Manage High Content Screening CellProfiler Pipelines with Apache Airflow

If you are running a High Content Screening Pipeline you probably have a lot of moving pieces. As a non exhaustive list you need to:

  • Trigger CellProfiler Analyses, either from a LIMS system, by watching a filesystem, or some other process.
  • Keep track of dependencies of CellProfiler Analyses - first run an illumination correction and then your analysis.
  • If you have a large dataset and you want to get it analyzed sometime this century you need to split your analysis, run, and then gather the results.
  • Once you have results you need to decide on a method of organization. You need to put your data in a database and set up in depth analysis pipelines.

These tasks are much easier to accomplish when you have a system or framework that is built for scientific workflows.

If you prefer to watch I have a video where I go through all the steps in this tutorial.

Enter Apache Airflow

Apache Airflow is :

Airflow is a platform created by the community to programmatically author, schedule and...

Continue Reading...

Setup a High Content Screening Imaging Platform with Label Studio

For a few years now I have been on a quest to find a tool I really like for annotating HCS images using a web interface. I've used several tools, including a desktop application called LabelImg, and I have finally found a tool that checks all the boxes called LabelStudio!

Label Studio is an annotation tool for images, audio, and text. Here we'll be concentrating on images as our medium of choice.

I go through the process in this video.


Grab the data

You can, of course, use your own data, but for this tutorial I will be using a publically available C. elegans dataset from the Broad BioImage Benchmark Collection.

mkdir data cd data wget unzip

HCS images are often very dark when opened in a system viewer. To use them for the rest of the pipeline we will have to do a two step conversion process, first using bftools to convert from tif -> png, and then using Imagmagick to do a levels...

Continue Reading...

50% Complete

DevOps for Data Scientists Weekly Tutorials

Subscribe to the newsletter! You'll get a weekly tutorial on all the DevOps you need to know as a Data Scientist. Build Python Apps with Docker, Design and Deploy complex analyses with Apache Airflow, build computer vision platforms, and more.