Updating the OneCellPipe image

Updating the OneCellPipe image

In order to update data or software of the OneCellPipe software the Docker and Singularity container will have to be re-created. In this example we put a new bowtie index in place. We are working on the command-line of an Ubuntu Linux system with Docker installed and running.

Part 1

  1. Preparation: Build the Bowtie index with the following steps.
    Get new genome data for human from Ensembl:

    wget ftp://ftp.ensembl.org/pub/release-91/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
    wget ftp://ftp.ensembl.org/pub/release-91/gtf/homo_sapiens/Homo_sapiens.GRCh38.85.gtf.gz
    
  2. Build the index using Bowtie through indrops. Make sure the indrops config file contains the correct values for bowtie_index and bowtie_dir. The first half of my default.yaml file looks like this (the index will be build as “/home/ubuntu/ref/Homo_sapiens.GRCh38.annotated”):
    project_name : "test"
    project_dir : "/home/ubuntu/data/indrops/"
    sequencing_runs : 
     - name : "Run3"
     version : 'v2'
     dir : "/home/ubuntu/data/seq_runs/run_v2_split"
     fastq_path : "{library_prefix}_{split_affix}_{read}_001.fastq.gz"
     split_affixes : ["L001", "L002"]
     libraries : 
     - {library_name: "test_lib1", library_prefix: "lib1"}
    paths : 
     bowtie_index : "/home/ubuntu/ref/Homo_sapiens.GRCh38.annotated"
     bowtie_dir : '/home/ubuntu/tools/bowtie-1.2.2/'
     rsem_dir : '/home/ubuntu/tools/RSEM/RSEM-1.3.0/'
     python_dir : '/usr/bin/'
     java_dir : '/usr/bin/'
     samtools_dir : '/home/ubuntu/tools/SAMTOOLS_DIR/samtools-1.3.1/bin/'

    python indrops.py default.yaml build_index \
        --genome-fasta-gz /home/ubuntu/ref/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
        --ensembl-gtf-gz /home/ubuntu/tools/indrops/ref/Homo_sapiens.GRCh38.91.gtf.gz
  3. Get the current Docker image. We are using the docker hub user “1cbdev”.

    export DOCKER_ID_USER="1cbdev"
    sudo docker pull 1cbdev/ocb-with-ref
    sudo docker import 1cbdev/ocb-with-ref
  4. Log into the Docker container. The directory /home/ubuntu on the host machine will be available through the path /tmp inside the Docker container.

    IMAGE="1cbdev/ocb-with-ref:5"
    sudo docker run -it --rm -m 4g -e HOME="/home/onecellbio" --mount type=bind,source=/home/ubuntu,target=/tmp -e PATH="/home/onecellbio/pyndrops/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/onecellbio:/home/onecellbio/miniconda2/bin" --name onecellpipe ${IMAGE} /bin/bash
  5. Replace the bowtie index directory with the new one build on the host machine in the ref directory:

    cd /home/onecellbio/
    rm -rf ref 
    cp -r /tmp/ref .
  6. If there are other things you would like to update, this is the time.

Part 2

  1. In order to create a new image form the running and modified container, open a new terminal window to access your container “from the outside” and export it. (You can use the name you assigned previously or the container ID you see when running docker ps)

    sudo docker export onecellpipe | gzip > onecellpipe-NEW-VERSION.tar.gz
  2. Let indrops know that there is a new index using the settings in onecellpipe/bin/indrop_fixed.yaml:
     bowtie_index : '/home/onecellbio/ref/Homo_sapiens.GRCh38.91.annotated'
  3. Update the config file in bin/nextflow.standard.config; the following parts might need updating:
    image_name_singularity = "onecellpipe-26-2.simg"
    image_name_docker = "onecellpipe-26-2.tar.gz"
    docker_hub_id_1 = "1cbdev/ocb-no-ref:4"
    docker_hub_id_2 = "1cbdev/ocb-w-ref:3"
  4. Test the new image with a copy of the sample data provided:

    cp -r sampledata testrun
    nextflow onecellpipe.nf --docker 1 --dir testrun
  5. If all looks ok remove the test folder:

    rm -rf testrun
  6. Re-import the image.  Make sure to a replace NEW-VERSION and VERSION-NUMBER with increments of the current numbers.
    export NEWVERSION="27-2"
    export VERSION-NUMBER="8"
    sudo docker import -m "version VERSION-NUMBER" onecellpipe-${NEWVERSION}.tar.gz ${DOCKER_ID_USER}/ocb-with-ref:VERSION-NUMBER
  7. Check the image was imported and get the image id.
    sudo docker images
    REPOSITORY TAG IMAGE ID CREATED SIZE
    1cbdev/ocb-with-ref 7 d10807949159 14 seconds ago 8.42GB
    1cbdev/ocb-with-ref 5 476fa069928a 2 months ago 6.16GB
  8. Push the image to the Docker registry (You will need the password for this)

    sudo docker login
    sudo docker commit d10807949159 $DOCKER_ID_USER/ocb-w-ref:$VERSION-NUMBER
    sudo docker push $DOCKER_ID_USER/ocb-w-ref:$VERSION-NUMBER
  9. You can use the file onecellpipe-NEW-VERSION.tar.gz as the base image for Docker-based pipelines. For Singularity-based pipelines, pull the image from the docker registry once to create the “simg” file:
    singularity build onecellpipe-NEW-VERSION.simg docker://$DOCKER_ID_USER/ocb-w-ref:$VERSION-NUMBER
  10. You can re-use the running docker container to create a second version: The images without the reference data. These might be useful when running on separated machines and only the smaller image has to be downloaded for the steps that do not do the genome alignement.