Nvidia-Docker containers for your JupyterLab based Tensorflow-gpu environment with MRCNN example, on Ubuntu 18.04 LTS

This will setup a convenient development environment for Tensorflow based deep learning environment on NVIDIA cards using the nvidia-docker containers. The work can happen in a Jupter Lab.

Github: https://github.com/vishwakarmarhl/dl-lab-docker

 

Overview

  1. Setup Python
  2. Setup Docker
  3. Setup Nvidia-Docker
  4. Create Deep learning container
  5. Run Mask R-CNN example in container
  6. Manage containers using Portainer

References

 

1. Setup Python

We need a virtual environment to work with and for that we use virtualenvwrapper

sudo apt-get install -y build-essential cmake unzip pkg-config ubuntu-restricted-extras git python3-dev python3-pip python3-numpy
sudo apt-get install -y freeglut3 freeglut3-dev libxi-dev libxmu-dev
sudo pip3 install virtualenv virtualenvwrapper

Edit ~/.bashrc file, add the following entry and source it

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

Create a virtual environment and install packages to it

mkvirtualenv cv3 -p python3
workon cv3
pip install numpy scipy scikit-image scikit-learn
pip install imutils pyzmq ipython matplotlib imgaug

More on this is available at Compile and Setup OpenCV 3.4.x on Ubuntu 18.04 LTS with Python Virtualenv for Image processing with Ceres, VTK, PCL

 

2. Setup Docker

Install docker-ce for Ubuntu 18.04 keeping in mind its compatibility with the nvidia-docker installation which is coming next

The repository setup is critical here and follow the instructions mentioned in the official installation guide at https://docs.docker.com/install/linux/docker-ce/ubuntu/

Just install docker-ce now

sudo apt-get install docker-ce=5:19.03.2~3-0~ubuntu-bionic 
sudo apt-get install docker-ce-cli=5:19.03.2~3-0~ubuntu-bionic 
sudo apt-get install containerd.io
sudo usermod -aG docker $USER
sudo systemctl enable docker 

Reboot the machine now and run “docker run hello-world” for test

 

3. Setup Nvidia-Docker

Install nvidia-docker runtime which allows containers to access the GPU hardware. The needed docker version is Docker 19.03 

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Test the runtime using nvidia-smi on official containers

# Test nvidia-smi with the latest official CUDA image
docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

You can also configure docker to use the nvidia runtime by default by using the following

# Now verify the config and it should look like below
sudo echo > /etc/docker/daemon.json
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

 

Advanced instructions for moving your docker image storage to a different location

Reference: https://forums.docker.com/t/how-do-i-change-the-docker-image-installation-directory/1169

Ubuntu/Debian: edit your /etc/default/docker file with the -g option: DOCKER_OPTS="-dns 8.8.8.8 -dns 8.8.4.4 -g /mnt"

4. Create a Deep Learning Container

Use the Dockerfile with an example in the provided dl-lab-docker repository

git clone https://github.com/vishwakarmarhl/dl-lab-docker.git
cd dl-lab-docker

Now lets build an image and run it locally.

Make sure Nvidia-docker is installed and default runtime is nvidia.

docker build -t dl-lab-docker:latest . -f Dockerfile.dl-lab.xenial
I have a prebuild docker image containing tensorflow-gpu==1.5.0 with CUDA 9.0 and cuDNN 7.0.5 which can be run as below
docker run --gpus all -it --ipc=host -p 8888:8888 \
           dl-lab-docker:latest

Finally access the Jupyter Lab page at http://127.0.0.1:8888/?token=31dcb0f9e

This docker image is also made available at my Docker Hub as vishwakarmarhl/dl-lab-docker. So without building you can use the image directly by pulling this image from docker hub.

Oh !!! by the way since this will be a development environment do not rely on the code provided in the container. Mount your own development folders to code.

docker run -it --ipc=host -v $(pwd):/module/host_workspace \
           -p 8888:8888 vishwakarmarhl/dl-lab-docker:latest

JupLab_Explorer

 

5. Container with Mask R-CNN in Jupyter Lab

The code directory contains Dockerfile to make it easy to get up and running with TensorFlow via Docker.

Navigate to the mrcnn codebase as below and open up masker.ipynb

JupyterLab

Below is an example run of the Mask R-CNN model taken from https://github.com/matterport/Mask_RCNN

JupyterLab-MRCNN-example

 

6. Manage Containers using Portainer

Since we are dealing with docker containers, the number of containers quickly becomes messy in a development environment. We will deal with this using Portainer management interface

docker volume create portainer_data 
docker run -d -p 8000:8000 -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer

Portainer

The portainer interface should be available at http://0.0.0.0:9000/#/home

 

Finally you can take a look at the created images at Docker Hub (dl-lab-docker repo) repository which is automatically building the github Dockerfile