How to access NVIDIA GPU from a Docker Container ?

Nowadays, most of the Machine Learning application requires to run on a NVIDIA GPU to speed up the training process and the inference. In this tutorial, we will see how you can use Docker for your machine learning application and still access your GPU(s) to make your life easier when you want to share your work and/or deploy it into other machines.

Why using Docker?

You probably know already that there are a lot of prerequisites before being able to install TensorFlow or PyTorch and start building your machine learning app. and if you didn’t know before now you know ;)

You can of course follow the official procedure to install your NVIDIA GPU driver, CUDA and TensorFlow (or PyTorch) libraries from the official websites. However, you learn the hard way that it will easily mess up your computer and your graphics card while installing all these libraries and drivers especially if you need different versions of the libraries for different projects. That’s why, I would highly recommend installing TensorFlow/PyTorch inside a Docker container.

“Docker is essentially a self-contained OS with all the dependencies necessary for a smooth installation.”

STEP1 — Setup Docker

Install Docker:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

Add your user to the docker group:

sudo usermod -aG docker $USER

STEP2 — Setup NVIDIA driver and runtime

Check the installation of the nvidia driver with the command nvidia-smi. You should get something like the following output:

If the nvidia driver is not installed, follow the instruction in this page to install the driver

STEP3 — Install NVIDIA container runtime

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list |\
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
sudo apt-get install nvidia-container-runtime

Restart Docker:

sudo systemctl stop docker
sudo systemctl start docker

Now you are ready to run your first CUDA application in Docker! but before that you can test docker with the hello-world image.

$ docker run hello-world

STEP4 — Run CUDA in Docker

You can run a docker container from one of the images available in docker-hub, by running the following command

$ docker run --gpus all --rm nvidia/cuda nvidia-smi

But before doing this make sure to choose the right base image (tag will be in form of {version}-cudnn*-{devel|runtime}) for your application. The newest one is 10.2-cudnn7-devel.

Check that NVIDIA runs in Docker with:

docker run --gpus all nvidia/cuda:10.2-cudnn7-devel nvidia-smi

💡 Specify the number of GPUs and even the specific GPUs with the --gpus flag.

To be sure that the container is actually accessing the GPU(s) run the command nvidia-smi on the container and you should get the same output as in the host machine.

STEP5 — Run CUDA in Docker + Tensorflow

Download the TensorFlow Docker images with GPU support

$ docker pull tensorflow/tensorflow:latest-gpu-py3
$ docker pull tensorflow/tensorflow:latest-gpu-py3-jupyter

Test that the image is working properly

$ docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu-py3    python -c "import tensorflow as tf; print(tf.version); print(tf.test.is_gpu_available()); print(tf.test.is_built_with_cuda())"

This should return the TensorFlow version and whether GPU support is available. You should get something like the picture below in which you have the name/type of the GPU you have

To run the TensorFlow container and explore it, create a new container from the TensorFlow image

$ docker run -it --gpus all --rm tensorflow/tensorflow:latest-gpu-py3

Once you are logged-in in the container. You can explore it using ls, cd, nvidia-smi … To exit from the container just type in “exit”.

For a practical example, we will share a local folder with the container and use a jupyter-notebook . First, create a directory to exchange files between your machine and the container:

$ mkdir ~/ws_host

Then run the container using the following command

$ docker run -u $(id -u):$(id -g) --gpus all -it --rm --name app1_container -v ~/ws_host:/work_app1 -p 8888:8888 -p 6006:6006 tensorflow/tensorflow:latest-gpu-py3-jupyter

The different options are used for:

-u $(id -u):$(id -g)       # assign a user and a group ID
--gpus all # allow GPU support
-it # run an interactive container inside a terminal
-rm # automatically clean up the container and remove the file system after closing the container
--name app1_container # give it a friendly name
-v ~/ws_host:/work_app1 # share a directory between the host and the container
-p 8888:8888 # define port 8888 to connect to the container
-p 6006:6006 # forward port 6006 for Tensorboard

Once the container is running, your should get an URL so copy and paste it in your browser it starts with something like “http://127.0.0.1:8888/?token …..”. On your browser you will get the following:

In parallel you can use $ docker exec to run a command inside your running container, for example if you want to access just a terminal

$ docker exec -it app1_container bash

STEP5 — Run CUDA in Docker + Pytorch + TensorFlow

If you want to use PyTorch instead of TensorFlow or both in a project you can use one of the following docker files to build an image first and then run a container

CUDA Docker + PyTorch

In a folder create a file with the name “Dockerfile” and copy past in it the following lines

FROM nvidia/cuda:11.1.1-base
# Check the version of cuda you have and adjust the line above
# according to it. a list of the available images can be found in
# hub.docker.com
RUN mkdir /home/appRUN apt-get update -y && \
apt-get install ffmpeg libsm6 libxext6 -y && \
apt-get install git -y && \
apt-get install python3-pip -y
RUN pip3 install --no-cache-dir --upgrade pip# https://pytorch.org/
# to update this line according to the cuda version the host system
# has - to know more and which one check the link above
RUN pip3 install torch==1.3.0+cu100 torchvision==0.4.1+cu100 -f https://download.pytorch.org/whl/torch_stable.html
WORKDIR /home/app

Once you have the file open a terminal and run the following command to build your image

$ docker build -t nvidia_gpu_torch .

After this you can run a container from the image

$ docker run --gpus all -it --rm --name app_node_1 -v ~/ws_app/cracks:/home/ws_app -p 8888:8888 nvidia_gpu_torch bash

and to access the running container from another terminal run the following command

$ docker exec -it app_node_1 bash

In case you want to have both PyTorch and TensorFlow on your Docker container use the Dockerfile below and follow the same instruction to create an image and to run a container

CUDA Docker + PyTorch + TensorFlow

FROM tensorflow/tensorflow:latest-gpu-py3-jupyterRUN mkdir /home/appRUN apt-get update -y && \
apt-get install ffmpeg libsm6 libxext6 -y && \
apt-get install git -y && \
apt-get install python3-pip -y
RUN pip3 install --no-cache-dir --upgrade pip# https://pytorch.org/
# to update this line according to the cuda version the host system
# has - to know more and which one check the link above
RUN pip3 install torch==1.3.0+cu100 torchvision==0.4.1+cu100 -f https://download.pytorch.org/whl/torch_stable.html
WORKDIR /home/app

STEP6 —Final Check

On your running container you can verify that you have access to your GPU using nvidia-smi and from your python script via TensorFlow and/or PyTorch by using the snipet code below

# tensorflow
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

# pytorch
import torch
torch.cuda.get_device_name()

RESOURCES

https://www.celantur.com/blog/run-cuda-in-docker-on-linux

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store