Tensorflow-GPU setup with cuDNN and NVIDIA CUDA 9.0 on Ubuntu 18.04 LTS

Pre-requisite: CUDA should be installed on the machine with NVIDIA graphics card


CUDA Setup

Driver and CUDA toolkit is described in a previous blogpost.

With a slight change since the Tensorflow setup requires CUDA toolkit 9.0

# Clean CUDA 9.1 and install 9.0
$ sudo /usr/local/cuda/bin/uninstall_cuda_9.1.pl 
$ rm -rf /usr/local/cuda-9.1
$ sudo rm -rf /usr/local/cuda-9.1
$ sudo ./cuda_9.0.176_384.81_linux.run --override

# Make sure environment variables are set for test
$ source ~/.bashrc 
$ sudo ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
$ sudo ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++
$ cd ~/NVIDIA_CUDA-9.0_Samples/
$ make -j12
$ ./deviceQuery

Test Successful

cuDNN Setup

Referenced from a medium blogpost.

The following steps are pretty much the same as the installation guide using .deb files (strange that the cuDNN guide is better than the CUDA one).

Screenshot from 2018-07-13 16-03-10.png

  1. Go to the cuDNN download page (need registration) and select the latest cuDNN 7.1.* version made for CUDA 9.0.
  2. Download all 3 .deb files: the runtime library, the developer library, and the code samples library for Ubuntu 16.04.
  3. In your download folder, install them in the same order:
# (the runtime library)
$ sudo dpkg -i libcudnn7_7.1.4.18-1+cuda9.0_amd64.deb
# (the developer library)
$ sudo dpkg -i libcudnn7-dev_7.1.4.18-1+cuda9.0_amd64.deb
# (the code samples)
$ sudo dpkg -i libcudnn7-doc_7.1.4.18-1+cuda9.0_amd64.deb

# remove 
$ sudo dpkg -r libcudnn7-doc libcudnn7-dev libcudnn7

Now, we can verify the cuDNN installation (below is just the official guide, which surprisingly works out of the box):

  1. Copy the code samples somewhere you have write access: cp -r /usr/src/cudnn_samples_v7/ ~/
  2. Go to the MNIST example code: cd ~/cudnn_samples_v7/mnistCUDNN.
  3. Compile the MNIST example: make clean && make -j4
  4. Run the MNIST example: ./mnistCUDNN. If your installation is successful, you should see Test passed! at the end of the output.
(cv3) rahul@Windspect:~/cv/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7104 , CUDNN_VERSION from cudnn.h : 7104 (7.1.4)
Host compiler version : GCC 5.4.0
There are 2 CUDA capable devices on your machine :
device 0 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11163, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=1
Using device 0


Result of classification: 1 3 5
Test passed!

In case of compilation error


/usr/local/cuda/include/cuda_runtime_api.h:1683:101: error: use of enum ‘cudaDeviceP2PAttr’ without previous declaration
extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaDeviceGetP2PAttribute(int *value, enum cudaDeviceP2PAttr attr, int srcDevice, int dstDevice);
/usr/local/cuda/include/cuda_runtime_api.h:2930:102: error: use of enum ‘cudaFuncAttribute’ without previous declaration
 extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaFuncSetAttribute(const void *func, enum cudaFuncAttribute attr, int value);
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
                 from /usr/local/cuda/include/cuda_runtime.h:90,
                 from /usr/include/cudnn.h:64,
                 from mnistCUDNN.cpp:30:

Solution: sudo vim /usr/include/cudnn.h

replace the line '#include "driver_types.h"' 
with '#include <driver_types.h>'


Configure the CUDA & cuDNN Environment Variables

# cuDNN libraries are at /usr/local/cuda/extras/CUPTI/lib64
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib64 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/extras/CUPTI/lib64

source ~/.bashrc

TensorFlow installation

The python environment is setup using a virtualenv located at /opt/pyenv/cv3

$ source /opt/pyenv/cv3/bin/activate
$ pip install numpy scipy matplotlib 
$ pip install scikit-image scikit-learn ipython

Referenced from the official Tensorflow guide 

$ pip install --upgrade tensorflow      # for Python 2.7
$ pip3 install --upgrade tensorflow     # for Python 3.n
$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
$ pip3 install --upgrade tensorflow-gpu=1.5 # for Python 3.n and GPU

# remove tensorflow
$ pip3 uninstall tensorflow-gpu

Now, run a test

(cv3) rahul@Windspect:~$ python
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2018-08-14 18:03:45.024181: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: A VX2 FMA
2018-08-14 18:03:45.261898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:03:00.0
totalMemory: 10.91GiB freeMemory: 10.75GiB
2018-08-14 18:03:45.435881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:04:00.0
totalMemory: 10.90GiB freeMemory: 10.10GiB
2018-08-14 18:03:45.437318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0, 1
2018-08-14 18:03:46.100062: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-14 18:03:46.100098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 1
2018-08-14 18:03:46.100108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N Y
2018-08-14 18:03:46.100114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 1: Y N
2018-08-14 18:03:46.100718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1039 8 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
2018-08-14 18:03:46.262683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 9769 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
>>> print(sess.run(hello))
b'Hello, TensorFlow!'

Looks like it is able to discover and use the NVIDIA GPU


Now add keras to the system

pip install pillow h5py keras autopep8

Edit configuration, vim ~/.keras/keras.json

"image_data_format": "channels_last",
"backend": "tensorflow",
"epsilon": 1e-07,
"floatx": "float32"

A test for keras would be like this at the python CLI,

(cv3) rahul@Windspect:~/workspace$ python
Python 3.5.2 (default, Nov 23 2017, 16:37:01) [GCC 5.4.0 20160609] on linux
>>> import keras
Using TensorFlow backend.





Compile and Setup OpenCV 3.4.x on Ubuntu 18.04 LTS with Python Virtualenv for Image processing with Ceres, VTK, PCL

OpenCV: Open Source Computer Vision Library


Documentation: https://docs.opencv.org/3.4.2/

OpenCV Source: https://github.com/opencv/opencv


A. Setup an external HDD/SSD for this setup


B. Environment (Ubuntu 18.04 LTS)


Python3 setup

Install the needed packages in a python virtualenv. Refer similar windows Anaconda setup or look at the ubuntu based info here

sudo apt-get install build-essential cmake unzip pkg-config 
sudo apt-get install ubuntu-restricted-extras
sudo apt-get install python3-dev python3-numpy
sudo apt-get install git python3-pip virtualenv
sudo pip3 install virtualenv
rahul@karma:~$ virtualenv -p /usr/bin/python3 cv3
Already using interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/rahul/cv3/bin/python3
Also creating executable in /home/rahul/cv3/bin/python
Installing setuptools, pkg_resources, pip, wheel...

Activate and Deactivate the python Environment

rahul@karma:~$ source ~/cv3/bin/activate
(cv3) rahul@karma:~$ python
Python 3.6.5 (default, Apr 1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Test",4*5)
Test 20
>>> exit()
(cv3) rahul@karma:~$ deactivate

Alternatively, a great way to use virtualenv is to use Virtualenvwrappers

sudo pip3 install virtualenv virtualenvwrapper

Add these to your ~/.bashrc file

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

Now, run “source ~/.bashrc” to set the environment

Create a Virtual environment
rahul@karma:~$ mkvirtualenv cv3 -p python3
Already using interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/rahul/.virtualenvs/cv3/bin/python3
Also creating executable in /home/rahul/.virtualenvs/cv3/bin/python
Installing setuptools, pkg_resources, pip, wheel...done.
virtualenvwrapper.user_scripts creating /home/rahul/.virtualenvs/cv3/bin/predeactivate
virtualenvwrapper.user_scripts creating /home/rahul/.virtualenvs/cv3/bin/postdeactivate
virtualenvwrapper.user_scripts creating /home/rahul/.virtualenvs/cv3/bin/preactivate
virtualenvwrapper.user_scripts creating /home/rahul/.virtualenvs/cv3/bin/postactivate
virtualenvwrapper.user_scripts creating /home/rahul/.virtualenvs/cv3/bin/get_env_details
(cv3) rahul@karma:~$
Activate/Deactivate virtual env
rahul@karma:~$ workon cv3
(cv3) rahul@karma:~$ deactivate 

Install basic packages for the vision work.

(cv3) rahul@karma: pip install numpy scipy matplotlib scikit-image scikit-learn ipython

Java installation from this blog

sudo add-apt-repository ppa:linuxuprising/java
sudo apt update
sudo apt install oracle-java10-installer
sudo apt install oracle-java10-set-default
sudo apt-get install ant

GTK support for GUI features, Camera support (libv4l), Media Support (ffmpeg, gstreamer) etc. Additional packages for image formats mostly downloaded form the ubuntu-restricted-extra repository
sudo apt-get install libjpeg-dev libpng-dev libtiff-dev ffmpeg
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
sudo apt-get install libxvidcore-dev libx264-dev libvorbis-dev
sudo apt-get install libgtk-3-dev ccache imagemagick
sudo apt-get install liblept5 leptonica-progs libleptonica-dev
sudo apt-get install qt5-default libgtk2.0-dev libtbb-dev
sudo apt-get install libatlas-base-dev gfortran libblas-dev liblapack-dev 
sudo apt-get install libdvd-pkg libgstreamer-plugins-base1.0-dev
sudo apt-get install libfaac-dev libmp3lame-dev libtheora-dev
sudo apt-get install libxine2-dev libv4l-dev x264 v4l-utils
sudo apt-get install libopencore-amrnb-dev libopencore-amrwb-dev

# Optional dependencies
sudo apt-get install libprotobuf-dev protobuf-compiler
sudo apt-get install libgoogle-glog-dev libgflags-dev
sudo apt-get install libgphoto2-dev libeigen3-dev libhdf5-dev doxygen


VTK for SFM Modules

SFM setup: https://docs.opencv.org/3.4.2/db/db8/tutorial_sfm_installation.html

sudo apt-get install libxt-dev libglew-dev libsuitesparse-dev
sudo apt-get install tk8.5 tcl8.5 tcl8.5-dev tcl-dev

Ceres-Solver: http://ceres-solver.org/installation.html

# However, if you want to build Ceres as a *shared* library, 
# You must, add the following PPA:
sudo add-apt-repository ppa:bzindovic/suitesparse-bugfix-1319687
sudo apt-get update
sudo apt-get install libsuitesparse-dev
git clone https://ceres-solver.googlesource.com/ceres-solver
cd ceres-solver
mkdir build && cd build
export CXXFLAGS="-std=c++11" 
cmake ..
make -j4
make test
sudo make install


sudo apt-get install libblas-dev libblas-doc liblapacke-dev liblapack-doc


VTK Setup, https://gitlab.kitware.com/vtk/vtk.git

Configure and build with QT support

git clone git://vtk.org/VTK.git VTK
cd VTK
mkdir VTK-build
cd VTK-build
-DVTK_Group_Qt:BOOL=ON \
make -j4
sudo make install
$ cp -r ~/cv/VTK/VTK-build/lib/python3.6/site-packages/* ~/.virtualenvs/cv3/lib/python3.6/site-packages/
$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib:/usr/local/lib"
$ sudo ldconfig


cd flann-1.8.4-src/ && mkdir build && cd build
cmake ..
make -j4 
sudo make install


Download: http://www.pointclouds.org/downloads/linux.html

sudo apt-get install libusb-1.0-0-dev libusb-dev libudev-dev
sudo apt-get install mpi-default-dev openmpi-bin openmpi-common 
sudo apt-get install libboost-all-dev libpcap-dev  sudo apt-get install libqhull* libgtest-dev sudo apt-get install freeglut3-dev pkg-config sudo apt-get install libxmu-dev libxi-dev sudo apt-get install mono-complete sudo apt-get install openjdk-8-jdk openjdk-8-jre
git clone https://github.com/PointCloudLibrary/pcl 
# https://github.com/PointCloudLibrary/pcl/archive/pcl-1.8.1.tar.gz 
cd pcl && mkdir build && cd build 
CXXFLAGS="-std=gnu++11" cmake -DBUILD_apps=ON \
 -DBUILD_apps_point_cloud_editor=ON \
 -DBUILD_apps_cloud_composer=ON \
 -DBUILD_apps_modeler=ON \
 -DBUILD_apps_3d_rec_framework=ON \
 -DBUILD_examples=ON ..
make -j8 
sudo make install

Official OpenCV installation

wget -O opencv.zip https://github.com/opencv/opencv/archive/3.4.2.zip
wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/3.4.2.zip
unzip opencv.zip
unzip opencv_contrib.zip
Packages needed for OpenCV

Configure OpenCV with CMake
$ cd ~/cv/opencv-3.4.2 && mkdir build && cd build
-D BUILD_opencv_java=OFF \
-D OPENCV_EXTRA_MODULES_PATH=~/cv/opencv_contrib-3.4.2/modules \
-D PYTHON_EXECUTABLE=~/.virtualenvs/cv3/bin/python \
Screenshot from 2018-07-12 13-18-55

Make sure the Python 3 interpreter and other dependencies are configured correctly.

Compiling with CUDA (Setup instructions)

-D OPENCV_EXTRA_MODULES_PATH=~/cv/opencv_contrib-3.4.2/modules \
-D PYTHON_EXECUTABLE=~/.virtualenvs/cv3/bin/python \


Compile, Install and Verify
(cv3) rahul@karma:~/cv/opencv-3.4.2/build$ make -j4
$ sudo make install
$ sudo sh -c 'echo "/usr/local/lib" >> /etc/ld.so.conf.d/opencv.conf'
$ sudo ldconfig
$ pkg-config --modversion opencv
Setup the cv shared libraries
(cv3) rahul@karma$ ls -l /usr/local/lib/python3.6/site-packages
total 5172
-rw-r--r-- 1 root staff 5292240 Jul 12 13:32 cv2.cpython-36m-x86_64-linux-gnu.so
# or use the find command 
$ find /usr/local/lib/ -type f -name "cv2*.so"
$ cd /usr/local/lib/python3.6/site-packages/
$ mv cv2.cpython-36m-x86_64-linux-gnu.so cv2.so
$ cd ~/.virtualenvs/cv3/lib/python3.6/site-packages/
$ ln -s /usr/local/lib/python3.6/site-packages/cv2.so cv2.so

C. Test

(cv3) rahul@karma:~$ python
Python 3.6.5 (default, Apr 1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information
>>> import cv2
>>> cv2.__version__
>>> exit()
(cv3) rahul@karma:~$


Anaconda for your Image Processing, Machine Learning, Neural Networks, Computer Vision development environment using VS Code

Python is a great language and I will not go into explaining why it is so. Here is a brief setup for your development environment in case you are tinkering with computer vision problems and looking at learning neural network on your windows laptop.

Anaconda3 5.0

64 bit Download: https://www.anaconda.com/download

Install Anaconda with the default options.

  • Anaconda Navigator is a great place to look at your environment and activate them as per your need.
  • In case you want to have a Python 2x and 3x environment side by side, then you can create them in navigator. Here I have a base(root) setup with Python 3.6 and an additional Python 2.7 environment.
  • In order to use a particular environment you can click on that environment in the navigator or go to the Anaconda prompt and execute the following command
"(base)C:\Users\Karma>activate Py27"
  • To deactivate use
  • To create a new environment use the following command:
(base)C:\Users\Karma>conda create -n Py27 python=2.7 anaconda


Whenever you want to use a particular environment just go to the environments section and activate it. This will setup your python with the packages and version as configured in that environment.  In the screenshot above I have tensorflow in my base environment while its always better to have a separate environment for this.

In case you are using Cmder like me then go for this:

Considering where you have installed your Anaconda
> C:\Anaconda3\Scripts\activate.bat C:\Anaconda3
> C:\Users\Karma\Anaconda3\Scripts\activate.bat C:\Users\Karma\Anaconda3
> conda info --envs
> conda activate py27
> conda deactivate

Lets try to use package manager “conda” for the setup.

Run the following installation command on Anaconda Command Prompt which will open up showing prompt as (C:\Anaconda3) C:\Users\Karma>:

In order to find packages, you should look at the Anaconda repository ( https://anaconda.org/anaconda/repo )

# Adding the menpo channels and install opencv
conda install -c https://conda.binstar.org/menpo opencv
conda config --add channels menpo
conda install -c menpo opencv

# or directly use conda-forge
conda install -c conda-forge opencv

# Install packages
conda install numpy
conda install scipy
conda install matplotlib

# List packages
conda list


If the OpenCV installation did not go through then we can use the pre-built windows binaries maintained by,

Christoph Gohlke at https://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv

Download File: You can remove these modules by using “pip uninstall <package>”

(base)λ pip install opencv_python-3.4.0-cp36-cp36m-win_amd64.whl
Processing c:\users\karma\downloads\opencv_python-3.4.0-cp36-cp36m-win_amd64.whl
Installing collected packages: opencv-python
Successfully installed opencv-python-3.4.0
(base)λ pip install opencv_python-3.4.0+contrib-cp36-cp36m-win_amd64.whl
Processing c:\users\karma\downloads\opencv_python-3.4.0+contrib-cp36-cp36m-win_amd64.whl
Installing collected packages: opencv-python
Successfully installed opencv-python-3.4.0+contrib

In my case I used SIFT and SURF implementations which were made available in the contrib packages.

Now, that we have packages set, lets test it out on the python interpreter interface,
Use the following commands on the python CLI.

import numpy as np
import cv2


Instructions: https://www.tensorflow.org/install/install_windows

To install this package with conda run:
conda install -c conda-forge tensorflow

Version changes based on the repository you are trying to download from.

I typically use VS Code but if you like smooth scrolling go for Sublime.

In VS Code I use ms-python.python, tht13.python extensions to simplify my workspace.


Debugging is critical to work with any kind of code. So here is some configuration to get you started here.

  • Verify that the workspace settings.json file has the right python path
"python.pythonPath ": "C:\\Anaconda3\\python.exe"
  • Add a launch.json in your project .vscode folder with the following values
   "name": "Python",
   "type": "python",
   "pythonPath":"${config:python.pythonPath}", "request": "launch", "stopOnEntry": true, "console": "none", "program": "${file}", "cwd": "${workspaceFolder}", "debugOptions": [ "WaitOnAbnormalExit", "WaitOnNormalExit", "RedirectOutput" ] }
This will get you setup for debugging and here is how the debug interface would look like when you have put the breakpoints and stepped through the code.

Good Luck.