Jetson AGX Xavier Development Kit Setup for Deep Learning (Tensorflow, PyTorch and Jupyter Lab) with JetPack 4.x SDK

NVIDIA Jetson AGX Xavier Developer Kit
https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit

There are many examples available at https://github.com/NVIDIA-AI-IOT/tf_trt_models that can be used to create a custom detector. However here we look at a basic development environment to get started with Tensorflow, PyTorch and Jupyter Lab on the device.

https://developer.nvidia.com/embedded/twodaystoademo

Setup Jetson AGX Xavier Development Kit

  1. NVIDIA SDK Manager for flashing (https://docs.nvidia.com/sdk-manager/install-with-sdkm-jetson/index.html)
  2. Use the SDK manager to also install Jetpack and other components
  3. Install Intel Wifi Ac 8265 with Bluetooth (https://www.jetsonhacks.com/2019/04/08/jetson-nano-intel-wifi-and-bluetooth/)
  4. Install M2 NVMe SSD storage (https://www.jetsonhacks.com/2018/10/18/install-nvme-ssd-on-nvidia-jetson-agx-developer-kit/)
  5. Move the rootfs to SSD (https://github.com/jetsonhacks/rootOnNVMe)
  6. Jetson reference Zoo: https://elinux.org/Jetson_Zoo

With this we will have AGX Xavier Kit running Jetpack 4.4 DP with rootfs on SSD and internet access via Intel WiFi 8265

Jetson Family of products (We are looking at AGX Xavier)

Deep Learning Environment/Framework Setup

  1. Setup VirtualEnvWrapper for each frameworks python install environment
mkvirtualenv <environment_name> -p python3
  1. Install Tensorflow 1.15 and 2.1 with Python 3.6 and Jetpack 4.4 DP
    (https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html)
sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
sudo apt-get install python3-pip
pip3 install -U pip
pip3 install -U pip testresources setuptools numpy==1.16.1 future==0.17.1 mock==3.0.5 h5py==2.9.0 keras_preprocessing==1.0.5 keras_applications==1.0.8 gast==0.2.2 futures protobuf pybind11

Make sure you install the python packages as above for each of the TensorFlow 1.15 and 2.1 virtual environments (tf1 and tf2) that we create next.

https://forums.developer.nvidia.com/t/official-tensorflow-for-jetson-agx-xavier

a.) Create Virtual Environment for Tensorflow 1.15 installation
mkvirtualenv tf1 -p python3

# TF-1.15
pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 ‘tensorflow<2’

b.) Create Virtual Environment for TensorFlow 2.1.0 installation
mkvirtualenv tf2 -p python3

# TF-2.x
pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 tensorflow

c.) Test the installation MNIST LeNet for both the tensorflow versions (https://forums.developer.nvidia.com/t/problem-to-install-tensorflow-on-xavier-solved/64991/11)
Make sure to upgrade keras to 2.2.4 (pip install keras)


3. Install PyTorch 1.5 in virtualenv (https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available)
mkvirtualenv tor -p python3

# Python 3.6 and Jetpack 4.4 DP
wget https://nvidia.box.com/shared/static/3ibazbiwtkl181n95n9em3wtrca7tdzp.whl -O torch-1.5.0-cp36-cp36m-linux_aarch64.whl
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev 
pip3 install Cython testresources setuptools pybind11
pip3 install numpy torch-1.5.0-cp36-cp36m-linux_aarch64.whl

Select the version of torchvision to download depending on the version of PyTorch that you have installed:

PyTorch v1.0 - torchvision v0.2.2
PyTorch v1.1 - torchvision v0.3.0
PyTorch v1.2 - torchvision v0.4.0
PyTorch v1.3 - torchvision v0.4.2
PyTorch v1.4 - torchvision v0.5.0
PyTorch v1.5 - torchvision v0.6.0  <---- Selected for Installation 

Install torchvision

sudo apt-get install libjpeg-dev zlib1g-dev
git clone --branch v0.6.0 https://github.com/pytorch/vision torchvision   # see above for version of torchvision to download
cd torchvision
python setup.py install
cd ../  # attempting to load torchvision from build dir will result in import error

Test Pytorch installation using MNIST https://github.com/pytorch/examples/blob/master/mnist/main.py


4. Install Jupyter for development across these virtualenv/kernels
Reference: https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html

a.) Add /home/nv/.local/bin/ to PATH for local or user installed packages
export PATH=/home/nv/.local/bin:$PATH

b.) Install Jupyterlab and add kernel from virtualenv path

python3 -m pip install jupyterlab ipykernel
python3 -m jupyter  --version

Setup the python virtual env kernel spec file for virtual environment created at ~/.virtualenvs/tor/

python3 -m ipykernel install --user --name=tor 

Installed kernelspec tor in ~/.local/share/jupyter/kernels/tor

c.) Edit the kernel.json file in envs kernelspec folder and change the default argv from “/usr/bin/python3” to “/user/nv/.virtualenvs/tor/bin/python”

 "argv": [
  "/home/nv/.virtualenvs/tor/bin/python",
  "-m",
  "ipykernel_launcher",
  "-f",
  "{connection_file}"
 ],
 "display_name": "tor",
 "language": "python"
}

Verify the kernels available using kernelspec for each python virtual environment. We have three in current setup corresponding to TensorFlow 1.15, 2.1 and PyTorch 1.5 installations

nv@agx$ python3 -m jupyter kernelspec list
Available kernels:
  python3    /home/nv/.local/share/jupyter/kernels/python3
  tf1        /home/nv/.local/share/jupyter/kernels/tf1
  tf2        /home/nv/.local/share/jupyter/kernels/tf2
  tor        /home/nv/.local/share/jupyter/kernels/tor

Start the Jupyter Lab server and select “tor” kernel for run

python3 -m jupyter lab --allow-root --ip=0.0.0.0 --no-browser

Tested with PyTorch Kernel at “tor” virtual environment and MNIST PyTorch code at https://github.com/pytorch/examples/blob/master/mnist/main.py

Now you can try all the great resources on NVIDIA’s web page https://developer.nvidia.com/embedded/twodaystoademo


Thanks to JetsonHacks for all the great reference tutorials

@Jetsonhacks

Nvidia-Docker containers for your JupyterLab based Tensorflow-gpu environment with MRCNN example, on Ubuntu 18.04 LTS

This will setup a convenient development environment for Tensorflow based deep learning environment on NVIDIA cards using the nvidia-docker containers. The work can happen in a Jupter Lab.

Github: https://github.com/vishwakarmarhl/dl-lab-docker

 

Overview

  1. Setup Python
  2. Setup Docker
  3. Setup Nvidia-Docker
  4. Create Deep learning container
  5. Run Mask R-CNN example in container
  6. Manage containers using Portainer

References

 

1. Setup Python

We need a virtual environment to work with and for that we use virtualenvwrapper

sudo apt-get install -y build-essential cmake unzip pkg-config ubuntu-restricted-extras git python3-dev python3-pip python3-numpy
sudo apt-get install -y freeglut3 freeglut3-dev libxi-dev libxmu-dev
sudo pip3 install virtualenv virtualenvwrapper

Edit ~/.bashrc file, add the following entry and source it

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

Create a virtual environment and install packages to it

mkvirtualenv cv3 -p python3
workon cv3
pip install numpy scipy scikit-image scikit-learn
pip install imutils pyzmq ipython matplotlib imgaug

More on this is available at Compile and Setup OpenCV 3.4.x on Ubuntu 18.04 LTS with Python Virtualenv for Image processing with Ceres, VTK, PCL

 

2. Setup Docker

Install docker-ce for Ubuntu 18.04 keeping in mind its compatibility with the nvidia-docker installation which is coming next

The repository setup is critical here and follow the instructions mentioned in the official installation guide at https://docs.docker.com/install/linux/docker-ce/ubuntu/

Just install docker-ce now

sudo apt-get install docker-ce=5:19.03.2~3-0~ubuntu-bionic 
sudo apt-get install docker-ce-cli=5:19.03.2~3-0~ubuntu-bionic 
sudo apt-get install containerd.io
sudo usermod -aG docker $USER
sudo systemctl enable docker 

Reboot the machine now and run “docker run hello-world” for test

 

3. Setup Nvidia-Docker

Install nvidia-docker runtime which allows containers to access the GPU hardware. The needed docker version is Docker 19.03 

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Test the runtime using nvidia-smi on official containers

# Test nvidia-smi with the latest official CUDA image
docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

You can also configure docker to use the nvidia runtime by default by using the following

# Now verify the config and it should look like below
sudo echo > /etc/docker/daemon.json
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

 

Advanced instructions for moving your docker image storage to a different location

Reference: https://forums.docker.com/t/how-do-i-change-the-docker-image-installation-directory/1169

Ubuntu/Debian: edit your /etc/default/docker file with the -g option: DOCKER_OPTS="-dns 8.8.8.8 -dns 8.8.4.4 -g /mnt"

4. Create a Deep Learning Container

Use the Dockerfile with an example in the provided dl-lab-docker repository

git clone https://github.com/vishwakarmarhl/dl-lab-docker.git
cd dl-lab-docker

Now lets build an image and run it locally.

Make sure Nvidia-docker is installed and default runtime is nvidia.

docker build -t dl-lab-docker:latest . -f Dockerfile.dl-lab.xenial
I have a prebuild docker image containing tensorflow-gpu==1.5.0 with CUDA 9.0 and cuDNN 7.0.5 which can be run as below
docker run --gpus all -it --ipc=host -p 8888:8888 \
           dl-lab-docker:latest

Finally access the Jupyter Lab page at http://127.0.0.1:8888/?token=31dcb0f9e

This docker image is also made available at my Docker Hub as vishwakarmarhl/dl-lab-docker. So without building you can use the image directly by pulling this image from docker hub.

Oh !!! by the way since this will be a development environment do not rely on the code provided in the container. Mount your own development folders to code.

docker run -it --ipc=host -v $(pwd):/module/host_workspace \
           -p 8888:8888 vishwakarmarhl/dl-lab-docker:latest

JupLab_Explorer

 

5. Container with Mask R-CNN in Jupyter Lab

The code directory contains Dockerfile to make it easy to get up and running with TensorFlow via Docker.

Navigate to the mrcnn codebase as below and open up masker.ipynb

JupyterLab

Below is an example run of the Mask R-CNN model taken from https://github.com/matterport/Mask_RCNN

JupyterLab-MRCNN-example

 

6. Manage Containers using Portainer

Since we are dealing with docker containers, the number of containers quickly becomes messy in a development environment. We will deal with this using Portainer management interface

docker volume create portainer_data 
docker run -d -p 8000:8000 -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer

Portainer

The portainer interface should be available at http://0.0.0.0:9000/#/home

 

Finally you can take a look at the created images at Docker Hub (dl-lab-docker repo) repository which is automatically building the github Dockerfile

 

 

 

Tensorflow-GPU setup with cuDNN and NVIDIA CUDA 9.0 on Ubuntu 18.04 LTS

Pre-requisite: CUDA should be installed on the machine with NVIDIA graphics card

 

CUDA Setup

Driver and CUDA toolkit is described in a previous blogpost.

With a slight change since the Tensorflow setup requires CUDA toolkit 9.0

# Clean CUDA 9.1 and install 9.0
$ sudo /usr/local/cuda/bin/uninstall_cuda_9.1.pl 
$ rm -rf /usr/local/cuda-9.1
$ sudo rm -rf /usr/local/cuda-9.1
$ sudo ./cuda_9.0.176_384.81_linux.run --override

# Make sure environment variables are set for test
$ source ~/.bashrc 
$ sudo ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
$ sudo ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++
$ cd ~/NVIDIA_CUDA-9.0_Samples/
$ make -j12
$ ./deviceQuery

Test Successful

cuDNN Setup

Referenced from a medium blogpost.

The following steps are pretty much the same as the installation guide using .deb files (strange that the cuDNN guide is better than the CUDA one).

Screenshot from 2018-07-13 16-03-10.png

  1. Go to the cuDNN download page (need registration) and select the latest cuDNN 7.1.* version made for CUDA 9.0.
  2. Download all 3 .deb files: the runtime library, the developer library, and the code samples library for Ubuntu 16.04.
  3. In your download folder, install them in the same order:
# (the runtime library)
$ sudo dpkg -i libcudnn7_7.1.4.18-1+cuda9.0_amd64.deb
# (the developer library)
$ sudo dpkg -i libcudnn7-dev_7.1.4.18-1+cuda9.0_amd64.deb
# (the code samples)
$ sudo dpkg -i libcudnn7-doc_7.1.4.18-1+cuda9.0_amd64.deb

# remove 
$ sudo dpkg -r libcudnn7-doc libcudnn7-dev libcudnn7

Now, we can verify the cuDNN installation (below is just the official guide, which surprisingly works out of the box):

  1. Copy the code samples somewhere you have write access: cp -r /usr/src/cudnn_samples_v7/ ~/
  2. Go to the MNIST example code: cd ~/cudnn_samples_v7/mnistCUDNN.
  3. Compile the MNIST example: make clean && make -j4
  4. Run the MNIST example: ./mnistCUDNN. If your installation is successful, you should see Test passed! at the end of the output.
(cv3) rahul@Windspect:~/cv/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7104 , CUDNN_VERSION from cudnn.h : 7104 (7.1.4)
Host compiler version : GCC 5.4.0
There are 2 CUDA capable devices on your machine :
device 0 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11172, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 28  Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 11163, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=1
Using device 0

...

Result of classification: 1 3 5
Test passed!

In case of compilation error

Error

/usr/local/cuda/include/cuda_runtime_api.h:1683:101: error: use of enum ‘cudaDeviceP2PAttr’ without previous declaration
extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaDeviceGetP2PAttribute(int *value, enum cudaDeviceP2PAttr attr, int srcDevice, int dstDevice);
/usr/local/cuda/include/cuda_runtime_api.h:2930:102: error: use of enum ‘cudaFuncAttribute’ without previous declaration
 extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaFuncSetAttribute(const void *func, enum cudaFuncAttribute attr, int value);
                                                                                                      ^
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
                 from /usr/local/cuda/include/cuda_runtime.h:90,
                 from /usr/include/cudnn.h:64,
                 from mnistCUDNN.cpp:30:

Solution: sudo vim /usr/include/cudnn.h

replace the line '#include "driver_types.h"' 
with '#include <driver_types.h>'

 

Configure the CUDA & cuDNN Environment Variables

# cuDNN libraries are at /usr/local/cuda/extras/CUPTI/lib64
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib64 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/extras/CUPTI/lib64

source ~/.bashrc

TensorFlow installation

The python environment is setup using a virtualenv located at /opt/pyenv/cv3

$ source /opt/pyenv/cv3/bin/activate
$ pip install numpy scipy matplotlib 
$ pip install scikit-image scikit-learn ipython

Referenced from the official Tensorflow guide 

$ pip install --upgrade tensorflow      # for Python 2.7
$ pip3 install --upgrade tensorflow     # for Python 3.n
$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
$ pip3 install --upgrade tensorflow-gpu=1.5 # for Python 3.n and GPU

# remove tensorflow
$ pip3 uninstall tensorflow-gpu

Now, run a test

(cv3) rahul@Windspect:~$ python
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2018-08-14 18:03:45.024181: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: A VX2 FMA
2018-08-14 18:03:45.261898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:03:00.0
totalMemory: 10.91GiB freeMemory: 10.75GiB
2018-08-14 18:03:45.435881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:04:00.0
totalMemory: 10.90GiB freeMemory: 10.10GiB
2018-08-14 18:03:45.437318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0, 1
2018-08-14 18:03:46.100062: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-14 18:03:46.100098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 1
2018-08-14 18:03:46.100108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N Y
2018-08-14 18:03:46.100114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 1: Y N
2018-08-14 18:03:46.100718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1039 8 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
2018-08-14 18:03:46.262683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 9769 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
>>> print(sess.run(hello))
b'Hello, TensorFlow!'

Looks like it is able to discover and use the NVIDIA GPU

KERAS

Now add keras to the system

pip install pillow h5py keras autopep8

Edit configuration, vim ~/.keras/keras.json

{
"image_data_format": "channels_last",
"backend": "tensorflow",
"epsilon": 1e-07,
"floatx": "float32"
}

A test for keras would be like this at the python CLI,

(cv3) rahul@Windspect:~/workspace$ python
Python 3.5.2 (default, Nov 23 2017, 16:37:01) [GCC 5.4.0 20160609] on linux
>>> import keras
Using TensorFlow backend.
>>>

 

END.

 

Quick Apt Repository way – NVIDIA CUDA 9.x on Ubuntu 18.04 LST installation

The same NVIDIA CUDA 9.1 setup on Ubuntu 18.04 LST using the aptitude repository. However this appears to work and is simple to work with. Reference is taken from this askubuntu discussion.

Lookup the solution to the Nouveau issue from this blogpost

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo ubuntu-drivers autoinstall
sudo reboot

Now install the CUDA toolkit

sudo apt install g++-6
sudo apt install gcc-6
sudo apt install nvidia-cuda-toolkit gcc-6

Screenshot from 2018-07-13 14-18-16

Screenshot from 2018-07-13 14-16-00

Run the installer

root@wind:~/Downloads# ./cuda_9.1.85_387.26_linux --override

Screenshot from 2018-07-13 14-27-36.png

Screenshot from 2018-07-13 14-28-43

Setup the environment variables

# Environment variables
export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.1/lib64 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.1/lib

Provide the soft link for the gcc-6 compiler

sudo ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
sudo ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++
sudo reboot

Test

cd ~/NVIDIA_CUDA-9.1_Samples/
make -j4

Upon completion of the compilation test using device query binary

$ cd ~/NVIDIA_CUDA-9.1_Samples/bin/x86_64/linux/release
$ ./deviceQuery

Screenshot from 2018-07-13 14-41-49.png

$ sudo bash -c "echo /usr/local/cuda/lib64/ > /etc/ld.so.conf.d/cuda.conf"
$ sudo ldconfig

DONE

NVIDIA CUDA 9.x on Ubuntu 18.04 LST installation

Guide

An installation guide to take you through the NVIDIA graphics driver as well as CUDA toolkit setup on an Ubuntu 18.04 LTS.

A. Know your cards

Verify what graphics card you have on your machine

rahul@karma:~$ lspci | grep VGA
04:00.0 VGA compatible controller: 
NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)
rahul@karma:~$ sudo lshw -C video
 *-display 
 description: VGA compatible controller
 product: GM204 [GeForce GTX 970]
 vendor: NVIDIA Corporation
 physical id: 0
 bus info: pci@0000:04:00.0
 version: a1
 width: 64 bits
 clock: 33MHz
 capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
 configuration: driver=nouveau latency=0
 resources: irq:30 memory:f2000000-f2ffffff memory:e0000000-efffffff memory:f0000000-f1ffffff ioport:2000(size=128) memory:f3080000-f30fffff

Download the right driver

downloaded the Version 390.67 for GeForce GTX 970

Screenshot from 2018-07-12 17-15-34.png

B. Nouveau problem kills your GPU rush

Hoever there are solutions available

Here is what worked for me

  1. remove all nvidia packages ,skip this if your system is fresh installed
    sudo apt-get remove nvidia* && sudo apt autoremove
    
  2. install some packages for build kernel:
    sudo apt-get install dkms build-essential linux-headers-generic
    
  3. now block and disable nouveau kernel driver:
    sudo vim /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
    

Insert follow lines to the nvidia-installer-disable-nouveau.conf:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

save and exit.

  1. Disable the Kernel nouveau by typing the following commands(nouveau-kms.conf may not exist,it is ok):
    rahul@wind:~$ echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
    options nouveau modeset=0
    
  2. build the new kernel by:
    rahul@wind:~$ sudo update-initramfs -u
    update-initramfs: Generating /boot/initrd.img-4.15.0-23-generic
    
  3. reboot
Run the Installer in run-level 3
$ sudo init 3 
$ sudo bash
$ ./NVIDIA-Linux-x86_64-390.67.run

Uninstall

More instruction on how to stop using the driver before uninstallation
sudo nvidia-installer –uninstall

C. NVIDIA X Server Settings

Install this from the ubuntu software center.
Screenshot from 2018-07-12 17-23-43.png

D. Start the CUDA related setup

We will need the CUDA toolkit 9.1 which is supported for the GTX 970 version with compute 3.0 capability. So download the local installer for Ubuntu.

Screenshot from 2018-07-13 13-55-24.png

Downloaded the “cuda_9.1.85_387.26_linux.run*” local installation file.

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt install nvidia-cuda-toolkit gcc-6

Steps are taken from the CUDA 9.1 official documentation

  1. Perform the pre-installation actions.
  2.  Disable the Nouveau drivers. We did this in the above driver installation
  3. Reboot into text mode (runlevel 3). This can usually be accomplished by adding the number “3” to the end of the system’s kernel boot parameters. Change the runlevel ‘sudo init 3’, refer
  4. Verify that the Nouveau drivers are not loaded. If the Nouveau drivers are still loaded, consult your distribution’s documentation to see if further steps are needed to disable Nouveau.
  5. Run the installer and follow the on-screen prompts:
$ chmod +x cuda_9.1.85_387.26_linux
$ rahul@wind:~/Downloads$ ./cuda_9.1.85_387.26_linux --override

Screenshot from 2018-07-13 13-52-19.png

Since we already installed the Driver above we say NO in the NVIDIA accelerated graphic driver installation question.

Screenshot from 2018-07-13 13-54-20.png

This will install the CUDA stuff in the following locations

  • CUDA Toolkit /usr/local/cuda-9.1
  • CUDA Samples $(HOME)/NVIDIA_CUDA-9.1_Samples

We can verify the graphic card using the NVIDIA-SMI command.

Screenshot from 2018-07-12 20-02-08

Uninstallation

cd /usr/local/cuda-9.1/bin
sudo ./uninstall_cuda_9.1.pl

 

E. Environment Variables

rahul@wind:~$ vim ~/.bashrc

# Add the following to the environment variables
export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.1/lib64 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.1/lib

rahul@wind:~$ source ~/.bashrc
rahul@wind:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.1, 

 

F. Test

Ensure you have the right driver versions

rahul@wind:$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 390.67 Fri Jun 1 04:04:27 PDT 2018
GCC version: gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)

Change directory to the NVIDIA CUDA Samples and compile them

rahul@wind:~/NVIDIA_CUDA-9.1_Samples$ make

Now run the device query test

rahul@wind:~/NVIDIA_CUDA-9.1_Samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

 

END