Install Tensorflow 1.9.0 with GPU from source on Ubuntu 18.04 Bionic Beaver

Dmytro Kisil
9 min readJun 28, 2018

Note: This is updated version of my previous article. Differences are in updated bazel and Tensorflow version (check step 12). CUDA, NCCL and cuDNN not changed. All other information remained nearly the same.

You prepared some snacks and ready for this long process? Okay, I wish you good luck and… let`s go!

Step 1: Update and Upgrade your system:

sudo apt-get update 
sudo apt-get upgrade

Step 2: Verify You Have a CUDA-Capable GPU:

lspci | grep -i nvidia

If you do not see any settings, update the PCI hardware database that Linux maintains by entering update-pciids (generally found in /sbin) at the command line and rerun the previous lspci command.

Step 3: Verify You Have a Supported Version of Linux:

To determine which distribution and release number you’re running, type the following at the command line:

uname -m && cat /etc/*release

The x86_64 line indicates you are running on a 64-bit system which is supported by cuda 9.1.

Step 4: Install Dependencies:

Required to compile from source:

sudo apt-get install build-essential 
sudo apt-get install cmake git unzip zip
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python2.7-dev python3.5-dev python3.6-dev pylint

Step 5: Install linux kernel header:

Goto terminal and type:

uname -r

You can get something like “4.15.0–23-generic”. To current install cuda you need to upgrade kernel to 4.16 version:

wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16/linux-headers-4.16.0-041600_4.16.0-041600.201804012230_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16/linux-headers-4.16.0-041600-generic_4.16.0-041600.201804012230_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16/linux-image-4.16.0-041600-generic_4.16.0-041600.201804012230_amd64.deb

Once you’ve downloaded all the above kernel files, now install them as follows. Linux-headers will also be installed with this command:

sudo dpkg -i *.deb

Once the installation is complete, reboot your machine and verify that the new kernel version is being used:

uname -sr

You must get something like this:

For now you have a correct version of the kernel

And that’s it. You are now using a much more recent kernel version than the one installed by default with Ubuntu.

Step 6: Install NVIDIA CUDA 9.2:

Remove previous cuda installation(if you installed cuda before):

sudo apt-get purge nvidia*
sudo apt-get autoremove
sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*

Install cuda:

sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1710/x86_64/7fa2af80.pub
echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1710/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.list
sudo apt-get update
sudo apt-get -o Dpkg::Options::="--force-overwrite" install cuda-9-2 cuda-drivers

On step 6 when execute last command be careful!

Building for 4.15.0-22-genericBuilding for architecture x86_64Building initial module for 4.15.0-22-generic

If this line not update for few minutes, open System Monitor and wait, when the load of the CPU cores will decrease. Don`t type immediately!

Then try this in terminal:

use ESC few times and then type: password + Enter + password + Enter..

If not helped:

use ESC few times and then type: password + Enter + password + Enter..

Be patient with yourself, typed password slowly. After 10 try use ESC and type again.

If you install cuda on a fresh system, you need to type password «blindly » just once. Else be prepared to do this twice: when build kernel and when see this message:

+++writing new private key to ‘/var/lib/shim-signed/mok/MOK.priv’— — -

And you will succeed!

Step 7: Reboot the system to load the NVIDIA drivers.

Step 8: Go to terminal and type:

echo 'export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
source ~/.bashrc
sudo ldconfig
nvidia-smi

Check driver version probably Driver Version: 396.26

For now if you use nvidia-smi command you get temp for GPU and nothing more (no process found below). And you have low screen resolution because your nvidia-drivers not detect ligament (GPU-monitor).

Yes, I know how hard to work, having 1024x768 resolution on FullHD monitor

But you can fix this with Xorg!

Use this command to create Xorg:

sudo nvidia-xconfig

That`s create a file a file xorg.conf in path: (etc/X11/xorg.conf). To change resolution you need just reboot your system. And you can go to step 9.

If this not helped change this file (xorg.conf). To do this use this command with parameters of your monitor. My command look like this:

cvt 1920x1080 60

Press Enter and you got

Modeline “1920x60_60.00” 9.25 1920 1976 2160 2400 60 63 73 76 -hsync +vsync

Just copy this and open xorg.conf:

sudo -i gedit /etc/X11/xorg.conf

Paste here (instead of modeline). Change HorizSync and VertRefresh too:

Like this

And reboot after it. For now your screen resolution should be the same as before.For now you can type

nvidia-settings

And see this:

Now you can see temperature and other useful information about your GPU.

Step 9: Install cuDNN 7.1.4:

NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks.

Go to https://developer.nvidia.com/cudnn and download Login and agreement required

After login and accepting agreement.

Download the following:

cuDNN v7.1.4 Library for Linux

Goto downloaded folder and in terminal perform following:

tar -xf cudnn-9.2-linux-x64-v7.1.tgz
sudo cp -R cuda/include/* /usr/local/cuda-9.2/include
sudo cp -R cuda/lib64/* /usr/local/cuda-9.2/lib64

Step 10: Install NCCL 2.2.13:

NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs

Go to https://developer.nvidia.com/nccl and attend survey to download Nvidia NCCL.

Download following after completing survey.

Download NCCL v2.2.13, for CUDA 9.2 -> NCCL 2.2.13 O/S agnostic and CUDA 9.2

Go to downloaded folder and in terminal perform following:

tar -xf nccl_2.2.13-1+cuda9.2_x86_64.txz
cd nccl_2.2.13-1+cuda9.2_x86_64
sudo cp -R * /usr/local/cuda-9.2/targets/x86_64-linux/
sudo ldconfig

Step 11: Install Dependencies

Install libcupti:

sudo apt-get install libcupti-dev
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc

Python related:

To install these packages for Python 2.7, issue the following command:

sudo apt-get install python-numpy python-dev python-pip python-wheel

To install these packages for Python 3.n, issue the following command:

sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel

Step 12: Configure Tensorflow from source:

Download bazel(version 0.16):

cd ~/
wget https://github.com/bazelbuild/bazel/releases/download/0.16.0/bazel-0.16.0-installer-linux-x86_64.sh
chmod +x bazel-0.16.0-installer-linux-x86_64.sh
./bazel-0.16.0-installer-linux-x86_64.sh --user
echo 'export PATH="$PATH:$HOME/bin"' >> ~/.bashrc

Reload environment variables

source ~/.bashrc
sudo ldconfig

Start the process of building TensorFlow by downloading latest tensorflow 1.9.0.

cd ~/
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git pull
git checkout r1.9
./configure

Give python path in

Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3

Press enter two times

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: Y
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.2
Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.2
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.4
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/local/cuda-9.2
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 2.2
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/local/cuda-9.2/targets/x86_64-linux

Now you need compute capability which we have noted at step 1 eg. 6.1

Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1] 6.1Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:N

Configuration finished

Step 13: Build Tensorflow using bazel

The next step in the process to install tensorflow GPU version will be to build tensorflow using bazel. This process takes a fairly long time.

To build a pip package for TensorFlow you would typically invoke the following command:

bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

Note: if you got error like unsupported platform then make sure you are running correct pip command associated with the python you used while configuring tensorflow build.

add "--config=mkl" if you want Intel MKL support for newer intel cpu for faster training on cpu
add "--config=monolithic" if you want static monolithic build (try this if build failed)
add "--local_resources 2048,.5,1.0" if your PC has low ram causing Segmentation fault or other related errors

This process will take a lot of time. It may take 1–2 hours or maybe even more. Be ready to wait!

Also if you got error like Segmentation Fault then try again it usually worked.

Note: With previous version tensorflow and bazel build took 2h 3min. For me, updated version build 1hrs 32min. It worth it to update as you think?)

This operation takes the most time in this guide

The bazel build command builds a script named build_pip_package. Running this script as follows will build a .whl file within the tensorflow_pkg directory:

To build whl file issue following command:

bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg

To install tensorflow with pip:

cd tensorflow_pkg

or python 2: (use sudo if required)

pip2 install tensorflow*.whl

for python 3: (use sudo if required)

pip3 install tensorflow*.whl

Note: if you got error like unsupported platform then make sure you are running correct pip command associated with the python you used while configuring tensorflow build.

You can check pip version and associated python by following command

pip -V

Run in terminal

python3
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

If the system outputs the following, then you are ready to begin writing tensorflow programs:

Hello, TensorFlow!

Success! You have now successfully installed tensorflow 1.9.0 on your machine.

Congratulations!

For now you can check which tensorflow version you install:

pip3 show tensorflow

This article based on awesome guide, which available here. I am only supplemented it with the errors I encountered and how to avoid them.

Those answers maybe can help you during the process:

https://www.tecmint.com/upgrade-kernel-in-ubuntu/

https://devtalk.nvidia.com/default/topic/1036167/stuck-trying-to-intall-nvidia-390-ubuntu-18-04-lts-/?offset=10

https://askubuntu.com/questions/647708/cannot-edit-xorg-conf-permissions

https://askubuntu.com/questions/4253/getting-screen-resolution-correct-with-nvidia-drivers

https://unix.stackexchange.com/questions/387735/how-to-set-a-custom-resolution-with-nvidia-drivers-installed

Thanks for your attention!

--

--