How to install CUDA and cuDNN on Ubuntu 16.04 LTS

I’ve been running this machine for TensorFlow and Keras with Jupyter notebook.

These are my environments:

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"

$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)

Disable secure boot on UEFI

First, we need to disable “secure boot” on UEFI menu because it prevents NVIDIA driver from loading into OS. So enter UEFI menu while boot time by pressing the key something like the delete key.

That’s the first time I heard that. I used to use BIOS, but now, a lot of motherboards recently released use UEFI. And UEFI has secure boot (see detail here, this was helpful for me).

If you use an ASUS motherboard, this document also can be helpful.

Remove old NVIDIA driver and CUDA

$ dpkg -l | grep nvidia
$ dpkg -l | grep cuda
$ sudo apt-get --purge remove nvidia-*
$ sudo apt-get --purge remove cuda-*

Install new NVIDIA driver and CUDA9.0

Download CUDA Toolkit 9.0

Download from NVIDIA developer site.

Instatll

$ sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
$ sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda

Add PATH and LD_LIBRARY_PATH

~/.bashrc

export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"

Then reboot.

sudo reboot

Check that the driver was installed

$ nvidia-smi

This command shows the status of the graphics card installed on your machine like this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.25                 Driver Version: 390.25                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0  On |                  N/A |
| 27%   36C    P8    12W / 180W |   7753MiB /  8116MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1351      G   /usr/lib/xorg/Xorg                            24MiB |
|    0     12638      C   /home/jupyter/tensorflow/bin/python3        7717MiB |
+-----------------------------------------------------------------------------+

Install cuDNN

Download cuDNN

Download Debian package from NVIDA developer site.

Install the rutime library

$ sudo dpkg -i libcudnn7_7.0.5.15-1+cuda9.0_amd64.deb

Install the developer library

$ sudo dpkg -i libcudnn7-dev_7.0.5.15-1+cuda9.0_amd64.deb

Add CUDA_HOME

~/.bashrc

export CUDA_HOME=/usr/local/cuda-9.0

Check that the cuDNN works

$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd  $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNN
Test passed!

Stop LightDM

LightDM start to run when you install CUDA. If you use Ubuntu server, it’s not necessary to run.

Disable LigthDM

Rewrite GRUB_CMDLINE_LINUX like below:

/etc/default/grub

GRUB_CMDLINE_LINUX="systemd.unit=multi-user.target"

Update and reboot.

sudo update-grub
sudo reboot

I implemented GANs

I implemented GANs(Generative Adversarial Networks) because I want to learn about GANs with TensorFlow.

I referred to here.

And this is my code implemented GANs. This code almost same what I referred.

My GANs repeatedly learn with MNIST handwritten image 100k times.

These are the learned images 0, 1k, 50k, 100k steps.

0 step


The 0 step image is just noise.

1k steps


The 1k steps image is just noise yet.

5k steps


The 5k steps image look like a handwritten image, but it is a little bit noisy.

50k and 100k steps


50k steps


100k steps

The 50k and 100k steps image pretty look like a handwritten image.

Above implementation is the vanilla GAN. So by using extended GAN like DCGAN would be able to increase the precision of generated handwritten image. And I’ll try to use some other training data.

However, I am going to implement GAN on iOS with CoreML before I use DCGAN and other training images.

What is the difference between tf.placeholder and tf.Variable

I read this tutorial. There were the confusing terms they are tf.placeholder and tf.Variable. So I checked the difference point.

tf.placeholder is called when a session runs a calculation.
If once you set a value with tf.placeholder, it can not change an own variable.

tf.Variable can change an own variable by assign method. So tf.Variable is literally “variable.”

Finally, I found this question was beneficial for me.

The top rated answer said:

You use tf.Variable for trainable variables such as weights and biases for your model.

weights = tf.Variable(
    tf.truncated_normal([IMAGE_PIXELS, hidden1_units],
                    stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights')

biases = tf.Variable(tf.zeros([hidden1_units]), name='biases')

tf.placeholder is used to feed actual training examples.

images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))

for step in xrange(FLAGS.max_steps):
    feed_dict = {
       images_placeholder: images_feed,
       labels_placeholder: labels_feed,
     }
    _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict)

And the another answer said:

The more important difference is their role within TensorFlow. Variables are trained over time, placeholders are are input data that doesn’t change as your model trains (like input images, and class labels for those images).

I appreciate roughly my question thanks to the StackOverflow’s answer.