Install CUDA, cuDNN, Nvidia Driver on Ubuntu 20.04 for Tensorflow2

기존에 설치되어 있는 것들 제거. Ubuntu 20.04에는 기본으로 Nvidia Driver가 설치되어 있으므로 이를 지워주어야 함.

$ sudo apt-get purge nvidia*
$ sudo apt-get autoremove
$ sudo apt-get autoclean

CUDA 설치

1. 레포지토리 키 등록

$ sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub

2. apt에 레포지토리 주소 등록

$ sudo bash -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" >> /etc/apt/sources.list.d/cuda.list'
$ sudo bash -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" >> /etc/apt/sources.list.d/cuda.list'

3. CUDA, Nvidia Driver 설치 (CUDA를 설치하면 자동으로 이에 의존된 드라이버 자동으로 설치)

$ sudo apt update
$ sudo apt install cuda
$ sudo apt install cuda-10-1

4. cuDNN 설치를 위한 repo 패키지 설치

$ wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64/nvidia-machine-learning-repo-ubuntu2004_1.0.0-1_amd64.deb
$ sudo dpkg -i nvidia-machine-learning-repo-ubuntu2004_1.0.0-1_amd64.deb

5. cuDNN 레포지토리 추가

$ sudo bash -c 'echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" >> /etc/apt/sources.list.d/nvidia-machine-learning.list'

6. cuDNN 설치 (CUDA 버전에 맞는 버전으로 설치해야 함)

$ sudo apt install libcudnn7-dev=7.6.5.32-1+cuda10.1 libcudnn7=7.6.5.32-1+cuda10.1

추후 업데이트시 최신버전으로 업데이트 되는 것을 방지

$ sudo apt-mark hold libcudnn7 libcudnn7-dev
libcudnn7 set on hold.
libcudnn7-dev set on hold.

이와 같이 설치를 완료하고, 재부팅한 다음… 드라이버가 잘 설치되어 있는지 확인

$ nvidia-smi


TensorFlow2 설치

$ sudo apt install python3-pip
$ sudo pip3 install -U tensorflow-gpu

환경변수 셋업

$ echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib64:/usr/local/cuda-10.2/lib64:/usr/local/cuda-10.1/lib64' >> ~/.bashrc
$ source ~/.bashrc

실행 및 설치 확인

$ python3
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import tensorflow as tf
2020-10-08 15:02:50.803732: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
tf.config.list_physical_devices('GPU')
2020-10-08 15:03:01.717511: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-08 15:03:01.733336: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-08 15:03:01.733585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-10-08 15:03:01.733603: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-08 15:03:01.734679: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-10-08 15:03:01.735901: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-10-08 15:03:01.736061: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-10-08 15:03:01.737163: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-10-08 15:03:01.737773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-10-08 15:03:01.740221: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-10-08 15:03:01.740295: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-08 15:03:01.740563: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-08 15:03:01.740768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Build and Install OpenCV 4.2.0 on Ubuntu with GPU support

Current OpenCV version only support CUDA 10.1.

Download sources

$ wget https://github.com/opencv/opencv/archive/4.2.0.tar.gz
$ wget https://github.com/opencv/opencv_contrib/archive/4.2.0.tar.gz

Extract the compressed files

$ tar zxf opencv-4.2.0.tar.gz
$ tar zxf opencv_contrib-4.2.0.tar.gz

CMake configuration

$ cd opencv-4.2.0
$ mkdir build
$ cd build
$ cmake -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.2.0/modules -DWITH_CUDA=ON -DBUILD_EXAMPLES=ON -DCMAKE_BUILD_TYPE=Release ..

Build

$ make -j8

It takes 15 ~ 20 minutes. And install,

$ sudo make install

Check installations

$ opencv_version 
4.2.0

$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2
<module 'cv2' from '/usr/local/lib/python3.6/dist-packages/cv2/python-3.6/cv2.cpython-36m-x86_64-linux-gnu.so'>
>>> 

Done.

Install Tensorflow with GPU support

Reference: https://www.tensorflow.org/install/gpu

  • Install CUDA 10.1 with GPU Driver
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.2.89-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
$ sudo apt-get update
$ wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
$ sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
$ sudo apt-get update
$ sudo apt install cuda
$ sudo apt install cuda-10-1
  • Install cuDNN
$ sudo apt install libcudnn7-dev
  • Install TensorRT
$ sudo apt install libnvinfer-dev=6.0.1-1+cuda10.1 libnvinfer6=6.0.1-1+cuda10.1 libnvinfer-plugin6=6.0.1-1+cuda10.1
  • Install python3-pip and update
$ sudo apt install python3-pip
$ sudo pip3 install -U pip
  • Install Tensorflow
$ sudo pip3 install -U tensorflow-gpu
$ sudo pip3 install -U launchpadlib

Check the installation

$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02)                                                                                                                                                                                         
[GCC 8.3.0] on linux                                                                                                                                                                                                                  
Type "help", "copyright", "credits" or "license" for more information.                                                                                                                                                                
>>> import tensorflow as tf                                                                                                                                                                                                           
2020-01-17 00:54:07.956548: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6                                                                                       
2020-01-17 00:54:07.957808: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
>>>
>>> tf.config.list_physical_devices('GPU')
2020-01-17 01:06:54.209819: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-01-17 01:06:54.244560: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-17 01:06:54.244971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2020-01-17 01:06:54.245001: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-01-17 01:06:54.245027: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-01-17 01:06:54.246771: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-01-17 01:06:54.247101: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-01-17 01:06:54.248591: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-01-17 01:06:54.249448: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-01-17 01:06:54.249496: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-01-17 01:06:54.249579: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-17 01:06:54.249978: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-17 01:06:54.250314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Done.