65

Note: there are many similar questions but for different versions of ubuntu and somewhat different specific libraries. I have not been able to figure out what combination of symbolic links, additional environment variables such as LD_LIBRARY_PATH would work

Here is my nvidia configuration

$ nvidia-smi
Tue Apr  6 11:35:54 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2070    Off  | 00000000:01:00.0 Off |                  N/A |
| 18%   25C    P8     9W / 175W |     25MiB /  7982MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1081      G   /usr/lib/xorg/Xorg                 20MiB |
|    0   N/A  N/A      1465      G   /usr/bin/gnome-shell                3MiB |
+-----------------------------------------------------------------------------+

When running a TF program the following happened:

2021-04-06 14:35:01.589906: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] 
Could not load dynamic library 'libcudnn.so.8'; dlerror: 
libcudnn.so.8: cannot open shared object file: No such file or directory
2021-04-06 14:35:01.589914: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] 
Cannot dlopen some GPU libraries. Please make sure the missing
 libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at 
https://www.tensorflow.org/install/gpu for how to download 
and setup the required libraries for your platform.
Skipping registering GPU devices...

Has anyone seen this particular mix and how did you resolve it?

Here is one of the additional fixes attempted, but with no change:

conda install cudatoolkit=11.0
3
  • 2
    CUDNN isn't part of the CUDA toolkit. it is separately distributed and need to be separately installed. If you have used anaconda to install tensorflow, then it should be automatically installed. If it isn't, something is broken in conda. If you didn't use conda, you will need to install it by hand in the way tensorflow expects Commented Apr 6, 2021 at 22:42
  • 1
    Ah ok I had forgotten. Thx for that pointer. Make an answer if you wish . And yes I am using conda but that has never worked with tensorflow for me. I am going to nuke it and start afresh with pipenv Commented Apr 6, 2021 at 22:55
  • 1
    Sound like CuDNN is not installed from error message. Tensorflow-GPU requires CUDA 11.0 and cuDNN 8.0. Follow the steps mentioned on Tensorflow site for linux setup. Thanks! Commented Apr 27, 2021 at 7:22

4 Answers 4

72

So I had the same issue. As the comments say, it's because you need to install CUDNN. For that, there is a guide here.

But as I know already your distro (Ubuntu 20.04) I can give you the command lines already:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
export last_public_key=3bf863cc # SEE NOTE BELOW
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/${last_public_key}.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get install libcudnn8
sudo apt-get install libcudnn8-dev

where ${last_public_key} is the last public key (file with .pub extension) published on https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/. (At March 8th 2023 when this post was edit, it was 3bf863cc).

And if you want to install a specific version, the last 2 commands would be replaced with

sudo apt-get install libcudnn8=${cudnn_version}-1+${cuda_version}
sudo apt-get install libcudnn8-dev=${cudnn_version}-1+${cuda_version}

where ${cudnn_version} is for example 8.2.4.* and ${cuda_version} is for example cuda11.0 (as I see you have 11.0 on the command nvidia-smi, although I have not tested it as mine was 11.4 but I guess it should work Ok)

Sign up to request clarification or add additional context in comments.

5 Comments

For me what worked on Ubuntu 20.04 and the RTX 3060 machine I have is: sudo apt install libcudnn8=8.2.4.15-1+cuda11.2
For me, I needed to run sudo apt-get install libcudnn8. The exact version was determined automatically.
This solves the issue for me. [Ubuntu 20.04, RTX-3070]
You have no idea how long i struggled with this. Thank you.
YES! Thank you. I have been trying to solve this problem for longer than I want to admit
51

I used in Ubuntu 22.04

sudo apt install nvidia-cudnn

4 Comments

Note: for OSes using pacman by default (e.g. Arch and derived, like Manjaro): one can use sudo pacman -S cudnn.
Thanks a lot, @André Berenguel. It worked on ubuntu 22.04.
also works for ubuntu 24.04 after adding the cuda nvidia repositories
It works on Debian 12
20

I had the same issue and linux OS is Centos7-6 for me. Since I don't have sudo access on this machine, I solved the issue by installing cudnn from the anaconda website

In the environment where the latest tensorflow flow is installed :

conda install -c anaconda cudnn

You can check the package install using conda list : ( I had previously installed the cudatoolkit from anaconda )

Name                      Version              Build  Channel
cudatoolkit               11.3.1               h2bc3f7f_2
cudnn                     8.2.1                cuda11.3_0

You can check if tensorflow and the gpus are talking to each other :

(tf2_6) [xxxx@co-dept-rd-gpu-01 envs]$ python
Python 3.8.12 (default, Oct 12 2021, 13:49:34)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> gpus = tf.config.experimental.list_physical_devices('GPU')
>>> for gpu in gpus:
...     print("Name:", gpu.name, "  Type:", gpu.device_type)
...
Name: /physical_device:GPU:0   Type: GPU
Name: /physical_device:GPU:1   Type: GPU

4 Comments

Actually, this can help me solve an issue installing stuff on my company's server... I will give it a try.
In case you have a conda installation for pytorch or tensorflow, this is safer if you don't wanna mess with your environment variables as in sudo apt install nvidia-cudnn.
I did everything, the only thing that worked for me was sudo apt install nvidia-cudnn
some additional documentation -- tensorflow.org/install/pip
0

Needs cuda 11.8 (8 for cudnn 8)

https://anaconda.org/conda-forge/cudnn

conda install -c conda-forge cudnn

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.