Versión en español de esta publicación.
This post explains how to install CUDA 9.1 Production Release on a Debian Stretch system. The first thing to do is to download the driver from the official Nvidia website and select the model of the video card you have. In my case, I have a server with 2 video cards, the first is a GeForce GTX660 and the second is a GeForce GTX650. If you are not sure which version of the driver should be installed, this information can be verified in the following link.
http://www.nvidia.com.mx/Download/index.aspx?lang=en-us
The latest version of the driver available in the case of my video cards is 390.25 and is available in the following link http://us.download.nvidia.com/XFree86/Linux-x86_64/390.25/NVIDIA-Linux-x86_64-390.25.run, the most desirable thing is to make sure to download the latest version of the driver available for your video card.
Also, it is necessary to download the CUDA Toolkit 9.1 from the Nvidia page located in the following link https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1704&target_type=runfilelocal. The toolkit installation file that is needed is the RUN version for Ubuntu, the file name is: cuda_9.1.85_387.26_linux.run.
Now we proceed to verify the requirements to install CUDA. First of all you need to confirm that you have a device that supports GPUs.
cuda_9_1_install.sh
# verify we have a cuda capable gpu lspci | grep -i nvidia
The previous instruction should yield results, otherwise, you should check the status of the video card. Then, we proceed to check our version of Linux
cuda_9_1_install.sh
# Linux version uname -m && cat /etc/*release
What tells us that we have Debian Stretch and is the version that is being used and supported for this post
x86_64 PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"
Now you need to check that you have a valid GCC compiler
cuda_9_1_install.sh
# verify gcc version gcc --version
If you do not have a valid GCC compiler you can install it with the following command
cuda_9_1_install.sh
# gcc install sudo apt-get install build-essential
Before proceeding with the installation process you need to uninstall any version of CUDA that has been previously installed, if this is the first time you install CUDA you can skip this step. To uninstall previous versions of CUDA the following command is used
cuda_9_1_install.sh
# cuda uninstall sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl
Where X and Y is the CUDA version. If you are updating the Nvidia driver you do not need to do anything since the installer removes the previous drivers automatically. If you want to uninstall the CUDA driver, you can do it with the following command.
cuda_9_1_install.sh
# cuda driver uninstall sudo /usr/bin/nvidia-uninstall
It is likely that you will need to uninstall nvidia packages if they have been installed through the repository. For this it could be sufficient to execute the following command
sudo apt-get remove 'nvidia*'
Once the previous steps have been completed and since the necessary files for the installation are available, we proceed to disable the nouveau driver, which is the default driver that comes with the Debian versions. In order to accomplish this a new file needs to be created
cuda_9_1_install.sh
# edit debian driver configuration file vim /etc/modprobe.d/disable-nouveau.conf
And we add the following lines
cuda_9_1_install.sh
# blacklist defualt driver blacklist nouveau options nouveau modeset=0
Note: to verify the nouveau driver is running or not, the following command needs to be executed and in the case the command shows results that means that you have the nouveau driver running in the system
cuda_9_1_install.sh
# verify driver lsmod | grep nouveau
After having created the file to block the default driver, we proceed to restart the system in default mode. Once restarted we will notice that the resolution has dropped quality, this means that the default nouveau driver was not loaded, now we will proceed to install the linux headers from our repository in order to compile the Nvidia driver, we do this through the next command.
cuda_9_1_install.sh
# dependencies sudo apt-get install linux-headers-$(uname -r)
After installing these packages we proceed to remove the nouveau driver completely from our system with the following command
cuda_9_1_install.sh
# remove apt-get remove --purge xserver-xorg-video-nouveau
Once the driver is removed, the system is restarted in recovery mode. Once rebooted and in the console, we will proceed with the driver installation process. First, you need to know with what version of the compiler the kernel was compiled, this is because the system needs to compile the Nvidia driver. If you do not know the version of the compiler or the kernel, you can know that information by using the following command
cuda_9_1_install.sh
# find curr gcc version cat /proc/version
Once you know the correct version of the compiler, you need to set it up by means of an export in the following way
cuda_9_1_install.sh
# set gcc version export CC=gcc-4.8
Now you can proceed with the driver installation, navigate to the driver download directory and run
cuda_9_1_install.sh
# execute driver install sudo sh NVIDIA-Linux-x86_64-390.25.run
Once the installation is finished, we proceed to restart our system. Note that the resolution went up in quality which means that our Nvidia drivers were installed correctly. Now, we proceed with the installation of the Cuda toolkit in the following way
cuda_9_1_install.sh
# execute toolkit install sudo sh cuda_9.1.85_387.26_linux.run
When asked, you need to select the option of not install the Nvidia driver since it has already been installed previously, and select the default installation directory for the examples. The last thing you have to do is to compile these examples and execute them, this is done by navigating to the installation directory of the examples using the make command. Usually the default directory of the examples lies in our home directory so the following command should do the trick
cd ~/NVIDIA_CUDA-9.1_Samples make
And that is pretty much it!
Reference Links
Cuda Installation Guide
https://developer.download.nvidia.com/compute/cuda/9.1/Prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf
CUDA Quick Start Guide
http://developer.download.nvidia.com/compute/cuda/9.1/Prod/docs/sidebar/CUDA_Quick_Start_Guide.pdf
CUDA Toolkit Official Documentation
http://docs.nvidia.com/cuda/index.html
CUDA Download Page
https://developer.nvidia.com/cuda-downloads
Nvidia Driver Download Page
http://www.nvidia.com.mx/Download/index.aspx?lang=en-us
CUDA 9.1 Performance Presentation
http://on-demand.gputechconf.com/gtc/2017/video/s7495-jain-optimizing-application-performance-cuda-profiling.mp4
CUDA Toolkit Release Notes
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
Enjoy! 🙂
-Yohan