The aim of this blog is to explore Linux security topics using a data science approach to things. Many people don't like the idea of putting proprietary blobs of code on their nice open source system. But I am pragmatic about things and have to admit that Nvidia is the king of GPU right now. And GPU is the approach to accelerate Deep Learning for the last few years. So, today I'll go over what it takes to correctly setup a CUDA development environment for Fedora 25. This is a continuation of the earlier post about how to get an Nvidia GPU card setup in Fedora. That step is a prerequisite to this blog post.
CUDA is the name that NVidia has given to a development environment for creating high performance GPU-accelerated applications. CUDA libraries enable acceleration across multiple domains such
as linear algebra, image and video processing, deep learning and graph
analytics.These libraries offload work normally done on a CPU to the GPU. And any program created by the CUDA toolkit is tied to the Nvidia family of GPU's.
Setting it up
The first step is to go get the toolkit. This is not shipped by any distribution. You have to get it directly from Nvidia. You can find the toolkit here:
Below is a screenshot of the web site. All the dark boxes are the options that I selected. I like the local rpm option because that installs all CUDA rpms in a local repo that you can then install as you need.
Download it. Even though it says F23, it still works fine on F25.
The day I downloaded it, 8.0.44 was the current release. Today its different. So, I'll continue by using my version numbers and you'll have to make the appropriate substitutions. So, let's continue the setup as root...
This installs a local repo of cuda developer rpms. The repo is located in /var/cuda-repo-8-0-local/. You can list the directory to see all the rpms. Let's install the core libraries that are necessary for Deep Learning:
Next, we need to make sure that utilities provided such as the GPU software compiler, nvcc, are in our path and that the libraries can be found. The easiest way to do this by creating a bash profile file that gets included when you start a shell.
edit /etc/profile.d/cuda.sh (which is a new file you are creating now):
The reason CUDA is aimed at F23 rather than 25 is that NVidia is not testing against the newest gcc. So, they put something in the headers to make it fail.
I spoke with people from Nvidia at the GTC conference about why they
don't support new gcc. Off the record they said they do extensive
testing on everything they support and that its just not something they
developed with when creating CUDA 8, but newer gcc will probably be
support in CUDA 9.
Its easy enough to fix by altering one line in the header to test for the gcc version. Since we have gcc-6.3, we can fix the header to test for gcc 7 or later and then fail. To do this:
On line 119 change from:
#if __GNUC__ > 5
#if __GNUC__ > 6
This will allow things to compile with current gcc. There is one more thing that we need to fix in the headers so that Theano can compile GPU code later. The error looks like this:
math_functions.h(8901): error: cannot overload functions distinguished by return type alone
This is because gcc defines the function also and conflicts with the one NVidia ships. The solution as best I can tell is simply to:
and around lines 8897 and 8901 you will find:
/* GCC 6.1 uses ::isnan(double x) for isnan(double x) */
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ int isnan(double x) throw();
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ constexpr bool isnan(long double x);
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ constexpr bool isinf(float x);
/* GCC 6.1 uses ::isinf(double x) for isinf(double x) */
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ int isinf(double x) throw();
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ constexpr bool isinf(long double x);
What I did is to comment out both lines that immediately follow the comment about gcc 6.1.
OK. Next we need to fix the cuda install paths just a bit. As root:
One of the goals of this blog is to explore Deep Learning. You will need the cuDNN libraries for that. So, let's put that in place while we are setting up the system. For some reason this is not shipped in an rpm and this leads to a manual installation that I don't like.
You'll need cuDNN version 5. Go to:
To get this you have to have a membership in the Nvidia Developer Program. Its free to join.
Look for "Download cuDNN v5 (May 27, 2016), for CUDA 8.0". Get the Linux one. I moved it to /var/cuda-repo-8-0-local. Assuming you did, too...as root:
To verify setup, we will make some sample program shipped with the toolkit. I had you to install them quite a few steps ago. The following instructions assume that you have used my recipe for a rpm build environment. As a normal user:
When its done (and hopefully its successful):
You should get something like:
You can also check the device bandwidth as follows:
You should see something like:
At this point you are done. I will refer back to these instructions in the future. If you see anything wrong or needs updating, please comment on this article.