Saturday, August 5, 2017

Theano Deep Learning Framework on Fedora 25

 [This article is rich in hyperlinks. I chose these to do a better job of explaining things than I can normally do. Please visit them.]

A few articles ago we covered Torch 7 and how to set it up. There are several other frameworks that are important each having advantages in one area or another. Its important to have access to all of them because you never know when a killer app lands on any one of them. Today we will show how to setup theano. Theano is one of the older frameworks and takes a unique approach to GPU acceleration. When you run a program that uses GPU acceleration, it generates and compiles CUDA code based on what your program describes.


THEANO
In the last article about AI, I mentioned that you can setup an account specifically to run AI programs. This is because most of the frameworks install things to your home directory. Sometimes they want versions of things that clash with other frameworks. Sounds like a classic use case for containers. But I wanted to set this up on bare metal so let's dive in.

Theano is python based. It typically wants things that are newer than the system python libraries. So, I'll show you how to set all this up. If you want to create a new ai account, go ahead and do that and/or log in under the account you want to set this up in.

The first step is to download miniconda which is a scaled back version of anaconda which is a package installer used by Continuum Analytics. (There is some overlap in names with Anaconda the Fedora and Red Hat package installer. They are not the same.) They have lots of scientific computing packages ready to install. Look over this list to get a feel for it.

To install miniconda, do this:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh


Click through the license and accept the default locations.

Once it's done, source .bashrc to update your variables.

source ~/.bashrc

Now, let's install theano. First we need to tell it where our CUDA libraries are installed. If you need information on how to setup a CUDA development environment, see this blog post.

export CUDA_ROOT="/usr/local/cuda"
conda install theano pygpu python=3

This will download and install theano for python 3 and all its dependencies. Contiuum is shipping python 3.6 which is ahead of Fedora's python 3.5. Next create a .theanorc file in the homedir. In it, put this:

[cuda]
root = /usr/local/cuda

[nvcc]
flags = -std=c++11


This fixes the nvidia compiler to not choke on the gcc/glibc headers and in a more permanent way where to find the CUDA environment. It is also important at this point that you have fixed /usr/local/cuda/include/math_functions.h as I explained in the article about setting up your CUDA development environment. Theano is the one that chokes on that bad code.

Next, we should test the setup to see if it works. We will start with the bottom layer, pygpu. If this is not working, then something went wrong and nothing else will work. I took the following from this article: http://blog.mdda.net/oss/2015/07/07/nvidia-on-fedora-22. You don't have to make this a program. Just use the python shell.

$ python3
>>> import pygpu
>>> pygpu.init('cuda0')


If its working, you should see
<pygpu.gpuarray.GpuContext object at 0x7f1547e79550

Good. Let's exit.

>>> quit()


Now let's test theano itself. The idea here is to make sure it works with simple apps before you jump into a complex AI program and then find trouble. Let's make a program. Copy this into a file we'll call gpu_check.py in the homedir.


from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print('Looping %d times took' % iters, t1 - t0, 'seconds')
print('Result is', r)
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')


We will run 2 tests. One to check that the CPU is working and one to see that the GPU is working.


# THEANO_FLAGS=mode=FAST_RUN,floatX=float32,device=cpu python3 gpu_check.py
# THEANO_FLAGS=mode=FAST_RUN,floatX=float32,device=gpu python3 gpu_check.py


When you test the gpu, if you see an errors like:

miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h(84): error: expected a "}"
/home/ai3/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h(446): error: identifier "NPY_NTYPES_ABI_COMPATIBLE" is undefined
...
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: cuda unavailable)
...
Used the cpu


This is normal the first time. You need to edit ~/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h.

On line 84, put the whole NPY_ATTR_DEPRECATE line in comments /* */ including the ending comma, save, and retest.

When you see:

Using gpu device 0: GeForce GTX 1080 Ti (CNMeM is disabled, cuDNN 5005)
...
Used the gpu


you are ready for theano...

Next blog post I'll show you something really cool that uses theano.

No comments: