Running Deep Learning extension with CUDA 10.2?

jacobcybulski · April 2020

I can see in the new version of the Deep Learning extension the requirement for CUDA 10.0. However the new Tensorflow, which I also use on my system, requires CUDA 10.1+ and runs with the newest one too, which is CUDA 10.2. The release notes for the extension suggest to contact RM for assistance. As it is, the preferences for the GPU/CPU switch are complaining about my CUDA. I imagine I may need to set up a multi-CUDA system on my Ubuntu 18.04? Or is there some easy tweak to run the extension with the newer version of CUDA?

jczogalla · April 2020

Hi @jacobcybulski,

we currently rely on CUDA 10.0, so a multi-CUDA setup might be a possibility.

We are also currently working on the next version, which would rely on 10.2, but the release date is not clear yet.

Cheers

Jan

pschlunder · April 2020

Hey @jacobcybulski

find a version build against CUDA 10.2 and cuDNN 7.6 here:

https://rapidminer-my.sharepoint.com/:u:/p/jczogalla/Eb6AEfElolJHjrT6lUNt7xQBGu2d4NmCRsKMRJ5rdwKXiw?e=fFXP25

(link is only valid until May 14th, if you need the extension and the link expired please point it out and we'll update).

You can place the downloaded jar under your .RapidMiner/extensions folder. Once we'll release 0.9.4 it should be automatically used since it's a newer version.

Another option would be to also install 10.0 and set the CUDA environment variable to the 10.0 version for the environment you're using RapidMiner in.

Hope this helps,

Philipp

jacobcybulski · April 2020

@jczogalla I have got a workaround! When you export the settings for LD_LIBRARY_PATH and a PATH to /usr/local/cuda within Rapid-Miner.sh, miraculously it is then possible to switch from CPU to GPU and Deep Learning operators actually execute on a GPU!

I have tried to set these environment variables in /etc/profile and /etc/environment but it did not matter. Perhaps there is some global setting for JVM?

MartinLiebig · April 2020

Hi @jacobcybulski ,

i think the built we have on marketplace requieres a specific cuda version. We may be able to provide a custom build, right @pschlunder ?

Cheers,

Martin

jacobcybulski · April 2020

Thanks @jczogalla and @mschmitz , I think I may need to reorganise my libraries to use a multiple CUDA setup

Jacob

jacobcybulski · April 2020

Hi @pschlunder , this would be fantastic! However, the link to rapidminer-my.sharepoint.com is not public so I cannot download it. If you could change its access to anyone this would be great. Thanks. Jacob

pschlunder · April 2020

Oh, sorry! Updated.

jacobcybulski · April 2020

@pschlunder , thanks a lot - I have downloaded the JAR file and will be playing with it. I've dropped it to .RapidMiner/extensions and it seems to be recognised. However, I am still having issues with the GPU. When I peeked into the .RapidMiner/extensions/workspace/rmx_deeplearning I can see the 9.4 SNAPSHOTS for cpu-backend and the libs, but the GPU-backend is still version 0.9.0 (in .javacpp cache there is only a CPU back end). Perhaps the GPU backend gets compiled only when the GPU option is happily accepted? Or is it only a CPU compiled snapshot? The RM error on switching to GPU backend is still that it is looking for CUDA 10.0.

Jacob

jczogalla · April 2020

Hi @jacobcybulski

I think this might now be a problem with how your path is set up. You are correct to assume that the GPU backend is only extracted/installed when it finds the correct CUDA version. Make sure that your path contains the CUDA 10.2 location and that it is before any other CUDA references in the path. I'm not sure about other environment variables in Linux that Java might pick up about libraries...

Cheers
Jan

jacobcybulski · April 2020

Thanks @jczogalla , I'll have to play with this, interestingly nvidia-smi finds it all just perfectly.

jczogalla · April 2020

Yeah, I guess they have some better heuristics for that

jacobcybulski · April 2020

Hi @jczogalla , it seems there is nothing I can do to make Deep Learning extension to switch to GPU, in neither of the versions of the extension. I removed all my NVIDIA drivers, CUDA and cuDNN libraries, cleaned the system and installed only CUDA 10.0 with cuDNN 7.4 as required. When switching to GPU I am always told it failed, which brings me to the only conclusion that RM Educational License is considered free for the purpose of running with GPU?

In case my conclusion is incorrect, I include an observation here. The current CUDA toolkit may provide conflicting information as compared with the NVIDIA driver, which comes with its own CUDA libraries. So NVIDIA driver 415 comes with CUDA 10.0, 418 with CUDA 10.1 and 440 with CUDA 10.2, these versions are reported by nvidia-smi, irrespectively what is the current active version of CUDA installed in /usr/local/cuda and pointed to by $PATH and $LD_LIBRARY_PATH, which is reported with nvcc. So I ensured that all sources of system information on my Ubuntu 18.04 tell me the same story, here it is:

jacob@goblin-galore:~$ nvidia-smi
Thu Apr 23 16:12:41 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.27       Driver Version: 415.27       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
|   0 GeForce GTX 108... Off | 00000000:17:00.0 Off |                  N/A |
| 0%   33C    P8    10W / 280W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1 GeForce GTX 108... Off | 00000000:65:00.0 On |                  N/A |
| 0%   61C    P0    66W / 280W |    248MiB / 11175MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
| GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    1      1777      G   /usr/lib/xorg/Xorg                           167MiB |
|    1      3121      G   /usr/bin/gnome-shell                          79MiB |
+-----------------------------------------------------------------------------+

jacob@goblin-galore:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

jacob@goblin-galore:~$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 4
#define CUDNN_PATCHLEVEL 2
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

I am using RM 9.6 with Deep Learning 0.9.3 which gives me the same error as in the new 0.9.4 snapshot:

Error while switching to GPU backend. Either CUDA 10.0 is not installed or you have a free license. Check the log for more information.

Any ideas?

Jacob

jczogalla · April 2020

Hm. I am not sure about the license. Afaik, an educational license does not count as a free license.

Regarding the error message, in the 0.9.4 snapshot version, we simply forgot to adjust the message to show 10.2 instead of 10.0...
Can you provide your Studio log file? You can share it via PM if you like. Not sure how well the logging is, but maybe we can see something there.

Other than that I am not sure why it would not work, since this version did work for other people before, but that might have been on Windows machines.

jacobcybulski · April 2020

Unfortunately, I cannot test it on my Windows machine with GPUs as it is locked in my office at work, to be opened only after the COVID-19 goes away. I'll dig out the logs though as I am very keen on getting it right!

jczogalla · April 2020

Hi @jacobcybulski
That's great to hear! I think the problem here is the special handling on Linux systems with the LD library path. There might be global JVM settings, but that might hurt other java programs. And yes, you would have to touch the RapidMiner.sh file because there is no other way to put that in there.
We'll make a note and think about a possibility to provide the cuda path as a setting, similar to what we do with Python.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Running Deep Learning extension with CUDA 10.2?

Best Answers

Answers