The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

What is the expected accuracy of the CNN tutorial?

FriedemannFriedemann Member, University Professor Posts: 27 University Professor
edited April 2021 in Help
I have loaded the tutorial process and fed in data from here:
(42000 jpeg images in 10 folders)

The process runs fine (approx. 45 Minutes with CPU and about 35 minutes with GPU). If I use the training set as test set as well I reach an accuracy of about 9.97% because every image is classified as a 9. Am I doing something wrong or is there something wrong with the deep learning extension?

Btw, classifying the csv version of the images using  two fully connected layers (plus output layer) reaches an accuracy of about 97%.

Update: By using the full data set for both training and testing of the non-CNN net, an accuracy of 99,91% is reached (excution time 7:45 min with GPU).

Cheers
Friedemann

Best Answer

  • FriedemannFriedemann Member, University Professor Posts: 27 University Professor
    edited April 2021 Solution Accepted
    Sorry, wrong thread! Please ignore!

    Question has been answered in another thread:

    In a nutshell:
    You have to specify the input shape manually! The default is set to "automatic". When switching off automatic mode a number of parameters can be set specifying how to map the data onto the tensor.
    Important: This approach assumes that the data is stored as a sequence of rows in the csv and to have a single line per instance. Multi-channel data is represented as a "sequence" of complete instances per channel in the same line and indicated by the "depth" parameter of the input shape.

Answers

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @Friedemann, Hi dear community,

    Several weeks ago, I observed something very similar by performing "Time series classification" with the Deep Learning Extension with the LSTM layer inside the Deep Learning (Tensor) operator.
    In entry of the process, there is a collection of time series with a label associated to 6 classes.
    Like @Friedemann, I'm using the training set as the test set and as the result, the predicted class by the model is systematically one of these classes.

    I specify that I performed the same classification task in a Python notebook (using Keras/Tensorflow) and as the result I get around 60% accuracy...!
    The process is in attached file (it is basically the same process as the process called "ICU mortality classification" in the samples of the Deep Learning folder).
    I can share the data on request if you want to reproduce what I observed with this process in order to understand what is going on.

    Regards,

    Lionel   
  • pschlunderpschlunder Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 96 RM Research
    thank you for reaching out. As you said, this sounds not correct.
    I've tried to reproduce the error, but I couldn't. When I'm testing the tutorial process from the "Add Convolutional Layer" operator with the trainingSample folder (containing 61 samples per label) from the MNIST jpg export you shared, and use this sample both for training and testing, I'm getting an accuracy of around 88%. 
    Hence I'd like to learn more about your setup:
    • Which version both of the Deep Learning and the ND4J Back-End extension are you using?
    • Can you maybe share the exact process you've used for testing?
    Regards,
    Philipp
  • pschlunderpschlunder Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 96 RM Research
    thank you for reaching out, as well :) can you maybe share the data set with me? You've got a PM regarding the data set. I've checked the process and besides the normalization being used to early (which might be a relic from an incorrect old tutorial process) I've not seen any obvious error.

    Regards,
    Philipp
  • FriedemannFriedemann Member, University Professor Posts: 27 University Professor
    edited April 2021
    HI Philipp,

    I have attached the process I am using - just the tutotrial process with specified data diretories. I am using RM 9.9 rev 0f5626 Platform WIN64 (Full Version info in second attachement).
    Deep Learning: 1.1.2
    ND4J: 1.0.0
    Image Handling: 0.2.1
    I am using the the full training set (42.000 images).

    Cheers
    Friedemann

  • FriedemannFriedemann Member, University Professor Posts: 27 University Professor
    edited April 2021
    Update: Tried the tutorial with the training sample and get an accuracy of 45.33% when using the training data as test data - not promising either.
    Noticed that RM uses about 19 GB of main memory when starting (blank process). Switching of CUDA does not really change the picture.Is that intended?

    Update: I have disabled all extensions and RM still uses almost 19 GB of main memory


  • FriedemannFriedemann Member, University Professor Posts: 27 University Professor
    I guess, that I have found the reason for the problem. By restricting the memory of RM to 12,000 MB and setting the mini batch size to 2,000, the model did now reach an accuracy of 69% (full training set). It seems that it runs out of memory because the JVM allocates almost 19GB in the beginning and the machine has only 32 GB (a few other things are running in parallel). The system monitor now shows a memory utilization of 9.7 GB for the JVM and 8.4 GB for the "Off-Heap memory" (whatever that means).  5.3 GB of GPUI memory is used.
  • pschlunderpschlunder Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 96 RM Research

    The deep learning extension is using out of JVM memory for calculations, that's correct. After setting the JVM memory available for RapidMiner Studio, you can specify the maximum out of JVM via the RapidMiner Studio settings, using the "Backend" tab:


  • FriedemannFriedemann Member, University Professor Posts: 27 University Professor
    Further tests showed a disappointing result. If I decrease the batch size from 2000 or increase the number of epochs the accuracy gets worse and I end up with an accuracy of 10% as before. There are no error messages regarding out of memory conditions or anything. So, to get back to the original question: What would be the expected accuracy of  the tutorial process with the MNIST training set. Frankly, I doubt that the CNN layer works correctly.
Sign In or Register to comment.