RapidMiner 7.5 memory increase
Hi guys,
I have data of ~244000 rows and I do a support vector machine with polynomial kernel with pre-specified grid (loop parameters) and k=5 cross validation. I have noticed that it takes quite a lot of time to complete these analyses, as I did one with neural network as well. I tried Z-transformation, PCA and using sampling, but still I need to use the whole data set for my churn prediction, which takes ages. I have 12GB ram and it says in the resource monitor that it only uses 1.6GB while it will 'use up to 11GB'. Why doesn't it use the full memory? Does the programme have a limitation? If so, can it be lifted somehow? I would appreciate any help.
Answers
Hi,
thanks to a great effort in the first iteration of making the data core of RapidMiner Studio more effective in 7.5, it will both run faster and use less memory for the vast majority of cases compared to previous versions.
In other words, in your case the memory is not the limiting factor, its the CPU speed. Your memory is not restricted, but if it does not need more memory, why should it be used? If an algorithm would require more memory, it would be taken.
Depending on the operators you use, more CPU cores could make a dramatic difference. For example, a 10-fold Cross Validation can make use of up to 11 cores (depending on parameters) to run all 10 folds and the overall model building in parallel. In that case, the number of CPU cores is restricted by your license (and obviously your hardware).
Regards,
Marco Böck
Hi Mr. Bock,
Thank you for your reply. I already thought it had something to do with CPU. I have 4 cores active, and it says that it uses 20% of processing power. Is CPU usage limited then due to the license? I have an educational license.
Regards,
Jovan Gligorevic
Hi Jovan,
The educational license only uses 1 logical processor (see the table here: https://rapidminer.com/educational-program/ ). If you have 4 cores, this typically result in 8 logical processors thanks to hyperthreading ( https://en.wikipedia.org/wiki/Hyper-threading ). Out of these 8 logical processors, you only use 1.
Only our commercial license supports parallel computation using multiple cores. See the table here for comparison: https://rapidminer.com/pricing/
Best,
Ingo
hi @jahagirdar_adna for RM Studio there is the Resource Monitor that is built into RM Studio. Go to View -> Show Panel -> Resource Monitor. That shows memory usage for your process. I also use my normal computer Activity Monitor all the time.
To be honest, if you're pushing your resources to a point where this is a problem, you're probably better off (a) getting more resources, or better still (b) going to RM Server where you can monitor and scale your resources MUCH better than in Studio (which is really a prototyping env).
Scott