The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Serious Memory Leak
To the Rapid Miner development team:
There is a very serious memory leak in Version 5.1. I am reading a large (900,000 rows) csv file in. The system monitor shows memory usage slowly increasing, as expected. But when the process finishes, and a new process is started, the memory usage starts at the same level where it was when the first process ended- the 2nd process then crashes due to lack of memory!
I have tested this with the Windows performance monitor as well- which confirmed that the memory was not being released when the pocess ended.
I am using the "Free Memory" operator- which seems to have no effect.
The only way to run the 2nd process is to restart Rapid Miner!
Please correct this error as soon as possible!
Thanks!
There is a very serious memory leak in Version 5.1. I am reading a large (900,000 rows) csv file in. The system monitor shows memory usage slowly increasing, as expected. But when the process finishes, and a new process is started, the memory usage starts at the same level where it was when the first process ended- the 2nd process then crashes due to lack of memory!
I have tested this with the Windows performance monitor as well- which confirmed that the memory was not being released when the pocess ended.
I am using the "Free Memory" operator- which seems to have no effect.
The only way to run the 2nd process is to restart Rapid Miner!
Please correct this error as soon as possible!
Thanks!
0
Answers
I have had something similar with v. 5.1. When running LoopAttributes inside which there is a single GenerateAttribute operator, after less than 200 iterations (new attributes), it runs out of memory and 'seizes up'. The dimensionality of each example vector is 28 (reals) and the total number of example vectors is 20,000 so I cannot see that there is cause for lack of memory...... my central memory space is 8GB and no other applications are running, the Xms parameter for Java is set at 6GB.........???
ChrisI
Worse, it doesn't release when the job is over even with a Free Memory box as the last step of the job. That means if I run another job immediately afterward, it will fail due to insufficient memory. I'd be happy to provide more information if someone can tell me what is needed to troubleshoot this issue.
Uwe
I think in this thread there are described several problems. This does probably no harm, RapidMiner just does some background calculations (e.g. updating the memory monitor ), and since it does not need the memory the garbage collection is not triggered. As soon as the memory is needed, it will be cleared.
Just a guess: did you leave the results view open? For that, the data also stays in memory. The Free Memory operator only triggers the garbage collection explicitly, which frees data which is not needed for anything. That could speed up things later, but it does not free any memory which would not be freed automatically. Thus it won't solve any out-of-memory problems Please try to close the result tab before running the second process. If that helps, we are done, if not, we will have a look at it.
Please check that RapidMiner can really access that much memory. If not, please try the Xmx option instead of Xms.
@all: please let us know if your problems persist.
Best,
Marius
I have checked the RapidMiner memory use via the System Monitor in the Results screen. With Xms set to 6GB it frequently clocks 5.2 GB.
Chris.
I run a loop on an ExampleSet with 20000 vectors (Examples) each vector made up of 8 integers which I subsequently convert to reals.
The loop grinds to a halt at 218 loops at which point the memory usage is showing read at max 4.2GB. Either I am doing something stupid or there is something weird going on..... ???
How can I get the xml data and ExampleSet to you?
Kindest Regards,
ChrisI
Referring to my posting on the looping problem, I have managed to read Marius' instructions on posting....
Here is the xml:
It now uses the generate data operator instead of read csv so anyone can paste it and run.
I kept a look at the amount of memory Rapid Miner was using:
idle memory usage 2.0GB (used by system not rapid miner)
start Rapid Miner 2.6GB
load and run process 2.8GB
press another time run 2.9GB
press 5 more times run 3.1GB
press 5 more times run 3.2GB
press 5 more times run 3.4GB
press 5 more times run 3.6GB
press 5 more times run 3.7GB
press run lots of times 6.7GB
press run lots of times 7.4GB
press run lots of times 8.2GB
http://img1.uploadscreenshot.com/images/orig/2/4106595534-orig.jpg
edit: if you wish I can try to do the same thing on Ubuntu linux and on a machine with even more memory.
Best regards,
Wessel
Tried using the GenerateData operator instead of the ReadCSV, just in case there was something confounding the issue. No change.
The machine locks up indicating 5.8GB memory used.
Using MaterializeData and FreeMemory operators inside the loop keeps the memory consuption down, BUT the execution speed is totally uancceptable
???
ChrisI
What I found out is the following: the JVM claims a lot of system memory quite fast, and almost never frees it. Internally however, the memory used (and not just claimed) by RapidMiner, is cleaned up between or during process runs.
As test process I used the process posted above with 1000 examples.
Running the same process with 20000 examples probably does not work, since with 1000 examples it already needs about 1GB of memory (this is probably improvable, and certainly will be improved in the future). At least the memory is correctly cleaned (inside the JVM) between process runs, and RapidMiner does not run out of memory, as long as the example sets are reasonably sized.
Best, Marius
Judging based on your description this should work.
What I do now, if need more memory, is simply close and restart Rapid Miner.
It is a pitty, but c'est la vie I guess. :-\
I have tried using the GenerateProduct operator on the same ExampleSet and it works quickly without a hitch so far.
ChrisI
Around 17:05 I see the amount of memory claimed by the JVM go down.
For example:
That looks great! It would have a huge impact on the RM looping capabilities which seems(!) to be an Achilles Heel at the moment.
I have started using GenerateProduct and with careful thinking it looks as if GenerateAggregation may also help me, otherwise I shall just output suitable files to R and get things done there.
Anyways, I hope your fix comes out soon
Kindest Regards,
ChrisI
It seems not much progress on the RM performance front has been done since the last post here.
Please have a look at the http://rapid-i.com/rapidforum/index.php/topic,5385.0.html
thx
f