The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Leaking Memory Bug"
Hi @ all!
First of all: I'm sorry that this post has become a bit long but i would rather have it said
RapidMiner suffers from inexplainably increasing memory consumption since some years and versions (e.g. http://rapid-i.com/rapidforum/index.php/topic,472.0.html or http://rapid-i.com/rapidforum/index.php/topic,1911.0.html). Lately i've been running into this problem: I've been running some parameter optimizations with inner cross-validations. The memory usage increased over the hours until i got an OutOfMemoryException.
I didn't understand this behaviour as the systematicaly trying of parameter combinations should not create more and more objects. As I run from command line it was no (direct) GUI issue. I used no breakpoints. As i used some custom made operators I started debugging searching my error. I found none. But I found a memory leak of RapidMiner itself (shurely it's not the holy grale but hopefully some insight). I didn't file this as bug as it is more a design flaw (no offense ment).
The "offending" classes are really basic: SimpleAttributes and AttributeRole. To show the problem let me show you what happens when a SimpleExampleSet is created (the same happens when one is cloned):
- a new SimpleAttributes object is created
- each attribute is wrapped into a AttributeRole object
- these AttributeRole's are added to the SimpleAttributes object, while adding
- the AttributeRole is stored in a list (field of SimpleAttributes)
- the SimpleAttributes is registered as owner of the AttributeRole, while registering
- the SimpleAttributes is added to a list (field of AttributeRole)
So we have an AttributeRole referencing a SimpleAttributes object and this SimpleAttributes object referencing the same AttributeRole.
This circular reference can be brocken by
A) removing the Attribute(Role) from the SimpleAttributes
clearing all Attribute(Role)s
C) removing the ownership
A and C are never used, B only seldom [according to Eclipse->Open Call Hierarchy]. So all SimpleExampleSet's contain a reference to a SimpleAttributes object referencing itself. Now imagine this SimpleExampleSet is not referenced anymore (for example after been used inside an IteratingChain). The GarbageCollector finalizes the SimpleExampleSet but can never(!) free the SimpleAttributes as it is referenced by several AttributeRole's and never(!) free the AttributeRole's as they are referenced by the SimpleAttributes. Each time a SimpleExampleSet is cloned (almost with every iteration of any ValidationChain, ParameterOptimization) a new SimpleAttributes object and new AttributeRoles are created. Both object types accumulate in the heap until it is filled. This can be checked in Eclipse: show all instances of SimpleAttributes after some iterations.
Unfortunatelly I have no idea how to solve this problem. Both references are needed. Perhaps some AttributeOwnership object could be introduced eliminating the circular reference. But this would require some deep changes in RM...
This is now open for discussion. Maybe I've missed something.
Best regards,
chero
First of all: I'm sorry that this post has become a bit long but i would rather have it said
RapidMiner suffers from inexplainably increasing memory consumption since some years and versions (e.g. http://rapid-i.com/rapidforum/index.php/topic,472.0.html or http://rapid-i.com/rapidforum/index.php/topic,1911.0.html). Lately i've been running into this problem: I've been running some parameter optimizations with inner cross-validations. The memory usage increased over the hours until i got an OutOfMemoryException.
I didn't understand this behaviour as the systematicaly trying of parameter combinations should not create more and more objects. As I run from command line it was no (direct) GUI issue. I used no breakpoints. As i used some custom made operators I started debugging searching my error. I found none. But I found a memory leak of RapidMiner itself (shurely it's not the holy grale but hopefully some insight). I didn't file this as bug as it is more a design flaw (no offense ment).
The "offending" classes are really basic: SimpleAttributes and AttributeRole. To show the problem let me show you what happens when a SimpleExampleSet is created (the same happens when one is cloned):
- a new SimpleAttributes object is created
- each attribute is wrapped into a AttributeRole object
- these AttributeRole's are added to the SimpleAttributes object, while adding
- the AttributeRole is stored in a list (field of SimpleAttributes)
- the SimpleAttributes is registered as owner of the AttributeRole, while registering
- the SimpleAttributes is added to a list (field of AttributeRole)
So we have an AttributeRole referencing a SimpleAttributes object and this SimpleAttributes object referencing the same AttributeRole.
This circular reference can be brocken by
A) removing the Attribute(Role) from the SimpleAttributes
clearing all Attribute(Role)s
C) removing the ownership
A and C are never used, B only seldom [according to Eclipse->Open Call Hierarchy]. So all SimpleExampleSet's contain a reference to a SimpleAttributes object referencing itself. Now imagine this SimpleExampleSet is not referenced anymore (for example after been used inside an IteratingChain). The GarbageCollector finalizes the SimpleExampleSet but can never(!) free the SimpleAttributes as it is referenced by several AttributeRole's and never(!) free the AttributeRole's as they are referenced by the SimpleAttributes. Each time a SimpleExampleSet is cloned (almost with every iteration of any ValidationChain, ParameterOptimization) a new SimpleAttributes object and new AttributeRoles are created. Both object types accumulate in the heap until it is filled. This can be checked in Eclipse: show all instances of SimpleAttributes after some iterations.
Unfortunatelly I have no idea how to solve this problem. Both references are needed. Perhaps some AttributeOwnership object could be introduced eliminating the circular reference. But this would require some deep changes in RM...
This is now open for discussion. Maybe I've missed something.
Best regards,
chero
Tagged:
0
Answers
thanks for this analysis, it is highly appreciated! I really consider that a very valuable contribution of the community, and it shows the strength of open source. I put this immediately on our task board, and it won't vanish from it until it is fixed. This looks like some high priority issue. We'll keep you informed on this board.
Cheers,
Simon
well it seams i was slightly wrong. I tried to create an example process for checking this bug. I failed nearly. It seams the cloning itself is not the problem. The GC seams to be able to resolve circular references (O fcourse it must be. How should double linked lists work otherwise?). Nevertheless there are processes where the number of SimpleAttributes's and AttributeRole's are exploding.
Here is an example process for this phenomenon: Eliminating either the ReplaceMissingValues operator or the ModelGrouper -- or both -- cancels the behavour. I have no idea why this is the case. Anyway the memory contains only one KnnRegressionModel, one ValueReplenishmentModel and one GroupedModel.
[Edit:] --- (forget about THAT)
Additonaly in the first fold of the first X-Validation either in learning and testing 11 additional AttributeRole's and two SimpleAttributes are created (in addition to these mentioned above). No idea why this happens.
Best regards.
I found it! Several things are (not) working together.
a) Each time a PredictionModel is applied a new Prediction Attribute is created. This attribute is added to the SimpleAttributes object of the ExampleSet and to the ExampleTable.
b) After each X-Validation step the prediction labels are removed
So what normally happens in Validation is as follows:
- the original ExampleSet E referencing an ExampleTable T and some SimpleAttributes object SA is cloned
- the cloned ExampleSet E' references the same ExampleTable T as E but has its own SimpleAttributes SA'
- this ExampleSet is splitted for learning and validation
In testing:
- a formally present prediction is saved
... (X)
- a prediction attribute P is created
- P is added to T
- P is wrapped in an AttributeRole R
- R is added as prediction to SA' (making R and P reference SA' as owner)
...
- if there is a new prediction label and was an old prediction label the prediction label is removed from the ExampleSet and the ExampleTable
- after the validation loop E' is discarded and freed by GC
This works fine and is no problem. But: (X) is where the evaluation of the test subprocess starts. Anything can happen there. For example the used ExampleSet could be cloned (e.g. by some preprocessing operator). Then the new prediction label is added to the cloned ExampleSet. The ExampleSet the ValidationChain sees isn't changed, so the prediction attribute isn't discarded from the MemoryTable.
So: The original ExampeSet E is still referenced (e.g. the used input port)! E in turn references T, T references all created P, each P references its SA', each SA' also references all(!) cloned AttributeRole's which in turn reference all cloned Attributes. QED 8)
Hope you have fun with that,
chero
just to prove my evaluation I'll a "bug fix":
I think it is not possible to handle this problem inside the ValidationChain operator. Therefor a new operator eliminating the prediction label is needed, like this: I've tried the operator -- half of the overhead is eliminated. So my "theory" is correct but only half the truth.
Best regards,
chero
thanks for investigating this further. I hope we can come up with a solution without a custom operator soon.
Cheers,
Simon