The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Running concurrent processes
Can anyone confirm or deny that RM 5.3 is capable of processing multiple processes concurrently?
I am integrating RM in a Java project and am having an issue when attempting to run multiple RM Processes concurrently. Here is an overview of what I am doing:
1. Main thread: initialize RapidMiner
2. Create a runnable that executes a RM Process via a File
3. Submit 20 instances of this runnable to a thread pool
4. All 20 results show different results in the resulting IOContainer <-- THIS IS THE PROBLEM
NOTES:
"a warning: Be careful with static references. RM 6 will probably allow multiple processes opened, and RapidAnalytics certainly does."
http://rapid-i.com/rapidforum/index.php/topic,2917.msg11719.html#msg11719
I am integrating RM in a Java project and am having an issue when attempting to run multiple RM Processes concurrently. Here is an overview of what I am doing:
1. Main thread: initialize RapidMiner
2. Create a runnable that executes a RM Process via a File
3. Submit 20 instances of this runnable to a thread pool
4. All 20 results show different results in the resulting IOContainer <-- THIS IS THE PROBLEM
NOTES:
- The Process contains only static data.
- The Process produces the same result on every run when running it serially.
- The Process uses the Series extension (rmx_series-5.3.0.jar)
- The Runnables do not share any data.
"a warning: Be careful with static references. RM 6 will probably allow multiple processes opened, and RapidAnalytics certainly does."
http://rapid-i.com/rapidforum/index.php/topic,2917.msg11719.html#msg11719
Tagged:
0
Answers
RapidMiner 5.3 is not capable of that.
Regards,
Marco
Most of the new frontiers in training systems (like finding interesting bits in huge problem spaces) depend on how ubiquitous parallel processing has become.
To really access RM's power on machines with 4-16+ cores, when working on embarrassingly parallel problems, clean process separation seems an absolute necessity. I haven't dug deep enough into the code to see exactly where objects (iterators, RNGs, etc?) are being shared (assuming that is the issue), but I would like to express how useful it would be to clean up those dependencies in a future release of RapidMiner.
Think of me at 2 AM, grasping for that break through, the sense of achievement and pride welling up within me as the points of data plot a beautiful line... and having to fall asleep hurt, distraught and confused in the fetal position. Why RapidMiner, why?! I thought you were my friend!
Ahem. Anyway... right now I have to do some evil things, like starting up separate JVMs depending on how many cores are on the machine. It's cave-man threading. Ungabunga! Somehow I know we can do better!
If you have any ideas on where the problem(s) might be in more concrete terms, I am certainly all ears and would happily dive into the code to try and help solve the mystery...
the RapidMiner5 design was done quite some years ago, in a time where multiple processes and multi-threading in general where not yet that big a deal (sadly )
We will however rectify these shortcomings in the future
Regards,
Marco