The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
parallelization and CPU optimization
sgenzer
Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
With RM 7.3's big improvement on Cross-Validation performance, I would like to suggest that RM parallelize and/or optimize CPU performance on:
1) k-means clustering (on a 6-core machine I still only see use of 1 core)
2) Decision Tree
3) Process Documents from Data (Text Processing extension)
4) Loop (all the variations)
5) Branch and Select Subprocess
Scott
Tagged:
0
Comments
Hi Scott,
Great suggestions, thanks a lot. I can already confirm that some of this is in the making as we speak.
Let me ask a few clarifying questions:
2) Decision Tree (and Random Forest) already has a parallel implementation since RapidMiner 6.2. Based on our tests, it is on par with some of the fastest tree learner implementations. Can you name specific circumstances (e.g many nominal attributes) where you feel the execution speed is not great?
3) Process Documents from Data: this operator has been significantly sped up with version 7.2.1 of the Text Processing extension that was released a few weeks ago. Have you had a chance to test that? Do you still feel that it is too slow?
Thanks, Zoltan
Good morning Zoltan,
I may have spoken too soon about the Decision Tree - I have not benchmarked it recently and seen whether or not it is indeed using multiple cores. Yes I am usually using Decision Tree with a ton of nominal attributes.
As for Process Documents from Data, this is what I was doing yesterday and yes, I can confirm that it is only using 1 core. It is slow. I was watching it spin for a long time while simultaneously watching my gorgeous 6-core processor being underutlilized.
Thanks!
Scott
ok Decision Tree is indeed cranking up CPU usage.
Scott