ALL FEATURE REQUESTS HERE ARE MONITORED BY OUR PRODUCT TEAM.
VOTING MATTERS!
IDEAS WITH HIGH NUMBERS OF VOTES (USUALLY ≥ 10) ARE PRIORITIZED IN OUR ROADMAP.
NOTE: IF YOU WISH TO SUGGEST A NEW FEATURE, PLEASE POST A NEW QUESTION AND TAG AS "FEATURE REQUEST". THANK YOU.
VOTING MATTERS!
IDEAS WITH HIGH NUMBERS OF VOTES (USUALLY ≥ 10) ARE PRIORITIZED IN OUR ROADMAP.
NOTE: IF YOU WISH TO SUGGEST A NEW FEATURE, PLEASE POST A NEW QUESTION AND TAG AS "FEATURE REQUEST". THANK YOU.
The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Add a native Rank operator to RapidMiner Studio
Telcontar120
RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
There have been several recent threads asking about how to calculate ranks using RapidMiner. Currently there is a Rank operator in the old and unsupported (and somewhat buggy) Finance & Economics extension, but it is hard to recommend that solution, especially to newer users. The alternative using RapidMiner native operators currently is very cumbersome and complex for something as conceptually simple as a rank calculation. It would be so much easier if RapidMiner simply added a native Rank operator to the basic data ETL toolkit.
Tagged:
4
Comments
Dortmund, Germany
A more sophisticated version would even provide options around whether to sort ascending vs descending and how to handle tie values (assign lowest rank, assign highest rank, or assign midpoint rank), and the option to either replace the original attribute vs adding a new attribute with the rank value.
This is conceptually similar to assigning the percentile value to all examples. There are many contexts in which this is a useful transformation, including many non-parametric calculations, or using rank value rather than raw values as predictors in models to eliminate scalar effects (e.g., of outliers) while preserving ordinality.
This can all be done manually now in RapidMiner but it requires a daisy chain of related operators (e.g., Generate Copy, Sort, Generate ID, etc.) that would be nice to combine all into one simple operator.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Dortmund, Germany
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
P.S. I'd like there to be a percentile operator for exactly the same reason! Once again, it can be done manually using a Loop and similar operators to the ones above, only with the additional complexity of calculating the percentile value from the raw rank value.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Dortmund, Germany
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
I think >=4 operators for one frequent transformation is enough to put this into one operator. I will create a ticket for that for the operator toolbox. We will have to see how to put it into it. If you have further description on how the operator should work or what options it should provide, feel free to post them. The more description the better.
Best regards,
Fabian
I realized you could also actually have a single operator to handle both raw ranks as well as percentile ranks, with another option to control the output format (rank vs percentile rank).
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts