The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Downsampling operators

2016004120160041 Member Posts: 6 Learner I
Hi,
Could you please tell me how I can achieve downsampling with imbalanced data in RM. I have used the random sampling and sampling bootstrap operators would also like to know the difference between the two.
Thank you

Best Answers

  • rfuentealbarfuentealba RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn
    Solution Accepted
    Hi,

    In the Mannheim Toolbox extension, there is a Sample - Balance operator that does just this.

    (Opinions and fundamental techniques aside, but you might want to work with weighting instead of sampling.)

    All the best,

    Rodrigo.
  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted
    I second the idea that weighting is my preferred approach, and that downsampling should be used primarily when you have many more cases than needed (either in general, or specifically of the majority class).  There are diminishing returns to larger and larger samples, so if your development population is hundreds of thousands of cases then you likely don't need them all.  But if you have an absolutely small number of your minority class then you probably don't want to downsample the majority class to match it as too much information would be lost.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.