The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

SMOTE Upsampling Operator With Multi-Label Classification

NawafNawaf Member Posts: 16 Learner I
Hi!
 I wanted to ask if it is possible to use SMOTE Upsampling operator with multi-label classification? If so how? If not what is the alternative operator to overcome imbalanced classes?
Tagged:

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Solution Accepted
    good question. Both ways are feasible and can be succesful. What I would remind you about is, that if you use tree-based models like a RF then the additional examples from upsampling allows "deeper trees", since there are just more examples. You this get a very different tree.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi @nawaf,
    sure. you just use it #classes-1 times to get all classes to the same level.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • David_ADavid_A Administrator, Moderator, Employee-RapidMiner, RMResearcher, Member Posts: 297 RM Research
    Hi @Nawaf ,

    you could simply run SMOTE multiple time for the minority classes. So afterwards you have an up-sampled data set with all classes being balanced. Of course this is only really feasible when the number of classes is not too high.

    Best,
    David
  • NawafNawaf Member Posts: 16 Learner I
    Thanks folks for your response! The number of difference between class 0 and 1 (using the binary classification) is almost too high as normal for multi-label classification problem. So do you think finding the best threshold is better than applying SMOTE ?
Sign In or Register to comment.