The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

problem of imbalnce dataset

hanaabdalrahmanhanaabdalrahman Member Posts: 9 Learner III
edited December 2018 in Help

hello i am new in data mining and rapidminer, I have problem of imbalance data set, I wok with decision tree and naïve bayes and random foreset the accuracy of DT, NB is very good but it is not real my question is what is best operators that work with three techniques , my data set contain 1031 sample

 

hana mohamed

student

Tagged:

Best Answer

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @hanaabdalrahman you will need to use the Sample operator and toggle on the 'balance data' option. Then enter the classes and # of samples for each class.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist

    Hi Hana,

    i recommend to use the SMOTE operator which is part of the operator toolbox extension.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • hanaabdalrahmanhanaabdalrahman Member Posts: 9 Learner III

    thanks for replay...

    but how i use it the class false (44) only and class true about (986)

     

     

  • hanaabdalrahmanhanaabdalrahman Member Posts: 9 Learner III

    thanks...

    i work on version 8.0.001 these operator not found in it.. what is best one instead of it and how work?

     

  • hanaabdalrahmanhanaabdalrahman Member Posts: 9 Learner III

    thanks very match..  upsampling operator solve the problem

Sign In or Register to comment.