The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Multi Class Labels

piroozmanpiroozman Member Posts: 1 Learner I
edited December 2019 in Help
Hi, I'm working with the data which has about 1220 samples. My data label has more than 2 categories. I want to train and test about 4 Predictive Model like Decision Tree and Naive Bayes. I need to confusion matrix and ROCs of these models. But ROC can only apply on the label with 2 categories. Here are my questions:
1- How can I use a cross-validation operator to training and testing models?
2- How can I produce ROCs for these models?
I read some articles about data with the label more than 2 categories. (for example, please google ROC for Multiclass Implementation in Rapid Miner kaggle)
I even used the XML of the about sample to learn, but in the end, I couldn't solve my problem. In fact, I don't know what operators are useful to solve my problem.
Thanks for your answer.

Best Answer

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted
    If you follow the tutorials for cross validation operator, you should be able to configure a process to use your desired ML algorithms to generate the confusion matrix with no problem since the confusion matrix supports multi-class labels. 
    However, for the ROC curve things are more complicated.  The best way to handle ROC for multi-class problems is to use the "one vs all other" approach.  So if you have 3 classes, you recode your label as 1 vs not 1 and then run a predictive model and produce the ROC curve (since it is now a two-class problem this is easily done using the Performance (Binominal) operator).  The you recode the label as 2 vs not-2, and repeat, and then 3 vs not-3, and repeat again.  Unfortunately this needs to be done manually, there is no single operator in RapidMiner which will handle all this recoding for you.  Then you can take the resulting ROC AUC values and average them or analyze them in some other way.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.