The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Modelling with a label that has 6 classes, and using 4 other polynominal attributes
Hi
I am new to rapidminer, and would like to find out if there's any model which i could use for predicting a label with 6 classes.
I have a data set which has 5 fields,
1. Industry Type (label)
2. Cardname
3. Education Level
4. Gender
5. Marital Status
I would like to use attribute 2 - 5 to predict the industry type. I have attached the file for reference.
Thanks in advance for the assistance.
Tagged:
0
Answers
Hi @18a641r,
First , if you are new to RapidMiner, I encourage you to see these training videos to learn the basics of RapidMiner.
I played a little with your data, and I'm not able to find a relevant model (the best model has an accuracy of ~16%).
When we see your data, all attributes are very "homogeneous" / "uniform" :
So I think no algorithm is able to find correlations between your 4 attributes and your label (IndustryType).
You can test different models and for a given model play with its parameters to see how is the performance of your model evolving :
It's a good method to begin to learn RapidMiner.
Finally, you can find here a basic process implementing a Decision Tree model :
I hope it helps,
Regards,
Lionel
NB : Sometimes, you have to resign yourself : Although, Machine Learning is a powerful tool, it is helpless in the face of certain problems....
Hi @18a641r
Like @lionelderkrikor, I took a look at your data, and it looks more like a cross join among 4 categories (Card Name, Education Level, Gender and Marital Status). A cross join between two entities gives you all the possible combinations of classes.
Let's dive deeper (not as deep as my sensei Lionel, but enough to build an idea on how classification problems work): Given the following table, where AL is the label and (A1, A2) are the combinations:
AL A1 A2
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
If you apply a model (let's say, a Decision Tree, which is the easiest one to understand, and the first one you are presented with when you open the RapidMiner Titanic Tutorial), it will be only 50% confident that any combination of A1 and A2 is 0. On the other hand, the following:
AL A1 A2
1 0 0
0 0 1
0 1 0
1 1 1
1 0 0
0 0 1
0 1 0
1 1 1
(It's simple: if A1 and A2 are equal, the label is 1; else 0).
Here you have an XML process for the latter, just to feed your curiosity. (Turns out I was preparing a class for tomorrow and was working with the same things). You need a new extension, the Operator Toolbox, I use it a lot to create example sets on the fly to test some things.
Run the process, then try replacing something and see what happens.
Hope it helps!