The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Classification with ordinal data
MaltePetersen
Member Posts: 7 Learner II
I am new to data science and rapid miner. I made a prediction with automodel with a dataset which persists of nominal and ordinal data. Online I read that a classification is normally only done with nominal data. So this begs the question can my classification be accurate? And which method would be the right one for my use case.
Tagged:
0
Best Answers
-
varunm1 Member Posts: 1,207 UnicornHello @MaltePetersen
Good question. I am not sure about rapidminer automodel capability to find ordinal data automatically (I dont think it can). My preference is to treat ordinal data as nominal data. Some papers suggest converting ordinal to numeric, but numeric data is continuous and equally spaced which might not be true in ordinal case. There are pros and cons for both.
@IngoRM might provide more information.
Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
5 -
varunm1 Member Posts: 1,207 UnicornYep, you have the option to "Change to category" right? That is the one that converts your number columns to category (which is also called as nominal).
Sorry, if I got confused. Just want to clarify, you are trying to convert the "number" format to "nominal" format right?Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
5
Answers
Ingo
One other caution though is that you should probably look at the number of distinct categories that you have. If you have very many categories, and the relationship is fairly linear, then that might be an argument for treating the data as numerical. Otherwise, you may need to consider binning or other combinations of values to get the most stability out of the model. Having an attribute with too many nominal values (whether as a predictor, or even worse, as the label) can definitely cause complications, instability, or deterioration in performance.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
So to further specify my request. The dataset I am analysing is about speeddating. My ordinal data mostly describes how the participants ranks the partner for example the ranking of the appearance or humour of the partner between 0 and 10. With this data we try to find out which attributes weight the most and try to predict new data.
My preference is to consider them as nominal as mentioned earlier as 10 is not a huge number of categories. Please feel free to ask anything you need and we are happy to help.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
How did rapidminer read your data? Is it in numerical form or nominal form?
How to check this: In turbo prep, you can see the data type under the attribute name.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
You can select the attributes with "number" type and then "Transform" and "Change Type" to category. Here category datatype means nominal.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
I tried that but it turboprep is not giving me that option.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
?
Or would the better approach be to look which model has the lowest classification error and then decide for that model?
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing