The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Regress SVM with numeric and nominal?"
Hello,
I want to build an Regression SVM that will train on a "score". The problem is that the data has both numeric and nominal data.
For example:
Age: numeric
Favorite Color: (red,green,blue) nominal
Favorite Food: (meat, chicken,fish) nominal
Weight: numeric
Calories per day: numeric
Postal Code: (90026, 90028, etc.) Looks numeric but really is nominal
The actual data has about 30 features of which about 15 are nominal and 15 are numeric.
Any ideas on how to build the proper data set and model for a regression SVM?
Thanks!
I want to build an Regression SVM that will train on a "score". The problem is that the data has both numeric and nominal data.
For example:
Age: numeric
Favorite Color: (red,green,blue) nominal
Favorite Food: (meat, chicken,fish) nominal
Weight: numeric
Calories per day: numeric
Postal Code: (90026, 90028, etc.) Looks numeric but really is nominal
The actual data has about 30 features of which about 15 are nominal and 15 are numeric.
Any ideas on how to build the proper data set and model for a regression SVM?
Thanks!
Tagged:
0
Answers
your problem is to transfer nominal values into numerical ones. You could use nominal2numeric, but I think it would be better to binaryze it first. This means, every nominal value of a nominal attribute becomes a column: favourite color = red and favourite color = green ... and so on. The cell will contain a true if the nominal value was the associated value and false otherwise.
You then could translate this by nominal2numeric into numerical values processing with the svm
This method prevents you to put in some ordinal information, by associating colors with numbers (green = 0, red = 1, blue = 2) which aren't simply in the data.
Greetings,
Sebastian