The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"SVM or Regression from data in database - how to??"
I'm VERY new to RM. Just installed it today
So far, I'm very impressed and a bit overwhemled by all the options it has.
I was hoping someone could help me design a model/workflow in the GUI for a simple problem.
-My data is stored in MYSQL (I do understand how to use DatabaseExampleSource to access the raw data
-The input is 4 columns. The first is a unique ID, the next 2 are various features (numbers), the last column is the result.
Fields: ID, first_measure, Second_measure, resulting_score
Example Data: 1, 13.5, 57.2, 6.12312313
I would like to use RM to create a "predictor" for this data. Build a model based on many training examples. One thought is regression, the other is an SVM. I might also expand into a model with 50-60 features. In that case, it would be nice to use some kind of genetic algorithm to learn the best features and correlation for the most accurate prediction.
As I wrote above, I can connect to my database and select the data. I'm not sure what to do with the data once I have it.
Any advice?
So far, I'm very impressed and a bit overwhemled by all the options it has.
I was hoping someone could help me design a model/workflow in the GUI for a simple problem.
-My data is stored in MYSQL (I do understand how to use DatabaseExampleSource to access the raw data
-The input is 4 columns. The first is a unique ID, the next 2 are various features (numbers), the last column is the result.
Fields: ID, first_measure, Second_measure, resulting_score
Example Data: 1, 13.5, 57.2, 6.12312313
I would like to use RM to create a "predictor" for this data. Build a model based on many training examples. One thought is regression, the other is an SVM. I might also expand into a model with 50-60 features. In that case, it would be nice to use some kind of genetic algorithm to learn the best features and correlation for the most accurate prediction.
As I wrote above, I can connect to my database and select the data. I'm not sure what to do with the data once I have it.
Any advice?
Tagged:
0
Answers
But enough of advertising .. The first steps of your tasks are to designate your ID and your result_score as special attributes, namely as a (who would have thought ) id and label, respectively. This can be done by setting the parameters [tt]id_attribute[/tt] and [tt]label_attribute[/tt] of the [tt] DatabaseExampleSource[/tt] operator to the appropriate column names. Note that this designation can also be done separetely by the operator [tt]ChangeAttributeRole[/tt], one for each attribute.
The second step is to simply place the [tt]LinearRegression[/tt] or e.g. the [tt]LibSVM[/tt] operator in the process. If you then run the process, it should give you a regression or SVM model, respectively.
The task of genetic feature selection is a bit more complicated. I stronly advise you to have a look at the RM built-in tutorial (i.e. the example processes coming with RM). There are also examples for feature selection. You should easily get an idea how this works from them.
Hope that helps,
Tobias
Thank you for the quick answer.
I can't wait to get good with RM. I see so many great possibilities!
One additional question: Can I specify some details about a feature. For example, one of my features is the ID number of a category. We keep it as an Integer in our DB. I want to tell RM that it is not an actual number to average, etc, but just an identifier of a category. (I guess one way would be to translate it into a string "ID-1", etc. but I was hoping there was a nicer way to do this in RM.)
Thanks again!!!!
-N
Tobias