The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Getting started with my data set in RM 5.0
Hi all,
I'm new to RapidMiner and data mining (although I've done what, in retrospect, was some very basic data mining in the past). I do have some university level statistics under my belt, but that is about it.
I've created a data set that I would like to work on. In general terms, I have two numeric inputs which more or less follow a linear regression. Now as for the more or less, I have a handful of non-numeric categorizations for associated with each data pair on the linear regression. I suspect these non-numerics will explain some of the directional wobble around the regression line (if that makes sense) and so I would like to run some data mining trials against the data.
Now, from what I can understand, this data is 'polynominal' according to RapidMiner so I am having a difficult time finding a mining function that works with the data set I've described. What are some good options for me to start with?
Thanks in advance.
I'm new to RapidMiner and data mining (although I've done what, in retrospect, was some very basic data mining in the past). I do have some university level statistics under my belt, but that is about it.
I've created a data set that I would like to work on. In general terms, I have two numeric inputs which more or less follow a linear regression. Now as for the more or less, I have a handful of non-numeric categorizations for associated with each data pair on the linear regression. I suspect these non-numerics will explain some of the directional wobble around the regression line (if that makes sense) and so I would like to run some data mining trials against the data.
Now, from what I can understand, this data is 'polynominal' according to RapidMiner so I am having a difficult time finding a mining function that works with the data set I've described. What are some good options for me to start with?
Thanks in advance.
0
Answers
I came to Rapidminer primarily because it provided a nice environment for testing Support Vector Machines against large stacks of data. Why SVMs? Partly because of the speed compared to induction or neural nets, partly because they avoided the dreaded neural local pothole problem, and partly because they are like swiss army knives and can handle just about any combo of data types. The weird thing is that it worked as I had planned, because well tuned SVMs are competitive, and because RM enables testing harnesses to be implemented quickly, even by mental midgets such as myself.
you could transform the polynominal attributes with the polynominal to binominal to binominal attributes. You can turn these to binary 0 - 1 coded attributes that can be used by numerical methods like SVMs. This is a common way how to handle these attributes.
Greetings,
Sebastian