The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"multicollinearity in linear regression"
Sorry if I post to a wrong forum and please let me know a proper place for my question.
I am trying to get a linear regression model with month of year which is a nominal of 12 values and other numeric attributes to predict revenue. I applied a nominal-to-binominal operator and a nominal-to-numeric operator before the linear regression operator. However the resulted model included all the 12 dummy variables (resulted from the first two conversion operator) and an intercept. As the sum of all the dummy variables are always one, there will be multicollinearity in the resulted model. Why not dropping one of the dummy variable in the process automatically? or it is the user's responsibility to drop it? But how?
Thanks in advance.
-Xiaoyan
I am trying to get a linear regression model with month of year which is a nominal of 12 values and other numeric attributes to predict revenue. I applied a nominal-to-binominal operator and a nominal-to-numeric operator before the linear regression operator. However the resulted model included all the 12 dummy variables (resulted from the first two conversion operator) and an intercept. As the sum of all the dummy variables are always one, there will be multicollinearity in the resulted model. Why not dropping one of the dummy variable in the process automatically? or it is the user's responsibility to drop it? But how?
Thanks in advance.
-Xiaoyan
Tagged:
0
Answers
how should it be possible for the computer to guess that you want to exclude one of your attributes from the model building process? Only based on the assumption: "Well, he has an attribute having tweleve values. Hey, that's probably a date and since he applies a linear regression he's of the financial field! Then it's certain to silently remove an input attribute automatically..."
No, this of course it is in the responsibility of the user to remove attributes for the model building he doesn't want to have regarded. There are two ways possible:
Simply filter the attributes away by applying a Select Attributes Operator beforehand or define their role to be special. All special roles aren't regarded as input for the analysis. Predefined special roles might have a special meaning (like label) but you can define your own roles in the "Set Role" operator by simply entering it.
I would suggest taking a deep look into the english manual, where all these basic ideas of RapidMiner are explained.
Greetings,
Sebastian