The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Need to make a model for economic Viability. Not sureHow to go ahead with variable selection
Rustyboltcutter
Member Posts: 2 Learner I
in Help
Hey guys,
So I am trying to build a model for the prediction of economic viability of property using the below variable.
So I created Listed_Year using the Listed data. Basically, any property that was listed till the end of 2018 is an old property and anything that is after is a new one.
So the logic I used To make the economic viability variable is this if rule,
if(listed_year == "New" && overall_satisfaction>3,"Economically viable",if(listed_year == "Old" && overall_satisfaction>3,"Economically viable","not Economically Vaible"))
But when I run this mode I get 100% in accuracy and 1 in Kappa, which obviously means is made it overfit and not work at all.
Would really love some input on how to move ahead and how to actually get this to work.
Tagged:
0
Answers
In deed, there is maybe an overfitting phenomenon or maybe one of your attribute is totally correlated to your label attribute.
RapidMiner can perform a relevant feature selection (and eventually feature generation) automatically. For that, please use
the operator called Automatic Feature Engineering.
An other advice, I can give you, is simply to submit your dataset to the AutoModel of RapidMiner. In this case, it is "all inclusive" :
Rapidminer takes care of everything : RM performs first a "preliminary" feature selection based on the "quality" of each
features and then RM will perform a feature selection (and eventually a feature generation based on your settings) , the modelling and the estimation of the model(s) performance. At the end of the calculations, RapidMiner presents all the results (the performances of each model).
Please let me know if you have other questions.
Hope this helps,
Regards,
Lionel