The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Unexpected results from Automatic Feature Engineering
So I am trying to squeeze out the most accurate regression possible on my model, and for that I have narrowed GLM, GBT and SVM as the best learners for my data. I first try to optimize GLM as it trains the fastest.
I then generated a bunch of features with loops (manually) and selected the best broad group (this was still 400+ features we are talking about) for GLM. This group was not optimal for SVM or GBT but I wasn't optimizing that yet.
I then proceeded to run AFE on that Set to get the best GLM performance possible. It was no surprise that I got 8 or 9 optimal features that gave me the same GLM performance I had with 400+. So I was happy about that and applied that FeatureSet to my data so I would cut out the long AFE process.
However, this new dataset has considerably better performances in most learners. Including SVM and GBT. Even thou it was GLM optimized.
I then proceed to try and repeat the process for SVM, thinking that if I got such an improvement from a GLM oriented FeatureSet, I would get a better one from running AFE on SVM. But no. The SVM AFE returned a SIMPLER FeatureSet (even when I selected for Accuracy) with decent performance, but it did not beat the GLM AFE FeatureSet.
I did not think that was possible under most circumstances, but yet it happened.
2
Best Answer
-
IngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM FounderHi,In general the wrapper approach we are using with AFE is supposed to deliver a specific feature set for the inner learner the feature engineering is optimized for. And while they are often somewhat similar across multiple models, they typically also differ at least somewhat based on the model type so I understand your confusion.Here is the most likely reason why the feature set from the GLM works better also for the SVM than the one created for the SVM itself: the SVM is MUCH slower than the GLM learner which means that in the same amount of time there will be much more feature sets tried in the GLM case than in the SVM case.The SVM therefore simply did not have the same time for finding the optimal set when the optimization has been stopped for it. In that sense, the SVM feature set was still somewhat suboptimal for the SVM. The GLM feature set, which has been optimized for a different learner but had more time to be developed, happens to beat the one found for the SVM (so far).There could also be just smaller random effects causing this but typically in my experience the reason above is why other feature sets - which are likely to be not optimal for the model as well - can outperform the optimized one for the model which just has not been optimized enough (yet) and is therefore even more suboptimal.Hope this helps,
Ingo6
Answers
Out of curiosity, is the difference in performance huge? I saw a few instances in research where GLM performed comparably to SVM's but not a huge difference in GLM totally outperforming SVM.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing