"Different performance from Backward Elimination when not using the operator"
I used the Backward Elimination operator to optimize my AUC for logistic regression by eliminating some attributes. However, when I stop using the Backward Elimination operator and eliminate the same attributes myself using the Selected Attribute operator (based on Backward Elimination operator's results) the resultant AUC/Performance is not the same (it lower). This is the same for many optimization operators (Optimize Parameter (Grid), Forward Selection).
How do these optimization operators work and how are they different from doing it manually (without optimization operator) ?
My data has 2030 instances with 33 features and 1 binary dependent variable.
Answers
Hi @aphongme,
I'm not specialist of feature selection algorithms, so I don't know why manually, you don't obtain the same AUC as using feature selection algorithms.
However to have an element of answer about how these algorithms works, you can find a ressource (especially part 1 / part 2) by following this link :
https://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/Multi-Objective-Feature-Selection-Part-1-The-Basics/ta-p/45775/jump-to/first-unread-message
I hope it helps.
Regards,
Lionel
This also happen when I use Optimize (Grid) operator too. The parameters that I got, when I try running them without using the operator the AUC decrease significantly.
Hi again @aphongme,
Can you verify your XML process and share it (the process you shared in the other topic is broken).
An pist of investigation can be first to build the ROC curves in the 2 cases (case 1 : manually / case 2 : use of feature selection - Optimize parameters algorithms) and compare these curves (using for example Compare ROCs operator).
Regards,
Lionel
Are you making sure to use a specific random seed to ensure reporducible results?
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts