The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Should I put OptimizeParameters inside XVal ?
Hi everybody,
I'm doing a SVM classification (inside an XValidation loop) and optimize my kernel parameters with "Optimize Parameters".
I'm doing the SVM classification inside XValidation to avoid overfitting of my SVM model, but the Optimize Parameters operator (which sits on top of it) simply iterates over all parameter combinations and returns the best.
Does this not lead to overfitting of the kernel parameters ? So, should I use OptimizeParameters inside another XValidation ?
I'm asking because the results I get with RapidMiner are always slightly better than the results of the software DTREG. Now, DTREG is doing the parameter optimization inside a separate cross validation loop and so I wonder if I should do the same in RapidMiner.
Many thanks,
Axel
I'm doing a SVM classification (inside an XValidation loop) and optimize my kernel parameters with "Optimize Parameters".
I'm doing the SVM classification inside XValidation to avoid overfitting of my SVM model, but the Optimize Parameters operator (which sits on top of it) simply iterates over all parameter combinations and returns the best.
Does this not lead to overfitting of the kernel parameters ? So, should I use OptimizeParameters inside another XValidation ?
I'm asking because the results I get with RapidMiner are always slightly better than the results of the software DTREG. Now, DTREG is doing the parameter optimization inside a separate cross validation loop and so I wonder if I should do the same in RapidMiner.
Many thanks,
Axel
0
Answers
Instead, perform parameter search to get the best accuracy on ALL your data, then perform X-Validation say 10 fold depending on how much data you have to see how your model generalizes to unseen data. ;D
-Gagi
in fact there are to sides to take into account: As Gagi said, if you want to find the best parameters, you have to use the complete data, and hence the setup you are having right now.
But, this is the second side, you have to keep in mind, that you might have overfitted the parameters to your training data and hence the resulting performance might be to optimistic. To check this, you should put the Optimize parameters into another XValidation, this will give you rather pessimistic results, because you didn't use all the data. The difference between the performances gives you an impression of the reliability of the performance of the optimized parameters.
After all this, you can train the model on the complete data set using the best found parameters, this then is the best possible model.
Greetings,
Sebastian