The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
External Validation
I did the following process and got good performance results by cross validation. Now I want to run an extern data set on this very same model. How to do so?
The retriew valdays_complete thereby is the external set, Filter examples (2) selects the dementia subgroup (also the used subgroup for modelling).
Tagged:
0
Best Answers
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi,
doing the backward elimination and the other feature selection *before* the cross validation is not the best approach. You want to validate the entire modeling process, and feature selection is an important part of that. It does take longer because of the repetitions, but you should put the feature selection into the cross validation in the main process.
Does the random attribute stay in the data after the feature selection? One would expect that it is eliminated. So it shouldn't be in the model, and then it won't be relevant.
You can connect any results you're interested in to the result ports. It will be interesting to compare the validation performance to the external data set performance.
Regards,
Balázs1 -
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi @Norita,
you can use the Remember operator in the cross validation after the feature selection to remember the weights for example. Then after the validation you would use Recall after the validation to retrieve the result.
The list of the attributes is also available in most models, but it's usually harder to retrieve it from those.
Regards,
Balázs0
Answers
the usual way to do this would be *another* Cross Validation around the Backward Elimination. You should validate the entire modeling process, not just the feature selection. The modeling in the outer validation would be of course the same as inside the B. Elimination.
The outer Cross Validation has a "mod" output that gives you the model. You can then use Apply Model to apply this model on a new data set (the external data) given that it has the same attributes with the same type as those that went into the model. (Additional attributes don't matter.)
So if you do a lot of preprocessing in Cleanse.Days.Data, you will need to do the same process on the external data to achieve the attribute structure expected by the model.
Regards,
Balázs