The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Answers
I don't believe this is possible. But I'm also not sure what you really intended by this request, because by definition, k-fold cross-validation requires that every example will appear once in a test dataset (and the other k-1 times it appears in the training sets).
As you already know, the model produced by cross-validation is based on the entire dataset. The cross-validation procedure is simply designed to estimate how the model might perform on unseen data in a more statistically robust way than the older approach of a static two-way split into a training versus testing set. So why would you need to extract the specific example sets used in cross-validation? The entire dataset is ultimately used both for training and testing in cross-validation.
If you really need to do this, then I think you are going to have to set up a kind of manual cross-validation by creating static segments and then building the model and running the test statistics on each segment separately using loops. But it seems like a lot of effort to build manually what cross-validation already does automatically.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
you add store on the training and test side, and then some macro logic.
Not sure why would you want to store it, but hopefully this attached example will give some ideas
May I ask why do you want to do that?
1) Because I want to use the predictions
Then you can use the operator X-Prediction.
2) Because you want to something else
A possibility here is to define yourself the k-different samples outside Rapidminer and then define a Batch Variable. After that you can use Batch-X-VAlidation
Hi,
a few remarks:
RapidMiner has two operators, the operator X-Validation to get a performance and the operator X-Prediction to get a scored sample. Sadly there is no built in operator to do both things at once. I am using the attached building block for this
Why shouldn't i do this? Well, to be honest it is very dangerous to do this. People tend to have a look at the scored data set and built new variables which solve issues with single examples. This is obvious overtraining by hand and should be treated with care or should better be avoided.
Why should i do this? Well, i am personally using it in regression problems to get a scatterplot true vs predicted. In this scatter plot you can see biases or biases in some regions, nonlinearities and so on. This is I think very useful.
~Martin
Dortmund, Germany