The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"How to access train and test instances in each fold for a N-fold cross validatin"
kashif_khan
Member Posts: 19 Contributor II
Hi Folks,
I am working on a data mining problem in RapidMiner where i have to access instances in each fold for a N-fold cross validation with a classifiers. I can access the instances in "Test" subprocess of Validation operator as it gives me an instance of "ExampleSet" but cannot access the same for "Training" subprocess which yields an instances of "DistributionModel". I am trying to iterate over them in my code. How can i get the instances in test and train split for each fold separately ? How can i cast DistrubutionModel to an ExampleSet ?
I really appreciate your help ...
I am working on a data mining problem in RapidMiner where i have to access instances in each fold for a N-fold cross validation with a classifiers. I can access the instances in "Test" subprocess of Validation operator as it gives me an instance of "ExampleSet" but cannot access the same for "Training" subprocess which yields an instances of "DistributionModel". I am trying to iterate over them in my code. How can i get the instances in test and train split for each fold separately ? How can i cast DistrubutionModel to an ExampleSet ?
I really appreciate your help ...
Tagged:
0
Answers
1) When you open the X-Validation operator in your process in RapidMiner Studio GUI, you see a "Training" subprocess on the left and a "Testing" subprocess on the right side. Notice the ports on the top right side of each subprocess. If you want to access data from them in your code, they need to be connected. So if you want to access the training data, you will have to pipe it to the "thr" port.
Another option would be to access the input ports on the left instead of the output ports on the right. That way you can access whatever comes into each subprocess.
2) You cannot cast DistributionModel to an ExampleSet. An ExampleSet is your actual data (think database table) and the DistributionModel is a model which is used to generate predictions based on your actual data. They are completely different things.
Regards,
Marco