The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] Automatic dataset shuffle
Dear All,
I have 5 different datasets (from 5 different user).
I wish to do "user-cross-validation".
Meaning, I wish to test on user n, and train on all other users, for n = 1, ..., 5.
Any way to do this automatically?
I can retrieve all 5 data sets, but after this, I should "dynamically" join them.
Best regards,
Wessel
I have 5 different datasets (from 5 different user).
I wish to do "user-cross-validation".
Meaning, I wish to test on user n, and train on all other users, for n = 1, ..., 5.
Any way to do this automatically?
I can retrieve all 5 data sets, but after this, I should "dynamically" join them.
Best regards,
Wessel
0
Answers
And then use 'linear sampling' option?
Best regards,
Wessel
OK, unfortunately there is no easy out-of-the-box-with-a-single-operator method for this. But - because of the almighty tool-box power of RapidMiner - we can try to mimic a cross-validation with your desired behaviour!
There are actually several methods for this. One could work like this. You append all of your data-sets, but add a special attribute, let us say 'set_id', for every single attribute before. This attribute contains the number of the exampleset (1,2,3,...,k). After this you can loop k-times and filter the train- and test data with the help of this attribute. After you calculate the performance you can build an average.
Here is an example of such an process with 5 identical iris datasets: If you find a more elegant or remarkable way to achieve this, feel free to post it here.