The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Customized X-fold cross-validation"
Hi,
I want to perform an X-fold cross-validation which however does not
operate on sets that are defined by RapidMiner's XValidation "sampling_type"
parameter but on sets which are constructed using a "marker" in the
examples provided by an ExampleSource operator.
To be more accurate, my input examples (pairs of feature vectors and
labels) used for classification contain an attribute that defines the
application this particular example was extracted from. Let's say the
examples come from three applications "A", "B", and "C" and each
example contains an attribute holding one of the three characters.
Based on this, I would like to perform a 3-fold cross-validation where in
a first run, examples from "A" are excluded and tested on examples from
"B" and "C" ...
Is there an operator for that in RapidMiner?
Regards,
Paul
I want to perform an X-fold cross-validation which however does not
operate on sets that are defined by RapidMiner's XValidation "sampling_type"
parameter but on sets which are constructed using a "marker" in the
examples provided by an ExampleSource operator.
To be more accurate, my input examples (pairs of feature vectors and
labels) used for classification contain an attribute that defines the
application this particular example was extracted from. Let's say the
examples come from three applications "A", "B", and "C" and each
example contains an attribute holding one of the three characters.
Based on this, I would like to perform a 3-fold cross-validation where in
a first run, examples from "A" are excluded and tested on examples from
"B" and "C" ...
Is there an operator for that in RapidMiner?
Regards,
Paul
Tagged:
0
Answers
there is a special variant of the XValidation called BatchXValidation, where it uses an attribute with the special role batch to define the splitting sets. I post a process below, making use of this operator. Greetings,
Sebastian
sorry for the late answer. ;-)
I didn't really get the idea of your model. How does the BatchXValidation operator
work? I assume that it relies on the operator ChangeAttributeRole (also on
AttributeSubsetProcessing?), but it's not clear to me how the operators
communicate.
Let's say I've this example set:
att1;att2;att3;label
1; 2; A; YES
2; 2; A; NO
3; 4; B; YES
1,4; C; NO
2,4; C; NO
4,4; C; YES
and I would like to have a 3-fold cross-validation where in each
run of the validation I want to exclude the examples belonging
to the class (A,B,C) specified by attribute "att3".
Thus, the cross-validation would look something like:
1. step: Exclude examples from class A, learn model for examples
from class B and C, and apply this model to examples from class A
2. step: Exclude B, learn for A and C, apply to B
3. step: Exlcude C, learn for A and B, apply to C
How can I model this type of validation?
And is there a way to figure out within the BatchXValidation operator
which examples are currently excluded (like att3=A in 1. step)?
Thank you.
Regards,
Paul
the BatchXValidation does not divide examples of the same batch over folds. Instead the batches are always completedly swapped into one fold.
So, if you define your attribute att3 as the batch attribute and set the number of validations of the BatchXValidation on the numbers of different values in att3, this should do the trick.
In the first round the first fold is removed, containing all As and learning will be carried out on the remaining folds. And so on...
I hope this clarifies it?
Greetings,
Sebastian
yes, the x-validation is now clear. Thank you.
Best,
Paul