The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Cross validation

YasminYasmin Member Posts: 5 Contributor I
edited December 2018 in Help
Hello
I have a question about the output of cross validation. If we take 90% for training and 10% for testing, then why the result shows the whole data and doesn't show just 10% of test part?
I'll be thankful if someone answers my question.
Yasmin

Best Answers

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi @Yasmin,

     

    Legitimate question ! 

    Here a possible element of answer :

    In reality for a 10-fold cross validation, RapidMiner performs 11 iterations.

    During the last iteration, RapidMiner applies the model to the whole training Dataset. So the length of the training set and the

    length of the test set are the same.

     

    Regards,

     

    Lionel

     

    NB : You can visualize this behaviour by setting a "Breakpoint After" on the Apply Model operator (inside the Cross Validation operator)

     

     

  • tftemmetftemme Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research
    Solution Accepted

    Hi @Yasmin,

     

    As it is true that the Cross Validation operator builds the final model on the whole data set (and thus performs a 11th iteration of the Training subprocess, in case the model port is connected), the Test process is only performed 10 times. But that is also the reason you have all your input data at the test result port. For every iteration step 10% of your input data is used in the test set. So within the Cross Validation all Examples of your input data are used once for testing.
    For the outer result port all test sets are appended together, so you have again your whole input data set. You can visualize this by adding a Generate Attribute operator in the Test subprocess of the Cross Validation and generate an attribute iteration with the value eval(%{a}) (the macro %{a} contains the number of times the current operator was applied).

     

    Best regards,
    Fabian

  • YasminYasmin Member Posts: 5 Contributor I
    Solution Accepted
    Thank you so much @lionelderkrikor and @tftemme for your complete responses.
    Best regards.
Sign In or Register to comment.