The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Split-Validation Issue

amitdamitd Member, University Professor Posts: 49 Maven
@sgenzer, I believe there may be some issue with the split-validation operator. The model output through the entire split-validation process does not correspond to the model with which the validation performance metrics are computed. 
I have attached an Excel spreadsheet to show the computations with a formula. The RMSE computed for the validation dataset (using the Performance operator) corresponds to the "ValidModel and ApplyModel" (in Excel worksheet) which is one of the models output by the process when dissected through a remember/recall operators and breakpoints. However, the RapidMiner process outputs a LinearRegression model that is same as the "TrainModel" (in Excel worksheet) whose RMSE does not match the one given by the Performance (Regression) operator. Why the discrepancy? Which is the correct model here?

I have tried this issue with multiple datasets and have documented it in a process with the sample Polynomial dataset. Any ideas on what may be going on here? 
Tagged:

Best Answer

Answers

  • amitdamitd Member, University Professor Posts: 49 Maven
    Thank you, that makes sense. Ideally, it would been better to get direct access to the model fit with the training data which is being used for evaluation on the validation partition.
Sign In or Register to comment.