The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to a use Auto Model for data that I have already split into train and test?
I am trying to solve an imbalanced binary classification problem using a model to predict the minority class (stroke victims). I used oversampling on the training data to make synthetic instances of stroke cases so that I could address the data imbalance issue.
However, I kept the test data as its normal imbalanced distribution rather than oversampling that too because I want to test my model on the real-world distribution. I would like to use RapidMiner's automodel feature, but every time I try to use it then it just splits my training data into another train-test split and does its own thing.
How do I use Auto Model while specifying the data that those models should be trained on and the data that it should be tested on?
However, I kept the test data as its normal imbalanced distribution rather than oversampling that too because I want to test my model on the real-world distribution. I would like to use RapidMiner's automodel feature, but every time I try to use it then it just splits my training data into another train-test split and does its own thing.
How do I use Auto Model while specifying the data that those models should be trained on and the data that it should be tested on?
Tagged:
0
Answers
You'll need to save Automodel output or you best model and then you can reuse the apply model operator to score your hold out data.
Scoring demo | RapidMiner Studio
Machine learning - classification | RapidMiner Auto Model