The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Test existing model on a different dataset
viktorvanbeerse
Member Posts: 1 Learner II
Hi,
I have two datasets which are very similar (same attributes & label), yet one of them is incomplete. The assignment is to develop a predictive model (Decision Tree and Logistic Regression) with the "incomplete" data and to validate this on the other dataset. So the goal is to develop the model with one dataset (the "incomplete" one) as training set and to use the other dataset (the "complete" one) as test set. Does anybody know if it is possible to model this issue by means of cross-validation/performance?
Thank you in advance
Viktor
0
Answers
This sounds backwards. You need to train a classification task with a label. This means that you already have some 'truth' on a historical data set. For example, you have a training data set that has labels for churn and loyal. Then you train on that and you use the "incomplete" data set as your scoring set, which will then autogenerate the prediction.
Hi,
not sure what you mean by incomplete data, but assuming it means that some attributes have missing values, it should be straight forward, as long as your training data has sufficient values for the desired labels. See, if the below sample process is doing what you want.
You may need to do some pre-processing though, depending on the learning algorithm you chose.