How can I apply my model with optimize parameters on a test set?

Samira_123 · May 2020

Hello,

I have a question regarding my classification assignment. I have to predict whether or not donors will donate (class 0 and class 1).

I built a model thanks to the 'optimize parameters' (as it was advised here) and I used the random forest. I got a relevant kappa, a good coefficient matrix and a costs matrix.

The performance of the model is satisfying but I have an issue.
I want to 'Apply Model' on a test set (from read csv) with the model I built with optimize parameters. However, when I try to apply the model to get the predictions of the this test set, there is an issue with rapid miner.

I need to apply the model on this test set to get the class predictions of the donors but unfortunately I can't.

I tried to find information online but didn't find anything relevant. The way I proceed is perhaps not correct.

Once I get the class predictions from this test set, I have to use the write csv.

Thank you,

Wish you all a good weekend!

lionelderkrikor · May 2020

@Samira_123,

The attributes in your training set and in your test set must be strictly the same.
In other words, all the preprocessing steps you applied to your training set before modelling have to be performed in your test set too.
I see, especially, that, in your training set, you use :
- 2 Generates Attributes operators : you have to generate the same attributes in your test set before scoring.
- a Nominal to Numerical operator : This operator performs "one hot encoding" on your training set and it generates new attributes.
You have to apply this operator to the attributes concerned in your test set too according the following principle :

Image: https://us.v-cdn.net/6030995/uploads/editor/nk/vp2l1h17h9bg.png

Regards,

Lionel

lionelderkrikor · May 2020

Hi @Samira_123,

You can use, in your training process , the Store operator to store your trained model in the RapidMiner repository :

Image: https://us.v-cdn.net/6030995/uploads/editor/uv/kblwruextnvd.png

Then open a new process and retrieve the model from the RapidMiner repository and use it to score your test set via Apply Model operator :

Image: https://us.v-cdn.net/6030995/uploads/editor/1u/jutgsc4t6he0.png

In attached files, the 2 processes using the Titanic datasets (training and testing).

If you still encounter an error after performing the proposed solution , please describe your issue, share your process and your data in order we can reproduce, understand and fix your issue.

Regards,

Lionel

Samira_123 · May 2020

Hi @lionelderkrikor,

Thank you for your answer

I did these steps. Here you can find the screenshots and the datasets of my model and my database. I had to join the first 3 tables to build my model, then I needed to use donors to predict as my test set (there is only one column in this dataset 'potentional donors').

I did this initally but there is still an issue

Image: https://us.v-cdn.net/6030995/uploads/editor/l5/4uhauyfsdrlz.png

Samira_123 · May 2020

@lionelderkrikor

As there is only one column in donors to predict, I should have joined the 4 tables in the beginning instead of joining only 3 of them.

I was just afraid to biaised my model by using it in the beginning but I use split data in the optimizer process.

Thank you for your answer

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

How can I apply my model with optimize parameters on a test set?

Best Answer

Answers