how to get performance of test data where the label has no values

User111113 · January 2020

Hi All,

I tried to find out how my model is performing on Training data and I was able to do it successfully

Image: https://us.v-cdn.net/6030995/uploads/editor/uf/q9921l4o23e9.jpg

Now I wanted to see how it is going to perform on test data so I added another apply model and performance and of course my test data like below

Image: https://us.v-cdn.net/6030995/uploads/editor/vr/pqefir1a4sce.jpg

I got below errors: That's probably because my "label parameter" is blank in "test data" as I wanted to see what values it will predict....... I am able to get results of prediction but to see how my model is performing on completely new set of data with no values in label.... can we do that if yes then how?

squared_error: unknown

root_mean_squared_error: unknown

if I am trying to put "Set Role" in between "Apply model" and "Performance" I am able to set that predicted variable as my "label" which is not right because that predicted variable column is not present in the original test data so that's not working

Image: https://us.v-cdn.net/6030995/uploads/editor/nh/tqehknzsrw62.jpg

User111113 · January 2020

Yes. Nov 19 data was not in training set but I feel I have very limited options on how many models I can run. I see only 3 models, mostly 2 GBT, & random forest to work with my data as it has only 1 real/int variable which is response rate and all others are polynomial.

varunm1 · January 2020

to see how my model is performing on completely new set of data with no values in label.... can we do that if yes then how?

Nope, regular performance metrics cannot be calculated without the original known label.

User111113 · January 2020

@varunm1

Thank you for your response.

My next step was to run it without performance and save the results in an excel file then I ran that excel as an input to the same model to see the error rate and it came as 0. Can you tell me why?

Please see below screenshot

Image: https://us.v-cdn.net/6030995/uploads/editor/f1/x8jvrdzip5xp.jpg

Image: https://us.v-cdn.net/6030995/uploads/editor/y3/uzx3a6ydjh6l.jpg

User111113 · January 2020

I did one more thing and I think I did it right this time.

The result set that was generated above was from the model and feeding same data to the model obviously would show 0 deviation.

Now I put original data for example I predicted response rate for Nov 2019 and I already have the actual/original so I fed that as an input to see how much the result set would deviate from original and I got root mean squared error as 0.016

which isn't bad what do you think?

varunm1 · January 2020

If this nov 2019 data is not in your training then the RMSE is low, which is fine. You can try different models and see if you could get better

EnragedWasp · January 2020

Now I put original data for example I predicted response rate for Nov 2019 and I already have the actual/original so I fed that as an input to see how much the result set would deviate from original and I got root mean squared error as 0.016

Telcontar120 · January 2020

Basically to find the performance of newly scored data you will need to wait until enough time passes for you to assign the label using the same logic that was embedded in your original model development sample. Then you can load that in and merge it with the dataset containing the predictions, and then use the typical performance operators on that combined dataset to see how the model did.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

how to get performance of test data where the label has no values

Best Answer

Answers

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing