The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
model application/testing problem [SOLVED]
Hello,
I have a problem with model application.
I am using Rapid Miner 5.3, I am using decision tree as a model. I have already trained the data set on labeled data and saved the model. What I am trying to do is to run the model on unlabeled data. Unfortunately I got this error message for several attributes (the one occurring in my decision tree)
Tree: The internal nominal mappings are not the same between training and application for attribute ...
What else could be done to fix the problem?
Thanks a lot for the advice
Regards
Bella
I have a problem with model application.
I am using Rapid Miner 5.3, I am using decision tree as a model. I have already trained the data set on labeled data and saved the model. What I am trying to do is to run the model on unlabeled data. Unfortunately I got this error message for several attributes (the one occurring in my decision tree)
Tree: The internal nominal mappings are not the same between training and application for attribute ...
What else could be done to fix the problem?
Thanks a lot for the advice
Regards
Bella
0
Answers
It could be that the test data contains an attribute with a nominal value that has not been seen during the training phase.
In other words if there is an attribute called "colour" with values "red", "green" and "blue" in the training phase, whilst during the test phase a new value like "purple" is seen, the model does not know how to treat this.
If that is the problem then you would have to pre-process the test data to check that it was correct and usable by the model. I could imagine using the training data to drive various filtering activities and any left over examples in the test data would have to be flagged. In fact, I think I will create a video about it.
regards
Andrew
thanks a lot for the answer. Actually I am testing the model on the same data that it has been trained. Just to see if my classification is good indeed (although the performance, accuracy value was 81%). But in the future I might have to test it on really unlabeled data, that's why I wanted to check if it works on the same (but without a label attribute).
But as during the model creation phase rapid miner shares the data on training and testing part. maybe you are right that something new occurred. Could it?How can I check how the program splits the sample on training and testing part?How to control this?And could this caused the problem?
And if the problem is in not taking all attribute values into account is my classification really good then?:(
I have checked that the range of attribute values is the same in testing and training set. But maybe it is another question?
Thanks for your opinion
Bella
Do you have binominal attributes in your data?
Andrew
most of the attributes are nominal. But it seems I solved the problem at least I am able to avoid it . In the higher Rapid Miner version (6.0) I do not get any error and I am able to predict the labels .
I have read more posts on the forum and based on this I considered trying the process in RM 6.0, as many people reported about some bugs in 5.3 version. So that's my solution .
Nevertheless thanks afor your advice and help
Best regards
Bella