The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
ARFF files with ? for nominal data
Hi all,
I'm new to rapidminer so I apologize in advance for any stupid comments that I make.
I have an ARFF file on which I am trying to run a Decision Tree. The problem is that one of my nominal variables has only "?" as values and the decision tree algorithm fails with an error message, if I remove that variable beforehand it finishes correctly with the right result. Is there any way to alleviate that problem? I am going to process automatically a lot of those ARFF files which are also automatically generated so if there is a way to handle the situation more gracefully it would be great.
Thank you very much for the help, it is highly appreciated.
Nikolay
I'm new to rapidminer so I apologize in advance for any stupid comments that I make.
I have an ARFF file on which I am trying to run a Decision Tree. The problem is that one of my nominal variables has only "?" as values and the decision tree algorithm fails with an error message, if I remove that variable beforehand it finishes correctly with the right result. Is there any way to alleviate that problem? I am going to process automatically a lot of those ARFF files which are also automatically generated so if there is a way to handle the situation more gracefully it would be great.
Thank you very much for the help, it is highly appreciated.
Nikolay
0
Answers
there is a bunch of possible solutions:
As a starter you can use the remove useless attributes to filter out all attributes that have always the same value. This will affect attributes having unknown values all the time, too.
Another solution would incorporate the replace missing values or the impute missing values operator. You could take a look at their documentation for more information.
Last but not least you simply could filter out attributes that have missing values with the select attributes operator.
Which of this solutions suits you best depends on your task and on what you are going to make with the generated model.
Greetings,
Sebastian
thank you very much for the quick and helpful reply, it was exactly what I needed.
Best Regards,
Nikolay