The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
RapidMiner: Handling nominal missing attributes
Hi fellas,
I'm a total noob in RapidMiner, I've just installed it on Mac.
My problem is sort of interesting for me. I'd appreciate any help. Here is my process:
In my process when I apply a "replace missing values" operation on a data set and run the process, only the numeric missing values are replaced by their Average value, nominal (binominal and polynominal) missing values are still missing and are not replaced.
However (this is where it gets strange for me) when I point on the output node of the "replace missing values" operator in the diagram (process view) I see that all missing values are replaced.
I'd really want to know if this is a bug or am I doing some ridiculous mistake.
Thanks a lot.
sav
I'm a total noob in RapidMiner, I've just installed it on Mac.
My problem is sort of interesting for me. I'd appreciate any help. Here is my process:
In my process when I apply a "replace missing values" operation on a data set and run the process, only the numeric missing values are replaced by their Average value, nominal (binominal and polynominal) missing values are still missing and are not replaced.
However (this is where it gets strange for me) when I point on the output node of the "replace missing values" operator in the diagram (process view) I see that all missing values are replaced.
I'd really want to know if this is a bug or am I doing some ridiculous mistake.
Thanks a lot.
sav
Tagged:
0
Answers
"For nominal attributes the mode is used for the average, i.e. the nominal value which occurs most often in the data. For nominal attributes and replacement type zero the first nominal value defined for this attribute is used. The replenishment "value" indicates that the user defined parameter should be used for the replacement."
They get replaced by the mode. If you don't want this, replace only the numeric variables by subsetting them.
Hope this helps,
Ernesto
Thanks for the reply.
The problem is Nominal values dont get replaced by anything when I look at the "Result Perspective/view" and when I export the results missing nominal values are still missing but missing Numeric values are replaced by the average, and the interesting thing is: all missing values seem replaced when I roll-over the result node in process view (as shown in the pictures above)... as if the "replace missing values" operator ignores the nominal values when it comes to results but not in the "process viw (where you design the process)"
what I want is to replace missing numeric values with their average (which happens), and to replace missing nominal values with their mode (which doesnt happen). This is the out put of values.preprocessing (an output of the "replace missing values" operator)
I'd appreciate any help
Thanks,
Sav
the metadata preview (which you see in the tooltip when hovering over the node) is just that: a preview which may or may not become reality after operator execution. You just don't know what will happen sometimes without actually executing the operator (and we obviously cannot do that for the preview), so the metadata preview may be wrong.
That aside, there was indeed a bug involved here which prevented the replacement of nominal attributes. I have fixed the problem so your process should work once the next update gets released.
Regards,
Marco