The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Replace missing values with random values?
Hi there,
is it possible to replace missing values in nominal attributes with random values?
I have a nominal ID (every feature has its own value, like a name), but some of them are missing. If I replace the missing values with average, min or max values, it would destroy the uniqueness of each feature. Can I tell RM to replace each of the missing values by a different value?
Thanks for the help!
is it possible to replace missing values in nominal attributes with random values?
I have a nominal ID (every feature has its own value, like a name), but some of them are missing. If I replace the missing values with average, min or max values, it would destroy the uniqueness of each feature. Can I tell RM to replace each of the missing values by a different value?
Thanks for the help!
0
Answers
yes this is possible in the latest version of RapidMiner. You can add a "Generate Attributes" operator to your process and then click on edit list, and add.
attribute name: choose the exact attribute name that contains missing values
function expressions: if(missing(YOUR_ATTRIBUTE_NAME), round(rand()*YOUR_DESIRED_NUMBER), YOUR_ATTRIBUTE_NAME)
this will replace all missing values with a random value in the range of your choice.
Regards,
Marco
This solution actually does not work. The problem is, it replaces missing values with random values of a range of that attribute. But I would like to replace missing values with random values that do not occour in the other values of this attribute. This is important because this attribute will become the ID, later in the process.
So the replaced values have to be: not included in this attribute beforehand, may not repeat themselfes, should be random (so in case the count of missing values increases from one execution to another).
I hope this specifies the problem a little better. : )
Best regards!
I'm afraid in that case you will have no choice but to use the Execute Script operator and write your own code.
Regards,
Marco