The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to balance examples ?
Hello everybody,
I have a classification problem with two classes and one of those classes is in large excess in my data set.
I would like to use roughly equal numbers of the two classes for my learner and so I wonder, if
there Is a way to select only a subset of the examples whose class is in excess ?
I looked at the Sampling operator, but that samples the same fraction from all classes.
Many thanks,
axel
I have a classification problem with two classes and one of those classes is in large excess in my data set.
I would like to use roughly equal numbers of the two classes for my learner and so I wonder, if
there Is a way to select only a subset of the examples whose class is in excess ?
I looked at the Sampling operator, but that samples the same fraction from all classes.
Many thanks,
axel
0
Answers
There probably is a much smarter way of doing this, but I'm too wrecked to think of it ;D, so you'll have to make do with the following... You'd better test it as well, as I haven't !
Have fun...
if your learner supports weighted examples, you could use the equal label weighting operator. It will distribute over all labels the same amount of weight.
But I guess we should add some sort of balancing operator in the future...
Greetings,
Sebastian
that's not very nice, but it works !
Many thanks,
Axel
P.S. But I think, RapidMiner really needs a special operator for this...
Thanks
Alejandro
well I think you either have to install RM4.x and load it there, store it and import the file, or you could extract another valid RapidMiner 4.x process file, insert the code there and import it with RapidMiner 5.0.
Or you simply build the process manually from scratch...
Greetings,
Sebastian