The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Dealing with Imbalanced Data
I'm studying the consequences of imbalanced data. I'm trying to replicate some earlier papers on the topic (e.g. Japkowicz 2002).
This is what I need to do, but I'm stuck:
1) Take the original dataset
2) Split it according to the value of the label (call the two new example sets : Common and Rare).
3) Resample (bootstrap) the Rare ExampleSet until it has the same size as the Common ExampleSet.
4) Join the resampled Rare with the old Common.
I can do it outside Rapid-I, but I was wondering if it can be done with a few operators.
Thanks in advance for any help,
\E
This is what I need to do, but I'm stuck:
1) Take the original dataset
2) Split it according to the value of the label (call the two new example sets : Common and Rare).
3) Resample (bootstrap) the Rare ExampleSet until it has the same size as the Common ExampleSet.
4) Join the resampled Rare with the old Common.
I can do it outside Rapid-I, but I was wondering if it can be done with a few operators.
Thanks in advance for any help,
\E
0
Answers
http://rapid-i.com/rapidforum/index.php/topic,1246.msg4786.html#msg4786