The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Class imbalance & GenerateWeight use
Hi everyone,
I have a highly imbalance example and I want to use weighting to increase the performance of the classifiers I want to use in my project. The problem I have is that once I put the Generate Weight in my model, some classifiers operators like SVM, Logistic Regression, Random Forest, etc. show the messaje "Input example set has weights, but the learner will ignore them".
Could some body help to me how to use the Generate Weight in imbalance examples?
Any other ideas how to balance an example using RM are welcome.
Thanks.
0
Answers
Hi,
yes, many learning methods do not support weighted examples and hence it can't be the way around this problem if you want to use them. You can right click on an operator and click on Show operator info to see the capabilities of each operator and what it supports (like weighted examples).
So you either choose a method that does support example weights, or you cannot use them.
Anyway, if you sample, weight or otherwise bias your training data set, please be aware that this will shift the class aprio probabilities. Let's say you will give each class 50%-50% weight. Then the algorithm will predict 50% of each data set as being the minor class. Not sure if that's what you want?
I would recommend the following approach:
1. Select a useful performance measure (accuracy it is most likely not in a highly imbalanced data set)
2. Optimize the classifiers according to that measure
3. Optimize feature selection
4. Find a good threshold on the confidence levels to split. Per default the split is always 50% in a 2 class problem. But perhaps you want to detect more true positives and rather suffer from false positives. Then you can shift the split value. If you have weights you can calculate that on training data.
Greetings,
Sebastian
Here's a good article on how to handle class imbalance:
http://www.ele.uri.edu/faculty/he/PDFfiles/ImbalancedLearning.pdf
Hi @crojasm,
Besides the really good tips already given, you can also try out the SMOTE Upsampling operator from the Operator Toolbox Extension.
It allows you to upsample your minority class.
Best regards,
Fabian