The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Need help on removing classifier model skew
Hi,
To provide a bried background to my exercise,
My objective is to create a SVM Classifier model which would classify customer feedback(attribute) into one of the various categories(Label).For this am trying to generate features from feedback verbatims which I then pass as attributes to the model.
The issue that am facing is, it could be observed from the classification errors of the model that the model is highly skewed towards the categories where the number of occurences was high(for highest frequency segment: class precision = low but class recall = high), i.e, the categories with lower frequencies were also being predicted as the ones with highest frequency. I have tried weighting the lower frequency segments suitably to remove differences in the occurences, but the errors are only getting magnified. Please let me know if there is any other way in which this can be controlled.
Many thanks in advance,
Ram
To provide a bried background to my exercise,
My objective is to create a SVM Classifier model which would classify customer feedback(attribute) into one of the various categories(Label).For this am trying to generate features from feedback verbatims which I then pass as attributes to the model.
The issue that am facing is, it could be observed from the classification errors of the model that the model is highly skewed towards the categories where the number of occurences was high(for highest frequency segment: class precision = low but class recall = high), i.e, the categories with lower frequencies were also being predicted as the ones with highest frequency. I have tried weighting the lower frequency segments suitably to remove differences in the occurences, but the errors are only getting magnified. Please let me know if there is any other way in which this can be controlled.
Many thanks in advance,
Ram
Tagged:
0
Answers
You might also look at the MetaCost operator to increase the penalty for misclassifying the rarer instances.
Hope that helps. I'm sure other people smarter than me will chime in as well. :-)
Keith
I tried using Metacost operator in my modeling flow today, however I got error saying that it cannot take in numerical attributes, and I seem to be unable to understand if I should use it during the before or after %xvalidation operator in the flow. Could you please provide some link where I can find information on the same.
Thanks,
Ram
probably there is an error in your process setup. It seems to me, that you have used a learner inside the metaCost operator, that does not support the handling of numerical attributes. You should check that.
Greetings,
Sebastian
please be a little bit more specific. It would be of great help posting the process for example and describing what you are going to do.
Greetings,
Sebastian
although you just sent a small part of the process, I can definitively say that this will not work. The MetaCost operator will need an inner learner for operating, hence it is called Metacost. It works simply like that:
For performing a cross-validation you need an inner learner. You want to modify the svm for the imbalanced class set by using the metaCost operator. Then put the SVM directly into the MetaCost operator and then put the MetaCost operator as learner inside the SVM.
Greetings,
Sebastian
I tried that and it works. So, I think I am using it correctly. I'm getting the balancing I'm after. Thanks for your help!!
what confused you, was my confusion. Of course I meant it the way, you actual did it
Greetings,
Sebastian