The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Answers
The confidence is representing a scoring or ranking of your items. To estimate the quality of your classification you can look at the ROC-curce or AUC values. The AUC values for instance cannot be biased by class skew, meanwhile the Accuracy can easily be tricked this way (as you mentioned above).
To gain the mentioned probabilities: The calculated confidences are approximations for the probabilities. To better this approximations you use calibration methods. As far as I know the only method implemented is platt scaling, which is the best calibration method for the output of SVM-Classificators and a moderate good method for the output of other classificators.
Feel free to ask if something I explained is not clear
hope this was helpful
kind regards,
Steffen
PS: Indeed, changing the true class distribution can hurt the performance
Strangely, even with the Platt scaling my results look incorrect. The system predicts a 62% confidence EXACTLY THE SAME for every case to fail. That' indicates that it didn't learn the model well.
As I said in a PM, Platt Scaling can help. Maybe the used Classification algorithmn is simply not capable of learning the concept, maybe the set is two small so that PS overadjusts the confidences. Regarding my current amount of information of your situation, I cannot tell you more.
regards,
Steffen
PS: Quote describing the situation of data miners perfectly :