The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Final prediction in bagging algorithm"
adrian_crouch
Member Posts: 8 Contributor II
Hello RM community,
I'm not certain whether I'm wrong but I always thought that the bagging meta algorithm should select the final prediction on the basis of a majority vote (in classification). While averaging the numeric confidences generated by the individual models for a label value this would mean that the final confidence may not directly map to the final prediction.
Lets say we have three models that are aggregated and the models predict confidences of 0.4., 0.4 and 0.9 for class 'A' and 0.6, 0.6, 0.1 respectively for class 'B' for a given example in a binominal classification. When averaging these confidences, class 'A' would get a confidence of 0.567 and class 'B' 0.433. In a majority voting approach I would however expect 'B' as the finally predicted class as it was 2 times predicted by the three models while class 'A' was predicted only once.
This does not correlate with the implementation in the BaggingModel (version 5.3.008). There it is the label value for the highest averaged confidence that is finally chosen - which for the example above was 'A' due to the higher confidence of 0.567.
Could someone tell me if I made a mistake with my thinking here?
Many thanks,
Adrian
I'm not certain whether I'm wrong but I always thought that the bagging meta algorithm should select the final prediction on the basis of a majority vote (in classification). While averaging the numeric confidences generated by the individual models for a label value this would mean that the final confidence may not directly map to the final prediction.
Lets say we have three models that are aggregated and the models predict confidences of 0.4., 0.4 and 0.9 for class 'A' and 0.6, 0.6, 0.1 respectively for class 'B' for a given example in a binominal classification. When averaging these confidences, class 'A' would get a confidence of 0.567 and class 'B' 0.433. In a majority voting approach I would however expect 'B' as the finally predicted class as it was 2 times predicted by the three models while class 'A' was predicted only once.
This does not correlate with the implementation in the BaggingModel (version 5.3.008). There it is the label value for the highest averaged confidence that is finally chosen - which for the example above was 'A' due to the higher confidence of 0.567.
Could someone tell me if I made a mistake with my thinking here?
Many thanks,
Adrian
Tagged:
0
Answers
it simply comes down to weighted or unweighted average. I think both are useful. Brimans original RF implementation used unweighted.
~Martin
Dortmund, Germany
So I don't exactly get the point. Am I misinterpreting something or is it indeed a bug in the bagging implementation?