The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
prediction confidence
hello, someone can explain me how the prediction confidence columns work and how are calculated when I apply a classification model on a test set. Thanks.
Tagged:
0
Answers
whoa, that's a pretty broad question with no definite answer for all learning schemes.
In general, the prediction confidences state how sure the model was for each of the possible values. This is similar to probabilities ("how large is the probability that the class is "positive"?) but not necessarily the same.
How they are calculated? Well, that differs for all model types. For schemes like Naive Bayes and Logistic Regression, the confidences are indeed the probabilities based on the seen training data. If you use an SVM and apply scalings like Platt scaling, it is at least pretty close. For other schemes, things might be different. For example, the confidences of decision trees are the fraction of the class in the applicable leaf against the total number of cases in this leaf.
There are only two ways: Simply accept the confidences as a measurement of how sure the model is and believe it. Or do it the hard way: read all the literature about all the model types and learn how they are calculated in detail. The source code might also help here.
Cheers,
Ingo
this is all fine for me, no need to understand it all in detail, but i would like to put a threshold on the confidence after the model applier is finsihed to get only some examples with a lower or higher threshold. But i can´t, as the confidence is a spezial attribute? or am i doing something complete wrong
well, you have several options for this.
You could use
- the operator "Generate Attributes" (you will have to rename the confidence attributes before since the parentheses would cause problems otherwise...)
- one of the discretization operators
- the operator "Drop Uncertain Predictions" (although this one does not exactly divide your data into discrete bins...)
If the fact that the confidence is a special attribute is a problem somewhere, you could either check the setting "include special attributes" or use the operator "Set Role" before the data transformation is applied.Here is an example using the operator "Generate Attributes":
And here is an example using one of the discretization operators:
Cheers,
Ingo
Hi there,
just to add on this:
are the values indicative of how sure we are in the sense of if the confidence value is 0.785, could we say we are 78.5% confident that this prediction falls into this category?
Or is it more along the lines of 78.5% of entries like this fall into this category too?
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
This is related but is ultimately not the same as the confidence for an individual prediction (or even set of predictions) and it is itself subject to skew based on the confidence threshold selected for classification purposes (see the earlier part of this same thread for a discussion of setting thresholds).
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts