The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"leave_one_out_performance_problem"
bojansimoski
Member Posts: 2 Contributor I
Hello guys,
so i'm using X-validation for my analysis and i have one question about interpreting the results i have from the performance operator.. So for the accuracy of the classifier i have something like : accuracy: 65.38% +/- 36.08% ; And my question is about the second argument i have here : 36.08% ... What is this? And how is computed ? I need to mention that i use leave one out technique ..
Many Thanks!!
so i'm using X-validation for my analysis and i have one question about interpreting the results i have from the performance operator.. So for the accuracy of the classifier i have something like : accuracy: 65.38% +/- 36.08% ; And my question is about the second argument i have here : 36.08% ... What is this? And how is computed ? I need to mention that i use leave one out technique ..
Many Thanks!!
Tagged:
0
Answers
The first part of the displayed accuracy is the mean accuracy of all N models, and the second part is the standard deviation.
Best,
Marius
Best, Marius
And interpreting the results in that situation they are strange.
I got results
84.26 +/- 36.08 or 63.38 +/- 47.57
and if in both cases I assume that this standart deviation is computed as sqrt(p(1-p)). Taking as p=accuracy (so p=0.8426. for instance) I got then the value 0f the standard deviation shown . In the example sqrt(0.8426(1-0.8426)). But this I think is not ok, bacause accuracy is not a bernoulli distribution. I think the value should be further divided by sqrt(N).... So my question is as Bojan how is this standard deviation computed?
thank you?
AMT
Best,
Marius
But here I do not think that it is what it was used. With one example you got correct and non-correct.
At the the end of the n iterations, a count variable with a binomial distribution is obtained as at each iteration a bernoulli distribution.
And what I was pointing it is that this standard deviation seems to be estimated using the formulas of the standard deviation for a bernoulli distribution ----- sqrt(p(1-p))) ------ and this I did not found in wikipedia page you point. So how it is really estimated the standard deviation.
Another point it is how you interpret a result like the ones I showed where performance can have such large spread? Even being larger than 100%?
you can transform: p(1-p) = p - p^2, which is equivalent to the standard formula for the standard deviation where the values are only 0 or 1.
Best,
Marius
But this is the point. I think that to compute the std (standard deviation) of the accuracy you need further divide by sqrt(n) ... What do you think?
Greetings
A.M. Tomé
With accuracy values in 0 and 1 the usefulness of this value is certainly questionable. Same applies to the +- notation, since it's not the error of the accuracy.
We will discuss that here at Rapid-I. Thanks for your input!
Best,
~Marius
Any new about this comment?
AMT
Best regards,
Marius