The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"[SOLVED] Performance Operator Evaluation - Mathematics / Derivation"

jaysunice3401jaysunice3401 Member Posts: 6 Contributor II
edited June 2019 in Help
Would someone be able to enlighten me on the mathematics behind some of the performance evaluation metrics and/or point me to a nice resource/website?  Specifically, if I am using a Performance (Classification) Operator, I would like to know how the following are derived:
  • Accuracy: specifcially, the +/- %
  • The difference between the mikro percentages and the given percentages
  • Classification Error vs. Relative Error vs. Root Mean Squared Error
  • How/Why the +/- % for Accuracy, Weighted Mean Recall, and Classification Error are different and why
Thank you.

Jason
Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Jason,

    - the accuracy is defined as the probability that a new example is classified correctly. It is calculated as (#ofCorrectPrediction/#numberOfExamples)
    - the classification error is 1-accuracy
    - the absolute error is calculated via the following formula: sum(1-confidence(trueClass)) / #numberOfExamples
    - the relative error is  absolute_error * 100%
    - the root mean squared error is calculated as: sqrt(  sum(  (1-confidence(trueClass))^2  ) / #numberOfExamples )

    The +- and the makro/mikro values are only calculated if the performance is estimated by a Cross Validation. In that case, the accuracy is calculated for each fold (iteration) of the validation. The makro performance is the average of the performance value of all folds, the +- states the standard deviation of that value.
    For the mikro average remember that each fold of the X-Validation uses 10% of the data set as test set and creates predictions on that set. After all 10 folds, there exist predictions for the complete dataset, and you can calculate the accuracy based on these predictions. The result is the mikro average. Since it is calculated from only on single dataset, there is no standard deviation.

    Hope this helps!

    Best regards,
    Marius
Sign In or Register to comment.