The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Is accuracy enough for determining model performance....?
hello
I have 3 predictive models... those are
1. Backpropagation-Based
2. C4.5-Based
3. CN2-Based.
I use accuracy for predictive model performance measurement... and these were their result :
1. Backpropagation ==> 83.14% on Training, 85.12 on Testing
2. C4.5 ==> 83.72 % on Training, 84,04% on Testing.
3. CN2 ==> 82.98 % on Training, 84,65 % on Testing.
when I look at the percentage of accuracy of each algorithm, It means that there is no significant difference between one and the others. My question is that accuracy enough to determine or judge the performance of certain algorithm in certain case? if that so, I just wonder which is the best model among that three model, because... you know, there are no real significant difference between them ::)... ??? (because it could be only 1 or 2 correctly classified vector...)
thank you for your advice,
regards
Dimas Yogatama...
I have 3 predictive models... those are
1. Backpropagation-Based
2. C4.5-Based
3. CN2-Based.
I use accuracy for predictive model performance measurement... and these were their result :
1. Backpropagation ==> 83.14% on Training, 85.12 on Testing
2. C4.5 ==> 83.72 % on Training, 84,04% on Testing.
3. CN2 ==> 82.98 % on Training, 84,65 % on Testing.
when I look at the percentage of accuracy of each algorithm, It means that there is no significant difference between one and the others. My question is that accuracy enough to determine or judge the performance of certain algorithm in certain case? if that so, I just wonder which is the best model among that three model, because... you know, there are no real significant difference between them ::)... ??? (because it could be only 1 or 2 correctly classified vector...)
thank you for your advice,
regards
Dimas Yogatama...
Tagged:
0
Answers
Try some roc or lift charts... but tell us more about your problem...
I partitioned my dataset into 75 % for training, and 25% for testing using stratified split. My problem is clear, I want to find the best model among those three based model I have mentioned earlier.
as for ROC and Lift chart, It's kind of a new stuff to me, unfortunately :-[, but I will try to employ ROC and lift chart to see the difference, could you suggest further reading for ROC and Lift Chart?
One More thing, is there implementation of CN2 in RM5? I got the implementation from Orange...
Thank you very much.
Could you say what is the model about? It should help to define a good evaluation measure.
it's about prediction of benign and malignant tumor based on age and 4 lab test.
I have tried the ROC comparison and the result is just like this
could you help how to read this chart?
Thank you very much
Hmmm... tumor prediction. That is serious stuff, and cost-sensitive prediction is what you are after. You don't want to falsely predict benign if its malign, and is preferable to make many mistakes in saying malign when it's benign rather than saying beningn when it's malign...do I make myself clear?
You should use the cost-sensitive options that come with RapidMiner, I've seen them but I haven't used them though... I can't tell you what parameters to use... maybe somebody else?
http://en.wikipedia.org/wiki/Confusion_matrix
Problem with an ROC curve, its not a number, so you can't compare 2 curves.
Problem with area under curve, it often does not reflect anything meaningful.
This is because you are often only interested in a small part of the curve.
There exists corrected area under curve, but this is not in rapid miner.
I agree partly with what you say about area under the curve, because if you are interested in the best part of the curve (such as this case) then I guess you could just determine a segment and analize it...
Do you have any link to somewhere explaining area under the curve correction? Haven't heard or read of it, thanks!
Confusion matrix is great, get some confusion matrixes up yoga.
here's confusion matrix on train
and here's confusion matrix on test
after reading your comments, I have studied something from my DM book collection, and I would add precision and recall for measuring model performance. my DM books state that recall and precision can show us how our model works in identifying certain pattern in certain classes.
what do you say?
thanks
I have no experience, but I think cost sensitive should help you. Theoritically I think you can set costs for the model to prefer a Zero-Rule (all malign) model if necessary.
I say this because I think that in your case the cost of a false 0 is terrible. Am I right?
Yes, you're right... but I don't have any predefined cost matrix...
my client only want the model to have minimum possible misclassification. but I'm just like you, I'm not really used to using cost-sensitive method. So, Do you think adding precision and recall in performance measurement will satisfy my client?
but once again, any false prediction means terrible...
thanks for your suggestion.
What are the positive examples?
What are the negative examples?
What does it mean to falsely classify a positive example?
What does it mean to falsely classify a negative example?
A way to combine the precision and recall score into a single measure is:
http://en.wikipedia.org/wiki/Cohen%27s_kappa
The kappa score is in Rapid Miner.
When you talk about recall, you mostly mean positive class recall, or sometimes also known as sensitivity.
If you want to make your model more sensitive you need to increase the number of positive examples in your training set.
Most applications need a minimum level of recall to be useful,
They often also need a minimum level of precision to be useful (also known negative class recall). I think you mean specificity here, but maybe you have your 0 and 1 class the other way around, its common to have true positives in the top left corner, you have this in bottom right corner :-/
If you use the area under ROC curve as a measure, you should only measure the area under the useful curve.
Or you should calculate the sensitivity at a fixed level of specificity.
This way you only have a single number and have something intuitive to talk about.
Like model A: sensitivity 60% specificity 99%.
Like model B: sensitivity 70% specificity 99%.
Both systems have 1% false warnings, so when acting on a positive classification in 1% of the cases you waste your money.
(This should be acceptable)
Model B is clearly better then model A, it finds 10% more cases you are interested in.
Thank you for your help?
I'm not sure this works equally well with every classifier.
You either sample the number of negative examples to get a number somewhere around 99% specificity.
Or you apply some threshold using the confidence of the prediction, I guess this is what you want but I'm not sure how to do this in RM5.
If I find out how to do It I post it, but maybe someone else has already done this.
edit: I think you may lose some information if you fix the specificity to 99% but it depends on the exact implementation.