The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Validation of k-means Clustering
tiramisusann
Member Posts: 9 Contributor II
Hi everybody,
I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.
First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?
What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.
Does anybody know? I really would apprecciate your help and your ideas.
Best,
tiramisusann
I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.
First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?
What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.
Does anybody know? I really would apprecciate your help and your ideas.
Best,
tiramisusann
Tagged:
0
Answers
This link might help...
rapidminernotes.blogspot.com/search/label/Clustering
The reason the values are negative is that some operators work by trying to maximise performance - a negative value that tends to 0 fits this requirement although in reality the absolute value is the one to use.
regards,
Andrew