The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to evaluate clustering
Hello
I want to compare clusters and evaluate which operators should I use?
And
How do I find the optimal parameters for each clustering method?
Thanks
0
Answers
Hi,
finding optimal settings for clustering is indeed a bit tricky.
But RapidMiner offers performance measures for clustering or segmentation tasks.
In the Operator list under Validation -> Segmentation you'll find the corresponding Operators.
If you have a subset of your data, where you exactly know into which cluster each example belongs, you can also try to set the cluster Attribute as a prediction and optimize the classification performance instead.
Best,
David
Hello
Concept of
avg within centroid distance -1.0876
davies bouldin -5.675
What is?
I used Silhouette
What do these results show?
Please guide
Thanks
Hi again,
I guess the Silhoutte performance comes from a 3rd party extension, so I can't say much about it. But wikipedia has an entry about it:
https://en.wikipedia.org/wiki/Silhouette_(clustering)
In short it messaures how similar an Example is to the rest of the cluster. The value is normed between -1 and +1 and a high value indicates a higher similarity.
The Davies–Bouldin criterion is also quite good explained in wikipedia:
https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index
The idea is to maximise the inter-cluster distance (the different between the different clusters) and minimize inter-cluster distances (the points within each cluster should be close together). Here a lower index is better.
Best,
David
Hello
Many thanks
Criterion
AVG within centroid distance -1.043
What is?
What does the Silhouette of each cluster show in the first photo?