Problem with hierarchical clustering

elena20 · April 2018

hello. I used the prossecc document from data and tf-idf
I used the top down clustering and agglomerative clustering operator
How do I optimize the number of clusters?
And how do I evaluate them?
Can I use performance distance clustering?
Please, tutors
Thankful

MartinLiebig · April 2018

Hi @elena20,

please have a look at the operator "Flatten Clustering". This reduces the hierachy to n-leaves. Afterwards you can go forward with usual cluster performance measures.

Best,

Martin

elena20 · April 2018

Thank you very much
But
How can I evaluate hierarchical paraphernalia? Do you send a sample without wounding?
Thank you

Telcontar120 · April 2018

I don't understand your last question at all, but you can use any standard clustering performance metric, such as DB index. However, since clustering is unsupervised, I would say your own use case should guide your evaluation at least as much as any formal metric. What are you clustering and for what purpose? Based on that purpose, how many clusters is reasonable versus too many? Etc.

elena20 · April 2018

Hello
So much
I want to do a hierarchical clustering on Twitter. And then compare with kmeans clustering. Is he honey
Which operator to evaluate hierarchical results?
Performance clustering distance operator error
Thankful

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Problem with hierarchical clustering

Answers