The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"some questions about hierarchical clustering"
1. When I look into the results of the clustering and choose the dendrogram view, all I see is a number of lines very close to the bottom of the chart pane, with no text or other type of information.
When does this happen and what might be wrong with my model or data? (see below for related information)
2. When i export the model to an XML file (with the Write operator) I see a number of "distance" tags. They encapsulate numeric values which seem to be the cluster distances and they all are either 0.0 or a negative number (preceded with "-"). Is this the cluster distance? And if this is the case, what would make the distance be negative? I am using the Jaccard distance to generate the model
3. The flatten operator seems useful to get a simple list of some clusters. Is it possible to obtain the distance for each of the clusters in the output of the flatten operator?
Thanks for the help, in advance.
When does this happen and what might be wrong with my model or data? (see below for related information)
2. When i export the model to an XML file (with the Write operator) I see a number of "distance" tags. They encapsulate numeric values which seem to be the cluster distances and they all are either 0.0 or a negative number (preceded with "-"). Is this the cluster distance? And if this is the case, what would make the distance be negative? I am using the Jaccard distance to generate the model
3. The flatten operator seems useful to get a simple list of some clusters. Is it possible to obtain the distance for each of the clusters in the output of the flatten operator?
Thanks for the help, in advance.
Tagged:
0
Answers
After trying using the cosine similarity instead of the Jaccard similarity. now i get positive distances and the dendrogram makes more sense now.
I wonder if there is something wrong with the implementation or use of the Jaccard similarity in this operator.
One suggestion: it would be great if by clicking on a dendrogram branch one could see the nodes that are part of that branch, just like it happens on the graph view.
I still have question #3, if anyone know about an answer for it.
Thanks.