The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Computations for Cluster Distance Performance operator

amitdamitd Member, University Professor Posts: 49 Maven
edited November 2021 in Help
I am having trouble replicating the computations of the "avg. within cluster distance" metrics produced by the Performance (Cluster Distance Performance) operator.

The operator documentation states - "avg._within_centroid_distance: The average within cluster distance is calculated by averaging the distance between the centroid and all examples of a cluster." The term "avg_within_centroid_distance" seems confusing to me because the definition is actually stating that it is "avg_within_cluster_distance" which are two different concepts altogether. Also, it is not clear how the overall "avg._within_centroid_distance" is computed in addition to the metric computed for each cluster.

I have attached the sample calculations for the Iris dataset along with the RapidMiner process. I was able to replicate the Davies Bouldin index but not the "avg._within_centroid_distance". Any help would be much appreciated.

On a related note, it is also not clear to me what the Performance (Cluster Density Performance) operator is computing and how. I did read the operator documentation but it did not make sense to me.


Answers

  • amitdamitd Member, University Professor Posts: 49 Maven
    I figured out that the "avg._within_centroid_distance" computes the average of the squared Euclidean distance between each observation and the corresponding centroid, not the Euclidean distance. 

    If someone could clarify what the Performance (Cluster Density Performance) operator is computing, that would help. Thanks.
Sign In or Register to comment.