The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Measuring clustering quality for a previously clustered data

singing_bird_1singing_bird_1 Member Posts: 16 Contributor I
edited November 2018 in Help

Hi all, I am new in rapidminer and I have a clustered data that has been clustered previously  and I want to load this data with its lable to rapidminer to be evaluated using one of the clustering evaluation measures

Note: I don't want to recluster my data, I want to evaluate it as it is with its lables.

How can I do this?

Thanks in advance

Answers

  • FBTFBT Member Posts: 106 Unicorn

    Edit: I misread your question. Would you be able to post your data, or parts of it? Measuring the performance should be straightforward, as long as labels and relevant attributes are available. 

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I

    thanks for you reply

    attached is a part of the data and its clusters

    they are 3 clusters

    the problem is that the performance (SSE) requires the data (which is not the problem) and requires the centroid which is unknown, because it is already labeled.

    silhouette requires the data , the model or the centroid as well as the similarity measure

    how can i arrange the nodes in the process to get the quality of the given data? and which nodes should i use?

     

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I
  • FBTFBT Member Posts: 106 Unicorn

    Ok, I don't think you can make any meaningful performance evaluations like this, because the data is missing information (e.g. the cluster model). What would you like to achieve? I.e. what is the question about the clusters that you would like to have answered?

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I

    my question is how to achieve the clustering quality despite the missed info as cluster model and distance measure in silhouette?

    if the answer is : it is impossible to achieve the clustering quality here in rapidminer because of the missing info, so give me a way to measure the clustering quality via another program or give me the SSE and the silhouette code

Sign In or Register to comment.