The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Clustering with KMeans"
Hi,
I have a question about the Weka W-SimpleKMeans algorithm. When I use the operator, is there anywhere in the result mode a detailed description of the clusters which are found during the analysis ? In the Text Mode Windows theres just the number of clusters and the item number which belongs to it.
When I use SimpleKMeans in Weka there is a detailed description of each cluster with the attribute values. Is there anything like that in the Rapidminer Version ?
Thanks in advance, Birger.
I have a question about the Weka W-SimpleKMeans algorithm. When I use the operator, is there anywhere in the result mode a detailed description of the clusters which are found during the analysis ? In the Text Mode Windows theres just the number of clusters and the item number which belongs to it.
When I use SimpleKMeans in Weka there is a detailed description of each cluster with the attribute values. Is there anything like that in the Rapidminer Version ?
Thanks in advance, Birger.
Tagged:
0
Answers
if you replace the Weka version by the RapidMiner Operator KMeans, you will have details about your centroids.
Greetings,
Sebastian
the problem with the RapidMiner KMeans is, that it can only handle numerical attributes, while the Weka version can also handle nominal ones. Is there any other solution for this problem ?
Thanks in advance, Birger
you cannot apply KMeans on nominal values. This will not work correctly, because KMeans implicitly always uses the euclidean distance between examples. And this distance is simply not defined for the difference between nominal values like apples and eggs.
You might switch to KMedoids and use one of the mixed measures for calculating the distance, or you could transform nominal values into numerical ones in a reasonable manner beforehand. What's reasonable depends mainly on the data and it's meanings, so automatic conversions like Weka does, cannot be always reasonable.
RapidMiner provides several operators for this transformations like Nominal2Binominal or Nominal2Numerical. Take a look at them and think how to represent your nominal values by numeric values, which will somehow reflect an ordering or a weighting of importance.
Greetings,
Sebastian
thanks for the effort, I will take a closer look at the operators and what i'm trying to analyse.
Birger
Furthermore, i would like to write the description of the Cluster Medoids to a file, but i haven't found any IO Operator for this ?
I'm sorry, but I think this isn't possible yet.
Greetings,
Sebastian
Birger.