The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

input ncluster for k-means

kdamodarankdamodaran Member Posts: 2 Contributor I
edited November 2018 in Help
Hi all,
I am new to rapidminer. I am interested in applying k-means clustering for a dataset consisting of a few thousand elements, and the attributes are real valued. So, the standard, sum of squared distances to the centroid will work as the metric for convergence.

A couple of trials I have run using k-means just partitions the data into two clusters, which seems to be the default? How can specify the number of clusters?

Thanks,
Dam

Answers

  • dan_agapedan_agape Member Posts: 106 Maven
    Hi,

    Click on the k-Means operator box in the process and set k in the Parameters window to the desired value.

    BTW, the convergence of the algorithm is given by the fact that the centroids do not change in two consecutive
    iterations. Regarding the sum of squared distances (i.e. the squared error), it provides a criterion to select the best solution among the generated possibly multiple solutions.

    Regards,
    Dan
  • kdamodarankdamodaran Member Posts: 2 Contributor I
    dan_agape wrote:

    Hi,

    Click on the k-Means operator box in the process and set k in the Parameters window to the desired value.

    BTW, the convergence of the algorithm is given by the fact that the centroids do not change in two consecutive
    iterations. Regarding the sum of squared distances (i.e. the squared error), it provides a criterion to select the best solution among the generated possibly multiple solutions.

    Regards,
    Dan
    That's what I was expecting too. But I don't get a Parameter window. Am I not seeing that's totally obvious?! The only thing that seems close in the dialog box is "Show Operator Info", which also doesn't have a parameter window.
    On a related note, is it possible to retain the nominal ids of the elements being processed. Sure, we can always drop the clustering output into excel and match with original ids but ............
    Thanks for your help!
    Dam
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    might be you have deactivated the according view. Go to the menu View, select Show View and then Parameters if not already selected.
    For more information about RapidMiner's gui and the concepts in general I would suggest you  take a look at the Manual that's available in english and german.

    Greetings,
      Sebastian
Sign In or Register to comment.