The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"[v4.4] Support for Clustering based on SOM?"

ruserruser Member Posts: 40 Maven
edited May 2019 in Help
I have the Rapidminer v4.4 installed. I would like to do the clustering using the SOM (Self-Organizing Maps) algorithm.
If I directly search for SOM in the Rapidminer folder, I could find lot of files related to SOM. But, I do not find it in the Operators list that are shown in the Rapidminer GUI page.
Can somebody help here? Thanks!
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    RapidMiner does not support som clustering directly. Nevertheless you could use the SOM DimensionalityReduction operator to perform something equivalent to a clustering.
    In order to do so, switch the number_of_dimensions to 1 and use the target number of clusters for the net_size. Each node of the net will then attract the most similar examples and hence can be seen as a centroid. The index of the node will be returned as value in the new dimension, so that you can change this attributes role to be the cluster attribute using the ChangeAttributeRole operator.

    Greetings,
      Sebastian
  • ruserruser Member Posts: 40 Maven
    > RapidMiner does not support som clustering directly.
    As I understand, SOM method is one of the very popular algorithm for the Clustering. I do not understand why it has not been implemented/integrated directly in the Rapidminer till now.

    > Nevertheless you could use the SOM DimensionalityReduction operator to perform something equivalent to a clustering.
    > In order to do so, switch the number_of_dimensions to 1 and use the target number of clusters for the net_size. Each node of the net
    > will then attract the most similar examples and hence can be seen as a centroid. The index of the node will be returned as value in
    > the new dimension, so that you can change this attributes
    Which attribute? What do we have to mention for this in the configuration of SOMDimensionalityReduction/ChangeAttributeRole operator?

    > role to be the cluster attribute using the ChangeAttributeRole operator.
    Ok
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    since SOM clusterings are very similiar to KMeans, we decided that there were more urgent features to add...But this might become a topic again, when this features are released.

    Greetings,
      Sebastian
  • ruserruser Member Posts: 40 Maven
    and, what do we have to mention for the configuration of SOMDimensionalityReduction/ChangeAttributeRole operator?

    Also, I currently have only the following operators:
    - Operator for taking input
    - SOMDimensionalityReduction
    - ChangeAttributeRole operator

    I assume, we dont need to include the Kmeans here, if the intention is to use the Clustering using the SOM algorithm. Or, do I have to include it?

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    the kMeans operator would probably yield very similar results, but is not needed in order to generate a clustering by soms.

    Greetings,
      Sebastian
  • ruserruser Member Posts: 40 Maven
    After the SOMDimentionalityReduction is completed (with number_of_dimensions=1, net_size=9), we'll have the data clustered into max 'net_size' number of cluster groups.
    If there are 100000 rows and 100 columns in the input data, and if the Clustering group is available for each of the row, it is very difficult to go through this clustered data manually and see how the clustering groups (on what basis) have been formed.
    Is there a way, to get a summary of each of the cluster?

    e.g. something like
          age:20-30 and salary:10K-20K are in cluster-1, and
          age:31-33 and salary:20K-25K are in cluster-2, and
          .......

    Can Rapidminer give such a summary?Also in general, is giving such a summary, part of the clustering algorithm?
Sign In or Register to comment.