The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Unsupervised Clusteranalysis with rapidminer"

shaihuludshaihulud Member Posts: 20 Contributor II
edited May 2019 in Help
Hi

ive an instance pool with a couple of attributes. I want to classify the instances. Which are the "common" unsupervised clusteranalysis methods that i can use with rapidminer? I cant give the actual outcome for the samples.

On a sidenote: can somebody recommend a readily comprehensible literatur on the topic of unsupervised clusteranalysis?

greez
Tagged:

Answers

  • el_chiefel_chief Member Posts: 63 Contributor II
    Hello,

    FYI, classification and clustering are different things. Classification is predicting a whole number, while clustering is grouping similar observations into a number of groups.

    What is your data like and what are you trying to accomplish?

    RapidMiner has 9 clustering methods, but the common ones are k-means (good for huge data sets), agglomerative, expectation maximization, and DBSCAN.

    The book Multivariate Data Analysis by Joseph Hair is the most understandable.

    Good luck

    Neil
  • shaihuludshaihulud Member Posts: 20 Contributor II
    well its like that:

    i do want to read data from csv files. Each line represents an instance with a name and a couple of attributes. The attribute values are mostly strings and they can be arbitrary. I need to find a way to identify some representatives for each "group" of instances i have in the data. I cannot examine alle the instances ecause i have 10000s of them in my files, so i need to narrow it down as best as i can.
    Many of them have similar or equal attribute values so that i can put them into a group (cluster) and just choose one of the instances from each group as a representative. (Again: i do not know the values of the instances or the attributes) My first thought there was to use clusteranalysis, but maybe i am wrong?!

    I would appreciate any help.

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    no clustering sounds exactly like something which could help you here. Please post in the Problems & Support board of this forum if you have any questions about this can actually be achieved.

    Cheers,
    Ingo
Sign In or Register to comment.