The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Unsupervised Clusteranalysis with rapidminer"
Hi
ive an instance pool with a couple of attributes. I want to classify the instances. Which are the "common" unsupervised clusteranalysis methods that i can use with rapidminer? I cant give the actual outcome for the samples.
On a sidenote: can somebody recommend a readily comprehensible literatur on the topic of unsupervised clusteranalysis?
greez
ive an instance pool with a couple of attributes. I want to classify the instances. Which are the "common" unsupervised clusteranalysis methods that i can use with rapidminer? I cant give the actual outcome for the samples.
On a sidenote: can somebody recommend a readily comprehensible literatur on the topic of unsupervised clusteranalysis?
greez
Tagged:
0
Answers
FYI, classification and clustering are different things. Classification is predicting a whole number, while clustering is grouping similar observations into a number of groups.
What is your data like and what are you trying to accomplish?
RapidMiner has 9 clustering methods, but the common ones are k-means (good for huge data sets), agglomerative, expectation maximization, and DBSCAN.
The book Multivariate Data Analysis by Joseph Hair is the most understandable.
Good luck
Neil
i do want to read data from csv files. Each line represents an instance with a name and a couple of attributes. The attribute values are mostly strings and they can be arbitrary. I need to find a way to identify some representatives for each "group" of instances i have in the data. I cannot examine alle the instances ecause i have 10000s of them in my files, so i need to narrow it down as best as i can.
Many of them have similar or equal attribute values so that i can put them into a group (cluster) and just choose one of the instances from each group as a representative. (Again: i do not know the values of the instances or the attributes) My first thought there was to use clusteranalysis, but maybe i am wrong?!
I would appreciate any help.
no clustering sounds exactly like something which could help you here. Please post in the Problems & Support board of this forum if you have any questions about this can actually be achieved.
Cheers,
Ingo