The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Categorical Clustering"

pvelandopvelando Member Posts: 5 Contributor II
edited June 2019 in Help
Hi all,

I'm trying to clouster this data that has numerical and categorical attributes:

high 177 180 187 180 177 188 177 189 177 166 166 164 170 170 160 164 167 168
weight 86 79 85 83 87 80 78 80 82 72 66 65 79 67 61 61 63 68
Param1 A M V M A M V V A V N M N V A N A M
Param2 H H H H H H H H H M M M M M M M M M

There is no way to convert categorical attributes in numercial.

I would like to know which would be the right algorithm to cluster this data that takes into consideration the non-numerical attributes; which are certainly relevant in term of clustering significance (k-means definetly does not work).

Well, thank you very much in advance,
Tagged:

Answers

  • pvelandopvelando Member Posts: 5 Contributor II
    After some testing. I've seen that agglomerative clustering might work, although the results are not very handy.
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello

    K-means will work with this data if you use the distance measure 'mixed euclidean distance'. You will probably have to normalize the numerical attributes to be between 0 and 1 for all the attributes to have an equal influence.

    Regards

    Andrew
Sign In or Register to comment.