"Clustering Performance"

devl82 · March 2010

Hello:)

I m trying to perform image segmentation using rapidminer's clustering algorithms. Except K-means, who completes execution in aproximatelly 3-4 minutes, other methods (EM, k-medoids, Kernel k-means) never seem to converge (although on a Q6600 with 2GB, rapidminer never uses more than 30% of my cpu).

My data are simple features derived from pixels such as texture, magnitude, gradient etc all normalized to 0-1 (for each 300x400 image, a 300x400x3 feature matrix is extracted).

Do i need more powerfull cpu/memory or some kind of different normalization/preprocessing specifically for these algorithms??

Thnk you & sorry for the long msg (O>o)

land · March 2010

Hi,
I don't think the problem is your computer, but compared to K-Means all other flat cluster methods take a factor equal to your number of examples longer to converge. That's because K-Means utilizes some neat properties of the euclidean distance measure to be faster.

So you might buy a faster computer, that would speed up the calculation a bit, but as you see on the workload, most of your cores are just doing nothing. So instead of buying a faster computer it would be more efficient to give us the money and let us implement a multi threaded version of the algorithms, so that it runs parallel. This would give you a speedup of factor 3 on your machine.

Another possibility would be to reduce the dimensionality of the examples for example using a PCA.

Greetings,
Sebastian

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"Clustering Performance"

Answers