The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How do I create balanced clusters?
kikikubikova
Member Posts: 3 Learner I
Hi guys,
I'm pretty new to the community so sorry if my question will seem quite elementary, but how do I create balanced clusters (k-means) - meaning that each cluster will have the same size of items in it? Or is there a way to force a minimum cluster size to anything else than 1?
(What I am trying to do is to create pairs based on some variables - I have a list of villages, their population size, average age, unemployment etc. And for each village in my dataset I am looking for the village with the most similar parameters in all of the variables - matching the most alike villages. My idea was to do N/2 clusters to create pairs, but as I don't know how to do balanced clusters or how to force the minimum size of a cluster to 2 items, the output was N/2 clusters but unfortunatelly there weren't 2 items in each, creating some clusters with i.e. 3 items and some with 1 item in it.)
Thank you for all of your advices (the simpler solution the better ) !
I'm pretty new to the community so sorry if my question will seem quite elementary, but how do I create balanced clusters (k-means) - meaning that each cluster will have the same size of items in it? Or is there a way to force a minimum cluster size to anything else than 1?
(What I am trying to do is to create pairs based on some variables - I have a list of villages, their population size, average age, unemployment etc. And for each village in my dataset I am looking for the village with the most similar parameters in all of the variables - matching the most alike villages. My idea was to do N/2 clusters to create pairs, but as I don't know how to do balanced clusters or how to force the minimum size of a cluster to 2 items, the output was N/2 clusters but unfortunatelly there weren't 2 items in each, creating some clusters with i.e. 3 items and some with 1 item in it.)
Thank you for all of your advices (the simpler solution the better ) !
0
Answers
Dortmund, Germany
thank you for your comment. So, as my master thesis I am analyzing the effect of rainfall on voter turnout. I have turnout and precipitation data for around 600 villages as well as some basic information like unemployment, area, population size etc. My task is to match the villages based on the parameters, finding the most suitable pair to perform a diff-in-diff model (checking how the difference in turnout changes with the difference in precipitation and other independent variables between the two villages during years).
Do you have any idea how to fix the number of items in a cluster? Or how to increase the minimum number of items in a cluster to 2?
Thanks,
Kristina
Dortmund, Germany
Anyway, thank you for your adivce!
BR,
Kristina
Dortmund, Germany