The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How do I define the distance measure to be used for clustering methods?
So I'm currently working on different clustering methods to analyse music data.
So how do I set the measure?
I'm using RapidMiner as a library and want to use e.g. the k-Means method. I've already initialized everything but I'm still struggling to define the Distance Measure to be used. I'd like to be able to choose between all of those RapidMiner offers for numerical values, but can't find how I'd have to set it and if it would be possible to get a list of those measures the clustering method supports.
I set the operator and its parameters in my class like this:
Operator clusterer = OperatorService.createOperator(FastKMeans.class);
clusterer.setParameter("k", new Integer(k).toString());
...
But the Distance Measure isn't set via a parameter but based on the given example set (in e.g. FastKMeans.class):
DistanceMeasure measure = this.getInitializedMeasure(eSet);
vs.
int k = this.getParameterAsInt("k");
So how do I set the measure?
Tagged:
0
Best Answer
-
jczogalla Employee-RapidMiner, Member Posts: 144 RM EngineeringHi @DiePaupi!
You should be able to find those parameters in all the cluster operators. They should show up when callinggetParameterTypes()
In code, you can see that they are added by this line:types<span>.addAll(getMeasureParameterTypes());</span>
So it should be as easy to set these parameters as with the parameter k as you are already doing.
Cheers
Jan1
Answers
You can find information on the parameters regarding measures on github.
The measures are adjusted/initialized based on the given data, but they are provided by parameters as you can see when looking at the operator's parameters in Studio:
To set the measure to a specific one, you have to set the measure type first (using the Constant PARAMETER_MEASURE_TYPES, measure_types), which has the possible values "MixedMeasures", "NominalMeasures", "NumericalMeasures", "BregmanDivergences".
Secondly you set the specific measure to use with the corresponding parameter (one of PARAMETER_[NOMINAL|NUMERICAL|MIXED]_MEASURE or PARAMETER_DIVERGENCE) and set the the value to one of the possibilites provided in the different type arrays in above mentioned class. You can of course just use the correct strings here, but we recommend to use the constants where possible.
If you have more questions, feel free to ask!
Cheers
Jan
Paupi