The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Extract Cluster Prototypes component does not show my id attribute
Learner II
How could I pass an attribute with id label through an Extract Cluster Prototypes?
I need to identify the centers of the clusters (centroids) after the process of clustering with k-medoids for this my dataset has an identifier attribute that I set up as being id label using the setRole operator but the Extract Cluster Prototypes component does not show my id attribute. Can someone help me ?
Tagged:
0
Answers
Sorry, I am a bit confused about this question. Extract cluster prototypes will get the centroids of each attribute for different cluster independent of a label. This operator will have your label but it is used only for visualization purpose. If you want the labels and cluster ID you should connect the output (Clustering.clustered set) of k-medoids to result. This will show the attributes, cluster of each sample and label assigned.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
If you have used k-medoids, then you can use Join to pull the averages into your full dataset and then map those centroids back to specific examples by Generate Attrbiutes.
This will show you which individual records match your cluster centroid. See attached example process:
<?xml version="1.0" encoding="UTF-8"?><process version="9.2.001"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process" origin="GENERATED_TUTORIAL"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.2.001" expanded="true" height="68" name="Ripley-Set" origin="GENERATED_TUTORIAL" width="90" x="112" y="34"> <parameter key="repository_entry" value="//Samples/data/Ripley-Set"/> </operator> <operator activated="true" class="generate_id" compatibility="9.2.001" expanded="true" height="82" name="Generate ID" width="90" x="246" y="34"> <parameter key="create_nominal_ids" value="false"/> <parameter key="offset" value="0"/> </operator> <operator activated="true" class="k_medoids" compatibility="9.2.001" expanded="true" height="82" name="Clustering" width="90" x="380" y="85"> <parameter key="add_cluster_attribute" value="true"/> <parameter key="add_as_label" value="false"/> <parameter key="remove_unlabeled" value="false"/> <parameter key="k" value="2"/> <parameter key="max_runs" value="10"/> <parameter key="max_optimization_steps" value="100"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> <parameter key="measure_types" value="MixedMeasures"/> <parameter key="mixed_measure" value="MixedEuclideanDistance"/> <parameter key="nominal_measure" value="NominalDistance"/> <parameter key="numerical_measure" value="EuclideanDistance"/> <parameter key="divergence" value="GeneralizedIDivergence"/> <parameter key="kernel_type" value="radial"/> <parameter key="kernel_gamma" value="1.0"/> <parameter key="kernel_sigma1" value="1.0"/> <parameter key="kernel_sigma2" value="0.0"/> <parameter key="kernel_sigma3" value="2.0"/> <parameter key="kernel_degree" value="3.0"/> <parameter key="kernel_shift" value="1.0"/> <parameter key="kernel_a" value="1.0"/> <parameter key="kernel_b" value="0.0"/> </operator> <operator activated="true" class="extract_prototypes" compatibility="9.2.001" expanded="true" height="82" name="Extract Cluster Prototypes" origin="GENERATED_TUTORIAL" width="90" x="581" y="34"/> <operator activated="true" class="concurrency:join" compatibility="9.2.001" expanded="true" height="82" name="Join" width="90" x="715" y="85"> <parameter key="remove_double_attributes" value="false"/> <parameter key="join_type" value="right"/> <parameter key="use_id_attribute_as_key" value="false"/> <list key="key_attributes"> <parameter key="cluster" value="cluster"/> </list> <parameter key="keep_both_join_attributes" value="false"/> </operator> <operator activated="true" class="generate_attributes" compatibility="9.2.001" expanded="true" height="82" name="Generate Attributes" width="90" x="849" y="85"> <list key="function_descriptions"> <parameter key="Centroid" value="if(att1==att1_from_ES2&&att2==att2_from_ES2,"centroid","not")"/> </list> <parameter key="keep_all" value="true"/> </operator> <connect from_op="Ripley-Set" from_port="output" to_op="Generate ID" to_port="example set input"/> <connect from_op="Generate ID" from_port="example set output" to_op="Clustering" to_port="example set"/> <connect from_op="Clustering" from_port="cluster model" to_op="Extract Cluster Prototypes" to_port="model"/> <connect from_op="Clustering" from_port="clustered set" to_op="Join" to_port="right"/> <connect from_op="Extract Cluster Prototypes" from_port="example set" to_op="Join" to_port="left"/> <connect from_op="Join" from_port="join" to_op="Generate Attributes" to_port="example set input"/> <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="72"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts