The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Rank order of attributes to each cluster"

doronadorona Member Posts: 1 Learner III
edited May 2019 in Help
Hi,
I just now started to play around with clustering and using Rapid Miner I was able to get results. Now my problem is how to categorize each cluster. Is there a way to get out of Rapid Miner for each cluster a ranked ordered list of attributes that best describe each cluster?
In addition, it would be great to have an actual value of contribution to the model and a statistic to measure its statistical significance as well.

Thanks 
Tagged:

Answers

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    yes, this is possible with RapidMiner. After clustering, each example in the input data set gets a cluster id assigned. Now you could use the new operator "AttributeConstruction" (will replace the operator FeatureGeneration in future releases together with the new ValueIterator operator). The whole setup looks like this:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_examples" value="200"/>
            <parameter key="number_of_attributes" value="10"/>
            <parameter key="target_function" value="gaussian mixture clusters"/>
        </operator>
        <operator name="IdTagging" class="IdTagging">
        </operator>
        <operator name="KMeans" class="KMeans">
            <parameter key="k" value="5"/>
        </operator>
        <operator name="IOConsumer" class="IOConsumer">
            <parameter key="io_object" value="ClusterModel"/>
        </operator>
        <operator name="ValueIterator" class="ValueIterator" expanded="yes">
            <parameter key="attribute" value="cluster"/>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="AttributeConstruction" class="AttributeConstruction">
                    <list key="function_descriptions">
                      <parameter key="inner_label_%{loop_value}" value="if (cluster == &quot;%{loop_value}&quot;, &quot;%{loop_value}&quot;, &quot;other&quot;)"/>
                    </list>
                </operator>
                <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                    <parameter key="name" value="inner_label_%{loop_value}"/>
                    <parameter key="target_role" value="label"/>
                </operator>
                <operator name="Relief" class="Relief">
                </operator>
                <operator name="IOConsumer (2)" class="IOConsumer">
                    <parameter key="io_object" value="ExampleSet"/>
                </operator>
            </operator>
        </operator>
    </operator>
    Please note that you will have to use the latest CVS version of RapidMiner or you would have to wait until the next release in order to get access to the latest version containing both new operators. It's by the way also possible with older versions but the process is much more complicated then.

    Cheers,
    Ingo
Sign In or Register to comment.