The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Decision tree : keep id"
Hello,
I'm trying to use decision tree on RapidMiner and I can't find how to keep id during the process. Here is an example of what I get on rapidminer:
Result :
I'm trying to use decision tree on RapidMiner and I can't find how to keep id during the process. Here is an example of what I get on rapidminer:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="501" width="683">
<operator activated="true" class="retrieve" compatibility="5.1.006" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="generate_id" compatibility="5.1.006" expanded="true" height="76" name="Generate ID" width="90" x="179" y="30"/>
<operator activated="true" class="generate_attributes" compatibility="5.1.006" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="30">
<list key="function_descriptions">
<parameter key="indic" value="rand()"/>
</list>
</operator>
<operator activated="true" class="numerical_to_polynominal" compatibility="5.1.006" expanded="true" height="76" name="Numerical to Polynominal" width="90" x="447" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="indic"/>
</operator>
<operator activated="true" class="discretize_by_bins" compatibility="5.1.006" expanded="true" height="94" name="Discretize" width="90" x="45" y="210">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="|Humidity|Temperature"/>
<parameter key="number_of_bins" value="3"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.006" expanded="true" height="76" name="Set Role" width="90" x="179" y="210">
<parameter key="name" value="Play"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.006" expanded="true" height="76" name="Set Role (2)" width="90" x="313" y="210">
<parameter key="name" value="indic"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="decision_tree" compatibility="5.1.006" expanded="true" height="76" name="Decision Tree" width="90" x="447" y="210"/>
<connect from_op="Retrieve" from_port="output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
<connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Discretize" to_port="example set input"/>
<connect from_op="Discretize" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
<connect from_op="Decision Tree" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="180"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Result :
Wind = falseIn the results, for each node, there is the number of occurrences for each values, but I need to get the id of each one. I mean I need to get this kind of result:
| Temperature = range1 [-∞ - 71]: 0.406 {0.038=0, 0.209=0, 0.248=0, 0.406=1, 0.075=1, 0.575=0, 0.578=0, 0.118=0, 0.746=1, 0.397=0, 0.524=0, 0.639=0, 0.716=0, 0.297=0}
| Temperature = range2 [71 - 78]: 0.118 {0.038=0, 0.209=0, 0.248=0, 0.406=0, 0.075=0, 0.575=0, 0.578=0, 0.118=1, 0.746=0, 0.397=1, 0.524=0, 0.639=0, 0.716=0, 0.297=0}
| Temperature = range3 [78 - ∞]: 0.038 {0.038=1, 0.209=0, 0.248=1, 0.406=0, 0.075=0, 0.575=0, 0.578=0, 0.118=0, 0.746=0, 0.397=0, 0.524=0, 0.639=0, 0.716=1, 0.297=0}
Wind = true
| Outlook = overcast: 0.578 {0.038=0, 0.209=0, 0.248=0, 0.406=0, 0.075=0, 0.575=0, 0.578=1, 0.118=0, 0.746=0, 0.397=0, 0.524=0, 0.639=1, 0.716=0, 0.297=0}
| Outlook = rain: 0.575 {0.038=0, 0.209=0, 0.248=0, 0.406=0, 0.075=0, 0.575=1, 0.578=0, 0.118=0, 0.746=0, 0.397=0, 0.524=0, 0.639=0, 0.716=0, 0.297=1}
| Outlook = sunny: 0.209 {0.038=0, 0.209=1, 0.248=0, 0.406=0, 0.075=0, 0.575=0, 0.578=0, 0.118=0, 0.746=0, 0.397=0, 0.524=1, 0.639=0, 0.716=0, 0.297=0}
Wind = falseIs there a simple way to do this kind of thing with decision tree?
| Temperature = range1 [-∞ - 71]: 0.406 {4,5,9}
| Temperature = range2 [71 - 78]: 0.118 {8,10}
| Temperature = range3 [78 - ∞]: 0.038 {1,3,13}
Wind = true
| Outlook = overcast: 0.578 {7,12}
| Outlook = rain: 0.575 {6,14}
| Outlook = sunny: 0.209 {2,11}
Tagged:
0