The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Entropy and Gain for Decision Tree more than 1

kinglaplacekinglaplace Member Posts: 3 Learner III
edited November 2018 in Help

Hi,

I am a newbie in data mining. I am interested to implement decision tree to predict my case. My case has 9 output prediction. When I try to calculate manually, entropy and gain value more than 1. How to solve it?Then, where can I see the entropy and gain result in rapidminer, so I can compare with manual calculation?

 

Thank you.

Answers

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @kinglaplace - welcome to the community.  Can you please post your XML process (see "Read before Posting on right when you reply)?  And have you looked at the videos on decision tree modeling (see "Creating a Decision Tree Model" here)?

     

    Scott

     

     

  • kinglaplacekinglaplace Member Posts: 3 Learner III

    Thank you for your help. Here are I send the data train. How to choose the best model for my data?

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @kinglaplace,

     

    To choose the best model for your data, I recommend you the tool Automatic model selection and optimization

    Pavithra_Rao).

    This tool help to choose the best model (the model which has the best performances) between several optimized models.

    I executed this tool with your data to benchmark 3 models (decision tree, Random Forest, Gradient Boosted Tree).

    It seems that  Gradient Boosted Tree  is the best  : Accuracy = correct predictions /total predictions = 89.60% (mean), but it is very close of the performance of the Decision Tree.

    NB : You have to consider the other performance metrics like recall, precision too. 

    Here the process : 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_csv" compatibility="8.0.001" expanded="true" height="68" name="Read CSV (2)" width="90" x="179" y="136">
    <parameter key="csv_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Data Mining Train Data.csv"/>
    <parameter key="column_separators" value=","/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="encoding" value="windows-1252"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Gas 1.true.real.attribute"/>
    <parameter key="1" value="Gas 2.true.real.attribute"/>
    <parameter key="2" value="Gas 3.true.real.attribute"/>
    <parameter key="3" value="Gas 4.true.real.attribute"/>
    <parameter key="4" value="Gas 5.true.real.attribute"/>
    <parameter key="5" value="Gas 6.true.real.attribute"/>
    <parameter key="6" value="Gas 7.true.real.attribute"/>
    <parameter key="7" value="Target Fault .true.polynominal.attribute"/>
    <parameter key="8" value="att9.true.attribute_value.attribute"/>
    <parameter key="9" value="att10.true.attribute_value.attribute"/>
    <parameter key="10" value="att11.true.attribute_value.attribute"/>
    <parameter key="11" value="att12.true.attribute_value.attribute"/>
    <parameter key="12" value="att13.true.attribute_value.attribute"/>
    <parameter key="13" value="att14.true.attribute_value.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="313" y="136">
    <parameter key="attribute_name" value="Target Fault"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="optimize_parameters_grid" compatibility="8.0.001" expanded="true" height="145" name="Optimize Parameters (Grid)" width="90" x="581" y="136">
    <list key="parameters">
    <parameter key="Select Subprocess.select_which" value="[1.0;4;3;linear]"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="select_subprocess" compatibility="8.0.001" expanded="true" height="124" name="Select Subprocess" width="90" x="514" y="34">
    <parameter key="select_which" value="4"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation (4)" width="90" x="45" y="34">
    <process expanded="true">
    <operator activated="true" class="h2o:gradient_boosted_trees" compatibility="7.6.001" expanded="true" height="103" name="Gradient Boosted Trees" width="90" x="112" y="136">
    <list key="expert_parameters"/>
    </operator>
    <connect from_port="training set" to_op="Gradient Boosted Trees" to_port="training set"/>
    <connect from_op="Gradient Boosted Trees" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (4)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance (4)" width="90" x="313" y="34">
    <parameter key="weighted_mean_recall" value="true"/>
    <parameter key="weighted_mean_precision" value="true"/>
    <parameter key="cross-entropy" value="true"/>
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (4)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (4)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (4)" from_port="labelled data" to_op="Performance (4)" to_port="labelled data"/>
    <connect from_op="Performance (4)" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance (4)" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Cross validation subprocess to to build learner model and validate it's performance</description>
    </operator>
    <connect from_port="input 1" to_op="Cross Validation (4)" to_port="example set"/>
    <connect from_op="Cross Validation (4)" from_port="model" to_port="output 2"/>
    <connect from_op="Cross Validation (4)" from_port="performance 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    <portSpacing port="sink_output 3" spacing="0"/>
    <portSpacing port="sink_output 4" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="103" name="Multiply (2)" width="90" x="45" y="340"/>
    <operator activated="true" class="optimize_parameters_grid" compatibility="8.0.001" expanded="true" height="103" name="Optimize Parameters DT" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="Decision Tree.criterion" value="gain_ratio,information_gain,gini_index,accuracy"/>
    <parameter key="Decision Tree.minimal_gain" value="[0.01;1;100;linear]"/>
    <parameter key="Decision Tree.apply_pruning" value="true,false"/>
    <parameter key="Decision Tree.apply_prepruning" value="true,false"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation" width="90" x="514" y="34">
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="103" name="Decision Tree" width="90" x="179" y="34">
    <parameter key="criterion" value="information_gain"/>
    <parameter key="apply_pruning" value="false"/>
    <parameter key="apply_prepruning" value="false"/>
    <parameter key="minimal_gain" value="0.2674"/>
    </operator>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="313" y="34">
    <parameter key="weighted_mean_recall" value="true"/>
    <parameter key="weighted_mean_precision" value="true"/>
    <parameter key="cross-entropy" value="true"/>
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <connect from_port="input 1" to_op="Cross Validation" to_port="example set"/>
    <connect from_op="Cross Validation" from_port="performance 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Optimize the parameters of the model and performance parameters</description>
    </operator>
    <operator activated="true" class="set_parameters" compatibility="8.0.001" expanded="true" height="82" name="Set Parameters (4)" width="90" x="179" y="340">
    <list key="name_map">
    <parameter key="Decision Tree " value="Decision Tree (2)"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">Picks up the optimized parameters and applies a set of parameters to the specified operators</description>
    </operator>
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="103" name="Decision Tree (2)" width="90" x="112" y="595"/>
    <connect from_port="input 1" to_op="Multiply (2)" to_port="input"/>
    <connect from_op="Multiply (2)" from_port="output 1" to_op="Optimize Parameters DT" to_port="input 1"/>
    <connect from_op="Multiply (2)" from_port="output 2" to_op="Decision Tree (2)" to_port="training set"/>
    <connect from_op="Optimize Parameters DT" from_port="performance" to_port="output 1"/>
    <connect from_op="Optimize Parameters DT" from_port="parameter" to_op="Set Parameters (4)" to_port="parameter set"/>
    <connect from_op="Set Parameters (4)" from_port="parameter set" to_port="output 3"/>
    <connect from_op="Decision Tree (2)" from_port="model" to_port="output 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    <portSpacing port="sink_output 3" spacing="0"/>
    <portSpacing port="sink_output 4" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="103" name="Multiply (4)" width="90" x="45" y="34"/>
    <operator activated="true" class="optimize_parameters_grid" compatibility="8.0.001" expanded="true" height="103" name="Optimize Parameters RF" width="90" x="179" y="34">
    <list key="parameters">
    <parameter key="Random Forest (2).number_of_trees" value="[1.0;10;10;linear]"/>
    <parameter key="Random Forest (2).criterion" value="gain_ratio,information_gain,gini_index,accuracy"/>
    <parameter key="Random Forest (2).apply_prepruning" value="true,false"/>
    <parameter key="Random Forest (2).apply_pruning" value="true,false"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation (2)" width="90" x="514" y="34">
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_random_forest" compatibility="8.0.001" expanded="true" height="103" name="Random Forest (2)" width="90" x="246" y="34">
    <parameter key="criterion" value="accuracy"/>
    <parameter key="apply_pruning" value="false"/>
    <parameter key="apply_prepruning" value="false"/>
    </operator>
    <connect from_port="training set" to_op="Random Forest (2)" to_port="training set"/>
    <connect from_op="Random Forest (2)" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance (2)" width="90" x="313" y="34">
    <parameter key="weighted_mean_recall" value="true"/>
    <parameter key="weighted_mean_precision" value="true"/>
    <parameter key="cross-entropy" value="true"/>
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
    <connect from_op="Performance (2)" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance (2)" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <connect from_port="input 1" to_op="Cross Validation (2)" to_port="example set"/>
    <connect from_op="Cross Validation (2)" from_port="performance 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_parameters" compatibility="8.0.001" expanded="true" height="82" name="Set Parameters (5)" width="90" x="179" y="238">
    <list key="name_map">
    <parameter key="Random Forest (2)" value="Random Forest (3)"/>
    </list>
    </operator>
    <operator activated="true" class="concurrency:parallel_random_forest" compatibility="8.0.001" expanded="true" height="103" name="Random Forest (3)" width="90" x="112" y="595">
    <parameter key="number_of_trees" value="8"/>
    <parameter key="criterion" value="information_gain"/>
    <parameter key="apply_prepruning" value="false"/>
    </operator>
    <connect from_port="input 1" to_op="Multiply (4)" to_port="input"/>
    <connect from_op="Multiply (4)" from_port="output 1" to_op="Optimize Parameters RF" to_port="input 1"/>
    <connect from_op="Multiply (4)" from_port="output 2" to_op="Random Forest (3)" to_port="training set"/>
    <connect from_op="Optimize Parameters RF" from_port="performance" to_port="output 1"/>
    <connect from_op="Optimize Parameters RF" from_port="parameter" to_op="Set Parameters (5)" to_port="parameter set"/>
    <connect from_op="Set Parameters (5)" from_port="parameter set" to_port="output 3"/>
    <connect from_op="Random Forest (3)" from_port="model" to_port="output 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    <portSpacing port="sink_output 3" spacing="0"/>
    <portSpacing port="sink_output 4" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="103" name="Multiply (5)" width="90" x="45" y="34"/>
    <operator activated="true" class="h2o:gradient_boosted_trees" compatibility="7.6.001" expanded="true" height="103" name="Gradient Boosted Trees (2)" width="90" x="112" y="544">
    <parameter key="number_of_trees" value="6"/>
    <parameter key="learning_rate" value="0.4"/>
    <list key="expert_parameters"/>
    </operator>
    <operator activated="true" class="optimize_parameters_grid" compatibility="8.0.001" expanded="true" height="103" name="Optimize Parameters GBT" width="90" x="179" y="34">
    <list key="parameters">
    <parameter key="Gradient Boosted Trees (3).number_of_trees" value="[1.0;10;10;linear]"/>
    <parameter key="Gradient Boosted Trees (3).learning_rate" value="[0.1;0.9;8;linear]"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation (5)" width="90" x="514" y="34">
    <process expanded="true">
    <operator activated="true" class="h2o:gradient_boosted_trees" compatibility="7.6.001" expanded="true" height="103" name="Gradient Boosted Trees (3)" width="90" x="246" y="34">
    <parameter key="number_of_trees" value="10"/>
    <parameter key="learning_rate" value="0.9"/>
    <list key="expert_parameters"/>
    </operator>
    <connect from_port="training set" to_op="Gradient Boosted Trees (3)" to_port="training set"/>
    <connect from_op="Gradient Boosted Trees (3)" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (5)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance (5)" width="90" x="313" y="34">
    <parameter key="weighted_mean_recall" value="true"/>
    <parameter key="weighted_mean_precision" value="true"/>
    <parameter key="cross-entropy" value="true"/>
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (5)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (5)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (5)" from_port="labelled data" to_op="Performance (5)" to_port="labelled data"/>
    <connect from_op="Performance (5)" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance (5)" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <connect from_port="input 1" to_op="Cross Validation (5)" to_port="example set"/>
    <connect from_op="Cross Validation (5)" from_port="performance 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_parameters" compatibility="8.0.001" expanded="true" height="82" name="Set Parameters (2)" width="90" x="179" y="289">
    <list key="name_map">
    <parameter key="Gradient Boosted Trees (3)" value="Gradient Boosted Trees (2)"/>
    </list>
    </operator>
    <connect from_port="input 1" to_op="Multiply (5)" to_port="input"/>
    <connect from_op="Multiply (5)" from_port="output 1" to_op="Optimize Parameters GBT" to_port="input 1"/>
    <connect from_op="Multiply (5)" from_port="output 2" to_op="Gradient Boosted Trees (2)" to_port="training set"/>
    <connect from_op="Gradient Boosted Trees (2)" from_port="model" to_port="output 2"/>
    <connect from_op="Optimize Parameters GBT" from_port="performance" to_port="output 1"/>
    <connect from_op="Optimize Parameters GBT" from_port="parameter" to_op="Set Parameters (2)" to_port="parameter set"/>
    <connect from_op="Set Parameters (2)" from_port="parameter set" to_port="output 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    <portSpacing port="sink_output 3" spacing="0"/>
    <portSpacing port="sink_output 4" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Subprocess to Optimize number of models and its performance</description>
    </operator>
    <connect from_port="input 1" to_op="Select Subprocess" to_port="input 1"/>
    <connect from_op="Select Subprocess" from_port="output 1" to_port="performance"/>
    <connect from_op="Select Subprocess" from_port="output 2" to_port="result 1"/>
    <connect from_op="Select Subprocess" from_port="output 3" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Actomatically picks the process which produces the optimized model</description>
    </operator>
    <connect from_op="Read CSV (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 4"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 2"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="result 2" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    <portSpacing port="sink_result 5" spacing="0"/>
    <description align="center" color="yellow" colored="false" height="74" resized="true" width="764" x="165" y="10">This process automatically picks the optimized model out of the number of models built inside Select subprocess operator&lt;br/&gt;The outer optimize operator, optimizes on the Select subprocess parameter to pick a process(insideselect subprocess operator) which has optimized model results for the given input data</description>
    </process>
    </operator>
    </process>

    Now you can experiment by yourself with other models and/or other optimization settings of the actual models.

     

    Regards,

     

    Lionel

     

     

     

     

     

     

  • kinglaplacekinglaplace Member Posts: 3 Learner III

    Thank you for your information. For decision tree, I've tried to implement by manually calculate for entropy and gain. But the value are more then 1. I always get maximum value for both maximum 1 in every references. How to get entropy and gain display in rapid miner?So I can compare with manual result that have been calculated. Then, I also always got in a lot of example of tree decision for two condition. But in my case there are 8 output condition. Is tree decision can be implemented in more than two output condition?

    Thank you.

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @kinglaplace,

     

    It seems to me that RapidMiner did not display the entropy and the gain in the results. There is the "cross-entropy" which is calculed by Performance (Classification) operator, but it is a measure of the performance of the model and different from what you are looking for, in my opinion.

    Decision tree can of course be implemented in case of 8 output conditions. 

     

    Regards,

     

    Lionel

     

Sign In or Register to comment.