The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Decision tree with one node in spite of low confidence and min gain"

olgakulesza2olgakulesza2 Member Posts: 15 Learner III
edited June 2019 in Help

Helo, 

I have a problem with my decision tree. It generated only one node. Then I started to minimize the confidence even to 0.1 and min gain to 0.001. However, it didn't help. Could you please tell me what to do?

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.1.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.1.003" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.1.003" expanded="true" height="68" name="Retrieve Books_Ratings_Tags_forUser10" width="90" x="112" y="85">
<parameter key="repository_entry" value="Books_Ratings_Tags_forUser10"/>
</operator>
<operator activated="true" class="split_data" compatibility="8.1.003" expanded="true" height="103" name="Split Data" width="90" x="246" y="85">
<enumeration key="partitions">
<parameter key="ratio" value="0.8"/>
<parameter key="ratio" value="0.2"/>
</enumeration>
</operator>
<operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.1.003" expanded="true" height="103" name="Decision Tree" width="90" x="447" y="34">
<parameter key="confidence" value="0.1"/>
<parameter key="minimal_gain" value="0.001"/>
</operator>
<operator activated="true" class="apply_model" compatibility="8.1.003" expanded="true" height="82" name="Apply Model" width="90" x="581" y="136">
<list key="application_parameters"/>
</operator>
<connect from_op="Retrieve Books_Ratings_Tags_forUser10" from_port="output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Decision Tree" to_port="training set"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Decision Tree" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Best wishes

Olga

Best Answer

  • rfuentealbarfuentealba RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn
    Solution Accepted

    Hello @olgakulesza2,

     

    I loaded your example, but don't have your data. However, I noticed that you have selected both "apply pruning" and "apply prepruning" on the parameters. You might want to adjust these settings, as these effectively reduce the amount of leaves generated in the tree.

     

    What helps me adjusting a tree with "some" brute force: count how many columns are on the dataset and adjust the maximal depth to the amount of columns + 1. If this does not satisfy your needs, begin playing with the prepruning parameters before pruning right away. Do it adjusting the amount of leaves and divisions, and rerunning the model until you are OK with your results. A piece of advice on top of this is that you might find that Cross-Validation and Optimize Parameters used together can help creating a tree that is good enough for your data.

     

    All the best,

     

    Rodrigo.

Answers

Sign In or Register to comment.