The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Entropy and Gain for Decision Tree more than 1
kinglaplace
Member Posts: 3 Learner III
Hi,
I am a newbie in data mining. I am interested to implement decision tree to predict my case. My case has 9 output prediction. When I try to calculate manually, entropy and gain value more than 1. How to solve it?Then, where can I see the entropy and gain result in rapidminer, so I can compare with manual calculation?
Thank you.
Tagged:
0
Answers
hello @kinglaplace - welcome to the community. Can you please post your XML process (see "Read before Posting on right when you reply)? And have you looked at the videos on decision tree modeling (see "Creating a Decision Tree Model" here)?
Scott
Thank you for your help. Here are I send the data train. How to choose the best model for my data?
Hi @kinglaplace,
To choose the best model for your data, I recommend you the tool Automatic model selection and optimization
Pavithra_Rao).
This tool help to choose the best model (the model which has the best performances) between several optimized models.
I executed this tool with your data to benchmark 3 models (decision tree, Random Forest, Gradient Boosted Tree).
It seems that Gradient Boosted Tree is the best : Accuracy = correct predictions /total predictions = 89.60% (mean), but it is very close of the performance of the Decision Tree.
NB : You have to consider the other performance metrics like recall, precision too.
Here the process :
Now you can experiment by yourself with other models and/or other optimization settings of the actual models.
Regards,
Lionel
Thank you for your information. For decision tree, I've tried to implement by manually calculate for entropy and gain. But the value are more then 1. I always get maximum value for both maximum 1 in every references. How to get entropy and gain display in rapid miner?So I can compare with manual result that have been calculated. Then, I also always got in a lot of example of tree decision for two condition. But in my case there are 8 output condition. Is tree decision can be implemented in more than two output condition?
Thank you.
Hi @kinglaplace,
It seems to me that RapidMiner did not display the entropy and the gain in the results. There is the "cross-entropy" which is calculed by Performance (Classification) operator, but it is a measure of the performance of the model and different from what you are looking for, in my opinion.
Decision tree can of course be implemented in case of 8 output conditions.
Regards,
Lionel