The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Twoing Algorithm

ebaybekebaybek Member Posts: 1 Learner III
edited November 2018 in Help

Hi Gurus,

 

You have to help me. We have a projekt for CART(Classification and Regression Trees) (Twoing Algortihm or Gini Algorithm). But ıf we could look for the twoing algorithm in RapidMiner Studion we found nothing. which algorithm is the best to them.

We thought that the bests are Classification by Regression or 'Classification and Decision Trees'. Could you recommend one of them or anything else

 

Thanks in advance

best regards

Eser Baybek

Tagged:

Answers

  • yyhuangyyhuang Administrator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    We have Decision Tree, Random Tree operators for CART. 

     

    When you grow the tree based on some splitting criteria or impurity measurements, you can choose from 

     

    • information_gain: The entropy of all the attributes is calculated. The attribute with minimum entropy is selected for split. This method has a bias towards selecting attributes with a large number of values.
    • gain_ratio: It is a variant of information gain. It adjusts the information gain for each attribute to allow the breadth and uniformity of the attribute values.
    • gini_index: This is a measure of impurity of an ExampleSet. Splitting on a chosen attribute gives a reduction in the average gini index of the resulting subsets.
    • accuracy: Such an attribute is selected for split that maximizes the accuracy of the whole Tree.

     

    According to the definition by IBM ftp://public.dhe.ibm.com/software/analytics/spss/support/Stats/Docs/Statistics/Algorithms/13.0/TREE-CART.pdf

    When the number of classes in your example, C=2, twoing is equivalent to the usual impurity index. A quick check on the classic R library RPART (Recursive Partitioning and Regression Trees) , and twoing is neither a part of RPART. But this paper about rpartOrdinal

    offer some twoing options in R. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2899711/

    And as you may know it is quite easy to integrate your R codes in rapidminer.

     

    Hope this hepls.

     

     

Sign In or Register to comment.