The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

c4.5

henileihenilei Member Posts: 2 Contributor I
i have data, 1 field contais numerical values. how rapid miner split the numerical value using c4.5?

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    RapidMiner's Decision Tree tries all possible split values of a numeric attribute and selects the value which produces the best split with respect to the selected criterion.

    Best,
      Marius
  • wesselwessel Member Posts: 537 Maven
    It uses all values that are the data set right?
    Probably, an implementation that uses sorting, can be even faster.
    Because you know an optimal split point is always halfway between to data points.

    For example, assume you have a numerical axis from left to right (attribute x) and labels A en B (class attribute).

    AAAAAAAAAAA|BBBBBBBBBBBBBBBBBBB
    ----------------------|------------------------------->
                                v                                   x-axis
                       optimal split point

    As far as I'm aware you can not create some picture where the optimal split point is not half way in between.

    ABABABABABABABABABAB
    ------------------------------------>
                                        x-axis
    There is no optimal split point here? Splitting on x here provides 0 information gain.


    AAABBA
    ------|----------->
          v              x-axis
    optimal split point





    I hope you get the idea.






    By the way, you can actually create data sets like this and see where it splits!
Sign In or Register to comment.