The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Performance Operator presents wrong value in the table view and description"

AlexeyAlexey Member Posts: 3 Contributor I
edited June 2019 in Help
Somehow I ran into following situation:
after applying a model trained with a decision tree in a cross validation the performance result is looking very "strange". Suddenly on the precision and recall view the rows and the columns are changed causing all the values as precision and recall to become (100% - x).

I attach the pictures from the view. This happened with the recent Rapidminer version downloaded from the website on my Macbook Pro with Mac OSX 10.10.1.

Accuracy View
image

Recall View
image

Description View
image
Tagged:

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi Alexey,

    Could you try to use

    1. Reorder attributes right in front of the apply model and the training of the model
    2. use remap binominal before the cross validation

    Do you do anything special inside the x-val which could change the meta-data (Append, Union,...)?

    By the way - are you working on a IACT like HESS, Magic or Veritas?

    Best,

    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • AlexeyAlexey Member Posts: 3 Contributor I
    Hey,

    I've tried both, reordering the attributes and remapping binomial before the cross validation. In both cases nothing changed the output.

    In the cross validation I just learn the decision tree, apply the model, select the recall, apply the threshold and then calculate the performance. I suppose what is causing the problem is sample(bootstrapping). As we have less examples from one class, I was trying to use bootstrapping in order to get somehow similar amount of both classes and then training the model. This was just a try and it didn't really worked that well as expected, but never mind. The way was as following: get all examples of one class, use Sample(bootstrapping), use Union with the other unmatched data and then sampling data for training from that unified data. Just in case of bootstrapping I get this strange result.

    Yes, I'm working with the FACT data, somehow base on the work from Marius Helf  ;) Asking in english, as everything here is in english and maybe someone else would run into the same configuration.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi

    The problem is not the sample but the union. The Union is changing the meta data and then it might be, that the labels are switched in there internal representation. You might put a remap binominal after the union and before the x-val and map them by hand to the internal positive/negative values.

    If it does not work, try, to use the simple sample operator. There you can use "balance classes" and define the ratios for gamma and proton seperatly. This should do the trick :).
    Did you try to use weights? Should be fine for a decision tree.

    Ohh it's FACT :-). I looove the project. As you might know i did my phd on icecube but was "a bit" involded in the data analysis of FACT (because of the coffee machine).
    Are you at Wolfgangs or Katharinas chair? In the physics department it is always useful to talk with Tim about the problems.


    Best,

    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Alexey, nice to see that my work is finally being reused. So investing in your scholarship finally pays off :)

    Good luck for your thesis and happy mining!
    ~Marius
  • AlexeyAlexey Member Posts: 3 Contributor I
    Martin Schmitz wrote:

    The problem is not the sample but the union. The Union is changing the meta data and then it might be, that the labels are switched in there internal representation. You might put a remap binominal after the union and before the x-val and map them by hand to the internal positive/negative values.
    I've tried this trick, but this doesn't solve the problem.
    Martin Schmitz wrote:

    If it does not work, try, to use the simple sample operator. There you can use "balance classes" and define the ratios for gamma and proton seperatly. This should do the trick :).
    Did you try to use weights? Should be fine for a decision tree.
    I've already used the "normal" sampling. Using bootstrapping was just a thought of how to get some similar amount of proton data. There is about 100k gamma examples and 40k proton examples. I was thinking of using more examples while still holding gamma and proton at the same level (50/50). But this seems to lead to some problems. I'm still not sure why these happens.
    Martin Schmitz wrote:

    Ohh it's FACT :-). I looove the project. As you might know i did my phd on icecube but was "a bit" involded in the data analysis of FACT (because of the coffee machine).
    Are you at Wolfgangs or Katharinas chair? In the physics department it is always useful to talk with Tim about the problems.
    I'm new on this project, but I've heard of it before and found it pretty amazing. :)
    Marius wrote:

    Hi Alexey, nice to see that my work is finally being reused. So investing in your scholarship finally pays off :)
    I'm not the first reusing your work! ;) The scholarship was definitely a really good help, though I wasn't able to handle an internship or something similar. This is still not my thesis, but just work. But who knows, how it all ends up!
Sign In or Register to comment.