The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
What ate the definition of profit and gain?
Best Answer
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi Koichi,
you're right, this part of AutoModel isn't documented very well.
The cost matrix is applied like this: In the validation part, we know the actual outcome and the predicted value. These are compared and the appropriate cell of the matrix is taken into account. For example, 1 for True Not Worthy and Predicted Not Worthy. These are summarized. If you have negative values in the matrix, you could even get a negative outcome.
"Not having a model" is the situation before applying data mining. E. g. a company just executing all orders, some of them might be fraudulent. This is the baseline. (You wouldn't randomly guess in this situation and cancel one half of all orders.)
Regards,
Balázs5
Answers
You can enter a cost matrix when executing Auto Model.
Quite often, the (financial or other) consequences of correct and wrong classifications are different. E. g. you miss a fraud case - you lose 1000 $; you falsely flag a good case as fraud - somebody has to look at the data (costing you e. g. 2 $).
The false positives and negatives and also the correct predictions are weighted with this cost matrix.
The gain is compared to the baseline of not having a model.
Best regards,
Balázs
OK. Here I can enter Cost Matrix, right? I'm just started to
learn RapidMiner. It's little bit confusing because in Machine Learning usually cost imply cost of "cost function”. And there is no information about this cost Matrix in the left panel. And also I could not find any information about this in RapidMiner documentation and tutorial videos.
>The false positives and negatives and also the correct predictions are weighted with this cost matrix.
Does this mean Confusion matrix multiplied by cost Matrix in each cells?
>The gain is compared to the baseline of not having a model.
The baseline is "Profits for Best Option"?
How RapidMiner calculate "Profits for Best Option"? This may be the baseline of not having a model? For example, Random guessing?
Best Regards,
Koichi
Now, I understood.
Those illustration should be included in the documentation.
Regards,
Koichi