The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Predicting rare event with Auto Model
niall_dempster
Member Posts: 1 Learner I
Hi,
I am trying to develop 2 models that predict relatively rare events (the F3/4 column and F4 column in attached file). I am a physician and not too familiar with machine learning so trying to get up to speed. I used Turbo Prep to impute missing data in the attached training/validation database and I have a separate independent Testing database that I would like to use once the models have been generated.
Initially using Auto Model, accuracy seemed to be prioritised (every case was predicted to be index 1, which was almost always correct since index 2 is infrequent). However, for this problem it is important to have a sensitive model so I am picking up cases of the rare event (index 2). Is it possible to optimise the AUC/Youden Index rather than accuracy?
So far I have tried adding in custom settings for costs and benefits, so that predicting range 1 where true range is 2 is penalised, and correctly predicting true range 2 is rewarded. Are there recommended numbers to add in for these costs/benefits?
Many thanks for your help
BW,
Niall
I am trying to develop 2 models that predict relatively rare events (the F3/4 column and F4 column in attached file). I am a physician and not too familiar with machine learning so trying to get up to speed. I used Turbo Prep to impute missing data in the attached training/validation database and I have a separate independent Testing database that I would like to use once the models have been generated.
Initially using Auto Model, accuracy seemed to be prioritised (every case was predicted to be index 1, which was almost always correct since index 2 is infrequent). However, for this problem it is important to have a sensitive model so I am picking up cases of the rare event (index 2). Is it possible to optimise the AUC/Youden Index rather than accuracy?
So far I have tried adding in custom settings for costs and benefits, so that predicting range 1 where true range is 2 is penalised, and correctly predicting true range 2 is rewarded. Are there recommended numbers to add in for these costs/benefits?
Many thanks for your help
BW,
Niall
Tagged:
-1
Answers
The cost/benefits are typically based on domain knowledge. Think it like this, what profits you will have for every correct predictions and how much money you will lose if you predict incorrectly and then you can use the exact values in the matrix.
Auto model main criterion is set to classification error. However, you can open process of your best performing model and change the main criterion. It is in (4) - SCORING, VALIDATION, EXPLANATIONS, WEIGHTS & SIMULATOR section. You can open Validate Model sub-process and evaluate different options with performance operator.
Harshit