The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Problem with overfitting
Hello,
I have a problem with overfitting.
It is a classification with 8 label values and 6 attributes with about 5.5 million values each.
By cross validation with 10 folds, my decision tree reaches an accuracy of about 93%. Unfortunately, when I apply the model to new data, I only get a test accuracy of 33%.
Can anyone tell me how to prevent overfitting on the training data?
I have chosen the following parameters for the decision tree:
criterion: information gain
maximum depth: 30
apply pruning: yes
confidence: 0.24
apply prepruning: yes
minimum gain: 0.0
minimum leaf size: 1
minimum size for slit: 1
number of prepruning alternatives: 0
Greetings
Simon
0
Answers
Dortmund, Germany
Dortmund, Germany
Dortmund, Germany
I have now carried out the cross validation with a batch, but with the same result.
Dortmund, Germany