"AdaBoost performance on new data (test dataset) MUCH worse than without AdaBoost"

miaque · December 2015

Hello,
I have the following problem:
I am working on dataset of data suitable for modeling the classification problem of digits recognition.

The database consists of 64 normal attributes + one for the class. It consists of nearly 5000 examples and is divided for training set (30 digit-writers) and test set (another, new 14 writers).

For my study project I am obliged to use the meta-learning operators. I faced the problem, that without use of AdaBoost operator, the results are aprox. 85% for the training set (X-Validation) and aprox. 80% for testing set (new data). When I try to implement AdaBoost, the results from X-Validation of training set are getting better - aprox. 90%, and MUCH WORSE for the new data - only 20% of accuracy!

Can anyone know what can be the issue here?

Thank you!

MartinLiebig · December 2015

seems like you overtrain, right?

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"AdaBoost performance on new data (test dataset) MUCH worse than without AdaBoost"

Answers