step by step optimization for multiclass classification

keb1811 · July 2020

Hello together,

I am working on a multiclass classification with different algorithms (decision tree, KNN, Naïve Bayes, SVM, NN) and I am trying to optimize my results. I want to do this step by step so that you can see a process. At first, I only use each algorithm in the cross validation operator. The next step should be the optimization with the grid operator (also inside the cross validation).
Now we come to my first problem:

I am not really sure, which parameters I have to choose in the grid optimization. For Decision tree and KNN ( Naïve Bayes hasn’t any parameters to set up) I took a few parameters and had better results…So it’s fine for me.
But if I choose the following parameters for SVM the process doesn’t run (it runs for many hours, but without a result):
- SVM Gamma 0-1; 10 Steps; Linear
- SVM.C 0-1 ; 10 Steps; Linear
- SVM.epsilon 0.001-1; 10 Steps; Linear

I get the same problem with my neural net algorithm:

- learning rate 0.01-0.4; 10 Steps; log.
- momentum 0.01-0.4; 10 Steps; log.

Is there anything wrong, so that my process doesn’t work?

My next step to optimize my results is to use (next to the grid operator) the sample (balance) operator from the marketplace. I placed the operator before the cross validation. This operator upsamples my minor labels, so that the dataset is more balanced. My question here is:

Is it realistic, that I improve my Recall and Precision from around 35% up to 75%? For me, this happened for Decision Tree, KNN and Naïve Bayes.

So we come to my last question:

Is it a good way/ idea to show a improving process in this order:
1. Only each algorithm
2. Algorithm + grid
3. Algorithm + grid + sample (balance)
4. Algorithm + grid + sample (balance) + bagging/adaboost

Thank you very much.

Regards,

Kevin

Telcontar120 · July 2020

For your SVM model, I would recommend optimizing gamma and C over a much wider range (like from 0.001 through 1000) using logarithmic steps. Your ranges for both of them are way too small now and linear steps aren't good over orders of magnitude.
Conversely, for your neural net, your range is ok but you should be using linear steps over such a small range and not logarithmic ones.

keb1811 · July 2020

Hi @Telcontar120 ,

thanks for your answer. I changed the the range and also the kind of steps (linerar; log.). But my problem still exists, the process will not run til the end. My laptop is calculating for more than 15 hours but the optimize grid operator has only reached 1%. Do you have any recommendation?

Regards

Telcontar120 · July 2020

Without knowing the size and nature of your dataset it is hard to diagnose. Here are some suggestions:
Try running without optimization first (just leave the defaults). Does your process run in a reasonable amount of time? If not, then you should sample down your dataset.
Try optimizing a much smaller set of combinations. Optimize grid operator will take the combined product of all your test values. So if you have 3 parameters to optimize and you are trying to search across 10 values of each one, then you are trying to test 1000 combinations. This is typically not feasible even for large machines. Instead focus on a smaller number of steps. Try to keep the total combinations down to a reasonable number.

keb1811 · July 2020

Thanks for your answer, i will try a smaller set of combinations.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

step by step optimization for multiclass classification

Answers