The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Optimized parameters for SVM Linear
Hi friends,
Anyone know wath's the best way to optimize C and epsilon parameter on a linear SVM model (libsvm)? I'm classifying about 3000 docs on 11 categories.
Is there a relationship to the number of features, doc ou categories that's give me a starting point? I'm getting an x-validation accurancy of 70,59% and i want to improve it.
Another question, how is calculated de accurancy value of performance vector result? There's a relashionship with precicion and recall of the classes on each x-validation step? It uses de F1 scrore?
Many thanks in advance!
Anyone know wath's the best way to optimize C and epsilon parameter on a linear SVM model (libsvm)? I'm classifying about 3000 docs on 11 categories.
Is there a relationship to the number of features, doc ou categories that's give me a starting point? I'm getting an x-validation accurancy of 70,59% and i want to improve it.
Another question, how is calculated de accurancy value of performance vector result? There's a relashionship with precicion and recall of the classes on each x-validation step? It uses de F1 scrore?
Many thanks in advance!
0
Answers
the accuracy is defined as (correctly_classified_examples)/(all_examples), i.e. the ratio of correctly classified examples or, in other words, the probability that an unseen example is classified correctly by the model. (The latter is only true if the class ratio of the testing set and in the new examples is equal).
Concerning the SVM: there is no general rule of thumb for good parameters. But using the Parameter Optimization (Grid) you can easily optimize the parameters. For the C value a good starting point are values from something like 10^-4 to 10^4 on a logarithmic scale. I also suggest to try the radial/rbf kernel. In that case also parameter gamma must be optimized. Try the same value range as for C.
Best,
Marius
One more question, my classes are very unbalanced, in your opinion this have any influence on my model accurancy?
Thanks a lot!!
Keep that in mind when you talk about accuracies
For better comparability, you could change the class ratio to 1:1 in your training set.
Best,
Marius