The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Finding good classifier
Legacy User
Member Posts: 0 Newbie
Hi,
I want to use a machine learning-based classification
algorithm to generate a classifier. The training set
was gathered by supervised learning and consists of about
600 examples, each of which has 30 features and a binary
label (example belongs to class or not).
What classification algorithm would yield most accurate
results (in terms of minimal classification error) in
my case?
Support Vector Machines seem to be a good choice but I
would like to hear your opinion.
If this question cannot be generally answered, I would
appreciate any hints (also papers,websites,books) how a
professional data miner proceeds in finding a good
algorithm.
Regards,
Stephan
I want to use a machine learning-based classification
algorithm to generate a classifier. The training set
was gathered by supervised learning and consists of about
600 examples, each of which has 30 features and a binary
label (example belongs to class or not).
What classification algorithm would yield most accurate
results (in terms of minimal classification error) in
my case?
Support Vector Machines seem to be a good choice but I
would like to hear your opinion.
If this question cannot be generally answered, I would
appreciate any hints (also papers,websites,books) how a
professional data miner proceeds in finding a good
algorithm.
Regards,
Stephan
0
Answers
one cannot say, which algorithm will yield the best performance without knowing the distribution behind the data. But if you would know the distribution, you wouldn't need any learner at all...
With two classes, SVMs seems to be a good tool, since they can cover a wide range of model types depending on the chosen Kernel. They are able to use linear and quadratic decision boundarys with linear or polynomial kernel or implicitly model the densities using RBF Kernels.
Unfortunatly it can't be shortly described, how professional data miner work things out. If it would be, they wouldn't be professionals any more, rather becoming unemployed...
Choosing the right algorithm(s) is more a guided search, considering previous results on the path to the one best process. This search is espacially guided by experience and a deep understanding of the algorithms itself.
But now to the good news:
RapidMiner is designed to really fast change processes and exchange learners. You even might do this automatically by OperatorSelection and ParameterOptimiziation. If you will "play" a little bit with your processes, you will gain your own experience with the algorithms...
Greetings,
Sebastian