The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Need reference for Optimize Parameters (Evolutionary)[SOLVED]
Hi
I made a model with SVM and Optimize Parameters (Evolutionary) and results are good, so i decide publish it but i could not find any reference about Optimize parameters evolutionary. I do not know which paper is used to develop this operator. If anyone has information about this please tell me.
I made a model with SVM and Optimize Parameters (Evolutionary) and results are good, so i decide publish it but i could not find any reference about Optimize parameters evolutionary. I do not know which paper is used to develop this operator. If anyone has information about this please tell me.
0
Answers
actually we don't have more detailed documentation for the guts of that operator. Unfortunately the only answer I can give you right now is to look at the code - that's also what I would have to do...
Best regards,
Marius
I am new in rapid miner and i do not know how to trace code. Please tell me how can i trace code during run or find optimization parameter evolutionary source code to get my answers.
With the best regards
(Although he is currently in Leiden).
http://arnetminer.org/person/thomas-back-1509429.html
Alternatively you can cite his close friend: Gusz Eiben.
http://www.cs.vu.nl/~gusz/ecbook/ecbook.html
Or, if the code is CMA-ES rather than ES, you can cite Nikolaus Hansen.
https://www.lri.fr/~hansen/cmsa-versus-cma.html
Look at the 3 links, you are guaranteed to find a good paper here.
To find the code of a certain class, open the OperatorsDoc.xml, search for the operator name, and search for the respective key in Operators.xml, which will then point you to the underlying java class.
Best regards,
Marius
This is a SVM implementation using an evolutionary algorithm (ES) to solve the dual optimization problem of a SVM. It turns out that on many datasets this simple implementation is as fast and accurate as the usual SVM implementations. In addition, it is also capable of learning with Kernels which are not positive semi-definite and can also be used for multi-objective learning which makes the selection of C unecessary before learning.
Mierswa, Ingo. Evolutionary Learning with Kernels: A Generic Solution for Large Margin Problems. In Proc. of the Genetic and Evolutionary Computation Conference (GECCO 2006), 2006.
http://dl.acm.org/citation.cfm?id=1144249
Evolutionary learning with kernels: a generic solution for large margin problems
Full Text: PDFPDF
Author: Ingo Mierswa University of Dortmund
Published in:
· Proceeding
GECCO '06 Proceedings of the 8th annual conference on Genetic and evolutionary computation
Pages 1553-1560
ACM New York, NY, USA ©2006
table of contents ISBN:1-59593-186-4 doi>10.1145/1143997.1144249
http://wing2.ddns.comp.nus.edu.sg/downloads/keyphraseCorpus/89/89.pdf
http://i.snag.gy/MlUz8.jpg
As far as I understand:
a: constrained real values
n: number of support vectors (a)
y: labels (with values -1 and 1)
k(.,.): a kernel function
So how is n chosen?
Is this maybe the number of data points?
So now I should be able to understand fully what this formula does.
We have a double loop, so we get all possible combinations of two data points in our data set.
y_i * y_j gets a value of 1 when both data points are of the same class and a value -1 when they are of a different class
a_i * a_j * k(x_i, x_j) also evaluates to a scalar value
Since we maximizing, we want a_i * a_j * k(x_i, x_j) to evaluate to some positive value if they are the same class, and some negative value if they are not the same class.
k(x_i, x_j) maybe is possible to interpret as the notion of similarity between this two data points with very similar is high positive, and very non-similar is high negative
Is pretty clear to me that I don't fully understand what is going on here.
For me optimizing the a's that maximize this formula using ES is trivial, but why this formula optimizing margin is unclear to me.
After the paper mentions "Wolfe dual" I'm lost, but I would like to understand!
as far as I remember n is the number of examples in the dataset. You are right about the assumptions of the other examples.
Tibshirani's Elements of Statistical Learning contain a good introduction and mathematical derivation of the SVM and the formula you cite: http://www-stat.stanford.edu/~tibs/ElemStatLearn/index.html
Best regards,
Marius
I appreciate for your help.
With the best regards
Ali Kavian