The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Question about prediction values and applying weights from results"
To begin with, I am very new to data mining and Rapid Miner. I have been experimenting with the software for several months now and it has certainly proven to be a fantastic and powerful learning and exploration tool. Thank you for your wonderful efforts on this software application!
I do have a couple of questions, though, about how to accurately replicate the prediction values I am receiving. I am performing both classification and regression tests on labelled data. Overall, I am running experiments where I:
1. optimize the number of attributes (either through PCA or Genetic/Evolutionary "optimize selection"). I typically normalize the resulting weights to receive either a 0 or 1 and pass those attributes with a weight of 1 onto the next processing step.
2. I run the same data set with the "selected" attributes through the same learner as the "optimize selection" (typically SVM) in order to obtain the weights/model for the data with the selected attributes.
3. I then apply these weights/model to a new set of unseen data with just the selected attributes and obtain the performance of the weighted model of the data.
When the test is complete, I view the data set which displays the selected attributes along with the label value and the predicted value. I also view the weights of the selected attributes. In an effort to replicate the predicted value, I basically perform matrix multiplication with the transposed weight matrix and the attribute value matrix. However, the values I obtain when I do this are usually nowhere near the predicted value which is displayed. I perform this for both the classification and regression problems. I also realize there are often times biases associated with the various learners which I add to/subtract from the calculated values I obtain. However, these values are still not near the predicted values. It seems like this should be pretty straight forward, but I know I am definitely missing something.
Is there anyone who might be able to explain how to apply the weights obtained from the various learners in order to obtain accurate prediction values, especially for binominal classification?
Thanks in advance for anyone able to assist!
David
Tagged:
0
Answers
I follow what you are doing up to... Could you post the XML of your process so that I can see what you mean?
PS. The "Create Formula" operator can handle binominal SVMs, like this...
I realize now I tried to sound a lot more educated in my post about data mining than what I really am. Sorry about that
However, you are definitely correct about me wanting to obtain some formula for getting the predicted values seen in RapidMiner. I added the CreateFormula operator into my process, and had it write out the formula for an SVM regression model. The resulting file listed a different formula for every single instance I was testing/evaluating with different attribute coefficients! The file was huge. Is this correct? Simply put, I am assuming the SVM regression model is very similar to a Linear regression model where there is a single formula, you plug in the attribute values into the formula, multiply these values by the corresponding coefficient, add some offset ... and there is your predicted value. Is this not the case?
Below is some of the XML for the process. I didn't include everything because there is a lot. Thanks again for your assistance!
the SVM regression model only collapses to a solution similar to Linear Regression if you use the linear kernel. Otherwise it depends on the Kernelmatrix and hence the factors are changing with the test example. Otherwise the SVM wouldn't be so much more flexible than Linear Regression, wouldn't it?
Greetings,
Sebastian