Using a SVM Within a Stacked Model...
Colleagues: Being relatively new to the RM Platform I've been happily experimenting with various techniques. I created a model that uses Stacking, which worked great until I added a SVM to the Stacking Operator (either as part of the "Model Stack" or as the "Learner". Now the process throws an error just before completion and stops. The log error messages (all of them severe) and the Excel source data for the process are contained in the attached file "Error_Messages_and_Source_Data.zip" which can be inflated with either WinZip or WinRar.
The Process I designed is in the attached .rmp file.
When I take the SVM out of the Stacking operator and replacing it with another Operator (such as Deep Learning, etc.) everything works great again - so I am led to the conclusion that either one should simply not use a SVM as part of a Stacking Model, or there is another step that needs to be done given the requirements / architecture of the SVM Operator within a Stacking Model.
Thanks for considering this and pointing me in the right direction and best wishes, Michael
Best Answer
-
M_Martin RapidMiner Certified Analyst, Member Posts: 125 Unicorn
Hallo Martin:
You're absolutely right - I can only guess that I must have made an error having to do with Grouping Models correctly. Plus I needed to apply your tip re: converting the prediction Nominals on the "Base Learners" side of the Stacked Model to Numericals prior to feeding everything through to the SVM on the "Learner" side of the Stacked Model.
I reconstructed everything from the start, making sure to Group Models very carefully and apply the above mentioned tip from you, and all works as expected. ;-)
My sincere thanks for your patience and advice, very much appreciated.
As far as I'm concerned, it looks like we can close this issue.
Attached is the test version I just put together and tested, which works fine.
Alles gute - MfG, Michael
1
Answers
Hey,
i've checked your process and I agree, something is strange their. I cannot trace the error down. I will create an internal ticket for our developers to have a look.
Best,
Martin
Dortmund, Germany
Hi Martin,
Why would stacking or any ensemble methods for that matter be beneficial with SVM's? At least with the datasets I use, I have never seen an improvment in accuracy. How would you approach this?
Many thanks,
Alex
Dear Alex,
i can see some reasoning in combining various learners with a linear svm. The linear svm does not catch that many interaction terms. The patterns which are recognizable with a Decision Tree are invisible to a SVM.
In the end you could argue that the base learners act like a kernel function for the SVM. Not sure if it can boost your performance practically, but i would not exclude it from my list of choices.
~Martin
Dortmund, Germany
Thanks Martin,
I have looked at the example and I see how it works now. What would you consider to be valid choices for the stacking model learner? Would it always be Naive Bayes?
Alex
Hi Martin,
Nevermind, I found the article that you wrote on the subject. Thanks!
Alex
Hallo Martin:
Vielen Dank fuer Ihre Meldung - I noticed that the SVM Learner within the "Stacking" Model appears to output a Performance Vector (the "est" port goping out of the SVM - please see the attached screen shot). Perhaps this is the Performance Measurement that the Log Error Message(s) is referring to.
As you can see, I did experiment with various Operators that could perhaps deliver this Performance Vector later on in the Process.
MfG, Michael
Dear Michael,
Gern!. The java stack trace which you can see in rapidminer-studio.log is showing something Optimize related. This is somewhat confusing. I will keep you posted when I know more.
BR,
Martin
Dortmund, Germany
Lieber Martin:
Vielen dank nacheinmal - Ich warte auf Nachricht von Ihnen.
MfG,
Michael
Did you try it with the LibSVM operator? Maybe we can pinpoint the error easier if we know that with another implementation it works.
Hello - and thanks for your suggestion.
I tried using the operator you suggsted, but I still get the same error message I recieved using the standard SVM Operator:
"The operator expects the inner process to deliver a performance value".
If you have any other suggestions, my thanks, and I'd be happy to try them.
Berst wishes, Michael
Hi, and thanks for your suggestion, which I tried. I still get the same error message.
I did some further experimenting and got some interesting results using the standard SVM Operator, and the Operator you suggested I try.
I'll talk you through the attached sceenshots:
1. "Revised Model Overview.png" - I took the Optimize Parameters Grid out of the process, and moved the "Cross Validation" operator to the top level of the process follwoing the "Generate weights...." Operator. I tried this as the Error logging (attached in initial post on this topic) suggests that the problem may have had something to do with the "Optimize Paramters - Grid" Operator.
2. "Revised Model Detail Nr 1.png" - a vew of what is inside the "Cross Validation" Operator.
3. "Revised Model Detail Nr 2.png" - a vew of what is inside the "Stacking" Operator (see the above screen shot)
4. "Revised Model Detail Nr 3_Error_Message.png" - this is the error messager I now get when running the process - which is different than the error message I received when the "Optimize Paramters" operator was part of the process (see my initial post).
5. "Revised Model Detail Nr 4_Data_View.png" - this is what the data looks like prior to entering the "Stacking Operator" (i.e. right after the PCA Operator shown in "Revised Model Detail Nr 1.png". "creditworthy" is the Label field, and it is binomial.
6. "Revised Model Detail Nr 4_Statistics_View.png". We see that all values in the data are numeric, with the exception of "creditworthy", which is binomial.
7. "SVM Operator Information.png" - from the RM Documentation, which states that the SVM Operator can handle numeric attributes and a binomial Label. It would therefore seem that the SVM operator should be able to handle the data.
8. "Customer_Credit_Risk_Solution_Nr_2_MM.rmp" - the revised process file. The data the process uses was included in my initial post on this topic.
My investigations seem to indicate that using a SVM operator within a "Stacking" Operator can be problematic, but perhasps I am missing a further step. Thanks for considering this, and would be happy to try other suggestions.
Best wishes, Michael
Ha, i think i got it.
Try this:
~Martin
Dortmund, Germany
Thanks, Martin. The Process does work despite many Warnings (Aufrufezeichnen). ;-)
I am able to write the Logging to Excel as long as I use the "Recall" Operator as the last step in the Process (see "Process Flow" screenshot). Writing to Excel right after "Log ot Data" causes the Process to fail.
Intersting that you need to use a "Split Validation" if the Stacked Model includes a SVM. This has implications regarding data selection for training.
Also Interesting how another conversion is required before the SVM sees the data on the "Leaner" side of the "Stacking" Operator. Can you explain why this is needed as I may need to try this in other situations?
I don't seem to be able to output the Predictions the model generates to a "res" port. I am only able to seev training cases before the Model generates Predictions. Is there anything you would suggest I try?
Best wishes, MfG, Michael
Lieber Michael,
to explain the issue a bit. Stacking adds the predictions of the base learners to the attribute set of the stacked learner. Thus you had 4 additional nominal attributes - baseprediction0,baseprediction1.... A SVM can not handle this and fails.
The bug is that the stack trace is wrong..
For the others we need to have a closer look. It's friday evening for me, so I most likely won't have a look before monday morning
Dortmund, Germany
Vielen Dank, Martin, fuer die Erklaerung - machts total Sinn.
Schoenes WE! ;-
MfG,
Michael
Hi Martin:
I took the "Optimize Parameters - Grid" Operator out of the Process you sent me (see below) and replaced the "Split Validation" operator from your process with the "Cross Validation" Operator - and now all works as expected.
So it seems that the problem I origianlly ran into has to do with using the "Optimize Parameters - Grid" operator when a "SVM" Operator is on the training or learning side of a "Stacking" Operator within a "Cross Validation" Operator.
As there are a number of parameters worth monitoring and logging in the "Staked Model", it would be interesting if it were somehow possible to use "Optimize Parameters" and a "Cross Validation" Operator (to ensure all data is used for training and testing).
Your suggested Process makes it possible to use "Optimize Parameters" if you use a "Split Validation" Operator - my examply (below) allows for the "Cross Validation" Operator, but not "Optimize Parameters".
Best wishes, Michael
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="7.5.001" expanded="true" height="68" name="Read Excel" width="90" x="130" y="34">
<parameter key="excel_file" value="C:\Data\Rapid_Miner_Training\Lab_Assignments\Customer_Credit_Risk_Data.xlsx"/>
<parameter key="imported_cell_range" value="A1:V989"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="foreignworker.true.polynominal.attribute"/>
<parameter key="1" value="status.true.polynominal.attribute"/>
<parameter key="2" value="credithistory.true.polynominal.attribute"/>
<parameter key="3" value="purpose.true.polynominal.attribute"/>
<parameter key="4" value="savings.true.polynominal.attribute"/>
<parameter key="5" value="employmentsince.true.polynominal.attribute"/>
<parameter key="6" value="otherdebtors.true.polynominal.attribute"/>
<parameter key="7" value="property.true.polynominal.attribute"/>
<parameter key="8" value="otherinstallments.true.polynominal.attribute"/>
<parameter key="9" value="housing.true.polynominal.attribute"/>
<parameter key="10" value="job.true.polynominal.attribute"/>
<parameter key="11" value="phone.true.polynominal.attribute"/>
<parameter key="12" value="duration.true.integer.attribute"/>
<parameter key="13" value="creditamount.true.integer.attribute"/>
<parameter key="14" value="installmentrate.true.integer.attribute"/>
<parameter key="15" value="residencesince.true.integer.attribute"/>
<parameter key="16" value="age.true.integer.attribute"/>
<parameter key="17" value="numberofexsistingcredits.true.integer.attribute"/>
<parameter key="18" value="numberofliablepeople.true.integer.attribute"/>
<parameter key="19" value="gender.true.polynominal.attribute"/>
<parameter key="20" value="creditworthy.true.polynominal.attribute"/>
<parameter key="21" value="creditamout_per_month.true.numeric.attribute"/>
</list>
</operator>
<operator activated="true" class="set_role" compatibility="7.5.001" expanded="true" height="82" name="Set Role" width="90" x="286" y="34">
<parameter key="attribute_name" value="creditworthy"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="nominal_to_binominal" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="483" y="36">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="creditworthy"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nom to Numeric" width="90" x="663" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="credithistory|employmentsince|foreignworker|gender|housing|job|otherdebtors|otherinstallments|phone|property|purpose|savings|status"/>
<list key="comparison_groups"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="7.5.001" expanded="true" height="68" name="Extract Macro" width="90" x="835" y="48">
<parameter key="macro" value="num_Examples"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="generate_weight_stratification" compatibility="7.5.001" expanded="true" height="82" name="Generate Weight (Stratification)" width="90" x="1012" y="42">
<parameter key="total_weight" value="%{num_Examples}"/>
</operator>
<operator activated="true" class="concurrency:cross_validation" compatibility="7.5.001" expanded="true" height="145" name="Cross Validation" width="90" x="1196" y="42">
<process expanded="true">
<operator activated="true" class="normalize" compatibility="7.5.001" expanded="true" height="103" name="z Score Normalize" width="90" x="121" y="186">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="age|creditamount|creditamout_per_month|duration"/>
</operator>
<operator activated="true" class="principal_component_analysis" compatibility="7.5.001" expanded="true" height="103" name="PCA 32 Attributes" width="90" x="261" y="34">
<parameter key="dimensionality_reduction" value="fixed number"/>
<parameter key="variance_threshold" value="0.9"/>
<parameter key="number_of_components" value="32"/>
</operator>
<operator activated="true" class="stacking" compatibility="7.5.001" expanded="true" height="68" name="Stacked Models with SVM" width="90" x="426" y="347">
<process expanded="true">
<operator activated="true" class="naive_bayes" compatibility="7.5.001" expanded="true" height="82" name="Stacking NB" width="90" x="382" y="177"/>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost DT" width="90" x="379" y="36">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="82" name="Stacking DT" width="90" x="707" y="75">
<parameter key="maximal_depth" value="15"/>
<parameter key="minimal_gain" value="0.02"/>
<parameter key="minimal_leaf_size" value="6"/>
</operator>
<connect from_port="training set" to_op="Stacking DT" to_port="training set"/>
<connect from_op="Stacking DT" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost KNN" width="90" x="375" y="307">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="7.5.001" expanded="true" height="82" name="Stacking KNN" width="90" x="564" y="182">
<parameter key="k" value="10"/>
<parameter key="weighted_vote" value="true"/>
</operator>
<connect from_port="training set" to_op="Stacking KNN" to_port="training set"/>
<connect from_op="Stacking KNN" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost Log Regr" width="90" x="366" y="457">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="h2o:logistic_regression" compatibility="7.5.000" expanded="true" height="103" name="Stacking Log Regr" width="90" x="543" y="139"/>
<connect from_port="training set" to_op="Stacking Log Regr" to_port="training set"/>
<connect from_op="Stacking Log Regr" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<connect from_port="training set 1" to_op="Stacking NB" to_port="training set"/>
<connect from_port="training set 2" to_op="Ada Boost DT" to_port="training set"/>
<connect from_port="training set 3" to_op="Ada Boost KNN" to_port="training set"/>
<connect from_port="training set 4" to_op="Ada Boost Log Regr" to_port="training set"/>
<connect from_op="Stacking NB" from_port="model" to_port="base model 2"/>
<connect from_op="Ada Boost DT" from_port="model" to_port="base model 1"/>
<connect from_op="Ada Boost KNN" from_port="model" to_port="base model 3"/>
<connect from_op="Ada Boost Log Regr" from_port="model" to_port="base model 4"/>
<portSpacing port="source_training set 1" spacing="0"/>
<portSpacing port="source_training set 2" spacing="0"/>
<portSpacing port="source_training set 3" spacing="0"/>
<portSpacing port="source_training set 4" spacing="0"/>
<portSpacing port="source_training set 5" spacing="0"/>
<portSpacing port="sink_base model 1" spacing="0"/>
<portSpacing port="sink_base model 2" spacing="0"/>
<portSpacing port="sink_base model 3" spacing="0"/>
<portSpacing port="sink_base model 4" spacing="0"/>
<portSpacing port="sink_base model 5" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Numerical" width="90" x="88" y="210">
<list key="comparison_groups"/>
</operator>
<operator activated="true" class="support_vector_machine" compatibility="7.5.001" expanded="true" height="124" name="SVM" width="90" x="288" y="66">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="5.0"/>
</operator>
<operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="103" name="Group Models Learning Side" width="90" x="524" y="239"/>
<connect from_port="stacking examples" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM" to_port="training set"/>
<connect from_op="Nominal to Numerical" from_port="preprocessing model" to_op="Group Models Learning Side" to_port="models in 1"/>
<connect from_op="SVM" from_port="model" to_op="Group Models Learning Side" to_port="models in 2"/>
<connect from_op="Group Models Learning Side" from_port="model out" to_port="stacking model"/>
<portSpacing port="source_stacking examples" spacing="0"/>
<portSpacing port="sink_stacking model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="124" name="Group Models" width="90" x="624" y="145"/>
<connect from_port="training set" to_op="z Score Normalize" to_port="example set input"/>
<connect from_op="z Score Normalize" from_port="example set output" to_op="PCA 32 Attributes" to_port="example set input"/>
<connect from_op="z Score Normalize" from_port="preprocessing model" to_op="Group Models" to_port="models in 1"/>
<connect from_op="PCA 32 Attributes" from_port="example set output" to_op="Stacked Models with SVM" to_port="training set"/>
<connect from_op="PCA 32 Attributes" from_port="preprocessing model" to_op="Group Models" to_port="models in 2"/>
<connect from_op="Stacked Models with SVM" from_port="model" to_op="Group Models" to_port="models in 3"/>
<connect from_op="Group Models" from_port="model out" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.5.001" expanded="true" height="82" name="Apply Stack Model" width="90" x="198" y="34">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="7.5.001" expanded="true" height="82" name="Stacking Perf." width="90" x="427" y="72">
<parameter key="AUC" value="true"/>
<parameter key="precision" value="true"/>
<parameter key="recall" value="true"/>
<parameter key="false_positive" value="true"/>
<parameter key="false_negative" value="true"/>
<parameter key="true_positive" value="true"/>
<parameter key="true_negative" value="true"/>
<parameter key="sensitivity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Stack Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Stack Model" to_port="unlabelled data"/>
<connect from_op="Apply Stack Model" from_port="labelled data" to_op="Stacking Perf." to_port="labelled data"/>
<connect from_op="Stacking Perf." from_port="performance" to_port="performance 1"/>
<connect from_op="Stacking Perf." from_port="example set" to_port="test set results"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
<connect from_op="Nominal to Binominal" from_port="example set output" to_op="Nom to Numeric" to_port="example set input"/>
<connect from_op="Nom to Numeric" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Generate Weight (Stratification)" to_port="example set input"/>
<connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="model" to_port="result 1"/>
<connect from_op="Cross Validation" from_port="test result set" to_port="result 2"/>
<connect from_op="Cross Validation" from_port="performance 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Dear Michael,
i've switched to X-Val, added performance and pass it out in the end. Seems to work fine?
Best,
Martin
Dortmund, Germany