The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[Solved] How to read out multiple performance parameters within optimization
Hi,
In the attached process I have two different performance operators which run inside an "optimization" operator.
Now I am looking for a way to get the result of the whole process (both performance values of the optimized/best model) into an example set. This seems a little tricky as a kind of collection is returned but I cannot use the "append operator" as the type is "per" and not "exa". (Nor is it possible to log within the optimization operator as this would only return the last run of the SVM but not the optimized/best.)
Would appreciate any ideas...
Best regards
Sachs
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
<process expanded="true" height="314" width="413">
<operator activated="true" class="generate_data" compatibility="5.3.000" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.3.000" expanded="true" height="94" name="Optimize Parameters (Grid)" width="90" x="179" y="30">
<list key="parameters">
<parameter key="SVM (Linear).C" value="1,100"/>
</list>
<process expanded="true" height="388" width="711">
<operator activated="true" class="series:sliding_window_validation" compatibility="5.3.000" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
<parameter key="training_window_width" value="10"/>
<parameter key="test_window_width" value="10"/>
<process expanded="true" height="388" width="330">
<operator activated="true" class="support_vector_machine_linear" compatibility="5.3.000" expanded="true" height="76" name="SVM (Linear)" width="90" x="45" y="30">
<parameter key="C" value="100"/>
</operator>
<connect from_port="training" to_op="SVM (Linear)" to_port="training set"/>
<connect from_op="SVM (Linear)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="388" width="480">
<operator activated="true" class="apply_model" compatibility="5.3.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="series:forecasting_performance" compatibility="5.3.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30">
<parameter key="horizon" value="1"/>
</operator>
<operator activated="true" class="performance_regression" compatibility="5.3.000" expanded="true" height="76" name="Performance (2)" width="90" x="313" y="30">
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_op="Performance (2)" to_port="performance"/>
<connect from_op="Performance" from_port="example set" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
0
Answers
just log performance1 and performance2 of the Validation operator. Please have a look at the attached process.
The results of both performance vectors (the output of a Performance operator) are concatenated. In the final results view you can see the performance: the first entry is the prediction_trend_accuracy, the second one the absolute_error. If you had a third measure, that would be performance3 of the validation.
By logging the performanceX values of the validation, you actually get the performance of the complete validation, not just of the last fold.
If anything is unclear, please let us know.
Best regards,
Marius
Hi Marius,
Probably my description was not ideal. When I take your process I receive two rows in the log table according the to the two optimization values set (in this example C=1 and C=100). The model resultung of the "optimization" parameter will be build on the the "C" value with the better performance. But there are two different performance criteria and "prediction trend" is better when C=1 while "absolute_error" is better when C=100.
So which C value was taken for the final model and what is the performance of this final model?
Thank you very much & kind regards
Sachs
If you simply want to know the parameter combination which resulted in the best performance, you should connect the "par" output of the optimization operator to the process output. You can even store that output in the repository with Store and apply it to another learning operator with the Set Parameters operator. That prevents you from having to type all those potentially long numbers when optimizing more than one parameter.
Best regards,
Marius
Forgive me my stupidness: It is probably easy to solve but I don't see how to do it... I tried to use the store operator but it throughs this error:
Cannot store data in repository at entry "//LocalRepository/store". Reason: Cannot store data at entry "C:\users\USERNAME\Documents\Rapidminer\store.ioo": java.io.NotSerializableException: com.rapidminer.operator.performance.ForecastingPerformanceEvaluator.
By the way would it also be possible to log the "par" output? That would be quite handy for me. Even better than using the store operator.
Thanks again for your support!
Sachs
Logging of the parameters object is not possible out of the box, but maybe you can convert them to an example set with the help of the Execute Script operator. For assistance of using that operator please download the document How to Extend RapidMiner from our website.
Best regards,
Marius
I had a brief look at the document "How to extend Rapidminer 5" which looks really promising. So I will take some time to get through it... let's see
Best regards
Sachs