Runtime exception: mismatched criterion class when running gridsearch on list of perf parameters
Hi
I'm getting the following runtime error when i try to run a "Optimize Parameters (Grid)" and selecting Performance.MainCriterion as variable to do the greid search. The grid search produces this error as soon as it has completed the first run with one parameter setting.
Can anyone shed light on this ?
Example: if the first run specifies "root_mean_squared_error", and the second run specifies "absolute_error", then I get the following exception:
- Exception: java.lang.RuntimeException
- Message: java.lang.RuntimeException: Mismatched criterion class:class com.rapidminer.operator.performance.AbsoluteError, class com.rapidminer.operator.performance.RootMeanSquaredError
- Stack trace:
- sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
- sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
- sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
- java.lang.reflect.Constructor.newInstance(Constructor.java:423)
- java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:593)
- java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1005)
- com.rapidminer.studio.concurrency.internal.StudioConcurrencyContext.collectResults(StudioConcurrencyContext.java:212)
- com.rapidminer.studio.concurrency.internal.StudioConcurrencyContext.call(StudioConcurrencyContext.java:156)
- com.rapidminer.extension.concurrency.execution.BackgroundExecutionService.executeOperatorTasks(BackgroundExecutionService.java:393)
- com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.performParallelLoop(AbstractLoopOperator.java:248)
- com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.doWork(AbstractLoopOperator.java:418)
- com.rapidminer.operator.Operator.execute(Operator.java:1004)
- com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:77)
- com.rapidminer.operator.ExecutionUnit$3.run(ExecutionUnit.java:812)
- com.rapidminer.operator.ExecutionUnit$3.run(ExecutionUnit.java:807)
- java.security.AccessController.doPrivileged(Native Method)
- com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:807)
- com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:428)
- com.rapidminer.operator.Operator.execute(Operator.java:1004)
- com.rapidminer.Process.execute(Process.java:1310)
- com.rapidminer.Process.run(Process.java:1285)
- com.rapidminer.Process.run(Process.java:1176)
- com.rapidminer.Process.run(Process.java:1129)
- com.rapidminer.Process.run(Process.java:1124)
- com.rapidminer.Process.run(Process.java:1114)
- com.rapidminer.gui.ProcessThread.run(ProcessThread.java:65)
- Cause
- Exception: java.lang.RuntimeException
- Message: Mismatched criterion class:class com.rapidminer.operator.performance.AbsoluteError, class com.rapidminer.operator.performance.RootMeanSquaredError
- Stack trace:
- com.rapidminer.operator.performance.PerformanceCriterion.compareTo(PerformanceCriterion.java:104)
- com.rapidminer.operator.performance.PerformanceVector$DefaultComparator.compare(PerformanceVector.java:56)
- com.rapidminer.operator.performance.PerformanceVector.compareTo(PerformanceVector.java:140)
- com.rapidminer.extension.concurrency.operator.optimization.parameters.OptimizeGridOperator.processSingleRun(OptimizeGridOperator.java:163)
- com.rapidminer.extension.concurrency.operator.optimization.parameters.OptimizeGridOperator.processSingleRun(OptimizeGridOperator.java:41)
- com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.lambda$performParallelLoop$1(AbstractLoopOperator.java:241)
- com.rapidminer.extension.concurrency.execution.BackgroundExecutionService$ExecutionCallable.call(BackgroundExecutionService.java:357)
- java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1424)
- java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
- java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
- java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
- java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
The setup is like this :
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve" width="90" x="112" y="34">
<parameter key="repository_entry" value="../data/OliTestSetIntraRTG"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="concurrency:optimize_parameters_grid" compatibility="8.0.001" expanded="true" height="124" name="Optimize Parameters (Grid)" width="90" x="581" y="493">
<list key="parameters">
<parameter key="Performance (2).main_criterion" value="root_mean_squared_error,absolute_error"/>
</list>
<parameter key="error_handling" value="fail on error"/>
<parameter key="log_performance" value="true"/>
<parameter key="log_all_criteria" value="true"/>
<parameter key="synchronize" value="false"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="split_data" compatibility="8.0.001" expanded="true" height="103" name="Split Data" width="90" x="179" y="340">
<enumeration key="partitions">
<parameter key="ratio" value="0.5"/>
<parameter key="ratio" value="0.5"/>
</enumeration>
<parameter key="sampling_type" value="automatic"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
</operator>
<operator activated="true" class="h2o:gradient_boosted_trees" compatibility="7.6.001" expanded="true" height="103" name="Gradient Boosted Trees (2)" width="90" x="380" y="136">
<parameter key="number_of_trees" value="9"/>
<parameter key="reproducible" value="false"/>
<parameter key="maximum_number_of_threads" value="4"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="maximal_depth" value="4"/>
<parameter key="min_rows" value="20.0"/>
<parameter key="min_split_improvement" value="0.0"/>
<parameter key="number_of_bins" value="20"/>
<parameter key="learning_rate" value="0.1"/>
<parameter key="sample_rate" value="1.0"/>
<parameter key="distribution" value="AUTO"/>
<parameter key="early_stopping" value="false"/>
<parameter key="stopping_rounds" value="1"/>
<parameter key="stopping_metric" value="AUTO"/>
<parameter key="stopping_tolerance" value="0.001"/>
<parameter key="max_runtime_seconds" value="0"/>
<list key="expert_parameters"/>
</operator>
<operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="581" y="238">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance_regression" compatibility="8.0.001" expanded="true" height="82" name="Performance (2)" width="90" x="715" y="34">
<parameter key="main_criterion" value="absolute_error"/>
<parameter key="root_mean_squared_error" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="true"/>
<parameter key="relative_error_strict" value="false"/>
<parameter key="normalized_absolute_error" value="true"/>
<parameter key="root_relative_squared_error" value="true"/>
<parameter key="squared_error" value="true"/>
<parameter key="correlation" value="true"/>
<parameter key="squared_correlation" value="true"/>
<parameter key="prediction_average" value="false"/>
<parameter key="spearman_rho" value="false"/>
<parameter key="kendall_tau" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="false"/>
</operator>
<connect from_port="input 1" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Gradient Boosted Trees (2)" to_port="training set"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Gradient Boosted Trees (2)" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Apply Model" from_port="model" to_port="model"/>
<connect from_op="Performance (2)" from_port="performance" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
</process>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="store" compatibility="8.0.001" expanded="true" height="68" name="Store" width="90" x="849" y="187">
<parameter key="repository_entry" value="../data/RForest1"/>
</operator>
</process>
Comments
Hi,
this is a known issue and will be fixed soon. It happens if you have different performance metrics in different runs of grid search. Just deactivate the automatic logging or ensure same performance measures in each run.
Best,
Martin
Dortmund, Germany
Hi Oliver,
thanks for reporting this. As a quick tip, use the </> to attach your process XML, this way there will be no conversion to smileys etc. :smileywink:
Unfortunately I have to correct Martin, since this is another issue (see here). We will fix it to not throw a RuntimeException, but in general RapidMiner can not compare all different performance criteria with each other, see also this main thing here:
What exactly do you want to actually achieve in your process? If you want to have all the different performance criteria for just one single run, you can also try to select all the needed criteria and check the "log_all_criteria" in the "Optimize Parameter (Grid)" Operator. Or if you want to only log and not optimize, you can simply use the "Loop Parameters" Operator to loop over those criteria (there should be no comparison there).
Best,
Jan