The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Problem with xvalidationparallel / process log
Hi all,
So here's my issue. I am running rapid-i enterprise edition with the feature selection plugin, and im trying to speed the process up by using the xvalidationparallel operator. I have a machine with dual quad core i7 processors, so in theory there are 8 processors to which i should be able to assign a thread each (so i should be able to specify 8 threads, if i understand the operator correctly). Now i get an illegalthreadexception:null error message if i do all that after about 5 mins, but none if i either a-remove the process log entirely or b-lower the number of threads to 4 and place the process log within the first operator chain, which slows down the algorithm exponentially over time. If i put the process log anywhere else, the algorithm throws that error. Any suggestions on what im doing wrong?
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="\\MONOLITH\Public\Documents\OP Methylation Machine data\Complete Aggressive datasheet OP tumors 9-4-09.csv"/>
<parameter key="label_column" value="2"/>
<parameter key="id_column" value="1"/>
</operator>
<operator name="ExampleSetTranspose" class="ExampleSetTranspose">
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="AGGRESSIVE AT 24M"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="MissingValueReplenishment" class="MissingValueReplenishment">
<parameter key="default" value="zero"/>
<list key="columns">
</list>
</operator>
<operator name="NominalNumbers2Numerical" class="NominalNumbers2Numerical">
</operator>
<operator name="WrapperXValidation" class="WrapperXValidation" expanded="yes">
<parameter key="leave_one_out" value="true"/>
<operator name="AdvancedForwardSelection" class="AdvancedForwardSelection" expanded="yes">
<parameter key="maximal_number_of_attributes" value="500"/>
<parameter key="speculative_rounds" value="10"/>
<parameter key="stopping_behavior" value="without significant increase"/>
<operator name="XValidationParallel" class="XValidationParallel" expanded="yes">
<parameter key="number_of_threads" value="4"/>
<parameter key="leave_one_out" value="true"/>
<parameter key="sampling_type" value="shuffled sampling"/>
<operator name="JMySVMLearner" class="JMySVMLearner">
<parameter key="calculate_weights" value="true"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
<operator name="MinMaxWrapper" class="MinMaxWrapper">
</operator>
</operator>
</operator>
</operator>
<operator name="LibSVMLearner" class="LibSVMLearner">
<list key="class_weights">
</list>
<parameter key="confidence_for_multiclass" value="false"/>
</operator>
<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="C:\Users\RLleras.HNSCC\Documents\Lab Projects\Rapid-I projects\Agressive OP model.mod"/>
</operator>
<operator name="ModelApplier (2)" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="ClassificationPerformance (2)" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="kappa" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="kendall_tau" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="true"/>
<parameter key="relative_error_strict" value="true"/>
<parameter key="correlation" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
</operator>
So here's my issue. I am running rapid-i enterprise edition with the feature selection plugin, and im trying to speed the process up by using the xvalidationparallel operator. I have a machine with dual quad core i7 processors, so in theory there are 8 processors to which i should be able to assign a thread each (so i should be able to specify 8 threads, if i understand the operator correctly). Now i get an illegalthreadexception:null error message if i do all that after about 5 mins, but none if i either a-remove the process log entirely or b-lower the number of threads to 4 and place the process log within the first operator chain, which slows down the algorithm exponentially over time. If i put the process log anywhere else, the algorithm throws that error. Any suggestions on what im doing wrong?
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="\\MONOLITH\Public\Documents\OP Methylation Machine data\Complete Aggressive datasheet OP tumors 9-4-09.csv"/>
<parameter key="label_column" value="2"/>
<parameter key="id_column" value="1"/>
</operator>
<operator name="ExampleSetTranspose" class="ExampleSetTranspose">
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="AGGRESSIVE AT 24M"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="MissingValueReplenishment" class="MissingValueReplenishment">
<parameter key="default" value="zero"/>
<list key="columns">
</list>
</operator>
<operator name="NominalNumbers2Numerical" class="NominalNumbers2Numerical">
</operator>
<operator name="WrapperXValidation" class="WrapperXValidation" expanded="yes">
<parameter key="leave_one_out" value="true"/>
<operator name="AdvancedForwardSelection" class="AdvancedForwardSelection" expanded="yes">
<parameter key="maximal_number_of_attributes" value="500"/>
<parameter key="speculative_rounds" value="10"/>
<parameter key="stopping_behavior" value="without significant increase"/>
<operator name="XValidationParallel" class="XValidationParallel" expanded="yes">
<parameter key="number_of_threads" value="4"/>
<parameter key="leave_one_out" value="true"/>
<parameter key="sampling_type" value="shuffled sampling"/>
<operator name="JMySVMLearner" class="JMySVMLearner">
<parameter key="calculate_weights" value="true"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
<operator name="MinMaxWrapper" class="MinMaxWrapper">
</operator>
</operator>
</operator>
</operator>
<operator name="LibSVMLearner" class="LibSVMLearner">
<list key="class_weights">
</list>
<parameter key="confidence_for_multiclass" value="false"/>
</operator>
<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="C:\Users\RLleras.HNSCC\Documents\Lab Projects\Rapid-I projects\Agressive OP model.mod"/>
</operator>
<operator name="ModelApplier (2)" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="ClassificationPerformance (2)" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="kappa" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="kendall_tau" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="true"/>
<parameter key="relative_error_strict" value="true"/>
<parameter key="correlation" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
</operator>
0
Answers
Sadly this issue has been around for a while....
http://rapid-i.com/rapidforum/index.php/topic,563.0.html
unfortunately you have removed the processLog operator from your process, so that I cannot take a look on its parameter settings.
Could you please insert the crashing process here?
In normal cases I would try to reproduce the error, but unless someone sponsors me a dual quad i7 processor, this setting outperforms my laptop Since I believe that this is a race condition I will not be able to reproduce it at all. It would be of great help if you could paste the stack trace of the error here.
For doing so, you have to enable the debug mode. Therefore choose Tools / Preferences and select in the general tab the "rapidminer.general.debugmode" property. After apply and saving the settings, there should be a possibility in the error dialog to get the error description.
Greetings,
Sebastian
Here's the error I get from the debug mode...
Exception: java.lang.IllegalThreadStateException
Message: null
Stack trace:
java.lang.Thread.start(Thread.java:595)
com.rapidminer.operator.validation.ParallelXValidation.estimatePerformance(ParallelXValidation.java:156)
com.rapidminer.operator.validation.ValidationChain.apply(ValidationChain.java:218)
com.rapidminer.operator.Operator.apply(Operator.java:671)
com.rapidminer.operator.features.selection.ForwardAttributeSelectionOperator.applyInnerLearner(ForwardAttributeSelectionOperator.java:278)
com.rapidminer.operator.features.selection.ForwardAttributeSelectionOperator.apply(ForwardAttributeSelectionOperator.java:181)
com.rapidminer.operator.Operator.apply(Operator.java:671)
com.rapidminer.operator.validation.WrapperValidationChain.useMethod(WrapperValidationChain.java:134)
com.rapidminer.operator.validation.WrapperXValidation.apply(WrapperXValidation.java:114)
com.rapidminer.operator.Operator.apply(Operator.java:671)
com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:424)
com.rapidminer.operator.Operator.apply(Operator.java:671)
com.rapidminer.Process.run(Process.java:735)
com.rapidminer.Process.run(Process.java:704)
com.rapidminer.Process.run(Process.java:694)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:59)
And here's the code from which I received that message...
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="\\MONOLITH\Public\Documents\OP Methylation Machine data\Complete Aggressive datasheet OP tumors 9-4-09.csv"/>
<parameter key="label_column" value="2"/>
<parameter key="id_column" value="1"/>
</operator>
<operator name="ExampleSetTranspose" class="ExampleSetTranspose">
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="AGGRESSIVE AT 24M"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="MissingValueReplenishment" class="MissingValueReplenishment">
<parameter key="default" value="zero"/>
<list key="columns">
</list>
</operator>
<operator name="NominalNumbers2Numerical" class="NominalNumbers2Numerical">
</operator>
<operator name="WrapperXValidation" class="WrapperXValidation" expanded="yes">
<parameter key="leave_one_out" value="true"/>
<operator name="AdvancedForwardSelection" class="AdvancedForwardSelection" expanded="yes">
<parameter key="maximal_number_of_attributes" value="500"/>
<parameter key="speculative_rounds" value="5"/>
<operator name="XValidationParallel" class="XValidationParallel" expanded="yes">
<parameter key="number_of_threads" value="8"/>
<operator name="JMySVMLearner" class="JMySVMLearner">
<parameter key="max_iterations" value="10000"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
<operator name="ProcessLog" class="ProcessLog">
<list key="log">
<parameter key="number of attributes" value="operator.AdvancedForwardSelection.value.number of attributes"/>
<parameter key="performance" value="operator.AdvancedForwardSelection.value.performance"/>
</list>
</operator>
</operator>
<operator name="JMySVMLearner (2)" class="JMySVMLearner">
</operator>
<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="C:\Users\RLleras.HNSCC\Documents\Lab Projects\Rapid-I projects\Agressive OP model.mod"/>
</operator>
<operator name="ModelApplier (2)" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="ClassificationPerformance (2)" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="kappa" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="kendall_tau" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="true"/>
<parameter key="relative_error_strict" value="true"/>
<parameter key="correlation" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
</operator>
I get this error about 30s in with the following posting in the Log: G Sep 17, 2009 1:08:59 PM: [Fatal] IllegalThreadStateException occured in 9593rd application of ClassificationPerformance (ClassificationPerformance)
I will take a look on this matter as soon as possible.
Greetings,
Sebastian
Here's some more info on my issue that I think might be able to help you. So I have found that even if I take out the process log that I have the same issues. However, here's what I have found that changes things
1) Replacing XValParallel with XVal- algorithm runs slower, but no problems seen at all.
2) If i start from scratch and make the EXACT same project in a new project file, the first time I execute it there is no problem, churns along fine...UNTIL it reaches a point when suddenly the memory usage suddenly goes bonkers and it uses all available RAM allocated to the JVM which in turn crashes Java, to the point that I have to force close it with the taskmanager, so I can't even give you the error message from debug mode. Now, whats even more interesting is if I then restart rapid-i, then load the SAME project that got up to that same point the last time rapid-i was used, I get the error message I already showed you before (illegalthreadstateexception message:null). Weird! I have reproduced this 3X, with the timing of the thread exception variable, as has always been the case...
Hope that helps!
Roberto
So after doing a little more troubleshooting I have figured out a few more things.
1) When I force rapid miner to shut down, it wasnt always terminating the Java threads that were being opened by the JVM. So, i found that if i force-closed the JVM through the task manager then re-opened the same process that I no longer get the subsequent Illegalthreadstateexception error when i execute the process.
2) Everytime i run that process, I now get the memory exhaustion problem at the exact same time: when the algorithm exits the advanced feature selection part of the algorithm, before it tests the model on the second JSVMlearner operator. Or at least thats how it seems, as I still can't get the stack trace because I have to use the taskmanager to force close the program, as well as Java, in order for it to release the memory. Hope all the extra info helps.
Roberto
I'm currently attending a conference, so I cannot take a deeper look into this until I'm back in office, but as I heard from my colleges, they have found an thread synchronization problem and solved it already. If it turns out, that this solves your problem, we will send the enterprise customer an updated version.
Greetings,
Sebastian