The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Write Model
Hi,
I have a problem by saving a Neural Net Model using the "Write Model" operator.
Rapid Miner stops with a severe error but there is no further information about the problem. I tried it with all file formats.
Process:
I have a problem by saving a Neural Net Model using the "Write Model" operator.
Rapid Miner stops with a severe error but there is no further information about the problem. I tried it with all file formats.
Process:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="open_file" compatibility="5.3.015" expanded="true" height="60" name="Open File (2)" width="90" x="45" y="255">
<parameter key="filename" value="/home/ubuntu/test.csv"/>
</operator>
<operator activated="true" class="read_csv" compatibility="5.3.015" expanded="true" height="60" name="Read CSV (2)" width="90" x="45" y="165">
<parameter key="csv_file" value="/home/ubuntu/test.csv"/>
<parameter key="trim_lines" value="true"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="encoding" value="UTF-8"/>
<list key="data_set_meta_data_information"/>
</operator>
<operator activated="true" class="remove_useless_attributes" compatibility="5.3.015" expanded="true" height="76" name="Remove Useless Attributes" width="90" x="45" y="30"/>
<operator activated="true" class="remove_correlated_attributes" compatibility="5.3.015" expanded="true" height="76" name="Remove Correlated Attributes" width="90" x="179" y="30"/>
<operator activated="true" class="filter_examples" compatibility="5.3.015" expanded="true" height="76" name="Filter Examples" width="90" x="246" y="210">
<parameter key="condition_class" value="missing_attributes"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.3.015" expanded="true" height="76" name="Set Role (2)" width="90" x="447" y="255">
<parameter key="attribute_name" value="label"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="5.3.015" expanded="true" height="76" name="Filter Examples (2)" width="90" x="380" y="120">
<parameter key="condition_class" value="no_missing_attributes"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.3.015" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
<parameter key="attribute_name" value="label"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="neural_net" compatibility="5.3.015" expanded="true" height="76" name="Neural Net" width="90" x="581" y="75">
<list key="hidden_layers"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="715" y="120">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="write_model" compatibility="5.3.015" expanded="true" height="60" name="Write Model" width="90" x="782" y="255">
<parameter key="model_file" value="/home/ubuntu/models/model.xml"/>
<parameter key="output_type" value="XML"/>
</operator>
<operator activated="false" class="weka:W-SMO" compatibility="5.3.001" expanded="true" height="76" name="W-SMO" width="90" x="179" y="390"/>
<connect from_op="Open File (2)" from_port="file" to_op="Read CSV (2)" to_port="file"/>
<connect from_op="Read CSV (2)" from_port="output" to_op="Remove Useless Attributes" to_port="example set input"/>
<connect from_op="Remove Useless Attributes" from_port="example set output" to_op="Remove Correlated Attributes" to_port="example set input"/>
<connect from_op="Remove Correlated Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="original" to_op="Filter Examples (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Filter Examples (2)" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Neural Net" to_port="training set"/>
<connect from_op="Neural Net" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<connect from_op="Apply Model" from_port="model" to_op="Write Model" to_port="input"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
0
Answers
In my example there are more than 1000 attributes left as input for the Neural Net learner. The Neural Net computes a correct model, but the Write Model operator is not able to save the model, no matter what format I use (XML oder binary). All attempts with the operators "Store" and "Write" failed, too.
I think that it should be possible that a data mining software is able to handle more than 1000 attributes. It is unreasonable to run the Neural Net operator every time for 30-60 minutes to create a model for new test data.
Unfortunately there is a variety of strange errors in Rapid Miner that make the program unsuitable for practical use.
But unfortunately the problem still exists when reading the previously saved model to classify new test data again (see code):
Error: "Process failed" (with no further information)
Another interesting fact: it depends on the test data. Sometimes the Process failed, sometimes not.
The operator "Read Model" reading the training data throws an error, but the problem lies inside the test data!
This is annoying. I have invested a lot in Rapid Miner, only to find that the software is unusable for large data sets.
Somehow no one here seems to have an idea.
Process:
I used the "Store" and "Retrieve" operators, too.
But the problem is the same.
that is indeed true
I have updated the issue.
There is one work-around available: You have to manually increase the Java stack-size. You can do so by providing the JVM parameter "-Xss16m" as an argument when starting RapidMiner from the RapidMinerGUI.sh/RapidMinerGUI.bat file found in the RapidMiner/scripts folder. You cannot use the RapidMiner.exe in that case.
FYI, the 16m stands for 16 MB for each stack, you could potentially increase that even more in case that is not enough but be aware that this will eat your memory for breakfast.
Regards,
Marco
Regards,
Marcel