The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Script task issue with RapidMiner Studio 9.2
Hi all,
Has anyone else noticed any issues with using Script Tasks with the latest version of RapidMiner Studio (RapidMiner Studio 9.2.000 (rev:461351, platform: WIN64))?
If I add a script task in my process process, all follow on tasks seem to loose the ExampleSet causing e.g. the available attributes list in "Reorder Attributes" task to be empty. The funny thing is that the follow on tasks still seem to execute ok, if I run the process. For example, if I remove the script task from the process, reorder the attributes with "Reorder Attributes", then put the script task back in and run, the output is reordered as per my configuration.
The above used to work for me with no issue in the previous version of RapidMiner Studio.
Regards,
Olli
Tagged:
0
Best Answer
-
Marco_Boeck Administrator, Moderator, Employee-RapidMiner, Member, University Professor Posts: 1,996 RM EngineeringHi,
Unfortunately, that is to be expected. Execute Script can be used to do anything, you can create and return new data here, you can remove/add attributes to existing data, return something else entirely (a model instead of a data set), it may not even return anything at all, etc.
So to know what really happens in there, we would have to execute the script. And because you are completely free in what to do, it may even fail when not running on the entire data. Right now, we have no chance to do that for meta data, so it has no meta data.
But, there is a solution:- Click on "Process" -> "Synchronize Meta Data with Real Data" in the top menu bar and make sure it's checked
- Right-click the Execute Script operator in your process, and select "Breakpoint After"
- Run the process. It will now pause after the script has run, and the meta data will be created based on the actual data that is now there.
- Select Attributes now has the attributes available. You can select them now.
- Resume the process by clicking the run button again. It will resume where it was paused and you will finish the process with the attributes selected in the 2nd Select Attributes operator.
Regards,
Marco7
Answers
When you edit your process in the UI and in the parameters of an operator you are selecting attributes, that is what we call "metadata". It's a best effort solution to help create a process. When actually running the process on the real data, this metadata is irrelevant and only the actual data is being looked at.
We have changed the way this metadata is generated in Studio 9.2, as previously it could freeze your entire Studio UI if you had an operator with large / slow metadata. This was fixed, but as a result some of the metadata may now take a while to appear, or may outright be missing because we overlooked something. Please let us know these instances and possibly share the process with us so we can have a look!
Regards,
Marco
<pre class="CodeBlock"><code><?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="SYSTEM"/><br> <process expanded="true"><br> <operator activated="true" class="read_excel" compatibility="9.2.000" expanded="true" height="68" name="Read Excel" width="90" x="112" y="34"><br> <parameter key="excel_file" value="sample data.xlsx"/><br> <parameter key="sheet_selection" value="sheet number"/><br> <parameter key="sheet_number" value="1"/><br> <parameter key="imported_cell_range" value="A1"/><br> <parameter key="encoding" value="SYSTEM"/><br> <parameter key="first_row_as_names" value="true"/><br> <list key="annotations"/><br> <parameter key="date_format" value=""/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="locale" value="English (United States)"/><br> <parameter key="read_all_values_as_polynominal" value="false"/><br> <list key="data_set_meta_data_information"><br> <parameter key="0" value="Town.true.polynominal.attribute"/><br> <parameter key="1" value="District.true.polynominal.attribute"/><br> </list><br> <parameter key="read_not_matching_values_as_missings" value="false"/><br> <parameter key="datamanagement" value="double_array"/><br> <parameter key="data_management" value="auto"/><br> </operator><br> <operator activated="true" class="select_attributes" compatibility="9.2.000" expanded="true" height="82" name="Select Attributes (2)" width="90" x="246" y="85"><br> <parameter key="attribute_filter_type" value="subset"/><br> <parameter key="attribute" value=""/><br> <parameter key="attributes" value="District|Town"/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="false"/><br> </operator><br> <operator activated="true" class="execute_script" compatibility="9.2.000" expanded="true" height="82" name="Execute Script" width="90" x="246" y="238"><br> <parameter key="script" value="/* * You can use both Java and Groovy syntax in this script. * * Note that you have access to the following two predefined variables: * 1) input (an array of all input data) * 2) operator (the operator instance which is running this script) */ // Take first input data and treat it as generic IOObject // Alternatively, you could treat it as an ExampleSet if it is one: // ExampleSet inputData = input[0]; IOObject inputData = input[0]; // You can add any code here // This line returns the first input as the first output return inputData;"/><br> <parameter key="standard_imports" value="true"/><br> </operator><br> <operator activated="true" class="select_attributes" compatibility="9.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="447" y="136"><br> <parameter key="attribute_filter_type" value="subset"/><br> <parameter key="attribute" value=""/><br> <parameter key="attributes" value=""/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="false"/><br> </operator><br> <connect from_port="input 1" to_op="Read Excel" to_port="file"/><br> <connect from_op="Read Excel" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/><br> <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Execute Script" to_port="input 1"/><br> <connect from_op="Execute Script" from_port="output 1" to_op="Select Attributes" to_port="example set input"/><br> <connect from_op="Select Attributes" from_port="example set output" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="source_input 2" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process>