The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How can I after I've split my dataset join them together again in the same order as befo
Hi people,
How can I after I've split my dataset join them together again in the same order as before?
Below is an example, I just want the ID to go from 1 to 150 again in a chronological order. But just sorting the ID doesn't work.
Also, how can I change the order of attributes in the results?
Thanks
-Prentice
How can I after I've split my dataset join them together again in the same order as before?
Below is an example, I just want the ID to go from 1 to 150 again in a chronological order. But just sorting the ID doesn't work.
Also, how can I change the order of attributes in the results?
<pre class="CodeBlock"><code><?xml version="1.0" encoding="UTF-8"?><process version="9.2.001"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="SYSTEM"/><br> <process expanded="true"><br> <operator activated="true" class="retrieve" compatibility="9.2.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34"><br> <parameter key="repository_entry" value="//Samples/data/Iris"/><br> </operator><br> <operator activated="true" class="split_data" compatibility="9.2.001" expanded="true" height="103" name="Split Data" width="90" x="179" y="34"><br> <enumeration key="partitions"><br> <parameter key="ratio" value="0.66"/><br> <parameter key="ratio" value="0.34"/><br> </enumeration><br> <parameter key="sampling_type" value="automatic"/><br> <parameter key="use_local_random_seed" value="false"/><br> <parameter key="local_random_seed" value="1992"/><br> </operator><br> <operator activated="true" class="union" compatibility="9.2.001" expanded="true" height="82" name="Union" width="90" x="313" y="34"/><br> <connect from_op="Retrieve Iris" from_port="output" to_op="Split Data" to_port="example set"/><br> <connect from_op="Split Data" from_port="partition 1" to_op="Union" to_port="example set 1"/><br> <connect from_op="Split Data" from_port="partition 2" to_op="Union" to_port="example set 2"/><br> <connect from_op="Union" from_port="union" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process>
Thanks
-Prentice
0
Best Answer
-
jmergler Employee-RapidMiner, RapidMiner Certified Analyst, Member, University Professor Posts: 41 GuruHi Prentice,
Many ways you could do this. Depending on the situation you might use linear sampling to keep them in order in the first place, or use multiply to keep a copy of the original. Another approach could be to add something to sort by before or after the union. Sorting by ID doesn't work because it's not numeric and doesn't have leading zeros. So you could do a text transformation either to add in leading zeros or remove the prefix and change it to a numeric. Here's one example<?xml version="1.0" encoding="UTF-8"?><process version="9.2.001"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.2.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34"> <parameter key="repository_entry" value="//Samples/data/Iris"/> </operator> <operator activated="true" class="split_data" compatibility="9.2.001" expanded="true" height="103" name="Split Data" width="90" x="179" y="34"> <enumeration key="partitions"> <parameter key="ratio" value="0.66"/> <parameter key="ratio" value="0.34"/> </enumeration> <parameter key="sampling_type" value="automatic"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> </operator> <operator activated="true" class="union" compatibility="9.2.001" expanded="true" height="82" name="Union" width="90" x="313" y="34"/> <operator activated="true" class="generate_attributes" compatibility="9.2.001" expanded="true" height="82" name="Generate Attributes" width="90" x="447" y="34"> <list key="function_descriptions"> <parameter key="ordering" value="parse(replaceAll(id, "id_", ""))"/> </list> <parameter key="keep_all" value="true"/> </operator> <operator activated="true" class="sort" compatibility="9.2.001" expanded="true" height="82" name="Sort" width="90" x="581" y="34"> <parameter key="attribute_name" value="ordering"/> <parameter key="sorting_direction" value="increasing"/> </operator> <connect from_op="Retrieve Iris" from_port="output" to_op="Split Data" to_port="example set"/> <connect from_op="Split Data" from_port="partition 1" to_op="Union" to_port="example set 1"/> <connect from_op="Split Data" from_port="partition 2" to_op="Union" to_port="example set 2"/> <connect from_op="Union" from_port="union" to_op="Generate Attributes" to_port="example set input"/> <connect from_op="Generate Attributes" from_port="example set output" to_op="Sort" to_port="example set input"/> <connect from_op="Sort" from_port="example set output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
8
Answers
Yes this works, thanks!
Do you perhaps also know how to change the order of attributes. It looks like the special attributes are always first, but is it also possible to put a regular attribute in front?
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts