The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to invert the example order?
Hey everyone!
I'm new to Rapid Miner and currently trying to use it on my first data set. I am desperately looking for a way to invert the order of my examples, i.e put the first row last, the second row second-to-last, and so on. The sort-operator refuses to work on the row-number (which kind of makes sense, since this isn't a real attribute). It's quit a big data set, and my current work-arounds take way to much time. Any ideas?
For context: I actually want to do this to remove certain duplicates. The remove duplicates operator seems to keep the first example and delete every duplication afterwards. I would like to keep the last example and remove all duplicates before (I'm filtering on a subset for the remove duplicates opertor). So my idea was to invert the order of examples to achieve this.
Thank you for your help!
0
Best Answer
-
varunm1 Member Posts: 1,207 UnicornHello @Newbie
You can use generate ID operator that generated ID for all the examples in your dataset. Then sort based on ID column in decreasing order which will invert the examples. Sample XML code below. To run this XML code you need to open a blank process. Go to View --> Show Panel --> XML. You can copy paste this code in XML window and click the green color tick mark that will show the process in the process window. Run it so that you can see how this sample is inverted.<?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Retrieve Titanic Training" width="90" x="179" y="34"> <parameter key="repository_entry" value="//Samples/data/Titanic Training"/> </operator> <operator activated="true" class="generate_id" compatibility="9.2.000" expanded="true" height="82" name="Generate ID" width="90" x="380" y="34"> <parameter key="create_nominal_ids" value="false"/> <parameter key="offset" value="0"/> </operator> <operator activated="true" class="sort" compatibility="9.2.000" expanded="true" height="82" name="Sort" width="90" x="581" y="34"> <parameter key="attribute_name" value="id"/> <parameter key="sorting_direction" value="decreasing"/> </operator> <connect from_op="Retrieve Titanic Training" from_port="output" to_op="Generate ID" to_port="example set input"/> <connect from_op="Generate ID" from_port="example set output" to_op="Sort" to_port="example set input"/> <connect from_op="Sort" from_port="example set output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
There might be other solutions as well. Hope this helps
PS: Once they are inverted, then you can use select attributes operator to remove the ID columnRegards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
2
Answers