The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Filtering rows and columns
Dr_Van_Nostrand
Member Posts: 1 Learner III
Hi,
Maybe this belongs under a different thread but anyway: is there an operator for selectively filtering out (i.e. remove) one or more rows or columns from a table with data? This is possible in Knime, but Knime lacks other features that RM has.
Feedback is much appreciated!
Maybe this belongs under a different thread but anyway: is there an operator for selectively filtering out (i.e. remove) one or more rows or columns from a table with data? This is possible in Knime, but Knime lacks other features that RM has.
Feedback is much appreciated!
10
Best Answer
-
MariusHelf RapidMiner Certified Expert, Member Posts: 1,869 UnicornThere is Filter Examples for examples (rows), and Select Attributes for attributes (columns).
Best, Marius6
Answers
Hi Marius,
At first I want to say THANK YOU at this time for your fast and competent help with all my issues the last days!
I'm really trying to find answers in this forum and the documentation/wiki first.
I checked the "filter examples" operator you mentioned. Especially "attribute_value_filter" shows manifold capabilities to remove rows.
Though, I still found no solution to remove all rows but the last one with this operator.
You can count the examples into a macro and used that as your range start and end, like this. HTH
Hi haddock,
Thanks for sharing. Indeed this solution works quite effective.
To be a somewhat more flexible in my process I tried another setup which can provide the last x rows of an example set by doing calculation with the macro's value. Unfortunatelly this doesn't work in the "filter example" operator (e.g. "%{exs}-5" as "first example" value).
Oh dear... I thougt it was so simple - now it drives me nuts
Sincerely
Sachs
It's quite simple, I'll leave it to you, think loop counters.
This process delivers x times the (same) last row and provides a collection instead of a single example set...
Bye for now
Sachs
I should have been more explicit about the count. The last example we know; if you take five from the number of examples
you'll filter to six examples ( think about the loop count ), so you need to add one back, like this. The more you use RM, the more those pesky macros show up!
Wow, that's it! Indeed simple - if you once know how it works ;D
Thank you so much!
The biggest mistake that I see being made on this Forum is that people underestimate RapidMiner, it is a deep and mature environment with many man years behind it. And these are seriously qualified enthusiastic man years, so the chances are that most likely tasks are covered, as you say if only you know how. It's a big toolkit, and it takes time.
Stick with it!
I've got a similar problem and after two days of work, no solutions...
I need to remove the attributes whose sum (on its column) is < 10. How can I do? I only have numeric values under these attributes...
every feedback is very welcome! :-[
I would not necessarily call this a similar problem, but try the process below. It calculates the sum of each column using Aggregate with a default aggregation, then cleans the filenames to match the original name, then selects attribute which some up to at least 0 in this example, and finally the filtered data is used as reference data to filter the original data in Reorder Attributes.
I propose to use breakpoints after each operator to understand what's going on.
Best regards,
Marius
Here is a link to this process in the Community Repository.