The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"[SOLVED] Filter data from examples set"
Hi,
I'm beginner in the RapidMiner, so in my first step I try to extract some data from Access database, do some operations and display it for the end.
I'm stopped at the point how to select some data from the data set.
What I do: make repository with data from MS Access, Select attributes - two columns A and B with text, next Generate Attributes - column C where are joined strings from A and B. All columns contains words (text). For example, column A: "Gurund", column B: "Corporation" and column C: "Gurund Corporation". Of course, at column B value are not only "Corporation". There are many different values also.
Next I would like to filter rows where can find word "Corporation" only and display it. I try different Operators like Filter Documents or Filter Examples,, but I not found anyone which help me. Can you write any suggestion?
I'm beginner in the RapidMiner, so in my first step I try to extract some data from Access database, do some operations and display it for the end.
I'm stopped at the point how to select some data from the data set.
What I do: make repository with data from MS Access, Select attributes - two columns A and B with text, next Generate Attributes - column C where are joined strings from A and B. All columns contains words (text). For example, column A: "Gurund", column B: "Corporation" and column C: "Gurund Corporation". Of course, at column B value are not only "Corporation". There are many different values also.
Next I would like to filter rows where can find word "Corporation" only and display it. I try different Operators like Filter Documents or Filter Examples,, but I not found anyone which help me. Can you write any suggestion?
Tagged:
0
Answers
condition class: Attribute value filer
parameter string: B="Corporation"
I try this operator, but problem is that column B (or A too) value may be one or more word. For example, in column B "Corporation Europe" or "Corp." which is the same for me. I think the best solution will be an operator with regular expression, but I can't find something similar to Filter Examples with regexp. Or maybe I don't know how to write correct expression for Filter Examples operator.
a rework of the Filter Examples operator is planned. Until then you have to use a workaround with Generate Attributes: it checks a condition and creates a new indicator attribute, on which you can then apply Filter Examples.
Please have a look at the attached process.
Best, Marius
I think I used some filtering with reg exp before to filter examples CONTAINING a word.
here are RM regular expressions
http://rapid-i.com/wiki/index.php?title=Regular_expressions
I am not sure if the reg exp work in filter examples attribute_value_filter, try.
If not they definitely work in Generate attrib as marius suggested.
good luck
Thanks Marius for your suggestion. I try and play with the Generate Attributes operator and I received desired result.
easily done with the Filter Examples operator in Studio 6.3, you just specify the words you want, then at the bottom if they must ALL be included or if ANY occurrence is sufficient.
Regards,
Marco