The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to compare two set of attribute values?
Hi All!
I have a dataset that has 7 columns. one column is like "no" and the other 6 columns are sets of two.
set-1: three columns are "attr1_1",attr1_2","attr1_3" .
set-2: other three columns are "attr2_1",attr2_2","attr2_3".
so I just want to compare these two sets of columns, if we any one column in the first set matching with second set I need to highlight a flag value as "1".
sample Input & Output:
Input:
no attr1_1 attr1_2 attr1_3 attr2_1 attr2_2 attr2_3
234 "klo","12","78" "jkl","13","78" "jkl","14","89" "klo","12","78" "hj","31","4" "kl","9","0"
456 "klo","12","78" "klo","12","78" "ko","12","78" "jkl","13","78" "jkl","13","78" "hj","31","4"
output:
no attr1_1 attr1_2 attr1_3 attr2_1 attr2_2 attr2_3 flag
234 "klo","12","78" "jkl","13","78" "jkl","14","89" "klo","12","78" "hj","31","4" "kl","9","0" 1
456 "klo","12","78" "klo","12","78" "ko","12","78" "jkl","13","78" "jkl","13","78" "hj","31","4" 0
In the first row--"234", att1_1("klo","12","78") is macthed with attr2_1("klo","12","78") -------------output flag value becomes "1"
and second row--"456", none of (attr1)set-1 columns macthed with set-2 columns(attr2)-----------flag is "0"
Could anyone help me in solving this?
Thanks in Advance!
I have a dataset that has 7 columns. one column is like "no" and the other 6 columns are sets of two.
set-1: three columns are "attr1_1",attr1_2","attr1_3" .
set-2: other three columns are "attr2_1",attr2_2","attr2_3".
so I just want to compare these two sets of columns, if we any one column in the first set matching with second set I need to highlight a flag value as "1".
sample Input & Output:
Input:
no attr1_1 attr1_2 attr1_3 attr2_1 attr2_2 attr2_3
234 "klo","12","78" "jkl","13","78" "jkl","14","89" "klo","12","78" "hj","31","4" "kl","9","0"
456 "klo","12","78" "klo","12","78" "ko","12","78" "jkl","13","78" "jkl","13","78" "hj","31","4"
output:
no attr1_1 attr1_2 attr1_3 attr2_1 attr2_2 attr2_3 flag
234 "klo","12","78" "jkl","13","78" "jkl","14","89" "klo","12","78" "hj","31","4" "kl","9","0" 1
456 "klo","12","78" "klo","12","78" "ko","12","78" "jkl","13","78" "jkl","13","78" "hj","31","4" 0
In the first row--"234", att1_1("klo","12","78") is macthed with attr2_1("klo","12","78") -------------output flag value becomes "1"
and second row--"456", none of (attr1)set-1 columns macthed with set-2 columns(attr2)-----------flag is "0"
Could anyone help me in solving this?
Thanks in Advance!
0
Best Answer
-
Edin_Klapic Employee-RapidMiner, RMResearcher, Member Posts: 299 RM Data ScientistHi @Anusha,I did not execute the process but could directly see a problem with your expression.
if(%{loop_attribute}==%{loop_attribute1},1,0)
Explanation:
needs to be
if(#{loop_attribute}==#{loop_attribute1},1,Flag)- A macro with % is just the String value. A macro with # means that this is supposed to be an Attribute name.- If you do 1,0 previous Flag 1 replacements can be overwritten. That is why you need 1,FlagHappy Mining,
Edin
1
Answers
you are also doing a cartesian join here (matching every row with every other row from the second example set), do I understand this correctly?
Are you relying on the last number (attr1_1 = attr2_1) here or are you accepting matches in different attribute "numbers"?
I would use Cartesian Product first and then one or two Loop Attributes operators depending on the comparison logic.
For the "attribute number is relevant" case I'd loop over the attr1_.+ (regular expression) attributes, use Generate Macro to change attr1 to attr2 and compare the current macro value with the generated matching comparison value. Then Generate Attributes with flag = if (%{attr} == %{comparison}, 1, max(flag, 0)). (You would pre-created the flag attribute with the value 0.)
Regards,
Balázs
Thanks for the response.
For your information:
you are also doing a cartesian join here (matching every row with every other row from the second example set), do I understand this correctly?
I'm not doing cartesian join here.
Are you relying on the last number (attr1_1 = attr2_1) here or are you accepting matches in different attribute "numbers"?
No, any column value from set-1 matches with any column value from set-2.
Why do you want to use Cartesian Product here, I'm not understanding this?
I have used 2 loop attributes, in each loop attribute I've selected column sets using regular expression. I haven't use generate macro because in the loop attribute one of the parameters is "attribute name macro". Inside 2nd loop attribute, I have used generate attribute with the if condition like flag= if (%{attr1} == %{attr2}, 1, 0). "attr1" is the attribute name macro in 1st loop attribute operator and "attr2" is the attribute name macro in 2nd loop attribute. But it's not working as per my requirement.
getting the in the IOobjectcollection folder. there are 2 examples set for each folder. In every example set has attr1_1,att1_2,attr2_1,attr2_2,flag these 5 columns only and values in all example sets are same.
at the final output I need all columns with flag, not as an example set, how can I get this?
Thanks in Advance.
In my above example 3 attributes in set-1 and 3 attributes in set-2, so if(attr1_1==attr2_1 || attr1_1== attr2_2 || attr1_1==attr2_3 || attr1_2== attr2_1 || attr1_2== attr2_2 || attr1_2==attr2_3 || attr1_3== attr2_1 || attr1_3== attr2_2 || attr1_3==attr2_3, "1","0").
It's working fine but I don't want this static condition. I may have multiple columns in each set. How can I achieve this?
can anyone help me, please?
the nested Loop Attributes is a good way to achieve what you want. Just be careful how you set it up. You'll need to check "reuse results" if you're working on the current example set instead of generating new ones. If it is not working, set up a breakpoint after the Generate Attributes that sets your "flag" attribute and check step by step.
I assumed Cartesian Product because you are combining two sets to one and then comparing each element of Set 1 with each element of Set 2. This is the main use case for cartesian product (cartesian join).
If you want to keep the data separately, you can use nested Loop Examples, always filter for the current example (e. g. with Filter Example Range), do the comparison there (but that will also need Loop Attributes if it needs to be generic) and set up the resulting example set.
Regards,
Balázs
Thanks for the reply.
Even after using loop attributes and generate attributes, not getting the required answer. flag value is "0" even though there is a match in the set-1 and set-2 attributes.
I have followed the same procedure but I'm not getting flag value as "1", even the value of set-1 attributes matched with values of set-2 attributes.
please find the below process.
I don't get a valid process when pasting your XML into RapidMiner but I looked in to the parameter values.
You are using this expression: isdocup = if(%{loop_attribute}==%{loop_attribute1},1,0)
However, if isdocup was already 1 but is reset later to 0, that's what you get as the end result.
Try something like: max(isdocup, if(%{loop_attribute}==%{loop_attribute1},1,0))
So if isdocup was ever 1 in the current row, it stays that way.
Regards,
Balázs