The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Comparing every row of an exampleset with all the rows in another

Pradyumna_26Pradyumna_26 Member Posts: 7 Learner II
I have two examplesets, say A and B, with the same set of attribute names, and each individual row from A needs to be compared with all rows in B to be categorized based on a criteria on a particular attribute. My initial thought was to use a Loop Examples operator to iterate over the rows of A, and to retrieve B and apply Filter Examples operator within the loop (at every iteration). The problem was that I couldn’t find a way to use macros to set the filter parameter (attribute value from A in that particular row iteration). This has been a hurdle for my task for quite a few days now, and any help/insight/suggestion would be greatly appreciated!

Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi @Pradyumna_26,

    if the example sets aren't too large, you could use a Cartesian Product (a kind of join, but everything with everything) and then use Generate Attributes for the necessary comparisons, and then Filter Examples to only keep what you need.

    If they are too large, you can process A in batches, e. g. of 100 or 1000 rows, joining with the entire B.

    If you want to go with Loop Examples, use Extract Macro inside the loop with the setting data_value and %{example} as the example index.

    Regards,

    Balázs
Sign In or Register to comment.