The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Filtering examples based on number of occurences in attribute

BaskiBaski Member Posts: 1 Learner III
edited November 2018 in Help

Hi,

For example I have examples that containts information about visits. Every visit is defined to visitor_id.  I want to filter the examples(rows) where the visitor_id occure more than 5 times. So there will be no more then 4  rows for every visitor_id. I tried filter, but that was not helpfull. 

Any idea how to do this in rapid miner ? 
Thanks.

Tagged:

Answers

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Hi,

     

    While I am pretty sure that the answer to this question will involve the operators "Aggregate", "Pivot", and "Filter Examples", I am unfortunately not sure if I fully got the problem.  Can you give us a small data sample (original data) as well as how the desired output for this sample should look like?

     

    Merci,

    Ingo

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi...no the Filter Examples operator is not going to help you here (as you saw).  The way I see it, you need to first create an attribute that lists # of occurrences, and then you can filter for n > 5 or whatever.  Personally I would use the Aggregate operator where you group by visitor_id and aggregate by visitor_id.  Then join this with your original data set on the visitor_id attribute.

     

    Scott

Sign In or Register to comment.