The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to remove non-duplicate values?
A RapidMiner user wants to know the answer to this question: "Hey! I have a data set of over 42000 records that has several duplicate and unique values. However, I would like to clean it up and remove only non-duplicate values and leave duplicate records. I know the “remove duplicates” operator removes duplicates but in my case, I want to do the opposite. Any idea how I could do this? Thank you."
Tagged:
0
Answers
Dortmund, Germany
Does this help?
Scott
You have 42000 records.
Some are duplicate.
Some are unique.
If you need the non-uniques, the dup output from the Remove Duplicates operator obtains the records that aren't unique.
Sorry, I was lost in translation, had to reorganize the question because I understood like, 3 different things. Yes, @sgenzer's question is fine. If what is required is an aggregation (like, the count of duplicated events), what @mschmitz says helps, too.
Best,
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts