The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Keep samples based on prefered attribute value

aileenzhouaileenzhou Member Posts: 12 Contributor II
I have a dataset, there are some duplicated DOI. I must keep one of the duplicated DOIs based on 'source' attribute with preference: B>C>A, and delete rest.

For example, the data below, I want to keep row 1261 and 643, delete the rest.
Row     DOI                 Source
18        10.1002/67       A
1261    10.1002/67       B
1400    10.1002/67       C
... ...
643      10.102/et.67    C 
1428    10.102/et.67    A 

Thank you in advance.

Best Answer

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Since Remove Duplicate always keeps the first you can I think sort and then use remove duplicates on the DOI.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.