The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Dynamic Attribute Filter
fstarsinic
Member Posts: 20 Contributor II
When testing I read data from a CSV. I'd like to limit the samples to several categories which is dynamically generated from a training set.
The training set might only have 20 categories but the test set could have 200. I only want to test on the 20.
The rest of the samples will be filtered out.
I read in the training set and extract the category list.
I remove duplicates to now have a unique list of categories.
This is what I want to filter my test set on.
I save the list to a file for later lookup if needed.
Now i'd like to read in the test data, filter on that list of categories and press on with testing.
How would I do such a thing?
Thanks.
The training set might only have 20 categories but the test set could have 200. I only want to test on the 20.
The rest of the samples will be filtered out.
I read in the training set and extract the category list.
I remove duplicates to now have a unique list of categories.
This is what I want to filter my test set on.
I save the list to a file for later lookup if needed.
Now i'd like to read in the test data, filter on that list of categories and press on with testing.
How would I do such a thing?
Thanks.
Tagged:
0
Best Answers
-
fstarsinic Member Posts: 20 Contributor III realized I could solve this be taking the unique list of categories and performing an inner join (operator) with the test set using the category column as the key attribute. that removes all the unwanted samples. easy!2
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientisthi @fstarsinic ,this is a great solution and hopefully i would have also recommended this if I would have seen this earlier! Beatiful!Best,Martin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany5