The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Perform text classification with seperate test/train splits ?
kashif_khan
Member Posts: 19 Contributor II
Hi,
i am a newbie and dealing with text classification in rapid miner. I have seperate test/train splits and i want to select top k features with respect to information gain(for e.g with high information gain). In general(without feature selection) we need to provide output of "Process Documents From Files" (wordlist) used for train set to "Process Documents From Files" (wordlist) which is used for loading test set but how can we do the same if we need to apply feature selection to train set and provide the reduced feature as a vocabulary to test split ??
Kindly help i searched alot on internet but all have done with n-fold cross validation and i could'nt figure out how to use it with dedicated test/train splits
i am a newbie and dealing with text classification in rapid miner. I have seperate test/train splits and i want to select top k features with respect to information gain(for e.g with high information gain). In general(without feature selection) we need to provide output of "Process Documents From Files" (wordlist) used for train set to "Process Documents From Files" (wordlist) which is used for loading test set but how can we do the same if we need to apply feature selection to train set and provide the reduced feature as a vocabulary to test split ??
Kindly help i searched alot on internet but all have done with n-fold cross validation and i could'nt figure out how to use it with dedicated test/train splits
Tagged:
0
Answers