The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

IDF Calculation for Test Set

smjsmj1smjsmj1 Member Posts: 3 Contributor I
edited November 2018 in Help
Can anyone explain the calculation of IDF value for Test sets?
Is it based on the IDF of Training sets?
I see that test set take only the word list used by the training set and IDF is Calculated solely based on the test set. So, if Test set contain only 1 document, then there is a chance that IDF becomes 0, correct?

Answers

  • frasfras Member Posts: 93 Contributor II
    If you are using TF-IDF you must store model _and_ wordlist after training.
    To test or score unseen data you have to preprocess with exactly the same
    "Process Documents"-Operator that you used for training including the wordlist.
  • smjsmj1smjsmj1 Member Posts: 3 Contributor I
    Thank you for the reply
Sign In or Register to comment.