The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Process Documents from Data
Hello,
I have a question regarding the "Process Documents from Data" and "Generate TF-IDF".
What is the difference between the vector creating from "Process Documents from Data" set to TF-IDF and the "Generate TF-IDF". They give different end values, while they should both give the TF-IDF. So if I want to get the TF-IDF should I use the designated operator for this or the vector creation which is set to TF-IDF?
Thanks
-Prentice
I have a question regarding the "Process Documents from Data" and "Generate TF-IDF".
What is the difference between the vector creating from "Process Documents from Data" set to TF-IDF and the "Generate TF-IDF". They give different end values, while they should both give the TF-IDF. So if I want to get the TF-IDF should I use the designated operator for this or the vector creation which is set to TF-IDF?
Thanks
-Prentice
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistHi,i wouild opt for Process Documents over Generate TF-IDF. the Generate TF-IDF is not normalizing the vector, thats why the results are different.Best,Martin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany6
Answers