The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Can process documents calculate term occurences of all words without having to give it a word list?
JeremyMTMD
Member Posts: 2 Learner I
in Help
I want process document to calculate for ALL the words in the document I send him, but I don't want to have to right them all manually. If someone has a solution, I would gladly take it!
Tagged:
0
Answers
There is an Text mining extenssion into the Marketplace for that, named Text Processing.
Into the Rapidminer Academy you have good learning materials to learn about it.
https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/lets-get-started
Best,
Cesar
I'm already using this extension. My problem is that I can't seem to be able to use the operator process document to calculate term occurences. It tells me I need a wordlist, but I don't know how to create one or where to search for one.
I would like an operator who can just calculate the term frequency of a tokenized, stemmed et filtered text so I can see which words is present the most. If someone knows of a way to do something like this, I would like to learn about it!
Thanks!
Jérémy
You get a wordlist from the Process Documents ... operators.
See the attached example:
The "wor" output of Process Documents is a wordlist. It is a special data structure, you can for example store it and apply it on future Process Documents operations. If you just need the data (word occurences), use WordList to Data.
Regards,
Balázs