The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"pdf tokenization (?)"
Hello guys,
I am totally new here and to the rapidminer!!
I have an assignment to get done so there is not much time for me to explore rapid miner. I will set my question here and I hope I will find the answer. It might be trivial.I apologise for that..
I have several pdf files. I want to tokenize them, i.e to see the multiple appearances of each word and how many times each word appears..
For example let's assume that in a pdf there is the word "process"..I want to see how many times this word appears. And that is what I want to do for all the words in the pdf file. Is tokenization what I need to do? If yes, how do I do it? If not what do you propose?
Thank you in advance!
I am totally new here and to the rapidminer!!
I have an assignment to get done so there is not much time for me to explore rapid miner. I will set my question here and I hope I will find the answer. It might be trivial.I apologise for that..
I have several pdf files. I want to tokenize them, i.e to see the multiple appearances of each word and how many times each word appears..
For example let's assume that in a pdf there is the word "process"..I want to see how many times this word appears. And that is what I want to do for all the words in the pdf file. Is tokenization what I need to do? If yes, how do I do it? If not what do you propose?
Thank you in advance!
Tagged:
0
Answers
To get the aforementioned operators, you have to install the Text Processing extension.
Best, Marius
And the process output should be what?
Sorry for the stupid questions..I am completely new to this..
these are very important concepts which are rather easy to understand, but hard to explain here in text form. I would like to forward you to our video tutorials on our website; there is one complete section about text processing.
You'll find the link to the tutorials in the post linked in my signature.
Happy Mining!
-Marius