The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Sentiment analysis multiple Pdfs
I am a master's student in Business Economics and I'm new to RapidMiner.
For my thesis, I have to pre-process multiple Pdf files by tokenizing, stemming, transforming cases etc.
If I do this for one file, I get the wanted outcome: a processed text. But when I use the loop function to process multiple pdfs, the output is never text, but tables of word counts.
How do I pre-process multiple pdf files and get all the processed texts?
Thank you for helping!
For my thesis, I have to pre-process multiple Pdf files by tokenizing, stemming, transforming cases etc.
If I do this for one file, I get the wanted outcome: a processed text. But when I use the loop function to process multiple pdfs, the output is never text, but tables of word counts.
How do I pre-process multiple pdf files and get all the processed texts?
Thank you for helping!
Tagged:
0
Answers
You should start by using the process documents from files.
That will output the results of your folders and PDF to an example set containing all the pre processed files after applyting all the steps of the text mining.
Here is a link to the course on the academy
https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/comparison-classification-and-clustering?page=3