The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to read PDF file in rapidminer
KanikaAg15
Member Posts: 19 Learner II
in Help
Hi,
I have a PDF file available with text and tabloid content. I would like to make a pipeline which can read only the specified tables from the PDF. Can anyone recommend any process for the same.
1st constraint being reading pdf into rapidminer.
2nd constraint extracting information from the PDF.
I have a PDF file available with text and tabloid content. I would like to make a pipeline which can read only the specified tables from the PDF. Can anyone recommend any process for the same.
1st constraint being reading pdf into rapidminer.
2nd constraint extracting information from the PDF.
0
Best Answer
-
MarcoBarradas Administrator, Employee-RapidMiner, RapidMiner Certified Analyst, Member Posts: 272 UnicornHi @KanikaAg15,
You'll need to add the Text Processing extension that will help let you extract the data from the pdf.
There is another extension that might useful PDF Table Extraction
And this course will be useful https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/lets-get-started
0