The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to loop through pictures for text recognition
Hi everyone,
I am new to Rapidminer and I would appreciate if any help you can provide. I have a database with a field of URLs. All the URLs are pictures. I need to find a process that without clicking manually on URLs, I still can extract text from the URL images for every row in my dataset. My dataset has hundreds of thousands of rows.
I am new to Rapidminer and I would appreciate if any help you can provide. I have a database with a field of URLs. All the URLs are pictures. I need to find a process that without clicking manually on URLs, I still can extract text from the URL images for every row in my dataset. My dataset has hundreds of thousands of rows.
0
Answers
One possible workflow would be to use RM to loop all of your db records -> webmining extension to download the image and store it locally -> python using for instance opencv to read the image -> pytesseract to do the OCR to get the text -> return text to Rapidminer and continue with next image.
Hi Kayman, thank you for your help! Can you be more specific about how to download the images? I used the operator Get pages and I don't see any options to download the images from URLs