The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Web Scraping - dynamic content
Hello folks,
we're trying to create a Text Mining project for our university class. Our goal is to scrape the data of our university courses description and look up Udemy for best matching courses. So far so good, now I realized that a saved HTML file of an Udemy course is missing relevant information like the "price" tag. Do you have an idea how to scrape those missing information?
Best regards
Patrick
we're trying to create a Text Mining project for our university class. Our goal is to scrape the data of our university courses description and look up Udemy for best matching courses. So far so good, now I realized that a saved HTML file of an Udemy course is missing relevant information like the "price" tag. Do you have an idea how to scrape those missing information?
Best regards
Patrick
0
Best Answer
-
JEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 UnicornTo save you time and effort I recommend you could use Parsehub to build the scraper for Udemy.Instructions to use it are really simple and the free version allows you to pull in 200 pages per run (should be enough for your task, but you can also contact them to ask about their academic program.Best of all when you build your scraper it has a RestAPI which means you can then call it from your RapidMiner process and get the results back directly.https://www.parsehub.com/
8
Answers