The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Crawl web not fetching any data from IMDB

kartikss7kartikss7 Member Posts: 1 Learner III
edited July 2019 in Help
Hello All,
I am using crawl web operation to fetch reviews of a specific movie in rapid miner. For this I am using a Loop Operator in the main process and inside loop am using macro and crawl web.

Loop -
Iterations: 74 (as 74 pages of reviews are there for http://www.imdb.com/title/tt0454876/reviews?start=1)
Limit Time: Checked
Timeout: 60

Inside loop operatior, I have generate macro and log from input to output and in parallel crawl web connected to output.

Generate Macro -
Function Description: macro name - pagePos and functions expression - %{pagePos} + 10 (I have initialized this macro)

Crawl Web -
URL: http://www.imdb.com/title/tt0454876/reviews?start=%{pagePos}
Crawling rules -
follow_link_with_matching_url - .+reviews.+
store with matching_url - .+reviews.+
write pages into file: checked
add pages as attribute: checked


image


When I run this process I get no observations in output. Please help me.
Tagged:

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Propably a broken link? I would add a Handle Exception around.

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.