The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Crawl web operator does not return any results

AmiroAmiro Member Posts: 3 Learner I
edited January 2019 in Help
Hi I am having issues with the simplest web crawler.  It does not return any results at all. I have no rules, just a simple web crawl out of the box. What am I doing wrong here?



Tagged:

Answers

  • AmiroAmiro Member Posts: 3 Learner I
  • AmiroAmiro Member Posts: 3 Learner I
    I have attached my rmp file, any help would be appreciated? Many thanks
  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @Amiro - so that website does not have much content and it is a very heavily-stylized site. My guess is that the site is blocking very basic crawlers like "Crawl Web". You're better off using Get Page, grabbing URLs, and so on.

    Scott

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Or it could be related to the https problem that came up in the other thread as well.  At this point I think Crawl Web operator is not a totally reliable option. Using Get Page within some kind of loop is probably a better alternative.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.