The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Web Mining Get pages"

AB26511AB26511 Member Posts: 11 Contributor II
edited June 2019 in Help
Hi All,

I am trying to extract the contents from web using Get Pages & Crawl web options but I am getting error as
"Jun 26, 2013 11:09:15 AM WARNING: Failed to get HTTP input stream, trying error stream."

While importing the links from excel I had done all the necessary settings such as selected "file path" option for attributes. Link attribute mentioned by the Column heading in excel for "get pages" etc


Please advice.

Regards,

AB
Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hey AB,

    is your internet connection stable? Do the sites you try to crawl are well connected and don't time out from time to time?

    If you can answer one of the questions with NO, then you should try to do something about your connection.

    Otherwise, are there some URLs for which this error can be reliably reproduced?

    Best regards,
    Marius
Sign In or Register to comment.