The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Web Crawl wikipedia"

macctenmaccten Member Posts: 28 Contributor II
edited June 2019 in Help
Hi All

I hope you can help me
I have a page on Wikipedia that i would like to crawl https://en.wikipedia.org/wiki/2012_in_film
I wish to extract certain details that determined if each of the films was a success or a bust i.e. text mine each of the wiki pages for each of the films
Based on this information, i would like to predict if the movies due to be released in 2013 are going to be a success

My problem is I'm unable to crawl the top website and the pages it links to (i.e. each movie) as it returns no records or files
I can web crawl wikipedia.org without any issues

Does anyone know what the problem is?

Thanks for your time
Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    as (almost) always we can't help you if we don't know how you configured your operators. Please post your process xml as described in my signature.

    Best regards,
    Marius
Sign In or Register to comment.