The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Web Crawl wikipedia"
Hi All
I hope you can help me
I have a page on Wikipedia that i would like to crawl https://en.wikipedia.org/wiki/2012_in_film
I wish to extract certain details that determined if each of the films was a success or a bust i.e. text mine each of the wiki pages for each of the films
Based on this information, i would like to predict if the movies due to be released in 2013 are going to be a success
My problem is I'm unable to crawl the top website and the pages it links to (i.e. each movie) as it returns no records or files
I can web crawl wikipedia.org without any issues
Does anyone know what the problem is?
Thanks for your time
I hope you can help me
I have a page on Wikipedia that i would like to crawl https://en.wikipedia.org/wiki/2012_in_film
I wish to extract certain details that determined if each of the films was a success or a bust i.e. text mine each of the wiki pages for each of the films
Based on this information, i would like to predict if the movies due to be released in 2013 are going to be a success
My problem is I'm unable to crawl the top website and the pages it links to (i.e. each movie) as it returns no records or files
I can web crawl wikipedia.org without any issues
Does anyone know what the problem is?
Thanks for your time
Tagged:
0
Answers
as (almost) always we can't help you if we don't know how you configured your operators. Please post your process xml as described in my signature.
Best regards,
Marius