The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Web Crawl from POST Method form newspaper search engine that uses Javascript var"
thankyourapid-i
Member Posts: 1 Learner III
Dear Rapid-I,
I succeed to populate a database for successive analysis using Your amazing data mining tool ( Rapid Miner ! ).
Now for a scientific research I need to get earthquake related italian article data from a freely available newspaper article archive search engine
http://sitesearch.corriere.it/siteSearchEngine?q=terremoto%20scosse .
Searching for these words: " terremoto scosse" You will find 670 articles.
The pagination system uses a javascript script to generate the pageNumber variable.
The form uses POST Method and hidden inputed variables, instead of GET method web crawling articles.
Maybe for You is a simple question, but I am a newbe in data mining field, so please explain to me how can I proceed.
What Rapid Miner operators have I to use?
How can I set the javascript pageNumber variable to loop the article extraction?
You could also write a new article about Web Crawling from on line data archive search engines that uses POST Method forms and Javascript, because it seems a not trivial topic.
I wait for Your kind answer and wish to Rapid-I a logarithmic success!
Have a good day,
Alex
I succeed to populate a database for successive analysis using Your amazing data mining tool ( Rapid Miner ! ).
Now for a scientific research I need to get earthquake related italian article data from a freely available newspaper article archive search engine
http://sitesearch.corriere.it/siteSearchEngine?q=terremoto%20scosse .
Searching for these words: " terremoto scosse" You will find 670 articles.
The pagination system uses a javascript script to generate the pageNumber variable.
The form uses POST Method and hidden inputed variables, instead of GET method web crawling articles.
Maybe for You is a simple question, but I am a newbe in data mining field, so please explain to me how can I proceed.
What Rapid Miner operators have I to use?
How can I set the javascript pageNumber variable to loop the article extraction?
You could also write a new article about Web Crawling from on line data archive search engines that uses POST Method forms and Javascript, because it seems a not trivial topic.
I wait for Your kind answer and wish to Rapid-I a logarithmic success!
Have a good day,
Alex
Tagged:
0