The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Web crawling a difficult webpage (Airbnb)

2176328921763289 Member Posts: 3 Learner III
edited March 2020 in Help

Hello,

 

I need to webscrap Airbnb webpage. I need to get all the punctuations from all the acommodations in a city ("Veracidad":5,"Comunicacion":5, etc.). airbnb.jpg

First, I thought about getting all the urls for all the acommodations in a city, for example . Then make the web crawler do the scraping to all those links and get the individual punctuations.

But when I use a max crawl depth of 1 with the url in the example link I don't get the acommodations' urls ...

 

Could you help me, please? :womanhappy:

 

 

 

 

 

Tagged:

Answers

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hello @21763289 please note that webscraping commercial websites is generally illegal and/or violates the Terms of Service of these companies. Here is the specific language from airbnb.com:

     

    14.1 You are solely responsible for compliance with any and all laws, rules, regulations, and Tax obligations that may apply to your use of the Airbnb Platform. In connection with your use of the Airbnb Platform, you will not and will not assist or enable others to:
    ...
    use any robots, spider, crawler, scraper or other automated means or processes to access, collect data or other content from or otherwise interact with the Airbnb Platform for any purpose;

    (source: https://www.airbnb.com/terms)

     

    I STRONGLY advise any RapidMiner users to please check the Terms of Service of any website when using our software or any other means of webscraping.

     

    Scott

     

  • 2176328921763289 Member Posts: 3 Learner III

    Ok, thanks, I understand.

     

    So, if someone'd want to reply me privately about how to do it hypothetically... It is just for doing a research for my university.

  • rfuentealbarfuentealba RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn

    Hi @21763289,

     

    Have you checked if you can do it legally through the AirBnB API? Looks like they do have one:

    https://www.airbnb.com/partner?c=tumblr&af=746240

     

    I haven't worked with it, but this might be a good beginning.

     

    All the best,

     

    Rodrigo.

  • 2176328921763289 Member Posts: 3 Learner III

    Nice idea. Thanks!! 

Sign In or Register to comment.