How to Scrape Airbnb Data: Pricing, Ratings, Amenities (Code & No code)


July 27, 2022 - 5 min read

Nicolae Rotaru
Nicolae Rotaru

Introduction

Airbnb is an online marketplace that connects people who want to rent out their homes with people who are looking for accommodations in specific locales.


In this article, you will read about the easiest way to scrape Airbnb listings with Page2API.


You will find code examples for Ruby, Python, PHP, NodeJS, cURL, and a No-Code solution that will import Airbnb listings into Google Sheets.


You can scrape Airbnb data, with such information as amenities, prices, descriptions, photos, URLs to perform:

  • price monitoring
  • trends analysis
  • competitor analysis
  • or simply to find the perfect vacation spot


In this article, we will learn how to:

  • Scrape Airbnb listings
  • Scrape Airbnb listing details

Prerequisites

To start scraping Airbnb, you will need the following things:


  • A Page2API account
  • A location, in which we want to search for listings, let's use for example Amsterdam, Netherlands

How to scrape Airbnb listings

First, we need to open the Airbnb search page with the desired location.

In our case it will be:

  
    https://www.airbnb.com/s/Amsterdam--Netherlands/homes


This URL is the first parameter we need to start scraping the listings page.


The page we see must look similar to the following one:

Airbnb listings page

From the search page, we will scrape the following attributes:

  • URL
  • Title
  • Name
  • Beds
  • Rating
  • Price

Each listing container is wrapped in a div element with the following attribute: itemprop="itemListElement".

Now, let's define the selectors for each attribute.

  
    /* Parent: */
    [itemprop=itemListElement]

    /* URL: */
    a[target*='listing_']

    /* Title: */
    [id*='title']

    /* Name: */
    [itemprop=name]

    /* Beds: */
    [aria-label*='bed']

    /* Rating: */
    [aria-label*='Rating']

    /* Price: */
    [style*='--pricing'] > div > span > div > span
  


Now, let's handle the pagination.

If we take a look at the URL structure when switching between pages, we see that they look like this:

  
    // Page 1
    https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=0

    // Page 2
    https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=20

    // Page 3
    https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=40
  

This looks like a great scenario to use the batch scraping approach.

Now let's build the request that will scrape all listings that the search page returned.

The following examples will show how to scrape 3 pages of listings from Airbnb.com

The payload for our scraping request will be:

  
    {
      "api_key": "YOUR_PAGE2API_KEY",
      "batch": {
        "urls": "https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=[0, 40, 20]",
        "concurrency": 1,
        "merge_results": true
      },
      "parse": {
        "listings": [
          {
            "url": "a[target*='listing_'] >> href",
            "beds": "[aria-label*='bed'] >> text",
            "name": "[itemprop=name] >> content",
            "price": "[style*='--pricing'] > div > span > div > span >> text",
            "title": "[id*='title'] >> text",
            "rating": "[aria-label*='Rating'] >> text",
            "_parent": "[itemprop=itemListElement]"
          }
        ]
      },
      "wait_for": "[itemprop=itemListElement]",
      "real_browser": true,
      "premium_proxy": "us"
    }
  

Code examples (batch scraping approach)

      
    require 'rest_client'
    require 'json'

    api_url = "https://www.page2api.com/api/v1/scrape"
    payload = {
      api_key: "YOUR_PAGE2API_KEY",
      batch: {
        urls: "https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=[0, 40, 20]",
        concurrency: 1,
        merge_results: true
      },
      parse: {
        listings: [
          {
            url: "a[target*='listing_'] >> href",
            beds: "[aria-label*='bed'] >> text",
            name: "[itemprop=name] >> content",
            price: "[style*='--pricing'] > div > span > div > span >> text",
            title: "[id*='title'] >> text",
            rating: "[aria-label*='Rating'] >> text",
            _parent: "[itemprop=itemListElement]"
          }
        ]
      },
      wait_for: "[itemprop=itemListElement]",
      real_browser: true,
      premium_proxy: "us"
    }

    response = RestClient::Request.execute(
      method: :post,
      payload: payload.to_json,
      url: api_url,
      headers: { "Content-type" => "application/json" },
    ).body

    result = JSON.parse(response)

    puts(result)
      
    

The result

  
    {
      "result": {
        "listings": [
          {
            "url": "https://www.airbnb.com/rooms/3982031?adults=1&children=0&infants=0&check_in=2022-08-20&check_out=2022-08-27&previous_page_section_name=1000&federated_search_id=b79f5e55-f2f6-40db-baae-db124484bd1e",
            "beds": "1 double bed",
            "name": "Top location, quiet guesthouse, 2p",
            "price": "$113",
            "title": "Guesthouse in Stadsdeel Centrum",
            "rating": "4.93 (147)"
          },
          {
            "url": "https://www.airbnb.com/rooms/51760555?adults=1&children=0&infants=0&check_in=2022-11-02&check_out=2022-11-09&previous_page_section_name=1000&federated_search_id=b79f5e55-f2f6-40db-baae-db124484bd1e",
            "beds": "1 double bed",
            "name": "Amsterdam city garden",
            "price": "$87",
            "title": "Apartment in Stadsdeel West",
            "rating": "4.88 (8)"
          },
          ...
        ]
      }, ...
    }
  

How to scrape Airbnb listing details

From the 'Search' page, we click on any listing.


This will change the browser URL to something similar to:

  
    https://www.airbnb.com/rooms/3163509


The page will look like the following one:

Airbnb property overview page

From this page, we will scrape the following attributes:

  • Title
  • Superhost
  • Guests
  • Bedrooms
  • Beds
  • Baths
  • Reviews
  • Rating
  • Price
  • Description
  • Amenities
  • Images

Let's define the selectors for each attribute.

  
    /* Title: */
    .ds-summary-row span

    /* Superhost: */
    //span[contains(text(),'Superhost')]

    /* Guests: */
    //span[contains(text(),'guests')]

    /* Bedrooms: */
    //span[contains(text(),'bedrooms')]

    /* Beds: */
    //span[contains(text(),'beds')]

    /* Baths: */
    //span[contains(text(),'baths')]

    /* Price: */
    [style*='--pricing'] > div > span > div > span


    /* Description: */
    [data-plugin-in-point-id=DESCRIPTION_DEFAULT]

    /* Images: */
    picture img

    /* Reviews / Rating */
    // we will encode it to base64 later
    var ratingObject = document.querySelector("[aria-label*='Rated']").attributes['aria-label'].nodeValue.match(/Rated (?[\d?\.]+).+ from\s(?[\d]+)/).groups;

    // reviews js selector
    ratingObject['reviews']

    // rating js selector
    ratingObject['rating']

    /* Amenities */
    document.querySelector('[data-section-id=AMENITIES_DEFAULT] section button').click()

    // amenities js selector
    var amnts = [];
    document.querySelectorAll(\"[id*='pdp_v3'][aria-disabled='false']\").forEach(e => amnts.push(e.innerText));
    amnts;
  

The payload for our scraping request will be:

  
    {
      "api_key": "YOUR_PAGE2API_KEY",
      "url": "https://www.airbnb.com/rooms/3163509",
      "parse": {
        "title": "h1 >> text",
        "superhost": "//span[contains(text(),'Superhost')] >> text",
        "guests": "//span[contains(text(),'guests')] >> text",
        "bedrooms": "//span[contains(text(),'bedrooms')] >> text",
        "beds": "//span[contains(text(),'beds')] >> text",
        "baths": "//span[contains(text(),'baths')] >> text",
        "reviews": "js >> ratingObject['reviews']",
        "rating": "js >> ratingObject['rating']",
        "price": "[style*='--pricing'] > div > span > div > span >> text",
        "description": "[data-plugin-in-point-id=DESCRIPTION_DEFAULT] >> text",
        "amenities": "js >> var amnts = []; document.querySelectorAll(\"[id*='pdp_v3'][aria-disabled='false']\").forEach(e => amnts.push(e.innerText)); amnts;",
        "images": ["picture img >> data-original-uri"]
      },
      "scenario": [
        { "wait": 3 },
        {
          "execute_js": "dmFyIHJhdGluZ09iamVjdCA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoIlthcmlhLWxhYmVsKj0nUmF0ZWQnXSIpLmF0dHJpYnV0ZXNbJ2FyaWEtbGFiZWwnXS5ub2RlVmFsdWUubWF0Y2goL1JhdGVkICg/PHJhdGluZz5bXGQ/XC5dKykuKyBmcm9tXHMoPzxyZXZpZXdzPltcZF0rKS8pLmdyb3VwczsK"
        },
        { "wait_for": "[data-section-id=AMENITIES_DEFAULT]" },
        { "execute_js": "document.querySelector('[data-section-id=AMENITIES_DEFAULT] section button').click()" },
        { "wait_for": "[id*='pdp_v3']" },
        { "execute": "parse" }
      ],
      "premium_proxy": "us",
      "real_browser": true
    }
  

Running the scraping request

      
    require 'rest_client'
    require 'json'

    api_url ="https://www.page2api.com/api/v1/scrape"
    payload = {
      api_key: 'YOUR_PAGE2API_KEY',
      url: "https://www.airbnb.com/rooms/3163509",
      parse: {
        title: "h1 >> text",
        superhost: "//span[contains(text(),'Superhost')] >> text",
        guests: "//span[contains(text(),'guests')] >> text",
        bedrooms: "//span[contains(text(),'bedrooms')] >> text",
        beds: "//span[contains(text(),'beds')] >> text",
        baths: "//span[contains(text(),'baths')] >> text",
        reviews: "js >> ratingObject['reviews']",
        rating: "js >> ratingObject['rating']",
        price: "[style*='--pricing'] > div > span > div > span >> text",
        description: "[data-plugin-in-point-id=DESCRIPTION_DEFAULT] >> text",
        amenities: "js >> var amnts = []; document.querySelectorAll(\"[id*='pdp_v3'][aria-disabled='false']\").forEach(e => amnts.push(e.innerText)); amnts;",
        images: ["picture img >> data-original-uri"]
      },
      scenario: [
        { wait: 3 },
        {
          execute_js: "dmFyIHJhdGluZ09iamVjdCA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoIlthcmlhLWxhYmVsKj0nUmF0ZWQnXSIpLmF0dHJpYnV0ZXNbJ2FyaWEtbGFiZWwnXS5ub2RlVmFsdWUubWF0Y2goL1JhdGVkICg/PHJhdGluZz5bXGQ/XC5dKykuKyBmcm9tXHMoPzxyZXZpZXdzPltcZF0rKS8pLmdyb3VwczsK"
        },
        { wait_for: "[data-section-id=AMENITIES_DEFAULT]" },
        { execute_js: "document.querySelector('[data-section-id=AMENITIES_DEFAULT] section button').click()" },
        { wait_for: "[id*='pdp_v3']" },
        { execute: "parse" }
      ],
      premium_proxy: "us",
      real_browser: true
    }

    response = RestClient::Request.execute(
      method: :post,
      payload: payload.to_json,
      url: api_url,
      headers: { "Content-type" => "application/json" },
    ).body

    result = JSON.parse(response)

    puts(result)
      
    

The result

  
    {
      "result": {
        "title": "Family Houseboat in City Center",
        "superhost": "Superhost",
        "guests": "2 guests",
        "bedrooms": "2 bedrooms",
        "beds": "3 beds",
        "baths": "1.5 baths",
        "reviews": "42",
        "rating": "4.95",
        "price": "$138",
        "description": "Fully equipped houseboat located on the Amstel River in the center of Amsterdam. Across the famous Amstel Hotel. Walking distance to all highlights: Museum Area, Albert Cuyp Market, De Pijp quarter. But also Nightlife, Restaurants, Shopping Areas.Show more",
        "amenities": [
          "Canal view",
          "Bathtub",
          "Hair dryer",
          "Cleaning products",
          "Hot water",
          "Essentials\nTowels, bed sheets, soap, and toilet paper",
          "Hangers",
          "Bed linens\nCotton linens",
          "Room-darkening shades",
          "Clothing storage: dresser",
          "TV with Netflix, standard cable",
          "Sound system with aux",
          "Window guards",
          "Board games",
          "Heating",
          "Smoke alarm",
          "Fire extinguisher",
          "Fast wifi – 76 Mbps\nVerified by speed test. Stream 4K videos and join video calls on multiple devices.",
          "Kitchen\nSpace where guests can cook their own meals",
          "Refrigerator",
          "Cooking basics\nPots and pans, oil, salt and pepper",
          "Dishes and silverware\nBowls, chopsticks, plates, cups, etc.",
          "Dishwasher",
          "Gas stove",
          "Stainless steel oven",
          "Hot water kettle",
          "Coffee maker",
          "Wine glasses",
          "Toaster",
          "Coffee",
          "Dining table",
          "Waterfront\nRight next to a body of water",
          "Private entrance\nSeparate street or building entrance",
          "Laundromat nearby",
          "Private patio or balcony",
          "Outdoor furniture",
          "Outdoor dining area",
          "BBQ grill: charcoal",
          "Boat slip",
          "Paid parking off premises",
          "Luggage dropoff allowed\nFor guests' convenience when they have early arrival or late departure",
          "Host greets you"
        ],
        "images": [
          "https://a0.muscache.com/pictures/40535051/7c49b7a2_original.jpg",
          "https://a0.muscache.com/pictures/40309400/2ed629ab_original.jpg",
          "https://a0.muscache.com/pictures/40309387/abf52119_original.jpg",
          "https://a0.muscache.com/pictures/miso/Hosting-3163509/original/871a0673-e4ea-4afa-a90d-c3fa946ab491.jpeg",
          "https://a0.muscache.com/pictures/miso/Hosting-3163509/original/d2fb3e01-a9e4-4374-b722-9d6952356734.jpeg"
        ]
      }
    }
  

How to export Airbnb listings to Google Sheets

In order to be able to export our Airbnb listings to a Google Spreadsheet we will need to slightly modify our request to receive the data in CSV format instead of JSON.

According to the documentation, we need to add the following parameters to our payload:
  
    "raw": {
      "key": "listings", "format": "csv"
    }
  

Now our payload will look like:

{ "api_key": "YOUR_PAGE2API_KEY", "batch": { "urls": [ "https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=0", "https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=20", "https://www.airbnb.com/s/Amsterdam--Netherlands/homes?pagination_search=true&items_offset=40" ], "concurrency": 1, "merge_results": true }, "raw": { "key": "listings", "format": "csv" }, "parse": { "listings": [ { "url": "a[target*='listing_'] >> href", "beds": "[aria-label*='bed'] >> text", "name": "[itemprop=name] >> content", "price": "[style*='--pricing'] > div > span > div > span >> text", "title": "[id*='title'] >> text", "rating": "[aria-label*='Rating'] >> text", "_parent": "[itemprop=itemListElement]" } ] }, "wait_for": "[itemprop=itemListElement]", "real_browser": true, "premium_proxy": "us" }

Now, edit the payload above if needed, and press Encode →

The URL with encoded payload will be:


  Press 'Encode'

Note: If you are reading this article being logged in - you can copy the link above since it will already have your api_key in the encoded payload.

The final part is adding the IMPORTDATA function, and we are ready to import our Airbnb listings into a Google Spreadsheet.
  Press 'Encode'

The result must look like the following one:

Airbnb listings import to Google Sheets

Final thoughts

Collecting Airbnb data manually can be a bit overwhelming and hard to scale.
However, a Web Scraping API can easily help you overcome this challenge and scrape the data in no time.
With Page2API you can quickly get access to the data you need, and use the time you saved on more important things!

You might also like:

Nicolae Rotaru
Nicolae Rotaru
May 29, 2022 - 5 min read

How to Scrape Yahoo Finance Stock Pricing Data (+ No code)

This article will describe the easiest way to scrape Stock Pricing Data from Yahoo Finance with Page2API

Nicolae Rotaru
Nicolae Rotaru
May 21, 2022 - 4 min read

How to Scrape Reddit Data: Posts, Images, Comments, and more.

In this article, you will discover the easiest way to scrape Reddit data with Page2API

Nicolae Rotaru
Nicolae Rotaru
January 12, 2022 - 4 min read

How to Scrape Glassdoor Reviews (Code & No code)

Learn how to scrape company reviews from Glassdoor with Page2API

Ready to Scrape the Web like a PRO?

Page2API will handle the hassle. You will get the data with ease!
(1000 free API calls. No credit card required)