12

How to scrape Google Maps Reviews with Node JS?

 2 years ago
source link: https://serpdog.io/blog/how-to-scrape-google-maps-reviews-with-node-js.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

In this post, we will learn to scrape Google Maps Reviews with Node JS.

google%20maps%20poster.jpg

Requirements:

Before we begin, we have to install everything we may need in this tutorial to move forward.

1. Node JS
2. Unirest JS
3. Cheerio JS

So before starting, we have to ensure that we have set up our Node JS project and installed both the packages - Unirest JS and Cheerio JS. You can install both packages from the above link.

We will use Unirest JS for extracting our raw HTML data and Cheerio JS for parsing our extracted HTML data.

Target:

Eiffel Tower Google Maps Results

We will target to scrape the user reviews on Eiffel Tower.

Process:

Now, we have set up all the things to be needed for preparing our scraper. We will use an npm library Unirest JS to make a get request to our target URL, so we can get our raw HTML data. Then we will use Cheerio JS for parsing the extracted raw HTML data.
We will target this type of URL:

                         
`https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:${data_ID},next_page_token:${next_page_token},sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc`
                         
                        

Where,
data_ID - Data ID is used to uniquely identify a place in Google Maps.
next_page_token - The next_page_token is used to get the next page results.
sort_by - It is used for sorting and filtering results.

The various values of sort_by are:
1. qualityScore - the most relevant reviews.
2. newestFirst - the most recent reviews.
3. ratingHigh - the highest rating reviews.
4. ratingLow - the lowest rating reviews.

Now, the question arises how do we get the Data ID of any place?
Eiffel Tower Google Maps URL

                            
https://www.google.com/maps/place/Eiffel+Tower/@48.8583701,2.2922926,17z/data=!4m7!3m6!1s0x47e66e2964e34e2d:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!9m1!1b1
                            
                           

You can see, in the URL the part after our !4m7!3m6!1s and before !8m2! is our Data ID.
So, our data ID in this case is - 0x47e66e2964e34e2d:0x8ddca9ee380ef7e0
Our target URL should look like this:

                            
https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:,sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc
                            
                           

Copy this URL in your browser and press enter. You will see, that a text file will be downloaded, by entering this URL in your browser. Open this file in your respective code editor. Convert it into a .html file. After opening the HTML file, we will search for the HTML tags of the elements we want in our response.

We will first parse the location information of the place, which contains - location name, address, average rating, and total reviews.

Eiffel Tower Google Maps HTML

From the above image, the tag for our location name is .P5Bobd, the tag for our address .T6pBCe, tag for our average rating is span.Aq14fc and tag for our total number of reviews is span.z5jxId.

All done for the location information part, we will now move towards parsing Data ID and next_page_token.

Search for the tag .lcorif. In the above image you can find the .lcorif tag in the second line. Under this tag, we have our tag for Data ID as .loris and of next_page_token as .gws-localreviews__general-reviews-block.

Now, we will search for the tags which contain data about the user and his review.
Search for the tag .gws-localreviews__google-review

Eiffle%20Tower%20Google%20reviews%20HTML.png

This tag contains all information about the user and his reviews.
We will parse the extracted HTML for the user's name, link, thumbnail, number of reviews, rating, review, and the images posted by the user.
This makes our whole code look like this:

Google Maps reviews scraper

You can copy this code from the following link: https://github.com/Darshan972/GoogleScrapingBlogs/blob/main/GoogleMapsReviewsScraper.js

Result:

Google Maps Reviews Result

Our result should look like this 👆🏻.
These are the results of the first ten reviews. If you want to get another 10 put the token, which we have found in our code in the below URL:

    
https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:tokenFromResponse,sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc
    

In this case, we have our token as CAESBkVnSUlDZw== .
You can find the reviews for every next page using the token from their previous pages.

With Google Maps Reviews API:

Serpdog API offers you 100 free requests on sign-up.
Scraping can take a lot of time sometimes, but the already made structured JSON data can save you a lot of time.

Statue Of Liberty Node JS requets

Result:

Statue Of Liberty Google Reviews

Conclusion:

In this tutorial, we learned how to scrape Google Maps Reviews using the reverse API method. Feel free to ask me anything on my email. Follow me on Twitter Thanks for reading!

Additional Resources:

How to scrape Google Organic Search Results using Node JS?

Author:

My name is Darshan and I am the founder of serpdog.io.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK