Privacy Policy Plotting Google Trends with Bokeh and Python for multiple comparisons prt 1/2 - Andrea Ciufo

Plotting Google Trends with Bokeh and Python for multiple comparisons prt 1/2

One of my last job tasks activities was to compare different keywords on Google Trends.

At the moment I am writing is possible to compare only 5 keywords between each other.

If you want to compare more then 5 Key Search Terrs is not possible, or you have to manipulate a little bit data.

Take these 9 Italian Cities as examples, I plotted the data extracted from Google Trends with the Python Library Bokeh, at the end of the post you can see the code or you can go directly to my Git Hub Profile.

(Hey you can click on the following plots that I made! They are interactive!)

Bokeh Plot
Bokeh Plot

What I wanted to understand was how to compare more than 5 trends simultaneously.

I am assuming that you already know what is Google Trend and why is a powerful tool for any entrepreneur.

I would just highlight one key aspect: data on Google Trend can assume a value from 0 to 100

100 is the maximum level of search interest, for:

  • Specific keyword/topic
  • Delta time
  • Location

This has a huge implication because the same value of 100 is different if we look at different locations or different delta time.

Therefore it’s obvious that the same value of 100 for a query with the term “Carbonara” is different if we select the USA or Italy.

The 6th of April is Carbonara Day, that’s why you see a spike

That said, I want to discuss how I decided to compare more than 5 search terms.

I started reading this post, the author leads 10 queries on two different Google Trends Dashboards with 2 queries in common between the two dashboards. He searched for trends through the time for different fruit types such as Pear Apple Pineapple Durian Orange etc.

Apple was in the first and second group of queries.

Then he compared the values from the two Dashboards through a normalization procedure.

Keep in mind that in this example apples are scaled differently in the first and second group.

Based on that he calculates the average of the Level of Interest Index for each element and through the average, he equalizes the data and compares them.

I think is a good and robust strategy, but some questions arised in my mind.

Based on the nature of the average, sensible to spikes, I was asking myself if I was missing any flipside, for example why not trasform data through a Standardization?

The difference between Normalization and Standardization It’s really simple, but can conduct to different outputs.

Normalization rescales the data through mean into a range of [0,1].

Standardization rescales data through mean and standard deviation. Standardized Data have a mean of 0 and a standard deviation of 1 (unit variance).

I wanted to go deeper. So I started a short experiment, that is still ongoing.

Basically I:

  • Took Google Trend Data for 9 Italian Cities
  • Exported this data on Google Sheet
  • Created a Colab Notebook
  • Read from the Google Sheet the data through a Google Sheet API for Python
  • Cleaned and correctly cast all the data type
  • Plotted on Bokeh

After that in the following days I am going to:

  • Normalize Data
  • Plot Normalized Data
  • Standardize Data
  • Plot Standardized Data
  • See how these values change

I will keep you updated.

Below you can find the code for:

Thank you for reading the article!

If you liked or you think is useful feel free to share. 

With just a tweet or a like on LinkedIn new opportunities can arise.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.