How to enhance keyword exploration in R with googleSuggestQueriesR package

googleSuggestQueriesR package readme file screen

Google Suggest Query API is one of multiple tools that can enhance both NLP model and Google Ads strategy. I am happy to share I have recently created and published googleSuggestQueriesR – open-source R package that enables and simplifies the process of using this API to extract keyword recommendations.

The main goal of this post is to show how this API can be used to enhance keyword exploration for content marketing, search ad campaigns and NLP modelling.

API

The Google Suggest API seems to be pretty old and it changed multiple times. There are at least three known working endpoints since 2011 (feel free to try them):

  1. google.com/complete/search?client=chrome&q=cheap%20shoes
  2. clients1.google.com/complete/search?output=firefox&q=cheap%20shoes
  3. http://suggestqueries.google.com/complete/search?output=toolbar&q=cheap%20shoes

All of them are working (see this thread on Stack Overflow). The 2nd and the 3rd returns only keyword suggestions, and the 1st returns both suggestions and additional metadata (i.e. relevance metric). However, the 1st may need an API key and a project setup.

Example of output:

I decided to use the 3rd version as it is the newest one and doesn’t require any setup and/or authorization management.

The package

I believe that this Suggest Queries API is a simple yet pretty powerful tool that may help in various ways. That’s why I decided to create a package (link to GitHub) to help people to use it.

My previous package (pagespeedParseR, available on GitHub) suffered from a serious feature bloat (still has to clean it up), so I decided to keep the new package as simple as possible.

I released the first version of {googleSuggestQueriesR} on GitHub at 24 May 2020. Planning to release it on CRAN when it is stable and ready.

Before we begin

The suggestqueries.google.com endpoint doesn’t require API key or project setup. If you intend to use it as it is you will send requests to Google from your IP (unless you use proxy or other backend dark magic).

Please remember that sending too many requests per minute may end in blocking you by Google, so be careful. There is no documentation of this API (at least I couldn’t find any for this particular endpoint) so the API limits are not known.

Use with caution – be gentle, don’t upset the servers and pause between requests to avoid problems.

Installation

Right now, to start using {googleSuggestQueriesR} one must install it from GitHub (CRAN version should be released further this year):

# CRAN version:
# not yet available - to be announced

# GitHub version:
install.packages("devtools")
devtools::install_github("Leszek-Sieminski/googleSuggestQueriesR")

Use case #1: Broad keywords exploration

Let’s first consider a simple use case. Imagine we have only a single idea/product that we want to find out more about, like “shoes”. All we need to do is to query the API with the query for choosen language:

# declare our chosen query
example_query <- "shoes"

# this can take some time due to the number of variations
keyword_suggestions <- suggest_keywords(
  queries = example_query,
  lang = "en",
  interval = 1,
  enhanced = T) 

# notice that I set 'enhanced' to TRUE, this way I will create more suggestions

length(keyword_suggestions)
# [1] 358

str(keyword_suggestions)
# chr [1:358] "shoes 0-3 months" "shoes 0.5 size too big" "shoes 0 drop" "shoes 0nline" "shoes 00s" "shoes 009" ..

keyword_suggestions
#   [1] "shoes 0-3 months"                 "shoes 0.5 size too big"           # "shoes 0 drop"                    
#   [4] "shoes 0nline"                     "shoes 00s"                        # "shoes 009"                       
#   [7] "shoes 06"                         "shoes-01"                         # "shoes 0 heel drop"               
#  [10] "0ffice shoes"                     "shoes 1916"                       # "shoes 1"                         
#  [13] "shoes 10"                         "shoes 11"                         # "shoes 1 year old"     ...

Enhance or not enhance, that’s the question

If you checked the links that were provided earlier you may have noticed that {googleSuggestQueriesR} returns much more results in above case than in single API request made via the links.

It is possible because the parameter ‘enhanced’ creates multiple combinations of input query ‘shoes’ under the hood:

  • shoes a
  • shoes b
  • shoes c
  • shoes z
  • shoes 0
  • shoes 1
  • shoes 9

… and then send a request for every combination. This way the process is much slower but also yields much more results.

What if you don’t want to use it? Simple, set ‘enhanced’ to FALSE:

> keyword_suggestions <- suggest_keywords(
  queries = example_query,
  lang = "en",
  interval = 1,
  enhanced = F) # enhancing switched off

> length(keyword_suggestions)
[1] 10

> str(keyword_suggestions)
 chr [1:10] "shoes" "shoes for men" "shoes online" "shoes with wheels" "shoes size" "shoes shop" "shoes 2020" ...

> keyword_suggestions
 [1] "shoes"               "shoes for men"       "shoes online"        "shoes with wheels"   "shoes size"         
 [6] "shoes shop"          "shoes 2020"          "shoes online poland" "shoestring"          "shoeske"  

Use case #2: Specific keyword exploration

OK, so we have our keyword explored. What if I want to find something more specific about ‘cheap shoes’ but for multiple locations, like Bristol, New York and Warsaw?

No problem. I have added 2 simple helper functions to help you with that: ‘create_enhanced_keywords’ and ‘create_custom_enhanced_keywords’.

First option: we only want suggestions for our idea and three cities

# let's first create the
> input_queries_specific_1 <- create_custom_enhanced_keywords(
    queries = "cheap shoes",
    suffix_vec = c("Bristol", "New York", "Warsaw"))

> input_queries_specific_1 
[1] "cheap shoes Bristol"  "cheap shoes New York" "cheap shoes Warsaw"  

> custom_suggestions <- suggest_keywords(
    queries = input_queries_specific_1 ,
    lang = "en",
    interval = 1,
    enhanced = F)

> length(custom_suggestions)
[1] 10

> str(custom_suggestions)
 chr [1:10] "cheap shoes bristol" "buy shoes bristol" "cheap shoes new york" "cheap nike shoes new york" ...

> custom_suggestions
 [1] "cheap shoes bristol"               "buy shoes bristol"                 "cheap shoes new york"             
 [4] "cheap nike shoes new york"         "cheap running shoes new york"      "cheap shoes store in new york"    
 [7] "where to buy cheap shoes new york" "buy cheap nike shoes new york"     "cheap shoes warsaw"               
[10] "buy shoes warsaw"                 

Second option: we want suggestions for our idea and three cities but enhanced with more combinations

input_queries_specific_2 <- create_custom_enhanced_keywords(
   queries = c("cheap shoes Bristol", "cheap shoes New York", "cheap shoes Warsaw"),
   suffix_vec = letters)

 [1] "shoes new york and company"           "buy cheap nike shoes new york"       
 [3] "where to buy cheap shoes new york"    "discount shoes new york city"        
 [5] "cheap shoes in new york city"         "buy merrell shoes new york city"     
 [7] "affordable shoes in new york city"    "buy salomon shoes new york city"     
 [9] "shoes new york city"                  "shoes new york company"              
[11] "shoes new york city designer"         "cheap designer shoes new york"       
[13] "shoes new york fashion week"          "shoes new york gucci"                
[15] "cheap shoes in new york"              "cheap shoes store in new york"       
[17] "cheap nike shoes in new york"         "where to buy cheap shoes in new york"
[19] "shoes new york marathon"              "cheap nike shoes new york"           
[21] "new york print show"                  "cheap running shoes new york"        
[23] "shoes new york transit"               "shoes new york times square"         
[25] "shoes new york winter"                "shoes new york yankees"              
[27] "buy shoes in warsaw"

custom_suggestions2 <- suggest_keywords(
  queries = input_queries_specific_2,
  lang = "en",
  interval = 1,
  enhanced = F)

length(custom_suggestions2)
[1] 27

str(custom_suggestions2)
 chr [1:27] "shoes new york and company" "buy cheap nike shoes new york" "where to buy cheap shoes new york" ...

> custom_suggestions2
 [1] "shoes new york and company"           "buy cheap nike shoes new york"       
 [3] "where to buy cheap shoes new york"    "discount shoes new york city"        
 [5] "cheap shoes in new york city"         "buy merrell shoes new york city"     
 [7] "affordable shoes in new york city"    "buy salomon shoes new york city"     
 [9] "shoes new york city"                  "shoes new york company"              
[11] "shoes new york city designer"         "cheap designer shoes new york"       
[13] "shoes new york fashion week"          "shoes new york gucci"                
[15] "cheap shoes in new york"              "cheap shoes store in new york"       
[17] "cheap nike shoes in new york"         "where to buy cheap shoes in new york"
[19] "shoes new york marathon"              "cheap nike shoes new york"           
[21] "new york print show"                  "cheap running shoes new york"        
[23] "shoes new york transit"               "shoes new york times square"         
[25] "shoes new york winter"                "shoes new york yankees"              
[27] "buy shoes in warsaw"                 

Summary

Thanks for reading! I hope that you find googleSuggestQueriesR useful. If you would like some specific feature or have any question feel free to comment below 🙂 In case of bugs, please let me know via Issues on package’s Github.