Docs

gsvi.connection

Holds GoogleConnection class.

This module provides the interface to Google Trends via the GoogleConnection class. Interacts with GT’s time series widget via the get_timeseries() method, related queries via get_related_queries().

class gsvi.connection.GoogleConnection(language='en-US', timezone=0, timeout=5.0, verbose=False)

Connection to Google Trends.

Offers the interface to Google Trends. For now, it connects to the time-series widget and related queries widget.

language

The language, defaults to ‘en-US’

timezone

The timezone in minutes, defaults to 0

timeout

The timeout for the GET-requests.

verbose

Print request URLs?

Raises:requests.exceptions.RequestException

Makes the related-queries request to Google Trends for the specified queries. This method only does very basic input checks as this is handled by the objects using the connection.

Parameters:
  • queries

    The queries as a list of dicts with ranges as tuples of datetime objects. A maximum of 5 queries is supported. Example:

    [{'key': 'apple', 'geo': 'US',
    'range': (start, end)}
    {'key': 'orange', 'geo': 'US',
    'range': (start, end)}
    
  • category – The category for the query, defaults to CategoryCodes.NONE.
Returns:

A dict of keywords with a pd.Dataframes for top and rising queries for each key.

Raises:
  • ValueError
  • requests.exceptions.RequestException
get_timeseries(queries: List[Dict[str, Union[str, Tuple[datetime.datetime, datetime.datetime]]]], category=<CategoryCodes.NONE: 0>, granularity='DAY') → List[pandas.core.series.Series]

Makes the timeseries request to Google Trends for the specified queries. This method only does very basic input checks as this is handled by the objects using the connection.

Parameters:
  • queries

    The queries as a list of dicts with ranges as tuples of datetime objects. A maximum of 5 queries is supported. Example:

    [{'key': 'apple', 'geo': 'US',
    'range': (start, end)}
    {'key': 'orange', 'geo': 'US',
    'range': (start, end)}
    
  • category – The category for the query, defaults to CategoryCodes.NONE.
  • granularity – The step length of the requested series, either ‘DAY’/’MONTH’ or ‘HOUR’. Defaults to ‘DAY’. Depending on the query ranges, the granularity returned by GT might differ. Check the SVSeries docs for details.
Returns:

A list of pd.Series, one series for each query. The values are normalized over the maximal value (which is set to 100) over all queries by Trends.

Raises:
  • ValueError
  • requests.exceptions.RequestException

gsvi.timeseries

Holds time series request structure for Google Trends.

The SVSeries class implements an algorithm to get arbitrary-length time series with values in [0, 100] from GT in the get_data() method. This algorithm ensures that GT itself handles the normalization, thus making the series easier to compare. It can fetch uni- and multivariate queries.

Example usage:

gc = GoogleConnection(timeout=10)
start = datetime.datetime(year=2017, month=1, day=1)
end = datetime.datetime(year=2019, month=9, day=30)
series = SVSeries.multivariate(gc,
            [{'key': 'apple', 'geo': 'US'},
            {'key': 'microsoft', 'geo': 'US'}],
            start, end, 'DAY')
data = series.get_data()
class gsvi.timeseries.SVSeries(connection: gsvi.connection.GoogleConnection, queries: List[Dict[str, str]], bounds: Tuple[datetime.datetime, datetime.datetime], **kwargs)

Container for uni- or multivariate google search volume time series.

The main purpose of this class is to get arbitrary-length time series data from Google Trends for one or more keywords.

connection

The connection to Google Trends.

queries

The user-specified queries dicts as list [{‘key’: ‘word’, ‘geo’: ‘country’}, …].

bounds

The date range for the time series. Depending on the location of the maximum and the granularity, the lower bound may not hold (see get_data()).

category

The category for the search volume. Possible categories are in the CategoryCodes enum.

granularity

The series granularity, either ‘DAY’, ‘HOUR’ or ‘MONTH’.

data

The search volume data after the get_data() call.

request_structure

The query fragments in levels after the get_data() call, showing how the optimum was obtained.

is_consistent

Flag indicating if the data is still consistent with the other attributes of the instance. This is set to True when get_data() runs successfully.

CAUTION: One has to take care when specifying certain time span/granularity combinations. Google Trends switches from returning weekly to monthly data when the span is >= 1890 days (63 months). SVSeries can handle by extending the lower boundary date if necessary. The same happens with daily data when the span is longer than 269 days AND not a multiple of 269 days. For hourly data, the switch to minute data happens at < 3 days. This weird behavior has changed in the past and might change again in the future! See get_data() for more on how this problem.

get_data(delay=10, force_truncation=False) → Union[pandas.core.frame.DataFrame, pandas.core.series.Series]

Builds the request structure for the queries and builds requests to Google Trends such that the resulting time series values are normalized to [0, 100]. The returned data might be extended beyond the lower bound specified in the query. This is necessary because GT returns data in different intervals depending on the specified range and granularity. One can enforce the correct length but might get data not in [0, 100] in case the maximum falls into the part that gets truncated.

Parameters:
  • delay – Put delay seconds between requests to avoid getting banned.
  • force_truncation – Truncate to the specified bounds even if the maximal volume (100) does fall into this interval. Default is to not truncate in case the maximum falls into this area.
Returns:

The normalized time series as pd.Series (univariate) or pd.Dataframe (multivariate).

Raises:

requests.exceptions.RequestException

Warning

UserWarning: in case truncation is not forced and maximum is in area to be truncated.

classmethod multivariate(connection: gsvi.connection.GoogleConnection, queries: List[Dict[str, str]], start: datetime.datetime, end: datetime.datetime, **kwargs)

Builds a multivariate search volume series. Initially, the series holds no data. Call get_data() to fill it.

Parameters:
  • connection – The GoogleConnection to use for the requests.
  • query – A list of query dicts.
  • start – The start of the series >= 2004/01/01.
  • end – The end of the series <= now
Keyword Arguments:
 
  • granularity – The granularity of the series (‘DAY’, ‘HOUR’ or ‘MONTH’). Defaults to ‘DAY’ if not given.
  • category – Volume for a specfic search category (see gsvi.catcodes). Defaults to CategoryCodes.NONE if not given.
Returns:

A SVSeries with empty data.

Raises:

ValueError

classmethod univariate(connection: gsvi.connection.GoogleConnection, query: Dict[str, str], start: datetime.datetime, end: datetime.datetime, **kwargs)

Builds a univariate search volume series. Initially, the series holds no data. Call get_data() to fill it.

Parameters:
  • connection – The GoogleConnection to use for the requests.
  • query – The query dict.
  • start – The start of the series >= 2004/01/01.
  • end – The end of the series <= now
Keyword Arguments:
 
  • granularity – The granularity of the series (‘DAY’, ‘HOUR’ or ‘MONTH’). Defaults to ‘DAY’ if not given.
  • category – Volume for a specfic search category (see gsvi.catcodes). Defaults to CategoryCodes.NONE if not given.
Returns:

A SVSeries with empty data.

Raises:

ValueError

gsvi.related

Holds related queries for Google Trends searches.

RelatedQueries allows to specify one or multiple queries and fetches the related queries from Google Trends. The returned data contains the related queries for each query, the corresponding value and the link to the Google Trends search for the related query.

Example usage:

gc = GoogleConnection(timeout=10)
start = datetime.datetime(year=2017, month=1, day=1)
end = datetime.datetime(year=2019, month=9, day=30)
related = RelatedQueries.multiple(gc,
            [{'key': 'apple', 'geo': 'US'},
            {'key': 'microsoft', 'geo': 'US'}],
            start, end)
data = related.get_data()
class gsvi.related.RelatedQueries(connection: gsvi.connection.GoogleConnection, queries: List[Dict[str, str]], bounds: Tuple[datetime.datetime, datetime.datetime], **kwargs)

Container for Google Trends related queries.

The main purpose of this class is to get and hold related query data for one or multiple user-specified queries (i.e. keyword and region).

connection

The connection to Google Trends.

queries

The user-specified queries dicts as list [{‘key’: ‘word’, ‘geo’: ‘country’}, …].

bounds

The date range for the request.

category

The category for the search volume. Possible categories are in the CategoryCodes enum.

data

The related-queries data after the get_data() call.

is_consistent

Flag indicating if the data is still consistent with the other attributes of the instance. This is set to True when get_data() runs successfully.

get_data() → Dict[str, Dict[str, pandas.core.frame.DataFrame]]

Gets the related queries for the specified queries from Google Trends. For each key, the returned data contains the list of related queries, their values and links to Google Trends. For information on how to interpret the values, please refer to Google Trends. A call to Google Trends is only made if it is the first call or the cached data is inconsistent with the other fields of the object. :returns: A dict of dicts with top and rising related queries for each passed query.

Raises:requests.exceptions.RequestException
classmethod multiple(connection: gsvi.connection.GoogleConnection, queries: List[Dict[str, str]], start: datetime.datetime, end: datetime.datetime, **kwargs)

Builds a RelatedQueries object for multiple queries. Initially, the series holds no data. Call get_data() to fill it.

Parameters:
  • connection – The GoogleConnection to use for the requests.
  • query – The query dict.
  • start – The start of the series >= 2004/01/01.
  • end – The end of the series <= now
Keyword Arguments:
 

category – Volume for a specfic search category (see gsvi.catcodes). Defaults to CategoryCodes.NONE if not given.

Returns:

A RelatedQueries object with empty data.

Raises:

ValueError

classmethod single(connection: gsvi.connection.GoogleConnection, query: Dict[str, str], start: datetime.datetime, end: datetime.datetime, **kwargs)

Builds a RelatedQueries object for a single query. Initially, the series holds no data. Call get_data() to fill it.

Parameters:
  • connection – The GoogleConnection to use for the requests.
  • query – The query dict.
  • start – The start of the series >= 2004/01/01.
  • end – The end of the series <= now
Keyword Arguments:
 

category – Volume for a specfic search category (see gsvi.catcodes). Defaults to CategoryCodes.NONE if not given.

Returns:

A RelatedQueries object with empty data.

Raises:

ValueError

gsvi.catcodes

Holds Google Trends category codes.

class gsvi.catcodes.CategoryCodes

Holds Google Trends category codes.