Data sources

How to reveal and leverage selected date range of Google Data Studio reports

How to get Google Data Studio selected date range and use it

Need to access to selected date range and related time repository

Most of data sources and related APIs provide rows of dated data only if the requested data are available (in other words, when the combination of all requested fields gets values), otherwise no row is returned. It is quite obvious but it also drives the way to manage data when temporality is key inside Google Data Studio reports.

On the other hand, Data Studio doesn’t provide for the moment a way to access to contextual information of the live state of the report, including which date range has been selected (or automatically applied). Only data sources can provide data to leverage for time data…but, as said before, all cases cannot be solved with data dates, especially when no data as been collected for all dates or other required time granularity.

So, we need sometimes a consistent time and calendar management, a dates repository without dependency with visualized data. An absolute reference of time.


Google BigQuery for a trivial small query

Google BigQuery can address nicely complex needs about cloud data storage, scalability, speed, everything catalyzed by complementary amazing tools from Google Cloud Platform and proposed with an attractive pricing plan based on queried and stored data.

But it is not the subject of this article.

Maybe you already know how I like finding solutions – out of the box – for data viz/reporting requirements not yet directly covered by Data Studio built-in features.

Let’s use BigQuery and its related implementation of Structured Query Language (SQL) as an easy and free way to get a service providing absolute time and calendar data. We will not store and query dynamic data, so we will not be charged for it. We will just take advantage of the SQL engine to query calculated rows about dates & time dimensions. Just that 😇

So, this time, we will hijack Google BigQuery to use 0,001% of its capabilities, just to get more context about dates and use it to improve displayed data. 😋



Example and Ingredients

Let’s assume that you’ve got already your main data source, providing dated data but not returning rows for every day. For this article, I use the example of daily sold quantity of a specific product category from Google Analytics. I would like to calculate the average of sold quantity by day, even if there is no sale (and any data) collected for some specific days.

So, of course, count distinct values on data date field or date difference between min and max data date field values are not reliable options to calculate the number of day, so the average.

In addition to our main data source, we will need very few elements:

  • An BigQuery project (an empty project in BigQuery sandbox is enough)
  • A BigQuery data source with a custom query
  • Blended data joining BigQuery time data with your main data
  • Some calculated fields on our charts powered by blended data

BigQuery (sandbox) project

First, we need a BigQuery project inside Google Cloud Platform (GCP).

Don’t worry, our GCP needs are quite simple and it doesn’t require any billing information. If you have never configured a BigQuery project, you can use the BigQuery sandbox. We will not be blocked by the sandbox limitations. You will find how to do in this quick start documentation.

So, create your project using the sandbox or use your existing environment, then go back to your Data Studio report. And of course, no need to create a BigQuery dataset and the related tables. We will not store data inside BigQuery.


BigQuery sandbox project for Data Studio data source


BigQuery data source inside Google Data Studio

To configure our BigQuery data source, we will code a custom query, enabling date range parameters. This feature allows you to provide start and end date from report date range to built your custom query.

Here is the main trick: Data Studio cannot provide us directly the selected date range of the report to allow us to compute data regarding time context. So we will take advantage of the query language of BigQuery, not to use date range parameters to filter data stored on BigQuery table. Nope. But to build a simple query to get dates list of selected range 😈

So, add and save your new BigQuery data source as described below:

  • Name in the example here: “BQ – Calendar”
  • Connector: BigQuery from Google
  • Type: Custom query
  • Project: Your (sandbox) project (here “My First Project”)
  • Custom query: Copy/paste the query below
  • Check “Enable date range parameters”

Through this query, we request all the dates between the start and the end of the date range, named “calendar_date”.

SQL custom query
SELECT
   *
FROM 
   UNNEST(
      GENERATE_DATE_ARRAY(
         PARSE_DATE('%Y%m%d',@DS_START_DATE)
        ,PARSE_DATE('%Y%m%d',@DS_END_DATE)
        ,INTERVAL 1 DAY
   )
) AS calendar_date

The screenshot below sums up the full configuration of the BigQuery data source.


BigQuery data source for dates data

2. Blended data

The data source corresponding to the date range repository is ready. Let’s blend it with the main data source.

  • As left table (table 1)
    • Data source: BigQuery data source (“BQ – Calendar”)
    • Dimension : calendar_date, renamed as “Date”
  • As right table (table 2)
    • Data source: Main data source, here Google Analytics data source
    • Dimension: Date, renamed as “Data date”
    • Other options: to adapt, here quantity as metric
  • Join configuration
    • Operator: Left outer join
    • Join keys: Date and Data date

Data studio blended data source for dates without data

3. Chart configuration

At last, adapt your reporting to use the blended data instead of your main data source and create chart calculated fields to take advantage of our dates repository.

Here 2 examples of calculated fields:

  • Quantity: A simple nullity test to replace absent values by zero
  • Average of daily quantity to divide the total quantity by the real number of days

Quantity
IFNULL(Quantity,0)

Daily quantity
SUM(Quantity)/COUNT_DISTINCT(Date)


Configuration review

At last, let’s see how everything is configured and let’s play with it. Voilà.



Final thoughts

This is just a first example describing how we can use BigQuery as a service provider of absolute or contextual data and the possibilities are endless, and not limited to dates & time. I’ve got in mind a visualisation use case to work around some limitations of Data Studio, probably a topic for another blog post.

On the other hand, As an extension of this article, I’m wondering if delegating some additional calculation or formatting about dates would be advantageous for performance purposes, instead of doing it inside Data Studio with calculated fields engine.

At last, I let you imagine how it would be amazing to get dynamic parameter feature, and so to be able to provide to BigQuery, not just date range data, but also data from other data sources, and so use BigQuery as a computing engine regardless of the data source. Just saying…


Il faut laisser le temps au temps