Interactive custom date ranges comparison with Looker Studio and BigQuery
Two years ago, I started working on this interactive date range comparison solution within Data Studio, to fill a gap in the tool. It required using BigQuery as a data source and the context at the time was not like today, even though the affinities between Looker Studio and BigQuery were already strong.
Today, they are even stronger. In general, BigQuery is much more used in the digital analytics world, and the Google Cloud BI team working on Looker Studio continues to advance and enrich the features between the two softwares. The new native BigQuery integration for LS is another additional proof of this.
During these 2 years, I tested and improved this solution through internal projects and reports for clients. During these 2 years, BigQuery has also become an almost systematic ally as a data preparation layer for sustainable Looker Studio reports, integrated into a complete analytics device, regardless of the data collection software, with or without a real data warehouse for ingestion and processing.
Therefore, the solution I describe in this article is not perfect but it allows addressing one of the most requested unsatisfied demands, while waiting for the functionality to be directly supported by Looker Studio…
So, let’s go.
The result
In the video below, I created a simple report presenting sales data by country. You can display them over a primary date range and compare them with another customized period. The purpose of the tutorial will be to reproduce this report.
I chose to offer the following comparison options, namely that the user can compare the data of the primary date range with the data of:
- The previous year
- The previous year, the same days of the week
- The preceding period (the same number of days preceding the number of days selected for the primary period)
- A custom period, entered with 2 date fields respecting the format “YYYY-MM-DD”
The comparison is made only if a box is checked to activate the request (“compare”). This avoids loading unwanted data while entering the different fields.
Ingredients
To create this report, we will need the following:
- A BigQuery data source through a custom query
- Parameters to inject comparison preferences into the BigQuery query
- Calculated fields at the data source level
- Looker Studio controls reflecting the user’s comparison preference and directly corresponding to the valued parameters of the query.
- Charts covering the main comparative representations typically needed in a custom report: time series, tables with variation, comparative bar charts, scorecard with variation
- Calculated fields at the chart level
- Conditional formatting for variations
The list may seem long, but you will see that the function of each element is very easy to understand, for a simple configuration.
0. Example of data set
For this tutorial, I used a very basic dataset of daily sales by country. Let’s quickly look at the schema of the the prepared BigQuery table:
- date (Date with daily partition)
- continent (String)
- country (String)
- ales (Integer)
Of course, you can apply this tutorial to any dataset from BigQuery, with the number of dimensions and metrics required for your interactive report.
For example, in the digital analytics world, you could prepare a table containing dimensions and metrics compatible with a visit scope (traffic source data, user devices and technologies, landing page, visits, visits with conversions, page views, cumulative visit time…).
Let’s continue with my simple dataset example.
1. Data source: Principle of the BigQuery query
First, let’s go to our Looker Studio report and add a BigQuery data source with “Custom Query” type, using the Google Cloud project in which your table is stored.
Two periods, two sub-queries with a union
The query we will use consists of 2 sub-queries joined by a union (UNION ALL) and corresponding to the 2 desired date ranges:
- The primary period, chosen in the report with the date range selector
- The comparison period, defined by the combination of parameters we will detail later
Two additional context fields to facilitate the comparison
In addition to the previously listed fields (date, continent, country, sales), we will add 2 additional data to the schemas shared by these 2 sub-queries:
- The concerned period
- The “date_range” field is fed with the value “current” for the primary period and with “compared” for the comparison period
- The temporal order of the date for each subset (row_number field): The first date of the primary period and the comparison period are valued with the integer 1. This field will align the dates to compare, especially for time series.
2. Parameters of the custom query
Before writing the custom query, we will create the related parameters.
Main date range
Enable the main date range
Custom parameters
- “Compared to”
- Identifier “date_range_for_comparison”
- Type “Text”, with list of values:
- Previous year
- Previous year the same day of week
- Previous period
- Custom
- Cardinality : Single select
- Default value “Previous year”
- “Compared date range start”
- Identifier “compared_date_range_start”
- Type “Text”, any value
- Default value “2023-12-01” ou any other value respecting the chosen format for dates
- “Compared date range end”
- Identifier “compared_date_range_end”
- Type “Text”, any value
- Default value “2023-12-31”
- “Compare”
- Identifier “compare”
- Type “Boolean”
- Checked by default (valeur true by default)
3. Data source: BigQuery SQL custom query
Now let’s move to the query itself and specify that:
- We will unite the sub-query of the main data range data and that of the comparison period data with a “UNION ALL”
- The second sub-query is primarily conditioned by the activation of the comparison with the Boolean “Compare”
- The conditions related to each type of comparison are also conditions in the second sub-query
- Comparing with data from the previous year on the same days of the week is equivalent to comparing the same period by subtracting 52 weeks.
So here is the query to enter as a data source, on the same screen where you entered the previous parameters. Simply adapt:
- the project, the dataset, and the table in the “FROM” clauses
- the field names in the “SELECT” clauses
SELECT
date, continent, country, sales
,"current" as date_range
,DATE_DIFF( date, PARSE_DATE('%Y%m%d', @DS_START_DATE), DAY)+1 as row_number
FROM your-project.your_dataset.viz_sales_by_country
WHERE TRUE
AND
date between PARSE_DATE('%Y%m%d', @DS_START_DATE) AND PARSE_DATE('%Y%m%d', @DS_END_DATE)
UNION ALL
SELECT
date, continent, country, sales
,"compared" as date_range
,DATE_DIFF( date,
CASE @date_range_for_comparison
WHEN 'Custom' THEN PARSE_DATE('%Y-%m-%d', @compared_date_range_start)
WHEN 'Previous year' THEN date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL 1 YEAR)
WHEN 'Previous year the same day of week' THEN date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL 52 WEEK)
WHEN 'Previous period' THEN date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL DATE_DIFF(PARSE_DATE('%Y%m%d', @DS_END_DATE),PARSE_DATE('%Y%m%d', @DS_START_DATE),DAY)+1 DAY)
END, DAY)+1 as row_number
FROM your-project.your_dataset.viz_sales_by_country
WHERE @compare
AND
(
(
@date_range_for_comparison = 'Previous year'
AND date between date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL 1 YEAR) AND date_sub(PARSE_DATE('%Y%m%d', @DS_END_DATE),INTERVAL 1 YEAR)
)
OR
(
@date_range_for_comparison = 'Previous year the same day of week'
AND date between date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL 52 WEEK) AND date_sub(PARSE_DATE('%Y%m%d', @DS_END_DATE),INTERVAL 52 WEEK)
)
OR
(
@date_range_for_comparison = 'Previous period'
AND date between date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL DATE_DIFF(PARSE_DATE('%Y%m%d', @DS_END_DATE),PARSE_DATE('%Y%m%d', @DS_START_DATE),DAY)+1 DAY) AND date_sub(PARSE_DATE('%Y%m%d', @DS_START_DATE),INTERVAL 1 DAY)
)
OR
(
@date_range_for_comparison = 'Custom'
AND date between PARSE_DATE('%Y-%m-%d', @compared_date_range_start) AND PARSE_DATE('%Y-%m-%d', @compared_date_range_end)
)
)
4. Calculated fields of the data source
Now, let’s add 2 calculated fields to the data source – 2 metrics that will help us simplify the configuration of the charts.
Sales
SUM(IF(date_range ='current', sales ,null))
Compared sales
SUM(IF(date_range ='compared', sales ,null))
5. Looker Studio controls for the comparison data widget
Add a classic date range control and the following controls for the widget block dedicated to the date range to compare:
- A dropdown list with the “Compared to” parameter as a control field
- 2 input boxes using the “Compared date range start” and “Compared date range end” parameters as control fields, for the custom date range.
- In the example below, I also added a text element “Custom dates – From” before the first field and another text “To” between the 2 fields.
- A checkbox with the “Compare” parameter as a control field
In order to present the comparison period resulting from the entries made with the fields described above, I also added a single-line and single-cell Table chart using the following calculated field formula.
CONCAT( "Vs " ,
FORMAT_DATETIME("%b %e, %Y",
PARSE_DATE("%Y/%m/%d" , SUBSTR(MIN(IF(date_range= "compared",date,null)), 1, 10)))
, " - " ,
FORMAT_DATETIME("%b %e, %Y",
PARSE_DATE("%Y/%m/%d" , SUBSTR(MAX(IF(date_range= "compared",date,null)), 1, 10)))
)
6. Charts configuration
At last, let’s start the configuration of the different charts on the page.
6.1. Scorecard with percentage variation (scorecard)
For the scorecard above the time series:
- Data Source: Our BigQuery query
- Use the “Sales” metric (blue box, with a capital S)
- For the variation below the main metric
- Add a single-line and single-cell Table chart using the formula below as the metric
- Apply 2 conditional formatting rules to display the percentage variation in green if it contains a “+” and in red if it contains a “-“
- Center the content of the cell and place the chart below the scorecard
CONCAT(
IF (SUM(IF (date_range = 'current', sales ,NULL))>= SUM(IF(date_range = 'compared', sales,NULL)) , "+","")
,
CAST(ROUND(ROUND((SUM(IF(date_range = 'current', sales ,NULL))/ SUM(IF(date_range = 'compared', sales,NULL)))-1,3) *100,1) AS TEXT)
, "%")
6.2. Sales table chart by continent and country
This table chart is configured as follows:
- “Setup” tab
- Data Source: Our BigQuery query
- Dimensions: continent, country
- Enable “Drill down” and select “country” as the Default drill down level
- Metrics:
- Sales (blue box, with a capital S)
- Compared sales
- A calculated field at the chart level, named “% Δ.”, using the same formula as the one indicated for the variation of the scorecard above. We apply exactly the same principle on a breakdown by country and by continent
- Sort by Sales in descending order
- “Style” tab
- Apply 2 conditional formatting rules to display the variation (% Δ.) in percentage in green if it contains a “+” and in red if it contains a “-“
6.3. Bar chart with Sales by Country
Apply the configuration details below to create a “bar chart”:
- “Setup” tab
- Data Source: Our BigQuery query
- Dimension: country
- Breakdown dimension: date_range
- Metric: sales (green box with a lowercase s)
- Sort by sales in descending order
- Secondary sort: Formula below in ascending order with a AVG aggregation
- “Style” tab
- Apply colors by dimension values: In the example, dark red for “current” and light red for “compared”
- Adjust the other style options
You can also recreate your own legend by using formulas for date range calculations to single-cell table charts (see Chapter 5).
For the secondary sort
IF (date_range='current', 1, 2)
6.4. Sales Trends
Finally, we will create the time series chart by relying on a data blending to align the dots of the 2 periods. Remember, the data set returns the row_number field, correctly valued following the comparison type.
So, blend the data as described below and name the result “DB – For trends”.
- Left Table
- Data Source: Our BigQuery query
- Dimensions: row_number, date
- Metrics: Sales (blue box, with a capital S)
- Right Table
- Data Source: Our BigQuery query
- Dimensions: row_number
- Metrics: Compared sales
- Join between the 2 tables
- Left outer
- Join condition: row_number for the 2 tables
Then, create the time series chart:
- “Setup” tab
- Data Source: DB – For trends (our combination)
- Dimension: date
- Metrics: Sales, Compared Sales
- “Style” tab
- Series #1: Use the color corresponding to the primary period, here dark red
- Series #2: Use the color corresponding to the comparison period, here light red
- Adjust the other style options
Like for the bar chart, you can also recreate your own legend using formulas for date range calculations to single-cell table charts (see Chapter 5).
Voilà, we have finished the configuration. I concede that this might seem tedious, but it’s very useful on an interactive report, often used to compare performance over time.
Handling your data queries through BigQuery and Looker Studio
Finally, you may legitimately wonder how BigQuery and Looker Studio manage the custom queries generated by the interface we have just created.
- The data is cached for the user. Back and forth on 2 comparison options allows you to notice it
- On the other hand, yes, new data will be loaded if the user customizes the main and/or the comparison date range
- But using prepared tables like in this example allows you to scan/load a small volume of data
- And in fact, the new native integration of BigQuery being gradually deployed offers many new capabilities that facilitate this type of usage, its monitoring thanks to enriched metadata, and many other advanced features.