Features

Data blending with Looker Studio – How to blend data using left, right, inner, outer and cross join

Data blending with Looker Studio - How to blend data using left, right, inner, full outer & cross join types

The Looker Studio Google Data Studio team has just released a highly anticipated new feature paving the way for a lot of amazing possibilities about data manipulation & preparation:


A new data blending engine with join types (inner, outer, right and cross join types joining 😛 legacy left join) and the related new user interface!


Data blending is not just about mixing data from various sources, it is a solution for a lot of data manipulation cases. These welcomed features will even more multiply possibilities.

In my opinion, it is the most important release since the launching of parameters feature.

So now, let’s review the released new features and scratch the surface of a new huge data playground!


Table of contents


Video of features review

Excited and eager to see the new interface of the related amazing possibilities? I totally understand.

Nothing better than a small video to discover these new features and go through the following additions & modifications:




1. What is data blending inside Looker Studio

Before jumping into new features, let’s make a quick recap of core features of data blending, what you know or should know, and what had not changed.


1.1. The most important

Data blending lets us creating charts based on multiple data sources, called a blended data source or a data view.

Data blending is possible :

  • up to 5 data sources
  • joining them using from 0 to 10 join keys
  • also, with several instances of the same data source reflecting different configurations, different data sets

You can blend data using 3 main kinds of access on the Looker Studio interface:

  • Academic and data model centric access
    Menu Resource > Manage blended data, creating a new data view from scratch.
  • Streamlined & data centric access
    Clicking on “blend data” link (Data tab > Data source section) on a specific chart on which you need to widen the data set with additional data from another data source. In this case, the configuration of data blending will be initialized with the data set of the chart.
  • Visual & chart centric access
    Selecting 2 or more charts, right-clicking & selecting the contextual menu “blend data”. Regarding the selected charts, it will create a blended data & use it directly on a new generated chart. With this way to create blended data, selected data, configuration and chart type is all automatic, so to adapt, at least to replace the default name by something more user friendly, often to apply also a more consistent configuration.

Each instance of data source, with a specific configuration for the wished data set (join keys, dimensions, metrics, date range, filters and segment for Google Analytics) is named a table.


1.2. Advanced usages

It is possible to create calculated fields on each table to use it as a specific configuration storage for the blending. But local to the table, no way to use fields from other tables, so other data source.
So, most of (all) calculated fields are created on the chart level as a dimension, a metric or sorting field. Not ideal for complicated formula with business rules…

Blending data is also a nice way to reaggregate data: Output metrics from blended data are no more automatically aggregated contrary to their state from the original data source. So, they can be used as dimension or metric with the wished new aggregation method.


1.3. At last, for the first & previous season of data blending…

Until now, only left outer join was available as join type when you blend data…until now…

Let’s join me 😉 to review the new features to blend data!


2. The main new interface for data blending configuration

At first, let’s see how the user interface has changed to create and update blended data.

  • Joins configuration get a dedicated popin, accessible between each table clicking through a recap box summarizing the join type and the number of join keys
  • This panel contains the join keys and join types
    • Left outer join
    • Right outer join (new)
    • Inner join (new)
    • Full outer join (new)
    • Cross join (more explicit than before)
  • The join keys are no more common to all tables (up to 5 like said before) but specific to the wished relation between two neighboring table (new)
  • In spite of the change described above, the data blending designer keeps linear, so 1 table can be joined to 2 other tables

Google Data Studio data blending panel

Data Studio blended data - Configuration of tables join

3. Tables management

Creating blended data corresponds to build relations between tables. Because of the addition of the new join types, dimensions used on each table as join keys are now available also as final dimensions for the blended data, to use for your charts.

Indeed, previously, left joined blended data didn’t require to repeat join keys because the equality was always true by definition regarding the value of the left table. Only the join key(s) at the left table was added as dimension(s).

Now, because the relation is not strict, keys are all available.

At last, to distinguish fields sharing the same dimension name, the table name is used as a field name suffix. This suffix – between parenthesis – is displayed only if a blended data view contains the same field name from different fields & tables.

Consider this opportunity to give understandable and meaning names to your blended data (too often kept with a default name “Blended Data (X)”) at the same time that taking care of this new naming for tables.

At last, thanks to an option above the “save” button, fields repeated because of join keys can be hidden from the list fields of the final data view (for left, right, inner joins).


Google Data Studio - Blend data - Table name


4. Join types

Four join types are available, giving you a nice freedom to blend data regarding data sources models, the required final data and your preferred data manipulation tactic.



4.1. Left join (left outer join)

Data blending left join (Left outer join)

Left join type returns all rows from the left table, widened with matched rows from the table on the right side of the join. The only option for the first 4 years of existence of data blending feature inside Google Looker Studio.

Understanding & tactics
The master table is at the left. You don’t need more than widening master table – playing the role of data set reference – with additional columns. Nonexistent/missing values from the right table are considered as an information you wish to keep in your final blended data set.

Google Data Studio - Example of data blending with left join type

Examples of use cases

  • Stats on all products: Your product information system as left table and your stats data source by product at the right, the Product SKU as the join key. Products without stats needs to be visible
  • CTA Click rate: Your views or impressions at the left, functionally mandatory before the clicks at the right
  • Transcoding: The identifiers with data at the left and the corresponding labels at the right


4.2. Right join (right outer join)

Data blending right join (right outer join)

Right join type returns all rows from the right table, widened with matched rows from the table on the left side of the join. The same principle as left join, with opposite table positions.

Understanding & tactics
Why would we need right join if it is the same than left join against a change of tables positions? Inside Looker Studio, it is a way to combine both on the same blending, to widen a central table thanks to 2 other tables. One at the left with right join type, and another at the right with left join type.

Google Data Studio - Example of data blending with right join type

Example of use case

  • Expand statistics with 2 complementary data sources: Let’s say we need to expand product sales with product information on one side and user segments with CRM data on the other side. See below the example of data blending taking advantage of right join type
Example of Data Studio blended data with left and right joins


4.3. Inner join

Data blending - Inner join

Inner join type returns only rows that match on both tables with join condition.

Understanding & tactics
…so, all other rows of each data set are removed. It helps to focus on the overlap of data. It is also an easy way to filter your data set with dynamic data.

Google Data Studio - Example of data blending with inner join type

Examples of use cases

  • Calculate of the headcount of an intersection of two population segments
  • Isolate Google Analytics users with a cross devices journey
  • Filter analytics conversions claimed by affiliation platform


4.4. Full outer join

Data blending - Outer join (full outer join)

Full outer join type returns all rows from both tables, even if join condition is not met.

Understanding & tactics
Rows meeting join condition(s) are linked on the same final merged rows and the final data set contains also isolated rows of each source table with null values for fields of the other source table. Everything is kept and you can consider the 3 kinds of data in your reporting or for your data analysis.

Google Data Studio - Example of data blending with full outer join type

Example of use case

  • Online Sales from the back office on one side, and Google Analytics transactions on the other side, blended with transaction id as join key. You need to get sales common to both systems, but also which sales have been cancelled from the back office, and which orders have not been tracked on Google Analytics


4.5. Cross join

Data blending - Outer join (full outer join)

Cross join type returns a cartesian product of all tables rows.

Understanding & tactics
In other words, all rows of the left data set will be multiplied by all rows of right data set, with the related data repetition.

This is the only join type that doesn’t require a join key. It was already available with the previous version of data blending.

Google Data Studio - Example of data blending with cross join type

Example of use case

I wrote some articles – like the one about YoY metric comparison – taking advantage of this to access to all data on each row.


Final thoughts

No doubt, this new feature is a game changer, opening a lot of possibilities. We only scratch the surface through this feature review.

I remember how the Looker Studio users community was enthusiastic when parameter feature was released, how real life applications were plentiful.

We could except the same thing with this new data blending capabilities.

We need also to be careful, taking care to recognize the related limits (performance, maintenance …) to use this feature for data preparation Vs do it outside of Looker Studio. This kind of concerns will come practicing on use cases. So let’s play first!

Practicing is also finding tricky and out of the box specific usages, but that is another story 😉


Je peux me joindre à vous ?