Data Join
Data join is term Coupler.io used to describe the process of combining two or more sources using LEFT JOIN (from SQL terminology) - meaning, all the elements from the left set were kept while only matching elements from the right set is included.
Feature description
Let's take this example to understand this feature further. We have these source tables:
Contacts:
Department:
Goal: We want to get a list of "Contacts" under the "Partner sales" department. Currently, the "Contacts" table only has the IDs.
How to join data from multiple sources?
1. After adding the needed sources, go to the "Transform data" step and select " CREATE JOIN":
2. Select the sources to join and the columns that will be used as the key:
3. Click the Join button and we will get the following results:
4. After joining the data, you can apply other transformations such as: hide and reorder columns, filter and sort data, add a formula column, etc. to achieve the desired result:
Important notes
1. Order of key source matters
Let's review 2 sources
Source1: Users data
Source2: Additional user data
As we can see, both of the tables contain "Id" column representing the user id.
But, together with that, amount of rows is different: 10 and 3 respectively
- In case "Users data" is chosen as the Key source, we will receive the next result (10 rows as in the key source):
- In case "Additional user data" is chosen as the Key source, we will receive the next result (3 rows as in the key source):
2. Join by multiple fields
Along with Join by 2 sources, it is possible to join these sources by multiple fields
Source1: Users data | Source2: Additional user data |
First, let's join these 2 sources by the "Id" field. The result will look like this (part of the data from "Additional user data" was cut due to missing references in the key source):
Next, we can join these 2 sources by one more column "Name".
The result of both joins will look like this (part of the data from "Additional user data" was cut due to missing references in the key source):
As a next step, we can clean up (filter out) results a bit and leave only rows that contain data. Also, we can hide some columns to have no duplicates.
And receive beautified results:
3. Join by multiple sources
Goal: We wanted to get a list of user names from the "Users" table, the amount of sold products from the "Transaction data" table, and product categories from the "Product data" table.
Source1: Users data | Source2: Transaction data | Source3: Product data |
Next data join setup will produce the desired result:
After applying data join along with the current transformation capabilities, the goal was achieved: