Data Join
Coupler.io allows you to fetch data from several sources at the same time, and combine it in the destination into a unified dataset - all with a single importer.
Coupler.io has three data-combining modes:
- Join
- Append
- Aggregation
Data JOIN is a term Coupler.io uses to describe the process of combining two or more sources using LEFT JOIN (from SQL terminology) - meaning, all the elements from the left set are kept while only matching elements from the right set are included
To learn more about APPEND data combining mode, please visit this article: Data Append
To learn more about AGGREGATION data combining mode, please visit this article: Data Aggregation
Feature description
Let's take this example to understand this feature further. We have these source tables:
Contacts:
Department:
Goal: We want to get a list of "Contacts" under the "Partner sales" department. Currently, the "Contacts" table only has the IDs.
How to join data from multiple sources?
1. After adding the needed sources, go to the "Transformations" step and either select Join data from the data preview window, or hit + Add transformation >> Create Join option from the left sidebar:
2. Select the sources to join and the columns that will be used as the key:
3. Click the "Join data" button and we will get the following results:
4. After joining the data, you can apply other transformations such as: hide and reorder columns, filter and sort data, add a formula column, etc. to achieve the desired result:
Important notes
1. Order of key source matters
Let's review 2 sources
Source1: Users data
Source2: Additional user data
As we can see, both of the tables contain "Id" column representing the user id.
But, together with that, amount of rows is different: 10 and 3 respectively
- In case "Users data" is chosen as the Key source, we will receive the next result (10 rows as in the key source):
- In case "Additional user data" is chosen as the Key source, we will receive the next result (3 rows as in the key source):
2. Join by multiple fields
Along with Join by 2 sources, it is possible to join these sources by multiple fields
Source1: Users data |
Source2: Additional user data |
First, let's join these 2 sources by the "Id" field. The result will look like this (part of the data from "Additional user data" was cut due to missing references in the key source):
Next, we can join these 2 sources by one more column "Name".
The result of both joins will look like this (part of the data from "Additional user data" was cut due to missing references in the key source):
As a next step, we can clean up (filter out) results a bit and leave only rows that contain data. Also, we can hide some columns to have no duplicates.
And receive beautified results:
3. Join by multiple sources
Goal: We wanted to get a list of user names from the "Users" table, the amount of sold products from the "Transaction data" table, and product categories from the "Product data" table.
Source1: Users data |
Source2: Transaction data |
Source3: Product data |
Next data join setup will produce the desired result:
After applying data join along with the current transformation capabilities, the goal was achieved: