How to make a data backup in BigQuery?

Use Coupler.io to set up the BigQuery destination to back up your data by fetching it from the list of supported sources and saving it to any of the supported destinations

BigQuery is a cloud data warehouse, that allows managing big volumes of data with ease. You can simply use it to back up any of your data with later access to it in the case when, at some point, you will need to restore your information. 

Coupler.io doesn't allow you to back up and restore all data in your source application. However, if you lost some data and are ready to restore it manually, Coupler.io can help you get access to the source of truth to understand what part of your information is missing. 

We believe that BigQuery is one of the most suitable destinations for data backup. Therefore, this article represents a detailed guide on how to use this specific destination for data backup. 

Table of contents

  1. How to back up source data into BigQuery
  2. How to pull your data backup to Microsoft Excel or Google Sheets

1. How to back up source data into BigQuery

For this purpose, you need to start setting up a regular BigQuery importer and complete the following steps:

  1. Fill out the Title field
  2. Select and set up your source.
  3. Set up BigQuery destination. Tip: use your source name as a dataset and entity name as a table (e.g. HubSpot_backup: Deals).
  4. In the next step, select the "Append" mode and enable the "Last updated column" option:
  5. Set up the desired backup schedule:

After this, Coupler.io will start backing up your data with the specified in this step frequency.

2. How to access your backup data

If you need to access your backup data to check how it looked like at the desired data or to compare with current data and fix possible issues you have two options:

  1. Open desired data in the BigQuery interface and look for the required information there:
  2. Import the required data from BigQuery to a Google spreadsheet or Microsoft Excel file.

We will describe the second option as the most common. 

First of all, you need to understand what backups you have and for what dates. In order to do this, set up a BigQuery importer with the following parameters:

  • Source: BigQuery
  • Connection: same key, that you used to back up data
  • SQL: 
  •         SELECT 
    	Row_Updated_At as Backup_date, 
    	count(*) as Records_count 
    	FROM `YOUR_PROJECT_NAME.HubSpot_backup.Deals`
    	group by Row_Updated_At<br>
    	

    In our case:

  • Proceed to the destination and set it up.

As a result, you will receive a list of available backups with the date and time when they were created and the correspondent number of records in it:

After that, you can actually pull the needed backup. Let's say, we need a backup for July 5th. In this case, we set up a BigQuery importer with the following parameters:

  • Source: BigQuery
  • Connection: same key, that you used to backup data
  • SQL: 
    SELECT * FROM `anything-importer-staging.HubSpot_backup.Deals`
    where Row_Updated_At >= 'DATE AND TIME OF THE BACKUP' and Row_Updated_At <= 'DATE AND TIME OF THE BACKUP + 1 sec'
    	

    Note, that BigQuery stores the date and time information with milliseconds and we do not know the exact backup time. That is why we need to look within a 1-second interval. In our case:

  • Proceed to the destination and set it up.

As a result, you will have your data imported into a spreadsheet/workbook/table.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.