DoiT Wins Google Cloud Global Sales Partner of the Year Award – Learn more

How to Copy Data Inside BigQuery

1 smme5va lamemrx1uerh9w

Copying data between your own tables in BigQuery has become much easier in recent years. But there are still many ways to do it, each with different limitations that can get confusing. I’ll describe each, with pluses and minuses.

1 smme5va lamemrx1uerh9w

Copying across data regions used to be complicated.

Copy through Storage

You could copy from BigQuery to Cloud Storage, then back again to BigQuery in the other region. This not only has that additional complicating step, but it can get expensive too. You also had to orchestrate it yourself, for example, with Composer for scheduling, buffering, and parallelization.

Dataset Copying

Recently, the Dataset Copying feature has made this much easier. The feature works across regions and is free, except for network costs. Scheduling and related features are built-in.

Intra-region copying with bq, the Job API, or Copy Tables

Copying inside a region has always been easier, and best of all, free. Within the region, you can run bq cp from the command line, code against the Job API, or “Copy Tables” in the Cloud Console lets you do this as well.

Scheduled Queries

The Scheduled Queries feature is another way to copy inside a region. Though not free, it is quite flexible, allowing you to write any SELECT statement for insertion to a target table. And as the name suggests, it has scheduling built-in.

Of the approaches I mentioned, only bq cp and Scheduled Queries support one common use case, daily incremental backup of an ingestion-time partitioned table. Only these can copy just the last day’s partition and preserve this as a partition in the target table.

Here are your choices in table form:

You need to populate your Google Spreadsheet with data.

Approach Across Regions? Preserve ingestion-time partitions with a daily copy? Scheduling built-in? Free?
Copy Job API (various languages) No No No Yes
Dataset Copying Yes No Yes Yes (except network costs)
Scheduled Queries No Yes Yes No
Copy Table (Console) No No No Yes
bq cp No Yes No Yes
BQ to Storage to BQ Yes No No No

 

Subscribe to updates, news and more.