Member-only story

Bigquery SQL optimization

Weekly Dev Blog
7 min readJan 4, 2021

--

BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. Despite its scalable nature, there exist tips and tricks in writing Bigquery SQLto further improve the query performance

1. Use standard SQL instead of legacy SQL

2. Best practices for query structure

2.1. select necessary columns

Original code

select
*
from
dataset.table

Optimized

select
dim1,
metric1
from
dataset.table

Reason

  • Only select the columns necessary, especially in inner queries. SELECT * is cost-inefficient and may also hurt performance
  • If the number of columns to return is large, consider using SELECT * EXCEPT to exclude unneeded columns

2.2 ORDER BY with LIMIT

Original code

select
t.dim1,
t.dim2,
t.metric1
from
dataset.table
order by metric1 desc
  • Optimized
select
t.dim1,
t.dim2,
t.metric1
from
dataset.table
order by metric1 desc
limit 1000

Reason

  • Writing results for a query with an ORDER BY clause can result in Resources Exceeded errors. Because the final sorting must be done on a single slot, if you are attempting to order a…

--

--

Responses (3)