Member-only story
Bigquery SQL optimization
7 min readJan 4, 2021
BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. Despite its scalable nature, there exist tips and tricks in writing Bigquery SQLto further improve the query performance
1. Use standard SQL instead of legacy SQL
2. Best practices for query structure
2.1. select necessary columns
Original code
select
*
from
dataset.table
Optimized
select
dim1,
metric1
from
dataset.table
Reason
- Only select the columns necessary, especially in inner queries. SELECT * is cost-inefficient and may also hurt performance
- If the number of columns to return is large, consider using SELECT * EXCEPT to exclude unneeded columns
2.2 ORDER BY with LIMIT
Original code
select
t.dim1,
t.dim2,
t.metric1
from
dataset.table
order by metric1 desc
- Optimized
select
t.dim1,
t.dim2,
t.metric1
from
dataset.table
order by metric1 desc
limit 1000
Reason
- Writing results for a query with an ORDER BY clause can result in Resources Exceeded errors. Because the final sorting must be done on a single slot, if you are attempting to order a…