Automated BigQuery Optimization

BigQuery gives you two pricing models: on-demand (bytes) and slots (capacity). Teams often treat this as an either/or choice, but in practice, you can — and often should — use both. The trick is knowing when each model is cheaper, and how to route workloads accordingly.

Step 1. Measure Both Models’ Cost

Before you can decide, you need numbers. BigQuery makes it possible to estimate both on-demand and slot costs for any query.

On-demand is simple. Run a dry-run and get the total_bytes_processed:

bq query --dry_run 'select ...'Query successfully validated. Assuming the tables are not modified,running this query will process upper bound of 2556289 bytes of data.

Multiply that by $6.25 per TiB (docs) and you have your expected on-demand charge.

Slots (aka capacity-based pricing) take a bit more work. Every query records total_slot_ms in INFORMATION_SCHEMA.JOBS*. Divide slot milliseconds by 3,600,000 to convert to slot-hours, then multiply by the slot-hour price for your edition:

Standard Edition ≈ $0.040/slot-hour
Enterprise Edition ≈ $0.060/slot-hour
Enterprise Plus Edition ≈ $0.100/slot-hour

But slot pricing has some quirks. Billing is in 50-slot increments when autoscaling or reserving. Each reservation carries a 1-minute minimum warm-up charge, even if the job runs for a second (docs). And idle baseline slots? Still billed (Medium). We’ll go over some examples when slot-based pricing might make sense further down.

How to work with INFORMATION_SCHEMA.JOBS

Here is a basic SQL query scanning your INFORMATION_SCHEMA.JOBS and figuring out which job would be cheaper on which pricing model. As an exercise, try grouping this result by project, query, or hash/fingerprint to quickly find workloads that could benefit from switching models.

DECLARE LOOKBACK_DAYS INT64 DEFAULT 7;          -- how far back to look
DECLARE PRICE_PER_TIB NUMERIC DEFAULT 6.25;      -- on-demand $/TiB
DECLARE STD_SLOT_PRICE NUMERIC DEFAULT 0.040;    -- $/slot-hour for STANDARD
DECLARE FLOOR_SLOTS INT64 DEFAULT 50;           -- autoscaler/reservation increment
DECLARE FLOOR_SECONDS INT64 DEFAULT 60;          -- 1-minute minimum

WITH jobs AS (
  SELECT
    project_id,
    job_id,
    creation_time,
    user_email,
    statement_type,
    cache_hit,
    total_bytes_processed,
    total_slot_ms,
    reservation_id,                -- null means on-demand
    state
  FROM `region-eu`.INFORMATION_SCHEMA.JOBS_BY_PROJECT -- adjust to your region
  WHERE
    creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL LOOKBACK_DAYS DAY)
    AND job_type = 'QUERY'
    AND state = 'DONE'
    AND error_result IS NULL
    AND cache_hit = FALSE  -- cache hits are free anyway
),

costs AS (
  SELECT
    project_id,
    job_id,
    creation_time,
    user_email,
    statement_type,
    reservation_id,
    total_bytes_processed,
    total_slot_ms,

    -- on-demand cost: bytes -> TiB -> $
    SAFE_DIVIDE(total_bytes_processed, POW(1024, 4)) * PRICE_PER_TIB AS on_demand_cost_usd,

    -- slots (marginal): slot-ms -> slot-hours -> $
    SAFE_DIVIDE(total_slot_ms, 3600000) * STD_SLOT_PRICE AS slots_cost_marginal_usd,

    -- slots (with billing floor): max(actual, 50 slots * 60s)
    GREATEST(
      CAST(total_slot_ms AS INT64),
      FLOOR_SLOTS * FLOOR_SECONDS * 1000
    ) AS slots_ms_with_floor,
    SAFE_DIVIDE(
      GREATEST(CAST(total_slot_ms AS INT64), FLOOR_SLOTS * FLOOR_SECONDS * 1000),
      3600000
    ) * STD_SLOT_PRICE AS slots_cost_with_floor_usd
  FROM jobs
)

SELECT
  project_id,
  job_id,
  creation_time,
  user_email,
  statement_type,
  reservation_id,
  total_bytes_processed,
  total_slot_ms,

  on_demand_cost_usd,
  slots_cost_marginal_usd,
  slots_cost_with_floor_usd,

  -- which would be cheaper ignoring floors
  IF(on_demand_cost_usd < slots_cost_marginal_usd, 'ON_DEMAND', 'SLOTS_STD') AS cheaper_marginal,

  -- which would be cheaper including floors (often more realistic for spiky loads)
  IF(on_demand_cost_usd < slots_cost_with_floor_usd, 'ON_DEMAND', 'SLOTS_STD') AS cheaper_with_floor

FROM costs
ORDER BY creation_time DESC

‍

Step 2. Manually Route Workloads

Once you know the economics, you can route workloads either to on-demand or to slots. At the coarse level, you assign projects: projects tied to a reservation run on slots, projects without one run on-demand.

But perhaps more interestingly, you can also set the reservation_id per job when you send it. (Docs) Setting it to “none” forces on-demand billing.

Edition choice also matters. The Standard Edition is cost-efficient for steady ETL operations. The Enterprise Edition adds autoscaling and governance features at a higher price, useful for variable or sensitive workloads. Enterprise Plus is the compliance-heavy tier. (Read the docs for more details.)

So, you could send jobs that need Enterprise features to an Enterprise reservation, most other jobs to a Standard Edition, and your odd ad-hoc query to on-demand.

The main point: you can deliberately decide which queries go where. It’s a powerful, albeit manual, tool in your toolbox.

Examples: Where Each Model Wins

Example A: BI/Dashboard Query (short, frequent)

Runs: 1,000/day
Bytes processed: 10 GiB/run
Slot ms: 120,000 (≈120 slot-seconds)

Cost/run per model:
On-Demand → $0.061
Standard Slots → $0.0013
Enterprise Slots → $0.0019

Slots look cheaper — until you hit the 1-minute warmup penalty. A cold start of 200 slots for 1 minute adds $0.20 (Enterprise). For tiny queries, warmup can outweigh the per-run slot savings. So you need to verify your assumptions here on a case-by-case basis.

Example B: Pipeline Query (large, nightly)

Bytes processed: 4 TiB
Slot ms: 45,000,000 (≈12.5 slot-hours)

Cost/run per model:
On-Demand → $25.00
Standard Slots → $0.50
Enterprise Slots → $0.75

For heavy, repeatable jobs, slots usually crush on-demand.

Example C: Compute-Heavy but Byte-Light Job

Sometimes a query barely scans data but burns a lot of slots. Think wide joins on small tables, complex UDFs, or gnarly aggregations. Here, on-demand pricing shines because billing is tied only to bytes, not to how much compute you use.

Schedule: runs once or twice a day
Bytes processed: 5 GiB
Slot ms: ~18,000,000 (≈5 slot-hours) — the job is compute-heavy

Cost/run per model:
On-Demand → (5 / 1024) * 6.25 ≈ $0.030
Standard Slots → 5 * 0.040 = $0.20
Enterprise Slots → 5 * 0.060 = $0.30

Why on-demand wins:

On-demand is blind to how many slots you burn; it only cares about bytes scanned.
Slot billing tracks CPU usage, so inefficient compute explodes costs.
For this shape, moving the job away from reservations to on-demand cuts costs by 80%+.

The catch: you only realize these savings if you also downsize your reservation. If you keep the same slot pool, those idle slots are still billed whether this job uses them or not. This is true even when using the slots autoscaler — a problem my colleague Marcello will discuss in more detail in another article.

One more fact to consider: Google enforces a compute-to-bytes ratio on on-demand jobs. If a query burns too much CPU relative to the bytes scanned, it will fail with a resource-limit error and must be run on slots instead. (That said, more often than not that means you did something stupid.) There’s also a ceiling of about 2,000 concurrent slots per project on on-demand, so very large or parallel workloads can hit that wall and you’re back to slot reservations.

Step 3. Recognize the Scaling Limit

Looking at the three archetypes makes the trade-offs clear. It’s absolutely worth trying this yourself; many teams shave off real dollars off their bill with a bit of manual routing. But now imagine repeating that analysis for hundreds or thousands of queries across dozens of teams, each with different schedules and data volumes. It’s not uncommon to see dbt or Dataform projects with 5,000+ models. Hand-picking the cheapest pricing model per job in that environment quickly becomes impossible.

This kind of hands-on FinOps breaks down fast. Workloads shift over time, slot utilization drifts, new pipelines appear. Every change means recalculating dry-run bytes, checking slot-ms, and re-writing job configs. And all of this must be done with the quirks of slot increments and warmup charges in mind. Fun for a handful of queries; unsustainable at scale.

Step 4. Automate the Mix

The good news: the same core principles do scale if you wrap them in automation. You can codify the dry-run vs slot-ms comparison, apply routing rules consistently, and keep a ledger of which jobs ran cheaper where.

We’ve worked with customers who built homegrown routers and later decided to hand off the problem. They were happy to let a system manage it automatically, and they saw better results once every query was fingerprinted, compared, and routed consistently.

That’s where Alvin helps. When manual isn’t enough, Alvin automates the mix, accounts for the quirks, and keeps a transparent savings ledger. It’s the same principle you’d apply yourself, just at scale, with the depth that comes from observing countless usage patterns across different organizations.

Bottom line: Don’t choose between pricing models. Mix them, measure them, and if the mix gets messy, automate it.

Cut your BigQuery spend with no code changes

Automated optimization with zero ongoing effort.

Run Savings Report

How to Mix BigQuery On-Demand and Slot Pricing in Practice