Cost Control on BigQuery and Vertex AI

What's the Concept?

GCP's pricing for the data stack is generally fair, but a few configurations turn cheap into ruinous in a hurry:

A BigQuery query that scans 1 TB instead of 1 GB because clustering wasn't honored.
An embedding refresh that re-embeds 10 million unchanged rows nightly because the incremental logic was wrong.
A streaming Dataflow job that auto-scaled to 50 workers because the input was malformed.
A Composer environment idling 24/7 for a 2-hour-per-day workload.

Each of these is a 100× to 1000× multiplier over the optimal cost. Almost all of them are preventable with a small number of policies.

How It Works

The five highest-leverage cost controls, in priority order:

1. maximum_bytes_billed on every interactive query. Every BigQuery query the agent's tools issue should set an explicit cost ceiling. A misconfigured cluster or a missing filter then errors out instead of silently scanning a terabyte:

job_config = bigquery.QueryJobConfig(
    query_parameters=[...],
    maximum_bytes_billed=10_000_000,   # 10 MB cap on this tool's queries
)

This is the single most impactful control. It costs nothing to set and prevents the most common cost incident.

2. Partition and cluster every silver/gold table. Already covered in Module 04, but worth repeating. A 1 TB table with no partition costs 1 TB per query; the same table partitioned by date and clustered by entity ID costs ~5 GB per typical query. The 200× saving compounds with usage.

3. Project-level + dataset-level budget alerts. Cloud Billing's budget alerts fire at 50% / 75% / 90% / 100% of a configured monthly cap. The 50% alert catches problems while you can still react.

4. Reservations for predictable workloads. Once BigQuery on-demand cost passes ~$2,000/month for stable workloads, flat-rate BigQuery Editions (Standard / Enterprise) usually cost less. The break-even depends on patterns; check the Recommender every quarter.

5. Cache embedding outputs. Embedding API calls are the second-largest line item. If a row's text hasn't changed, the embedding hasn't either. Compute a content hash on input; only re-call the API when the hash differs. Most production pipelines re-embed under 5% of rows per day after this is done — versus 100% if you forget.

-- Embedding refresh, with caching
MERGE `myco.gold.support_tickets` AS target
USING (
  SELECT t.*, SHA256(t.subject || '\n\n' || t.body) AS content_hash
  FROM `myco.silver.tickets` t
  WHERE t.updated_at >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
) AS source
ON target.ticket_id = source.ticket_id
WHEN MATCHED AND target.content_hash != source.content_hash THEN
  UPDATE SET
    content_hash = source.content_hash,
    embedding = (
      SELECT ml_generate_embedding_result
      FROM ML.GENERATE_EMBEDDING(
        MODEL `myco.embedding_models.text_embedding_005`,
        (SELECT source.subject || '\n\n' || source.body AS content)
      )
    ),
    body = source.body,
    subject = source.subject,
    updated_at = source.updated_at;

Why It Matters

Costs scale super-linearly with usage if left alone. A 10× growth in users can become a 1000× growth in bill if every retrieval triggers a re-embedding pass.
Cost incidents propagate slowly. A misconfigured query that runs hourly takes a few days to become noticeable in billing. The 50% budget alert is your safety net.
Cheap cost control is the boring kind. Partitioning, parameter ceilings, content hashing — none are clever. All are reliable.

Key Technical Details

BigQuery on-demand pricing: $6.25 p er T i B sc ann e d (c han g esocc a s i o na l l y; c h ec k c u r r e n tp r i c in g) . S t or a g e :$ 0.020/GB-month for active, $0.010 for long-term (auto-applied after 90 days untouched).
Vertex AI text-embedding-005: roughly $0.0001 p er 1000 in p u t c ha r a c t er s . A 500 - c ha r a c t er t i c k e t cos t s$ 0.00005 to embed. 10M tickets = $500. W i t h c a c hin g, d ai l y r e f r es hi s$ 5–25.
Composer environment: starts at ~$300/month for the smallest configuration. For pipelines doing <10 hours of work per day, Workflows + Cloud Run + Cloud Scheduler is usually cheaper than Composer.
Dataflow streaming: ~ $35/ m o n t h p er a l w a y s - o n v C P U . A s t r e amin g j o b s u s t ain e d a t 4 v C P U cos t s$ 140/month before storage.
Use BigQuery flat-rate reservations with workload management to cap concurrency. A reservation of 100 slots costs ~$2,000/month and gives predictable performance for that budget.

Common Misconceptions

"BigQuery is cheap." Per-byte-scanned, it is. Per-careless-query, it's expensive in a unique way: a query that takes 30 seconds but scans 5 TB costs $30. Per scan, not per second.

"We don't have enough volume to worry." Three of the four cost controls above (caps, partitioning, caching) are also clarity tools. They make the system easier to reason about, regardless of volume.

"Reservations are for large teams." Reservations are right whenever the bill is predictable and >$2,000/month, regardless of team size. The Recommender tab in BigQuery flags this automatically.

Connections to Other Concepts

01-observability-and-data-quality-monitoring — Cost dashboards are part of the operational view.
Course 04-refinement-in-bigquery/04-incremental-and-idempotent-pipelines — Idempotent incrementals are the foundation that makes caching feasible.
Course 05-serving-data-to-agents/01-structured-retrieval-bigquery-as-a-tool — Where maximum_bytes_billed lives.