What's the Concept?
When you wire an agent to a tool that queries BigQuery, somewhere a service account is making that query. Which service account? With which permissions? Against which datasets? If those answers aren't deliberate, you've created a security gap — the agent can potentially see, and through its responses leak, data it has no business reading.
The principle: every distinct "consumer role" gets its own service account, scoped to exactly what it needs. The agent's account has read access to gold tables it needs and nothing else — no silver, no bronze, no other agents' gold, no PII columns it doesn't need.
How It Works
The standard service-account topology for an agent stack:
Cloud Run
agent-runtime-sa
(read-only, gold-only)
│
┌────────┴─────────┐
▼ ▼
BigQuery Vertex AI
gold.* only Embeddings API
(specific tables)
pipeline-orchestrator-sa ──────▶ Composer/Cloud Run + BQ silver/gold rw
ingestion-stripe-sa ──────▶ GCS bronze write, Secret Manager read
data-engineer-sa ──────▶ All datasets read; silver/gold write
(human, via group membership)
pii-reviewer-sa ──────▶ Granted time-bounded for PII reviewThe agent's agent-runtime-sa has only:
bigquery.tables.getDataon the specific gold tables it queries (not the dataset — the tables).bigquery.jobs.createon its project.aiplatform.endpoints.predicton the Vertex AI embedding model.- Nothing else.
If the agent is compromised — prompt injection, a leaked tool call, a misbehaving plugin — the blast radius is "the gold tables the agent already could read." It cannot escalate to silver, bronze, or sibling agents' data.
Within a gold table, column-level security via Data Catalog policy tags can hide individual columns even from accounts with table read access:
gold.billing_agent_context columns:
customer_id [public]
email [pii:contact] ← policy tag
plan_name [public]
spend_last_90d [public]
payment_method [pii:financial] ← policy tagThe agent's service account is granted the pii:contact tag (so it can quote emails) but not pii:financial (so it can never see payment methods). SELECT * from the agent's connection automatically excludes the protected columns.
Why It Matters
- Prompt injection becomes a contained risk. A malicious user prompt can manipulate the agent's reasoning, but it can't widen the IAM grants. The data the agent can access is bounded by IAM, not by prompt rules.
- Tenant isolation in multi-tenant SaaS. A query template that forgets to add
WHERE tenant_id = ?is a critical bug. Combined with row-level security policies, you make tenant cross-pollination impossible at the storage layer. - Audit logs become useful. Cloud Audit Logs record every BigQuery query with the calling service account. "Which queries touched table X in the last hour?" is a single log filter.
- Rotation is mechanical. Service-account keys (when used) rotate on schedule. Workload Identity (preferred) means no keys at all — credentials come from the runtime environment.
Key Technical Details
- Use Workload Identity Federation instead of long-lived service-account keys whenever possible. The agent's Cloud Run service authenticates as the SA without a key file ever existing.
- BigQuery row-level security policies filter rows transparently. Define them at the table level:
CREATE ROW ACCESS POLICY tenant_isolation ON gold.foo GRANT TO ('group:tenant-a@myco.com') FILTER USING (tenant_id = 'a'). - For high-regulation workloads, VPC Service Controls add a network perimeter — even with valid credentials, BigQuery queries from outside the perimeter are blocked. Significant operational overhead; only worth it where compliance demands.
- Audit data access logs (off by default) once per service. They're necessary for SOC 2 and most security reviews.
Common Misconceptions
"IAM at the project level is fine." It's the most common bug. Project-level grants give too much access for the convenience saved. Spend the extra 10 minutes to scope at dataset or table level.
"The agent runs as a user, so it has the user's permissions." That's one pattern (impersonation), and it's appropriate sometimes — a customer-support agent might act as the support rep. But the default should be a dedicated service account; impersonation needs explicit thought.
"PII filtering happens at retrieval." It can, but defense in depth means PII is also masked at the storage layer (policy tags), and audited at the log layer. Three lines of defense, not one.
Connections to Other Concepts
- Course
03-the-raw-data-lake/04-data-governance-from-day-one— Labels and ownership feed into IAM grants. 04-handling-pii-and-redaction-pipelines— Active redaction on top of static IAM.- Course
05-serving-data-to-agents/04-the-retrieval-contract-between-pipeline-and-agent— The contract enforces what the agent can ask for; IAM enforces what it can actually receive.
Further Reading
- Google Cloud, "IAM overview" + "BigQuery access control" docs.
- "BigQuery column-level security" docs — Policy tag walkthrough.
- "VPC Service Controls overview" — For when network perimeter matters.
- NIST SP 800-207 "Zero Trust Architecture" — The conceptual framework; the patterns above are the GCP implementation.