Event Streams with Pub/Sub

What's the Concept?

When freshness has to be measured in seconds, not hours, batch pulling breaks down. The producer (a webhook, a microservice, an SDK in your mobile app) needs to push events the moment they happen, and you need something durable to catch them before any pipeline picks them up. That something, on GCP, is Pub/Sub.

Pub/Sub is a managed message bus. Producers publish messages to a topic; consumers create subscriptions to that topic and read at their own pace. Messages are durable for up to seven days by default. Delivery is at-least-once — meaning duplicates are possible — which determines how downstream consumers have to behave.

How It Works

The minimal streaming ingest:

                  ┌─────────────────────────┐
   webhook  ───▶  │  Cloud Run /webhook     │  ── publish ──▶  Pub/Sub topic
                  │  validates + signs      │                  "raw.orders"
                  └─────────────────────────┘                       │
                                                                    │ subscribe
                                                                    ▼
                                                       ┌───────────────────────┐
                                                       │  Pub/Sub → BigQuery   │
                                                       │  subscription writes  │
                                                       │  rows directly into   │
                                                       │  a bronze BQ table    │
                                                       └───────────────────────┘

Three patterns worth knowing:

1. Pub/Sub → BigQuery subscription. Newest and simplest. You configure a subscription with a destination BigQuery table, and Pub/Sub writes messages there with no code in the middle. Great for "land the raw event, transform later" scenarios.

2. Pub/Sub → Dataflow. When you need real per-event transformation, deduplication windows, or enrichment from another table. Dataflow is the canonical streaming framework on GCP; it reads from Pub/Sub, processes, and writes to BigQuery or GCS.

3. Pub/Sub → Cloud Run. Lightweight push subscriptions deliver each message to an HTTP endpoint. Right for sparse / low-volume events where Dataflow is overkill.

For agent-facing pipelines, pattern #1 is the default for raw bronze landing. Pattern #2 shows up once you start refining in flight.

Why It Matters

Decouples producer and consumer. A webhook can fire 10,000 events in a burst. Pub/Sub buffers them; consumers drain at sustainable rate. You don't need to scale receivers to peak load.
Replay is built in. Within the retention window, you can rewind a subscription and reprocess. This is the streaming equivalent of "replay from bronze."
At-least-onceAt-least-once. A delivery guarantee where every message reaches the consumer one or more times. Duplicates are possible — never silent drops. Downstream consumers must be designed to tolerate seeing the same message twice. + idempotentidempotent. Running the same operation twice produces the same end-state as running it once. Safe to retry; the property that lets at-least-once delivery yield exactly-once effects. processing = exactly-once effects. This is the actual production guarantee. You don't get exactly-once delivery; you get the freedom to design consumers that tolerate duplicates.

Key Technical Details

Default message retention: 7 days. Configurable up to 31 days.
Maximum message size: 10 MB. For larger payloads, write the payload to GCS and publish only the GCS URI.
Pub/Sub charges per message + per data volume; rough order of magnitude is $40 per TB published.
Latency target: end-to-end p50 under 1 second; p99 under 5 seconds. Bursts can spike higher.
Ordering is opt-in via an ordering key. Without it, message order across the topic is not guaranteed.

Common Misconceptions

"Pub/Sub is a queue." It's a topic-based pub/sub bus, which means multiple subscriptions can read the same stream independently. A queue has one reader per message; Pub/Sub has one reader per (subscription, message) — different model.

"At-least-once means it might lose messages." No — at-least-once means it might duplicate messages. The system never silently drops. The duplicate handling is your responsibility downstream.

"Stream means real-time." End-to-end p50 of about 1 second is "real-time" for almost any agent use case. If your model takes 2s to respond anyway, sub-second pipeline freshness is plenty.

Connections to Other Concepts

03-change-data-capture-from-databases — CDC events flow through Pub/Sub in most production setups.
Course 06-pipeline-orchestration/02-dataflow-for-heavy-transforms — The streaming-transform tier.
Course 07-operating-the-system/01-observability-and-data-quality-monitoring — How you monitor a stream you can't see.