Step 7: What's Next — Multi-Agent A2A System on Cloud Run

What You Have

Three independent Cloud Run services. One protocol. A pattern you can repeat for any specialist you want to add.

You can stop here and have a useful production stack — but A2A has a much bigger surface area you can grow into.

Add Signed Agent Cards

A2A v1.0 introduced signed Agent Cards. The specialist signs its card with its deployment's private key. The orchestrator verifies the signature before trusting the card.

This is the right move when:

You operate across multiple cloud providers (Cloud Run IAM won't help across GCP and AWS)
The specialists are run by different teams or organizations
You want a portable auth story that does not depend on the cloud platform

The signing key lives in a JWKS endpoint at /.well-known/jwks.json. ADK has helpers for both signing and verification. See the A2A v1.0 spec on signed cards.

Add a Third Specialist

The Researcher → Writer pipeline is two steps. Real research workflows often need:

Fact-Checker — takes findings and re-verifies the top claims against a second source
Editor — takes the Writer's draft and polishes voice/style to match a brand
Translator — produces the brief in multiple languages

Each new specialist is one more package with a to_a2a() wrapper. The Orchestrator's sub_agents=[…] list grows by one entry. The instruction grows by one sentence.

You will notice the pattern: adding a specialist never modifies the existing specialists. That separation is the entire reason to use A2A over in-process sub-agents.

Cross-Language Teams

ADK 1.0 ships in Python, Go, Java, and TypeScript. A2A is the wire format — it does not care about the language on either end.

A common shape:

Python Researcher (you have it)
TypeScript Writer (frontend team owns it, ships with their UI)
Go Fact-Checker (data team owns it, runs near their warehouse)

The orchestrator does not change. As long as each specialist publishes a valid Agent Card, the parent can call them all.

Observability

A2A traces are first-class in OpenTelemetry's GenAI semantic conventions. Each inter-agent call shows up as a span with attributes for the agent name, model, and message ID.

Drop an OTel exporter into each agent and you get end-to-end traces of every brief — Orchestrator → Researcher → google_search → Researcher response → Writer → Writer response — in tools like Langfuse, Honeycomb, or Grafana Tempo.

ADK's runtime emits OTel spans automatically. You only need to wire the exporter:

from opentelemetry import trace
from opentelemetry.sdk.trace.export import BatchSpanProcessor
# Plus your exporter of choice

Long-Running Tasks

A2A supports streaming and push notifications for tasks that take longer than a single request can stay open. For example: a research brief that needs to run for ten minutes.

The protocol's tasks/send and tasks/get methods let you start a task, get a task ID, poll for status, and receive a webhook when it completes. ADK exposes these via the RemoteA2aAgent when you set streaming=True.

This is the right move when a brief should take minutes, not seconds, because the Researcher wants to read fifty sources.

Sessions and Memory

Right now each request is stateless. The Orchestrator forgets the conversation between calls. ADK supports sessions out of the box:

# Add a session service to the agent
from google.adk.sessions import InMemorySessionService
# (Or VertexAiSessionService for cloud-backed sessions)

Sessions are the natural place to add: prior brief history, user preferences, project context. With sessions enabled, the Orchestrator can carry context across briefs.

Move to Vertex AI Agent Engine

Cloud Run is great. Vertex AI Agent Engine is Google's managed runtime specifically for ADK agents, with built-in:

Session storage
Tracing
Auth
Versioned deployments

For production at scale, swap the gcloud run deploy for vertexai.agent_engines.create(). The agent code does not change — only the deployment surface.

See the Agent Engine docs for the migration path.

Plug In MCP for Tools

The Researcher uses Gemini's built-in google_search. For more sophisticated tools — your own database, your CRM, your CI — wire them via MCP (Model Context Protocol).

The 2026 stack is:

A2A for inter-agent communication
MCP for tool calls and data retrieval
ADK as the SDK that uses both

If you have built an MCP server (see the Build Your Own MCP Server blueprint), the Researcher can use it the same way it uses google_search.

Where to Go for Reference

Key Takeaways

Signed Agent Cards (v1.0) are the portable auth story across clouds and teams
Adding a specialist is additive — you never have to modify existing agents
ADK ships in four languages; A2A means your team can mix them freely
For production scale, move from Cloud Run to Vertex AI Agent Engine
The 2026 agentic stack is A2A + MCP + ADK — three layers, each with a single concern