A2A Protocol Interaction Specification
!!! note "Note" This document defines the extended implementation specification for the A2A protocol within the trpc-agent-go framework. Regular users do not need to read this document when using A2A Client/Server — the framework automatically handles all protocol conversion details. You only need to refer to this specification when developing a non-trpc-agent-go A2A Client/Server that interoperates with this framework.
Background
The A2A (Agent-to-Agent) protocol defines the basic data models (Message, Task, Part, etc.) and operation interfaces (SendMessage, StreamMessage, etc.) for inter-Agent communication. The A2A specification states its design goals at the very beginning:
The Agent2Agent (A2A) Protocol is an open standard designed to facilitate communication and interoperability between independent, potentially opaque AI agent systems.
Its primary goal is to enable agents to:
- Discover each other's capabilities.
- Negotiate interaction modalities (text, files, structured data).
- Manage collaborative tasks.
- Securely exchange information to achieve user goals without needing access to each other's internal state, memory, or tools.
As shown above, A2A envisions agents collaborating as "black boxes" — discovering capabilities, negotiating interaction modalities, and managing collaborative tasks, but without needing access to each other's Tools, Memory, or Internal State. Based on this design philosophy, trace data from an Agent's execution process — such as tool call chains (Function Call / Response), code execution steps, and model reasoning steps (Reasoning) — falls outside the scope of the A2A protocol, which does not define how such data should be transmitted.
However, in real-world multi-Agent orchestration scenarios, some users want to see the execution path of a remote Agent for debugging, auditing, or finer-grained coordination. To address this need, trpc-agent-go leverages the extension mechanisms reserved by the A2A protocol (DataPart, Message.metadata, etc.) to support the transmission of such data without violating the protocol specification.
This document defines the interaction specification of trpc-agent-go on top of the A2A protocol, serving as the standard reference for Client and Server implementations. This document will be updated as the A2A protocol evolves.
!!! info "Future Plans" To better align with the A2A specification's design philosophy, the cross-Agent transmission of execution process data such as tool calls will be designed as an independent extension, allowing users to decide whether to enable it through configuration. If you prefer to strictly follow A2A's black-box collaboration model without exposing internal execution details, simply disable it — only final results will be transmitted.
For the complete A2A protocol specification, see: https://a2a-protocol.org/latest/specification/
For the framework usage guide, see: A2A Integration Guide
Version Identifier
Current interaction specification version: 0.1
trpc-agent-go declares the interaction specification version through the standard A2A AgentCard.capabilities.extensions field, allowing the Client to detect the specification version supported by the remote Agent during discovery, enabling compatible handling during future upgrades.
This extension identifies the Agent interaction specification — the overall protocol-level conventions such as message encoding formats, streaming control, and metadata field definitions. The cross-Agent transmission of execution process data like tool call traces is an independent capability that will be declared and controlled through a separate extension in the future, and is not within the scope of this extension.
Declaration in AgentCard
The AgentCard automatically generated by the A2A Server includes the following extension:
| Field | Description |
|---|---|
uri |
Extension identifier, fixed as trpc-a2a-version |
required |
Omitted (defaults to false), indicating this extension is declarative and does not require the Client to support it. Standard A2A Clients that do not recognize this extension can still perform basic interactions |
params.version |
Interaction specification version number, following semantic versioning (currently 0.1) |
Version Declaration in Requests
When the A2A Client sends a request (message/send or message/stream), it includes the interaction specification version it supports in the Message metadata:
| Field | Description |
|---|---|
interaction_spec_version |
The interaction specification version supported by the Client; the Server can use this to determine the response encoding format |
The Server can determine the Client's capabilities based on this field: - Field present: The Client is a trpc-agent-go client that supports the corresponding version of the interaction specification (e.g., tool call, reasoning content, and other extended encodings) - Field absent: The Client is a standard A2A client or a client from another framework; the Server should respond using basic A2A protocol behavior
Compatibility Strategy
- Major version change (e.g.,
1.0→2.0): Indicates incompatible protocol changes; the Client should check the version and degrade or reject - Minor version change (e.g.,
1.0→1.1): Indicates backward-compatible extensions; the Client can safely ignore unknown fields - When the Client does not detect this extension, it should fall back to basic A2A protocol behavior
- When the Server receives a request without
interaction_spec_version, it should respond using basic A2A protocol behavior
Overall Conversion Flow
sequenceDiagram
participant C as A2AAgent (Client)
participant S as A2A Server
participant R as Agent Runner
participant A as Agent
Note over C,A: Non-streaming (message/send)
C->>C: Invocation → protocol.Message (TextPart/FilePart)
C->>S: HTTP POST JSON-RPC: message/send
S->>S: protocol.Message → model.Message
S->>R: runner.Run(userID, ctxID, message)
R->>A: agent.Run(ctx, invocation)
A-->>R: event.Event (ToolCall)
A-->>R: event.Event (ToolResponse)
A-->>R: event.Event (Content)
R-->>S: <-chan event.Event
S->>S: Convert each event → protocol.Message
S->>S: Single message returns Message / multiple messages wrapped in Task
S-->>C: JSON-RPC Response (Message or Task)
C->>C: Message/Task → event.Event sequence
Note over C,A: Streaming (message/stream)
C->>C: Invocation → protocol.Message (TextPart/FilePart)
C->>S: HTTP POST JSON-RPC: message/stream
S->>S: protocol.Message → model.Message
S-->>C: SSE: TaskStatusUpdateEvent (submitted)
S->>R: runner.Run(userID, ctxID, message)
R->>A: agent.Run(ctx, invocation)
loop Agent produces events
A-->>R: event.Event (Delta/ToolCall/...)
R-->>S: event.Event
S->>S: event → TaskArtifactUpdateEvent
S-->>C: SSE: TaskArtifactUpdateEvent
C->>C: ArtifactUpdate → event.Event
end
S-->>C: SSE: TaskArtifactUpdateEvent (lastChunk=true)
S-->>C: SSE: TaskStatusUpdateEvent (completed)
SendMessage waits for the complete response before returning it all at once, while StreamMessage pushes incremental events in real-time via SSE. The framework automatically handles format conversion on both ends.
Event Types and A2A Mapping
| Agent Event | A2A Part Type | Part Metadata | Message Metadata |
|---|---|---|---|
| Text Reply | TextPart |
— | object_type: chat.completion |
| Reasoning Content | TextPart |
thought: true |
Same as above |
| Tool Call | DataPart (id/name/args) |
type: function_call |
object_type: chat.completion |
| Tool Response | DataPart (id/name/response) |
type: function_response |
object_type: tool.response |
| Executable Code | DataPart (code/language) |
type: executable_code |
tag: code_execution_code |
| Code Execution Result | DataPart (output/outcome) |
type: code_execution_result |
tag: code_execution_result |
Tool Call Transmission Flow
Non-streaming (message/send)
The Server collects all Agent events and returns them at once. A single message is returned directly as protocol.Message; multiple messages are wrapped in protocol.Task (intermediate processes in history, final reply in artifacts).
sequenceDiagram
participant C as A2AAgent (Client)
participant S as A2A Server
participant R as Agent Runner
participant A as Agent + LLM
C->>C: Invocation → protocol.Message
C->>S: HTTP POST message/send (TextPart: "What's the weather in Beijing?")
S->>S: protocol.Message → model.Message
S->>R: runner.Run(userID, ctxID, message)
R->>A: agent.Run(ctx, invocation)
A->>A: Call LLM
Note over A,R: LLM decides to call a tool
A-->>R: event(ToolCall: get_weather, args: {city:"Beijing"})
R->>R: Execute tool get_weather
R-->>A: Tool result: {temp:"20°C"}
A-->>R: event(ToolResponse: {temp:"20°C"})
Note over A,R: LLM generates final reply
A->>A: Call LLM again (with tool result)
A-->>R: event(Content: "The current temperature in Beijing is 20°C")
A-->>R: event(RunnerCompletion)
R-->>S: channel closed
Note over S,C: Server converts and wraps
S->>S: ToolCall event → DataPart (function_call)
S->>S: ToolResponse event → DataPart (function_response)
S->>S: Content event → TextPart
S->>S: 3 messages → wrapped as Task
S-->>C: JSON-RPC Response: Task
Note left of C: history: [function_call msg, function_response msg]<br/>artifacts: [TextPart "The current temperature in Beijing is 20°C"]
C->>C: Parse history → ToolCall/ToolResponse event
C->>C: Parse artifacts → Content event
C->>C: Output event.Event sequence downstream
Streaming (message/stream)
The Server converts each Agent event in real-time to an SSE TaskArtifactUpdateEvent and pushes it to the Client. Tool calls and tool responses are sent as complete events, while text content is pushed incrementally.
sequenceDiagram
participant C as A2AAgent (Client)
participant S as A2A Server
participant R as Agent Runner
participant A as Agent + LLM
C->>C: Invocation → protocol.Message
C->>S: HTTP POST message/stream (TextPart: "What's the weather in Beijing?")
S->>S: protocol.Message → model.Message
S->>S: BuildTask → taskID
S-->>C: SSE: TaskStatusUpdateEvent (submitted) [Client filters]
S->>R: runner.Run(userID, ctxID, message)
R->>A: agent.Run(ctx, invocation)
Note over A,S: Tool call (complete event)
A-->>R: event(ToolCall: get_weather)
R-->>S: event
S->>S: ToolCall → DataPart (function_call)
S-->>C: SSE: ArtifactUpdate {DataPart, llm_response_id: "chatcmpl-1"}
C->>C: DataPart → ToolCall event
R->>R: Execute tool get_weather
A-->>R: event(ToolResponse: {temp:"20°C"})
R-->>S: event
S->>S: ToolResponse → DataPart (function_response)
S-->>C: SSE: ArtifactUpdate {DataPart, llm_response_id: "chatcmpl-1"}
C->>C: DataPart → ToolResponse event
Note over A,S: Final reply (incremental push)
A-->>R: event(Delta: "The current")
R-->>S: event
S-->>C: SSE: ArtifactUpdate {TextPart: "The current", llm_response_id: "chatcmpl-2"}
C->>C: TextPart → Delta event
A-->>R: event(Delta: " temperature is 20°C")
R-->>S: event
S-->>C: SSE: ArtifactUpdate {TextPart: " temperature is 20°C", llm_response_id: "chatcmpl-2"}
C->>C: Same llm_response_id → aggregate into same message
Note over S,C: Stream ends
A-->>R: event(RunnerCompletion)
R-->>S: channel closed
S-->>C: SSE: ArtifactUpdate (lastChunk=true) [Client filters]
S-->>C: SSE: TaskStatusUpdateEvent (completed) [Client filters]
S->>S: CleanTask(taskID)
Client-Side Filtering Rules
TaskStatusUpdateEvent(submitted/completed): Task lifecycle signals, no user contentTaskArtifactUpdateEventwithlastChunk=true: Stream end signal or aggregated result
Role of llm_response_id
The Server includes llm_response_id in the Metadata of each response (from the ID returned by the LLM API, such as OpenAI's chatcmpl-xxx). All events produced by the same LLM call share the same llm_response_id. When the Agent makes a second LLM call (e.g., the final reply after a tool call), the llm_response_id changes. The Client uses this to determine that multiple incremental events belong to the same message.
This mechanism is primarily used for message aggregation in AG-UI scenarios — AG-UI's translator uses rsp.ID to decide when to emit TextMessageStart/TextMessageEnd events, and a change in llm_response_id indicates a new message has started.
Reasoning Content Transmission
Model reasoning processes (e.g., DeepSeek R1) are marked via TextPart.metadata.thought:
| Direction | ReasoningContent | Content |
|---|---|---|
| Agent → A2A | TextPart + metadata: {thought: true} |
TextPart (no thought marker) |
| A2A → Agent | thought=true → restored as ReasoningContent |
No marker → restored as Content |
A single Message can contain both reasoning content and formal reply as two TextParts.
Metadata Specification
Request Direction (Client → Server)
| Carrier | Field | Description |
|---|---|---|
| HTTP Header | X-User-ID |
User identifier (primary source) |
| HTTP Header | traceparent |
W3C Trace Context (auto-injected by OpenTelemetry) |
| Message.Metadata | invocation_id |
Client-side invocation ID for trace correlation |
| Message.Metadata | user_id |
User identifier (supplementary) |
Response Direction (Server → Client)
| Carrier | Field | Description |
|---|---|---|
| Message/Artifact Metadata | object_type |
Event business type (chat.completion, tool.response, etc.) |
| Message/Artifact Metadata | tag |
Event tag (distinguishes code execution vs code execution result) |
| Message/Artifact Metadata | llm_response_id |
LLM response ID (used for Client-side message aggregation, e.g., OpenAI's chatcmpl-xxx) |
| Part Metadata | type |
Data semantic type (function_call, function_response, executable_code, code_execution_result) |
| Part Metadata | thought |
Whether this is reasoning/thinking content |
Network Packet Examples
Non-streaming: Request and Response with Tool Call
Request:
Response (Task, with tool call intermediate process):
Streaming: SSE Event Sequence with Tool Call
Request:
SSE Response:
Non-streaming: Response with Reasoning Content
Streaming: Parallel Tool Calls
Multiple tool calls are placed in the parts array of the same artifact-update:
Distributed Tracing
Trace Context Propagation
Distributed tracing is propagated through HTTP Headers, following the W3C Trace Context standard:
- Client side: Automatically injects the
traceparentheader into HTTP requests via OpenTelemetry'sTextMapPropagator - Server side: Automatically extracts trace context from the
traceparentHTTP request header and injects it intocontext.Context, making it available throughout the entire call chain
Application-Layer Tracing Fields
In addition to HTTP-layer trace context, supplementary tracing information is passed through Message.Metadata:
invocation_id(Request Metadata): Identifies a single Agent invocation on the Client side; the Server can use it for log correlationllm_response_id(Response Metadata): Identifies the original LLM response ID; the Client uses it for message aggregation