Runner Component User Guide
Overview
Runner provides the interface to run Agents, responsible for session management and event stream processing. The core responsibilities of Runner are: obtain or create sessions, generate an Invocation ID, call the Agent (via agent.RunWithPlugins), process the returned event stream, and append non-partial response events to the session.
๐ฏ Key Features
- ๐พ Session Management: Obtain/create sessions via sessionService, using inmemory.NewSessionService() by default.
- ๐ Event Handling: Receive Agent event streams and append non-partial response events to the session.
- ๐ ID Generation: Automatically generate Invocation IDs and event IDs.
- ๐ Observability Integration: Integrates telemetry/trace to automatically record spans.
- โ Completion Event: Generates a runner-completion event after the Agent event stream ends.
- ๐ Plugins: Register once on a Runner to apply global hooks across agent, tool, and model lifecycles.
Architecture
๐ Quick Start
๐ Requirements
- Go 1.21 or later.
- Valid LLM API key (OpenAI-compatible interface).
- Redis (optional, for distributed session management).
๐ก Minimal Example
๐ Run the Example
๐ฌ Interactive Features
After running the example, the following special commands are supported:
/history- Ask AI to show conversation history./new- Start a new session (reset conversation context)./exit- End the conversation.
When the AI uses tools, detailed invocation processes will be displayed:
๐ง Core API
Create Runner
๐ Plugins
Runner plugins are global, runner-scoped hooks. Register plugins once and they will apply automatically to all agents, tools, and model calls executed by that Runner.
Notes:
- Plugin names must be unique per Runner.
- Plugins run in the order they are registered.
- If a plugin implements
plugin.Closer, Runner will call it inClose().
Run Conversation
Request ID (requestID) and Run Control
Each call to Runner.Run is a run. If you want to cancel a run or query
its status, you need a request identifier (requestID).
You can provide your own requestID (recommended) via agent.WithRequestID
(for example, a Universally Unique Identifier (UUID)). Runner injects it into
every emitted event.Event (event.RequestID).
Detached Cancellation (background execution)
In Go, context.Context (often named ctx) carries both cancellation and a
deadline. By default, Runner stops when ctx is cancelled.
If you want the run to continue after a parent cancellation, enable detached cancellation and use a timeout to bound the total runtime:
Runner enforces the earlier of:
- the parent context deadline (if any)
MaxRunDuration(if set)
Resume Interrupted Runs (tools-first resume)
In long-running conversations, users may interrupt the agent while it is still
in a tool-calling phase (for example, the last message in the session is an
assistant message with tool_calls, but no tool result has been written yet).
When you later reuse the same sessionID, you can ask the Runner to resume
from that point instead of asking the model to repeat the tool calls:
When WithResume(true) is set:
- Runner inspects the latest persisted session event.
- If the last event is an assistant response that contains
tool_callsand there is no later tool result, Runner will execute those pending tools first (using the same tool set and callbacks as a normal step) and persist the tool results into the session. - After tools finish, the normal LLM cycle continues using the updated session history, so the model sees both the original tool calls and their results.
If the last event is a user or tool message (or a plain assistant reply
without tool_calls), WithResume(true) is a no-op and the flow behaves like
todayโs Run call.
Tool Call Arguments Auto Repair
Some models may emit non-strict JSON arguments for tool_calls (for example, unquoted object keys or trailing commas), which can break tool execution or external parsing.
When agent.WithToolCallArgumentsJSONRepairEnabled(true) is enabled in runner.Run, the framework will best-effort repair toolCall.Function.Arguments. For detailed usage, see Tool Call Arguments Auto Repair.
Provide Conversation History (auto-seed + session reuse)
If your upstream service maintains the conversation and you want the agent to
see that context, you can pass a full history ([]model.Message) directly. The
runner will seed an empty session with that history automatically and then
merge in new session events.
Option A: Use the convenience helper runner.RunWithMessages
Example: examples/runwithmessages (uses RunWithMessages; runner auto-seeds and
continues reusing the session)
Option B: Pass via RunOption explicitly (same philosophy as ADK Python)
When []model.Message is provided, the runner persists that history into the
session on first use (if empty). The content processor does not read this
option; it only derives messages from session events (or falls back to the
single invocation.Message if the session has no events). RunWithMessages
still sets invocation.Message to the latest user turn so graph/flow agents
that inspect it continue to work.
โ Detecting End-of-Run and Reading Final Output (Graph-friendly)
When driving a GraphAgent workflow, the LLMโs โfinal responseโ is not the end of
the workflowโnodes like output may still be pending. Instead of checking
Response.IsFinalResponse(), always stop on the Runnerโs terminal completion
event:
For convenience, Runner now propagates the graphโs final snapshot into this last
event. You can extract the final textual output via graph.StateKeyLastResponse:
This keeps application code simple and consistent across Agent types while still preserving detailed graph events for advanced use.
๐ Option: Emit Final Graph LLM Responses
Graph-based agents (for example, GraphAgent) can call a Large Language Model (LLM) many times inside a single run. Each model call can produce a stream of events:
- Partial chunks:
IsPartial=true,Done=false, incremental text inchoice.Delta.Content - Final message:
IsPartial=false,Done=true, full text inchoice.Message.Content
By default, graph LLM nodes only emit the partial chunks. This avoids treating intermediate node outputs as normal assistant replies (for example, persisting them into the Session by Runner or showing them to end users).
To opt into the newer behavior (emit the final Done=true assistant message
events from graph LLM nodes), enable this RunOption:
Behavior summary:
First, one key idea: this option controls whether each graph Large Language
Model (LLM) node emits an extra final Done=true assistant message event. It
does not mean the Runner completion event will always have (or not have)
Response.Choices.
Assume your graph is llm1 -> llm2 -> llm3, and llm3 produces the final
answer:
- Case 1:
agent.WithGraphEmitFinalModelResponses(false)(default)llm1/llm2/llm3: emit only partial chunks (Done=false), no finalDone=trueassistant message events.- Runner completion event: to keep the โread only the last eventโ pattern
working, Runner echoes
llm3โs final output into completionResponse.Choices(when the graph provides final choices). The final text is also always available viaStateDelta[graph.StateKeyLastResponse].
- Case 2:
agent.WithGraphEmitFinalModelResponses(true)llm1/llm2/llm3: in addition to partial chunks, each node emits a finalDone=trueassistant message event (so intermediate nodes may now produce complete assistant messages, and Runner may persist those non-partial events into the Session).- Runner completion event: to avoid duplicating the final message, Runner
deduplicates by response identifier (ID). When it can confirm the final
message already appeared earlier, it omits the echo, so completion
Response.Choicesmay be empty. The final text should still be read fromStateDelta[graph.StateKeyLastResponse].
Recommendation: for GraphAgent workflows, always read the final output from the
Runner completion eventโs StateDelta (for example,
graph.StateKeyLastResponse). Treat Response.Choices on the completion event
as optional when this option is enabled.
๐๏ธ Option: StreamMode
Runner can filter the event stream before it reaches your application code.
This provides a single, run-level switch to select which categories of events
are forwarded to your eventChan.
Use agent.WithStreamMode(...):
Supported modes (graph workflows):
messages: model output events (for example,chat.completion.chunk)updates:graph.state.update/graph.channel.update/graph.executioncheckpoints:graph.checkpoint.*tasks: task lifecycle events (graph.node.*,graph.pregel.*)debug: same ascheckpoints+taskscustom: node-emitted events (graph.node.custom)
Notes:
- When
agent.StreamModeMessagesis selected, graph-based Large Language Model (LLM) nodes enable final model response events automatically for that run. To override it, callagent.WithGraphEmitFinalModelResponses(false)afteragent.WithStreamMode(...). - StreamMode only affects what Runner forwards to your
eventChan. Runner still processes and persists events internally. - For graph workflows, some event types (for example,
graph.checkpoint.*) are emitted only when their corresponding mode is selected. - Runner always emits a final
runner.completionevent.
๐พ Session Management
In-memory Session (Default)
Redis Session (Distributed)
Session Configuration
๐ค Agent Configuration
Runner's core responsibility is to manage the Agent execution flow. A created Agent needs to be executed via Runner.
Basic Agent Creation
Switch Agents Per Request
Runner can register multiple optional agents at construction time and pick one per Run:
runner.NewRunner("my-app", agent): Set the default agent when creating the Runner.runner.WithAgent("agentName", agent): Pre-register an agent by name so later requests can switch via name.agent.WithAgentByName("agentName"): Choose a registered agent by name for a single request without changing the default.agent.WithAgent(agent): Provide an agent instance directly for a single request; highest priority and no pre-registration needed.
Agent selection priority: agent.WithAgent > agent.WithAgentByName > default agent set at construction.
The selected agent name is used as the event author and is recorded via appid.RegisterRunner for observability.
Generation Configuration
Runner passes generation configuration to the Agent:
Tool Integration
Tool configuration is done inside the Agent, while Runner is responsible for running the Agent with tools:
Tool invocation flow: Runner itself does not directly handle tool invocation. The flow is as follows:
- Pass tools: Runner passes context to the Agent via Invocation.
- Agent processing: Agent.Run handles the tool invocation logic.
- Event forwarding: Runner receives the event stream returned by the Agent and forwards it.
- Session recording: Append non-partial response events to the session.
Multi-Agent Support
Runner can execute complex multi-Agent structures (see multiagent.md for details):
๐ Event Processing
Event Types
Complete Event Handling Example
๐ฎ Execution Context Management
Runner creates and manages the Invocation structure:
โ Best Practices
Error Handling
Stopping a Run Safely
- Cancel the context: Wrap
runner.Runwithcontext.WithCancel. Callcancel()when turn count or token budget is hit.llmflowtreatscontext.Canceledas graceful exit and closes the agent event channel, so the runner loop finishes cleanly without blocking writers.
- Emit a stop event: Inside custom processors or tools, return
agent.NewStopError("reason").llmflowconverts it into astop_agent_errorevent and stops the flow. Still pair with context cancel for hard cutoffs.
- Avoid breaking the runner loop directly: Breaking the event-loop reader leaves the agent goroutine running and can block on channel
writes. Prefer context cancellation or
StopError.
Resource Management
๐ Closing Runner (Important)
You MUST call Close() when the Runner is no longer needed to prevent goroutine leaks(trpc-agent-go >= v0.5.0).
Runner Only Closes Resources It Created
When a Runner is created without providing a Session Service, it automatically creates a default inmemory Session Service. This service starts background goroutines internally (for asynchronous summary processing, TTL-based session cleanup, etc.). Runner only manages the lifecycle of this self-created inmemory Session Service. If you provide your own Session Service via WithSessionService(), you are responsible for managing its lifecycleโRunner won't close it.
If you don't call Close() on a Runner that owns an inmemory Session Service, the background goroutines will run forever, causing resource leaks.
Recommended Practice:
When You Provide Your Own Session Service:
Long-Running Services:
Important Notes:
- โ
Close()is idempotent; calling it multiple times is safe - โ Runner only closes the inmemory Session Service it creates by default
- โ
If you provide your own Session Service via
WithSessionService(), Runner won't close it (you manage it yourself) - โ Not calling
Close()when Runner owns an inmemory Session Service will cause goroutine leaks
Context Lifecycle Control
Health Check
๐ Summary
The Runner component is a core part of the tRPC-Agent-Go framework, providing complete conversation management and Agent orchestration capabilities. By properly using session management, tool integration, and event handling, you can build powerful intelligent conversational applications.