Agent Usage Documentation
Agent is the core execution unit of the tRPC-Agent-Go framework, responsible for processing user input and generating corresponding responses. Each Agent implements a unified interface, supporting streaming output and callback mechanisms.
The framework provides multiple types of Agents, including LLMAgent, ChainAgent, ParallelAgent, CycleAgent, and GraphAgent. This document focuses on LLMAgent. For detailed information about other Agent types and multi-Agent systems, please refer to Multi-Agent.
Quick Start
Recommended Usage: Runner
We strongly recommend using Runner to execute Agents instead of directly calling Agent interfaces. Runner provides a more user-friendly interface, integrating services like Session and Memory, making usage much simpler.
📖 Learn More: For detailed usage methods, please refer to Runner
This example uses OpenAI's GPT-4o-mini model. Before starting, please ensure you have prepared the corresponding OPENAI_API_KEY and exported it through environment variables:
Additionally, the framework supports OpenAI API-compatible models, which can be configured through environment variables:
Creating Model Instance
First, you need to create a model instance. Here we use OpenAI's GPT-4o-mini model:
Configuring Generation Parameters
Set the model's generation parameters, including maximum tokens, temperature, and whether to use streaming output:
If you do not explicitly pass llmagent.WithGenerationConfig(...),
LLMAgent forwards the zero-value model.GenerationConfig{} by default,
so the default behavior is non-streaming (Stream=false). If you need
streaming output, set Stream: true explicitly, or override it per
request with agent.WithStream(true). Higher-level wrappers may choose
their own explicit defaults. For example, OpenClaw enables streaming by
default.
Creating LLMAgent
Use the model instance and configuration to create an LLMAgent, while setting the Agent's Description and Instruction.
Description is used to describe the basic functionality and characteristics of the Agent, while Instruction defines the specific instructions and behavioral guidelines that the Agent should follow when executing tasks.
Placeholder Variables (Session State Injection)
LLMAgent automatically injects session state into Instruction and the optional SystemPrompt via placeholder variables. Supported patterns:
{key}: Replaced with the string value corresponding to the keykeyin the session state (write viainvocation.Session.SetState("key", ...)or SessionService){key?}: Optional; if missing, replaced with an empty string{user:subkey}/{app:subkey}/{temp:subkey}: Use user/app/temp scoped keys (session services merge app/user state into session with these prefixes){invocation:subkey}: Replaces with the value of fmt.Sprintf("%+v",invocation.state["subkey"]). (The state can be set via invocation.SetState(k, v))
Notes:
- If a non-optional key is not found, the original
{key}is preserved (helps the LLM notice missing context) - Values are read from session state (Runner + SessionService set/merge this automatically)
Example:
See also:
- Examples:
examples/placeholder,examples/outputkey - Session API:
docs/mkdocs/en/session/index.md
Using Runner to Execute Agent
Use Runner to execute the Agent, which is the recommended usage:
Stopping an Agent Run (Cancellation)
In Go, context.Context (often named ctx) can carry:
- A cancellation signal (someone calls
cancel()) - A deadline (a time limit)
The framework uses ctx to stop a running agent safely.
How to stop a running agent
Cancel the same ctx you passed into Runner.Run. Do not just stop reading the
event channel.
Ctrl+C (terminal programs)
If you implement a custom Agent or Tool
Cancellation is cooperative: your code must check ctx.Done() and return.
- Custom agents should
selectonctx.Done()in long loops. - Tools should pass
ctxinto network/DB requests (so those calls can be canceled too).
For the full run-control guide (requestID cancel, StopError, timeouts), see
docs/mkdocs/en/runner.md.
Message Visibility Options
The Agent can dynamically manage the visibility of messages generated by other Agents and historical session messages based on different scenarios. This is configurable through relevant options. When interacting with the model, only the visible content is passed as input.
TIPS: - Messages from different sessionIDs are never visible to each other under any circumstances. The following control strategies only apply to messages sharing the same sessionID. - Invocation.Message always visible regardless of the configuration. - When the option is not configured, the default value is FullContext.
Config:
- llmagent.WithMessageFilterMode(MessageFilterMode):
- FullContext: Includes historical messages and messages generated in the current request, filtered by prefix matching with the filterKey.
- RequestContext: Only includes messages generated in the current request, filtered by prefix matching with the filterKey.
- IsolatedRequest: Only includes messages generated in the current request, filtered by exact matching with the filterKey.
- IsolatedInvocation: Only includes messages generated in the current Invocation context, filtered by exact matching with the filterKey.
Recommended Usage Examples (These examples are simplified configurations based on advanced usage):
Advanced Usage Examples:
You can independently control the visibility of historical messages and messages
generated by other Agents for the current agent using
WithMessageTimelineFilterMode and WithMessageBranchFilterMode.
When the current agent interacts with the model, only messages satisfying both
conditions are input to the model (invocation.Message is always visible).
- WithMessageTimelineFilterMode: Controls visibility from a temporal dimension
- TimelineFilterAll: Includes historical messages and messages generated in the current request.
- TimelineFilterCurrentRequest: Only includes messages generated in the current request (one runner.Run counts as one request).
- TimelineFilterCurrentInvocation: Only includes messages generated in the current invocation context.
- WithMessageBranchFilterMode: Controls visibility by FilterKey hierarchy.
- BranchFilterModePrefix (default): Hierarchical match between Event.FilterKey
and Invocation.eventFilterKey (ancestors/self/descendants all count).
- BranchFilterModeSubtree: Only include Invocation.eventFilterKey and its
descendants (no ancestors).
- BranchFilterModeExact: Only include events where
Event.FilterKey == Invocation.eventFilterKey.
- BranchFilterModeAll: Include all events regardless of FilterKey.
Reasoning Content Mode (DeepSeek Thinking Mode)
When using models with thinking/reasoning capabilities (such as DeepSeek), the model outputs both reasoning_content (thinking chain) and content (final answer). According to DeepSeek API documentation, in multi-turn conversations, you should not send the previous turn's reasoning_content to the model.
LLMAgent provides WithReasoningContentMode to control how reasoning_content is handled in conversation history:
Available Modes:
| Mode | Constant | Description |
|---|---|---|
| Discard Previous Turns | ReasoningContentModeDiscardPreviousTurns |
Discard reasoning_content from previous request turns, keep for current request. (Default, recommended) |
| Keep All | ReasoningContentModeKeepAll |
Keep all reasoning_content in history (for debugging). |
| Discard All | ReasoningContentModeDiscardAll |
Discard all reasoning_content from history for maximum bandwidth savings. |
Usage Example:
How It Works:
keep_all: Allreasoning_contentis preserved in session history. Use this if you need to retain thinking chains for debugging or analysis.discard_previous_turns: When building the message list for a new request,reasoning_contentfrom messages belonging to previous requests is cleared. Messages within the current request (e.g., during tool call loops) retain theirreasoning_content. This follows DeepSeek's recommendation.discard_all: Allreasoning_contentis stripped from historical messages before sending to the model.
Note: This option only affects how historical messages are processed before sending to the model. The current response's reasoning_content is always captured and stored in session events.
Delegation Visibility Options
When building multi‑Agent systems (task delegation between Agents), LLMAgent provides a unified fallback option for delegation events. Transfer events always include announcement text and are tagged transfer so UIs (User Interfaces) can filter them if desired.
llmagent.WithDefaultTransferMessage(string)- Configure the default message used when a model calls a SubAgent without a
message. - Pass an empty string to disable injecting a default message; pass a non‑empty string to enable and override it.
- Configure the default message used when a model calls a SubAgent without a
Usage example:
Notes:
- These options do not change the actual handoff logic; they only affect user‑visible texts or whether a fallback
messageis injected. - Transfer announcements are emitted as Events with
Response.Object == "agent.transfer". If your UI should not display system‑level notices, filter this object type at the renderer/service layer.
Post-tool Prompt Injection
When the model calls tools, the tool outputs are added to the conversation as
role=tool messages. Some models may respond with meta commentary like “based
on the tool result…”, or reveal internal process details.
To make the assistant respond more naturally after tool calls, LLMAgent injects a short “post-tool” dynamic prompt into the system message only when tool results are present.
- Default: enabled, using the built-in prompt.
- Customize the injected text:
llmagent.WithPostToolPrompt("..."). - Disable injection entirely:
llmagent.WithEnablePostToolPrompt(false).
Example:
Call Count Limits (Safety Mechanism)
To prevent Agents from entering infinite loops or consuming excessive resources, LLMAgent provides two optional call count limit configurations:
Available Configurations:
| Configuration | Description |
|---|---|
llmagent.WithMaxLLMCalls(n) |
Limits the maximum number of LLM calls per invocation. Takes effect when n > 0; no limit when n <= 0 (default). |
llmagent.WithMaxToolIterations(n) |
Limits the maximum number of tool-call iterations per invocation. Takes effect when n > 0; no limit when n <= 0 (default). |
Usage Example:
Behavior:
WithMaxLLMCalls: When LLM call count exceeds the limit, aStopErroris returned and the current invocation terminates.WithMaxToolIterations: When tool iteration count exceeds the limit, aflow_errorresponse event is emitted and the invocation ends. It does not return aStopError.- Both limits are independent and can be used separately or together.
- These limits are per-invocation; different
runner.Run()calls maintain independent counts.
Recommended Usage:
- In production environments, setting reasonable limits is recommended to prevent unexpected scenarios.
- Set limit values based on task complexity and expected behavior.
- See examples/max_limits for a complete example.
Tool Call Retry
If you want LLMAgent to retry a tool call automatically after a failure, configure llmagent.WithToolCallRetryPolicy(...).
Notes:
- The retry applies only to the current tool call. It does not rerun the whole Agent.
- It is disabled by default, so existing behavior stays unchanged unless you opt in.
- It currently applies only to
CallableTool;StreamableToolis not retried yet. - The default retry rule covers common transient raw errors such as
io.EOF,io.ErrUnexpectedEOF, and network timeouts. - If you also want to retry result-level failures, customize
tool.RetryPolicy.RetryOn.
Runnable example:
Handling Event Stream
The eventChan returned by runner.Run() is an event channel. The Agent continuously sends Event objects to this channel during execution.
Each Event contains execution state information at a specific moment: LLM-generated content, tool call requests and results, error messages, etc. By iterating through the event channel, you can get real-time execution progress (see Event section below for details).
Receive execution results through the event channel:
This example uses event.IsFinalResponse() because it only cares about when
the current reply has been fully printed. If you need to wait until the whole
Runner.Run is truly over, such as when tool.response may be followed by
extra processing or when using GraphAgent, use
event.IsRunnerCompletion() as the loop exit condition instead.
The complete code for this example can be found at examples/runner
Why is Runner recommended?
- Simpler Interface: No need to create complex Invocation objects
- Integrated Services: Automatically integrates Session, Memory and other services
- Better Management: Unified management of Agent execution flow
- Production Ready: Suitable for production environment use
💡 Tip: Want to learn more about Runner's detailed usage and advanced features? Please check Runner
Advanced Usage: Direct Agent Usage
If you need more fine-grained control, you can also use the Agent interface directly, but this requires creating Invocation objects:
Core Concepts
Invocation (Advanced Usage)
Invocation is the context object for Agent execution flow, containing all information needed for a single call. Note: This is advanced usage, we recommend using Runner to simplify operations.
When to use direct calls?
- Need complete control over execution flow
- Custom Session and Memory management
- Implement special invocation logic
- Debugging and testing scenarios
Invocation State
Invocation provides a general-purpose state storage mechanism for sharing data within the lifecycle of a single invocation. This is useful for callbacks, middleware, or any scenario that requires storing temporary data at the invocation level.
Core Methods:
Features:
- Invocation-scoped: State is automatically scoped to a single invocation
- Thread-safe: Built-in RWMutex protection for concurrent access
- Lazy initialization: Memory allocated only on first use
- General-purpose: Can be used for callbacks, middleware, custom logic, and more
Usage Example:
Version Requirement
The structured callback API (recommended) requires trpc-agent-go >= 0.6.0.
Recommended Key Naming Convention:
- Agent callbacks:
"agent:xxx" - Model callbacks:
"model:xxx" - Tool callbacks:
"tool:toolName:xxx" - Middleware:
"middleware:xxx" - Custom logic:
"custom:xxx"
For detailed usage and more examples, please refer to Callbacks.
Event
Event is the real-time feedback generated during Agent execution, reporting execution progress in real-time through Event streams.
Events mainly include the following types:
- Model conversation events
- Tool call and response events
- Agent transfer events
- Error events
The streaming nature of Events allows you to see the Agent's working process in real-time, just like having a natural conversation with a real person. You only need to iterate through the Event stream, check the content and status of each Event, and you can completely handle the Agent's execution results.
Agent Interface
The Agent interface defines the core behaviors that all Agents must implement. This interface allows you to uniformly use different types of Agents while supporting tool calls and sub-Agent management.
The framework provides multiple types of Agent implementations, including LLMAgent, ChainAgent, ParallelAgent, CycleAgent, and GraphAgent. For detailed information about different types of Agents and multi-Agent systems, please refer to Multi-Agent.
Callbacks
Callbacks provide a rich callback mechanism that allows you to inject custom logic at key points during Agent execution.
Version Requirement
The structured callback API (recommended) requires trpc-agent-go >= 0.6.0.
Callback Types
The framework provides three types of callbacks:
Agent Callbacks: Triggered before and after Agent execution
Model Callbacks: Triggered before and after model calls
Tool Callbacks: Triggered before and after tool calls
Usage Example
The callback mechanism allows you to precisely control the Agent's execution process and implement more complex business logic.
Structured Output
Structured output ensures that agent responses conform to a predefined format, making them easier to parse and process programmatically. The framework provides multiple methods for structured output, each suited for different use cases.
Comparison of Structured Output Methods
| Feature | WithStructuredOutputJSONSchema | WithStructuredOutputJSON | WithOutputSchema | WithOutputKey |
|---|---|---|---|---|
| Tool Usage | ✅ Allowed | ✅ Allowed | ❌ Disabled | ✅ Enabled |
| Schema Type | User-provided JSON Schema | Auto-generated from Go struct | User-provided JSON Schema | N/A |
| Output Type | Untyped (map/interface{}) | Typed (Go struct) | Untyped (map/interface{}) | String/Bytes |
| Schema Validation | ✅ By LLM | ✅ By LLM | ✅ By LLM | ❌ None |
| Data Location | Event.StructuredOutput | Event.StructuredOutput | Model response content | Session State |
| Primary Use Case | Flexible schema with tools | Type-safe structured output | Simple structured responses | State storage & flow control |
WithStructuredOutputJSONSchema
Provides a user-defined JSON schema for structured output while allowing tool usage. This is the most flexible option for agents that need both structured output and tool capabilities.
Notes:
- “Tools are allowed” means the agent can still make tool calls (including
Skills tools like skill_load and workspace_exec).
- When the model needs tools, it may emit tool call events first and only
produce the final JSON later. Only the final answer must conform to the
schema and must be a single JSON object.
Example:
Best for: - Complex agents requiring both structured output and tool usage - Working with external JSON schemas (from APIs, databases, config files) - Prototyping with dynamic schemas - Gradual typing scenarios
WithStructuredOutputJSON
Auto-generates JSON schema from a Go struct and returns typed output. Provides compile-time type safety.
Notes: - When the model needs tools, it may emit tool call events first and only produce the final JSON later. Only the final answer must conform to the schema and must be a single JSON object.
Example:
Best for: - Type-safe applications with well-defined Go structs - Clean code integration - Compile-time type checking
WithOutputSchema (Legacy)
Similar to WithStructuredOutputJSONSchema but disables all tools. This is a legacy method kept for backward compatibility.
Example:
Limitations: - ❌ Cannot use tools, function calling, or RAG - ❌ Response in model content (needs parsing)
Migration Tip: If you need tool capabilities, migrate to WithStructuredOutputJSONSchema:
WithOutputKey
Stores agent output in session state under a specific key, useful for agent workflows where output needs to be accessed by downstream agents.
Example:
Best for: - Multi-agent workflows with data passing - Session state management - Placeholder variable access in downstream agents
Choosing the Right Method
| Scenario | Recommended Method |
|---|---|
| Need tools + structured output | WithStructuredOutputJSONSchema or WithStructuredOutputJSON |
| Type safety is critical | WithStructuredOutputJSON |
| Working with external schemas | WithStructuredOutputJSONSchema |
| Simple structured responses (no tools) | WithOutputSchema |
| Multi-agent workflows | WithOutputKey |
| Rapid prototyping | WithStructuredOutputJSONSchema |
Examples:
- examples/structuredoutput/ - Demonstrates WithStructuredOutputJSON (typed)
- examples/outputschema/ - Demonstrates WithOutputSchema (legacy)
- examples/outputkey/ - Demonstrates WithOutputKey (session state)
Advanced Usage
The framework provides advanced features like Runner, Session, and Memory for building more complex Agent systems.
Runner is the recommended usage, responsible for managing Agent execution flow, connecting Session/Memory Service capabilities, and providing a more user-friendly interface.
Session Service is used to manage session state, supporting conversation history and context maintenance.
Memory Service is used to record user preference information, supporting personalized experiences.
Recommended Reading Order:
- Runner - Learn the recommended usage
- Session - Understand session management
- Multi-Agent - Learn multi-Agent systems
Runtime Instruction Updates
You can update an Agent’s behavior-defining text at runtime without rebuilding or restarting the Agent.
What changes dynamically
- Instruction: the behavior guideline appended to the system message.
- Global Instruction (system prompt): the system-level preface prepended to the request.
Both can be updated on an existing LLMAgent instance and take effect on subsequent model requests.
Example
Notes
- Thread‑safe: the setters are concurrency‑safe and can be called while the service is handling requests.
- Mid‑turn behavior: if an Agent’s current turn triggers more than one model request (e.g., due to tool calls), updates may apply to subsequent requests in the same turn. If you need per‑run stability, set or freeze the text at the start of the run.
- Per‑run override: pass
agent.WithInstruction(...)and/oragent.WithGlobalInstruction(...)toRunner.Run(...)to override prompts for a single request without mutating the Agent instance. - Model‑specific prompts: if an Agent can switch models, use
llmagent.WithModelInstructions/llmagent.WithModelGlobalInstructions(or the corresponding setters) to override prompts bymodel.Info().Name, falling back to the Agent defaults when no mapping exists. - Per‑session personalization: for per‑user or per‑session data, prefer placeholders in the instruction and session state injection (see the “Placeholder Variables” section above).
Model-specific Prompts
If a single Agent can switch between different models, you can define a different Instruction/system prompt for each model.
The key used for matching is the model name returned by model.Info().Name.
Example
See also: examples/model/promptmap.
Alternative: Placeholder‑Driven Dynamic System Prompts
If you’d rather not call setters, you can make the instruction itself a template and feed values via session state. The instruction processor replaces placeholders using session/app/user state on each turn.
Patterns
- Persistent per‑user value: store under
user:*and reference{user:key}. - Persistent per‑app value: store under
app:*and reference{app:key}. - Session-scoped temporary value: write into the session’s
temp:*namespace and reference{temp:key}(notuser:*/app:*).
Example: per‑user dynamic instruction
Example: per‑turn temp value via a before‑agent callback
Version Requirement
The structured callback API (recommended) requires trpc-agent-go >= 0.6.0.
Caveats
- In-memory
UpdateUserStateintentionally forbidstemp:*updates; writetemp:*viainvocation.Session.SetState(e.g., via a callback) when you need session-scoped temporary values. - Placeholders are resolved at request time; changing the stored value updates behavior on the next model request without recreating the agent.
Static Structure Export
The framework provides static structure export for agents. This is useful for structure inspection, visualization, configuration tools, and diagnostics that need a stable snapshot of nodes, edges, and editable surfaces.
Use agent/structure to export a normalized snapshot:
The exported snapshot contains:
Nodes: stable static nodes in the current agent structureEdges: static possible connections between nodesSurfaces: stable editable baselines such asinstruction,model,tool, andskill
For a complete example, see examples/graph/structure_export.
Override Runtime Surfaces by nodeID
In addition to run-scoped overrides such as agent.WithInstruction(...) and
agent.WithGlobalInstruction(...), the framework also supports overriding
runtime surfaces for a specific node by stable nodeID within a single
runner.Run(...) call.
This is useful when you want to:
- Temporarily change the instruction of one graph node without affecting the entire agent.
- Apply different instructions, few-shot examples, or tools to multiple nodes in the same
runner.Run(...). - Precisely target nested nodes inside
chain,parallel,cycle,graph,team, andswarmstructures.
Entry API:
Recommended Workflow
Use the feature in this order:
- Export a static structure snapshot with
structure.Export(...). - Find the stable target
nodeIDfromsnapshot.EntryNodeIDorsnapshot.Nodes. - Build an
agent.SurfacePatchand set only the surfaces you want to override for this run. - Pass one or more
agent.WithSurfacePatchForNode(...)options torunner.Run(...).
Example:
Notes:
- Prefer
nodeIDvalues exported bystructure.Export(...)instead of handwritten paths. - The override only applies to the current
runner.Run(...)call and does not mutate the static agent definition. agent.WithInstruction(...)andagent.WithGlobalInstruction(...)still work and are appropriate when you want one run-scoped override for the root agent. UseWithSurfacePatchForNode(...)when you need per-node control.
Patchable Surfaces
SurfacePatch currently provides these setters:
SetInstruction(string): overrides the node instruction for this run.SetGlobalInstruction(string): overrides the node global instruction for this run.SetFewShot([][]model.Message): overrides the node few-shot examples for this run.SetModel(model.Model): overrides the model instance used by the node for this run.SetTools([]tool.Tool): overrides the node tool surface for this run. This means replacing the tool set visible at runtime, not merely changing tool descriptions.SetSkillRepository(skill.Repository): overrides the node skill repository for this run. Passingnilexplicitly disables the node skill surface.
Not every node supports every surface. Common cases are:
- Root
LLMAgent: supportsinstruction,global_instruction,few_shot,model,tool, andskill. - Graph LLM node: supports
instruction,few_shot,model, andtool. - Graph Tools node: supports
tool. - Child nodes inside graph sub-agents,
chain,parallel,cycle,team, andswarm: support depends on the actual target node you patch.
Override Multiple Nodes in One Run
To patch multiple nodes in the same runner.Run(...), pass multiple
agent.WithSurfacePatchForNode(...) options. There is no separate batch API.
When WithSurfacePatchForNode(...) is passed multiple times for the same
nodeID, the framework merges by surface type:
- Different surface types are combined.
- For the same surface type, the later option overrides the earlier one.
Composite and Nested Structures
The same pattern applies beyond a single LLMAgent:
graph: patch graph LLM nodes, Tools nodes, and the root nodes of child agents mounted inside the graph.chain,parallel,cycle: patch any exported child node.teamandswarm: patch thecoordinator, member nodes, and members reached after transfer.
The most robust workflow is still:
- export with
structure.Export(...) - select the exported
nodeID - pass the patch through
WithSurfacePatchForNode(...)
This keeps one rule easy to remember: nodeID comes from structure export, and the runtime patch is supplied through WithSurfacePatchForNode(...).
Execution Trace
The framework can export an execution trace for a single runner.Run call. This is useful for understanding which nodes actually ran, how steps depend on each other, and what each step saw as input and output.
Enable it explicitly when you start a run:
Execution traces are attached to the runner completion event as an in-memory artifact. They are not serialized by default.
Each recorded step carries stable fields such as:
NodeID: the static node path for the executed nodePredecessorStepIDs: the direct step dependencies in this runInputandOutput: stable text snapshots captured for the stepError: the terminal step error, when the step fails
For a complete example, see examples/graph/execution_trace.