Session Management
Overview
tRPC-Agent-Go provides powerful session management capabilities to maintain conversation history and context information during Agent-user interactions. Through automatic persistence of conversation records, intelligent summary compression, and flexible storage backends, session management offers complete infrastructure for building stateful intelligent Agents.
Positioning
A Session manages the context of the current conversation, with isolation dimensions <appName, userID, SessionID>. It stores user messages, Agent responses, tool call results, and brief summaries generated based on this content within the conversation, supporting multi-turn question-and-answer scenarios.
Within the same conversation, it allows for seamless transitions between multiple turns of question-and-answer, preventing users from restating the same question or providing the same parameters in each turn.
🎯 Key Features
- Context Management: Automatically load conversation history for true multi-turn dialogues
- Session Summary: Automatically compress long conversation history using LLM while preserving key context and significantly reducing token consumption
- Event Limiting: Control maximum number of events stored per session to prevent memory overflow
- TTL Management: Support automatic expiration and cleanup of session data
- Multiple Storage Backends: Support Memory, Redis, PostgreSQL, MySQL, ClickHouse storage
- Concurrency Safety: Built-in read-write locks ensure safe concurrent access
- Automatic Management: Automatically handle session creation, loading, and updates after Runner integration
- Soft Delete Support: PostgreSQL/MySQL support soft delete with data recovery capability
Quick Start
Integration with Runner
tRPC-Agent-Go's session management integrates with Runner through runner.WithSessionService. Runner automatically handles session creation, loading, updates, and persistence.
Supported Storage Backends: Memory, Redis, PostgreSQL, MySQL, ClickHouse
Default Behavior: If runner.WithSessionService is not configured, Runner defaults to using memory storage (Memory), and data will be lost after process restarts.
Basic Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
Runner Automatic Capabilities
After integrating Session Service, Runner automatically provides the following capabilities without needing to manually call any Session API:
- Automatic Session Creation: Automatically create session on first conversation (generates UUID if SessionID is empty)
- Automatic Session Loading: Automatically load historical context at the start of each conversation
- Automatic Session Updates: Automatically save new events after conversation ends
- Context Continuity: Automatically inject conversation history into LLM input for multi-turn dialogues
- Automatic Summary Generation (Optional): Generate summaries asynchronously in background when trigger conditions are met, no manual intervention required
Core Capabilities
1️⃣ Context Management
The core function of session management is to maintain conversation context, ensuring the Agent can remember historical interactions and provide intelligent responses based on history.
How it Works:
- Automatically save user input and AI responses from each conversation round
- Automatically load historical events when new conversations begin
- Runner automatically injects historical context into LLM input
Default Behavior: After Runner integration, context management is fully automated without manual intervention.
2️⃣ Session Summary
As conversations continue to grow, maintaining complete event history can consume significant memory and may exceed LLM context window limits. The session summary feature uses LLM to automatically compress historical conversations into concise summaries, significantly reducing memory usage and token consumption while preserving important context.
Core Features:
- Automatic Triggering: Automatically generate summaries based on event count, token count, or time thresholds
- Incremental Processing: Only process new events since the last summary, avoiding redundant computation
- LLM-Driven: Use configured LLM model to generate high-quality, context-aware summaries
- Non-Destructive: Original events are fully preserved, summaries stored separately
- Asynchronous Processing: Execute asynchronously in background without blocking conversation flow
- Flexible Configuration: Support custom trigger conditions, prompts, and word limits
Quick Configuration:
Summary hooks (pre/post)
You can inject hooks to tweak summary input or output:
Notes:
- Pre-hook can mutate
ctx.Text(preferred) orctx.Events; post-hook can mutatectx.Summary. - Default behavior ignores hook errors; enable abort with
WithSummaryHookAbortOnError(true).
Context Injection Mechanism:
After enabling summary, the framework prepends the summary as a system message to the LLM input, while including all incremental events after the summary timestamp to ensure complete context:
Summary Format Customization
By default, session summaries are formatted with context tags and a note about preferring current conversation information:
Default Format:
You can customize the summary format using WithSummaryFormatter (available in llmagent and graphagent) to better match your specific use cases or model requirements.
Custom Format Example:
Use Cases:
- Simplified Format: Reduce token usage by using concise headings and minimal context notes
- Language Localization: Translate context notes to target language (e.g., Chinese, Japanese)
- Role-Specific Formatting: Different formats for different agent roles (assistant, researcher, coder)
- Model Optimization: Tailor format for specific model preferences (some models respond better to certain prompt structures)
Important Notes:
- The formatter function receives the raw summary text from the session and returns the formatted string
- Custom formatters should ensure the summary is clearly distinguishable from other messages
- The default format is designed to be compatible with most models and use cases
- When
WithAddSessionSummary(false)is used, the formatter is never invoked
Important Note: When WithAddSessionSummary(true) is enabled, the WithMaxHistoryRuns parameter is ignored, and all events after the summary are fully retained.
For detailed configuration and advanced usage, see the Session Summary section.
3️⃣ Event Limiting (EventLimit)
Control the maximum number of events stored per session to prevent memory overflow from long conversations.
How it Works:
- Automatically evict oldest events (FIFO) when limit is exceeded
- Only affects storage, not business logic
- Applies to all storage backends
Configuration Example:
Recommended Configuration:
| Scenario | Recommended Value | Description |
|---|---|---|
| Short-term conversations | 100-200 | Customer service, single tasks |
| Medium-term sessions | 500-1000 | Daily assistant, multi-turn collaboration |
| Long-term sessions | 1000-2000 | Personal assistant, ongoing projects (use with summary) |
| Debug/testing | 50-100 | Quick validation, reduce noise |
4️⃣ TTL Management (Auto-Expiration)
Support setting Time To Live (TTL) for session data, automatically cleaning expired data.
Supported TTL Types:
- SessionTTL: Expiration time for session state and events
- AppStateTTL: Expiration time for application-level state
- UserStateTTL: Expiration time for user-level state
Configuration Example:
Expiration Behavior:
| Storage Type | Expiration Mechanism | Auto Cleanup |
|---|---|---|
| Memory | Periodic scanning + access-time checking | Yes |
| Redis | Redis native TTL | Yes |
| PostgreSQL | Periodic scanning (soft delete or hard delete) | Yes |
| MySQL | Periodic scanning (soft delete or hard delete) | Yes |
Storage Backend Comparison
tRPC-Agent-Go provides five session storage backends to meet different scenario requirements:
| Storage Type | Use Case |
|---|---|
| Memory | Development/testing, small-scale |
| Redis | Production, distributed |
| PostgreSQL | Production, complex queries |
| MySQL | Production, complex queries |
| ClickHouse | Production, massive logs |
Memory Storage
Suitable for development environments and small-scale applications, no external dependencies required, ready to use out of the box.
Configuration Options
WithSessionEventLimit(limit int): Set maximum number of events stored per session. Default is 1000, evicts old events when exceeded.WithSessionTTL(ttl time.Duration): Set TTL for session state and event list. Default is 0 (no expiration).WithAppStateTTL(ttl time.Duration): Set TTL for application-level state. Default is 0 (no expiration).WithUserStateTTL(ttl time.Duration): Set TTL for user-level state. Default is 0 (no expiration).WithCleanupInterval(interval time.Duration): Set interval for automatic cleanup of expired data. Default is 0 (auto-determined), if any TTL is configured, default cleanup interval is 5 minutes.WithSummarizer(s summary.SessionSummarizer): Inject session summarizer.WithAsyncSummaryNum(num int): Set number of summary processing workers. Default is 3.WithSummaryQueueSize(size int): Set summary task queue size. Default is 100.WithSummaryJobTimeout(timeout time.Duration): Set timeout for single summary task. Default is 60 seconds.
Basic Configuration Example
Use with Summary
Redis Storage
Suitable for production environments and distributed applications, provides high performance and auto-expiration capabilities.
Configuration Options
WithRedisClientURL(url string): Create Redis client via URL. Format:redis://[username:password@]host:port[/database].WithRedisInstance(instanceName string): Use pre-configured Redis instance. Note:WithRedisClientURLhas higher priority thanWithRedisInstance.WithSessionEventLimit(limit int): Set maximum number of events stored per session. Default is 1000.WithSessionTTL(ttl time.Duration): Set TTL for session state and events. Default is 0 (no expiration).WithAppStateTTL(ttl time.Duration): Set TTL for application-level state. Default is 0 (no expiration).WithUserStateTTL(ttl time.Duration): Set TTL for user-level state. Default is 0 (no expiration).WithSummarizer(s summary.SessionSummarizer): Inject session summarizer.WithAsyncSummaryNum(num int): Set number of summary processing workers. Default is 3.WithSummaryQueueSize(size int): Set summary task queue size. Default is 100.WithExtraOptions(extraOptions ...interface{}): Set extra options for Redis client.
Basic Configuration Example
Configuration Reuse
If multiple components need to use the same Redis instance, you can register and reuse:
Use with Summary
Storage Structure
PostgreSQL Storage
Suitable for production environments and applications requiring complex queries, provides full relational database capabilities.
Configuration Options
Connection Configuration:
WithPostgresClientDSN(dsn string): PostgreSQL DSN connection string (recommended). Supports two formats:- Key-Value format:
host=localhost port=5432 user=postgres password=secret dbname=mydb sslmode=disable - URL format:
postgres://user:password@localhost:5432/dbname?sslmode=disable
- Key-Value format:
WithHost(host string): PostgreSQL server address. Default islocalhost.WithPort(port int): PostgreSQL server port. Default is5432.WithUser(user string): Database username. Default ispostgres.WithPassword(password string): Database password. Default is empty string.WithDatabase(database string): Database name. Default ispostgres.WithSSLMode(sslMode string): SSL mode. Default isdisable. Options:disable,require,verify-ca,verify-full.WithPostgresInstance(name string): Use pre-configured PostgreSQL instance.
Priority:
WithPostgresClientDSN> Direct connection settings (WithHost, etc.) >WithPostgresInstance
Session Configuration:
WithSessionEventLimit(limit int): Maximum events per session. Default is 1000.WithSessionTTL(ttl time.Duration): Session TTL. Default is 0 (no expiration).WithAppStateTTL(ttl time.Duration): App state TTL. Default is 0 (no expiration).WithUserStateTTL(ttl time.Duration): User state TTL. Default is 0 (no expiration).WithCleanupInterval(interval time.Duration): TTL cleanup interval. Default is 5 minutes.WithSoftDelete(enable bool): Enable or disable soft delete. Default istrue.
Async Persistence Configuration:
WithEnableAsyncPersist(enable bool): Enable async persistence. Default isfalse.WithAsyncPersisterNum(num int): Number of async persistence workers. Default is 10.
Summary Configuration:
WithSummarizer(s summary.SessionSummarizer): Inject session summarizer.WithAsyncSummaryNum(num int): Number of summary processing workers. Default is 3.WithSummaryQueueSize(size int): Summary task queue size. Default is 100.WithSummaryJobTimeout(timeout time.Duration): Set timeout for single summary task. Default is 60 seconds.
Schema and Table Configuration:
WithSchema(schema string): Specify schema name.WithTablePrefix(prefix string): Table name prefix.WithSkipDBInit(skip bool): Skip automatic table creation.
Basic Configuration Example
Configuration Reuse
Schema and Table Prefix
PostgreSQL supports schema and table prefix configuration for multi-tenant and multi-environment scenarios:
Table Naming Rules:
| Schema | Prefix | Final Table Name |
|---|---|---|
| (none) | (none) | session_states |
| (none) | app1_ |
app1_session_states |
my_schema |
(none) | my_schema.session_states |
my_schema |
app1_ |
my_schema.app1_session_states |
Soft Delete and TTL Cleanup
Soft Delete Configuration:
Delete Behavior Comparison:
| Configuration | Delete Operation | Query Behavior | Data Recovery |
|---|---|---|---|
softDelete=true |
UPDATE SET deleted_at = NOW() |
Queries include WHERE deleted_at IS NULL, returning only non-soft-deleted rows |
Recoverable |
softDelete=false |
DELETE FROM ... |
Query all records | Not recoverable |
TTL Auto Cleanup:
Use with Summary
Storage Structure
PostgreSQL uses relational table structure with JSON data stored using JSONB type.
For complete table definitions, see session/postgres/schema.sql
MySQL Storage
Suitable for production environments and applications requiring complex queries, MySQL is a widely used relational database.
Configuration Options
Connection Configuration:
WithMySQLClientDSN(dsn string):MySQL configWithInstanceName(name string): Use pre-configured MySQL instance.
Session Configuration:
WithSessionEventLimit(limit int): Maximum events per session. Default is 1000.WithSessionTTL(ttl time.Duration): Session TTL. Default is 0 (no expiration).WithAppStateTTL(ttl time.Duration): App state TTL. Default is 0 (no expiration).WithUserStateTTL(ttl time.Duration): User state TTL. Default is 0 (no expiration).WithCleanupInterval(interval time.Duration): TTL cleanup interval. Default is 5 minutes.WithSoftDelete(enable bool): Enable or disable soft delete. Default istrue.
Async Persistence Configuration:
WithEnableAsyncPersist(enable bool): Enable async persistence. Default isfalse.WithAsyncPersisterNum(num int): Number of async persistence workers. Default is 10.
Summary Configuration:
WithSummarizer(s summary.SessionSummarizer): Inject session summarizer.WithAsyncSummaryNum(num int): Number of summary processing workers. Default is 3.WithSummaryQueueSize(size int): Summary task queue size. Default is 100.WithSummaryJobTimeout(timeout time.Duration): Set timeout for single summary task. Default is 60 seconds.
Table Configuration:
WithTablePrefix(prefix string): Table name prefix.WithSkipDBInit(skip bool): Skip automatic table creation.
Basic Configuration Example
Configuration Reuse
Table Prefix
MySQL supports table prefix configuration for multi-application shared database scenarios:
Soft Delete and TTL Cleanup
Soft Delete Configuration:
Delete Behavior Comparison:
| Configuration | Delete Operation | Query Behavior | Data Recovery |
|---|---|---|---|
softDelete=true |
UPDATE SET deleted_at = NOW() |
Queries include WHERE deleted_at IS NULL, returning only non-soft-deleted rows |
Recoverable |
softDelete=false |
DELETE FROM ... |
Query all records | Not recoverable |
TTL Auto Cleanup:
Use with Summary
Storage Structure
MySQL uses relational table structure with JSON data stored using JSON type.
For complete table definitions, see session/mysql/schema.sql
Database Migration
Migrating from Older Versions
Affected Versions: v1.2.0 and earlier
Fixed in Version: v1.2.0 and later
Background: Earlier versions of the session_summaries table had index design issues:
- The earliest version used a unique index that included the
deleted_atcolumn. However, in MySQLNULL != NULL, which means multiple records withdeleted_at = NULLwould not trigger the unique constraint. - Later versions changed to a regular lookup index (non-unique), which also could not prevent duplicate data.
Both situations could lead to duplicate data.
Old Index (one of the following):
idx_*_session_summaries_unique_active(app_name, user_id, session_id, filter_key, deleted_at)— unique index but includes deleted_atidx_*_session_summaries_lookup(app_name, user_id, session_id, deleted_at)— regular index
New Index: idx_*_session_summaries_unique_active(app_name, user_id, session_id, filter_key) — unique index without deleted_at
Migration Steps:
Notes:
-
If you configured
WithTablePrefix("trpc_"), table and index names will have a prefix:- Table name:
trpc_session_summaries - Old index name:
idx_trpc_session_summaries_lookuporidx_trpc_session_summaries_unique_active - New index name:
idx_trpc_session_summaries_unique_active - Please adjust the table and index names in the SQL above according to your actual configuration.
- Table name:
-
The new index does not include the
deleted_atcolumn, which means soft-deleted summary records will block new records with the same business key. Since summary data is regenerable, it is recommended to hard delete soft-deleted records during migration (Step 3). If you skip this step, you need to handle conflicts manually.
ClickHouse Storage
Suitable for production environments and massive data scenarios, leveraging ClickHouse's powerful write throughput and data compression capabilities.
Configuration Options
Connection Configuration:
WithClickHouseDSN(dsn string): ClickHouse DSN connection string (recommended).- Format:
clickhouse://user:password@host:port/database?dial_timeout=10s
- Format:
WithClickHouseInstance(name string): Use pre-configured ClickHouse instance.WithExtraOptions(opts ...any): Set extra options for ClickHouse client.
Session Configuration:
WithSessionEventLimit(limit int): Maximum events per session. Default is 1000.WithSessionTTL(ttl time.Duration): Session TTL. Default is 0 (no expiration).WithAppStateTTL(ttl time.Duration): App state TTL. Default is 0 (no expiration).WithUserStateTTL(ttl time.Duration): User state TTL. Default is 0 (no expiration).WithDeletedRetention(retention time.Duration): Retention period for soft-deleted data. Default is 0 (disable application-level physical cleanup). When enabled, it will periodically clean up soft-deleted data viaALTER TABLE DELETE. Not recommended for production environments; prefer ClickHouse table-level TTL.WithCleanupInterval(interval time.Duration): Cleanup task interval.
Async Persistence Configuration:
WithEnableAsyncPersist(enable bool): Enable async persistence. Default isfalse.WithAsyncPersisterNum(num int): Number of async persistence workers. Default is 10.WithBatchSize(size int): Batch write size. Default is 100.WithBatchTimeout(timeout time.Duration): Batch write timeout. Default is 100ms.
Summary Configuration:
WithSummarizer(s summary.SessionSummarizer): Inject session summarizer.WithAsyncSummaryNum(num int): Number of summary processing workers. Default is 3.WithSummaryQueueSize(size int): Summary task queue size. Default is 100.WithSummaryJobTimeout(timeout time.Duration): Timeout for single summary task.
Schema Configuration:
WithTablePrefix(prefix string): Table name prefix.WithSkipDBInit(skip bool): Skip automatic table creation.
Hook Configuration:
WithAppendEventHook(hooks ...session.AppendEventHook): Add hooks for event appending.WithGetSessionHook(hooks ...session.GetSessionHook): Add hooks for session retrieval.
Basic Configuration Example
Configuration Reuse
Storage Structure
ClickHouse implementation uses ReplacingMergeTree engine to handle data updates and deduplication.
Key Features:
- ReplacingMergeTree: Uses
updated_atcolumn as version for background deduplication, keeping the latest version. - FINAL Query: All read operations use
FINALkeyword (e.g.,SELECT ... FINAL) to ensure data consistency by merging parts at query time. - Soft Delete: Deletion is implemented by inserting a new record with
deleted_attimestamp. Queries filter withdeleted_at IS NULL.
Advanced Usage
Hook Capabilities (Append/Get)
- AppendEventHook: Intercept/modify/abort events before they are stored. Useful for content safety or auditing (e.g., tagging
violation=<word>), or short-circuiting persistence. For filterKey usage, see the “Session Summarization / FilterKey with AppendEventHook” section below. - GetSessionHook: Intercept/modify/filter sessions after they are read. Useful for removing tagged events or dynamically augmenting the returned session state.
- Chain-of-responsibility: Hooks call
next()to continue; returning early short-circuits later hooks, and errors bubble up. - Backend parity: Memory, Redis, MySQL, and PostgreSQL share the same hook interface—inject hook slices when constructing the service.
- Example: See
examples/session/hook(code)
Direct Use of Session Service API
In most cases, you should use session management through Runner, which automatically handles all details. However, in some special scenarios (such as session management backend, data migration, statistical analysis, etc.), you may need to directly operate the Session Service.
Note: The following APIs are only for special scenarios, daily use of Runner is sufficient.
Query Session List
Manually Delete Session
Manually Get Session Details
Directly Append Events to Session
In some scenarios, you may want to directly append events to a session without invoking the model. This is useful for:
- Pre-loading conversation history from external sources
- Inserting system messages or context before the first user query
- Recording user actions or metadata as events
- Building conversation context programmatically
Important: An Event can represent both user requests and model responses. When you use Runner.Run(), the framework automatically creates events for both user messages and assistant responses.
Example: Append a User Message
Example: Append a System Message
Example: Append an Assistant Message
Event Required Fields
When creating an event using event.NewResponseEvent(), the following fields are required:
-
Function Parameters:
invocationID(string): Unique identifier, typicallyuuid.New().String()author(string): Event author ("user","system", or agent name)response(*model.Response): Response object with Choices
-
Response Fields:
Choices([]model.Choice): At least one Choice withIndexandMessageMessage: Must haveContentorContentParts
-
Auto-generated Fields (by
event.NewResponseEvent()):ID: Auto-generated UUIDTimestamp: Auto-set to current timeVersion: Auto-set toCurrentVersion
-
Persistence Requirements:
Response != nil!IsPartial(or hasStateDelta)IsValidContent()returnstrue
How It Works with Runner
When you later use Runner.Run() with the same session:
- Runner automatically loads the session (including all appended events)
- Converts session events to messages
- Includes all messages (appended + current) in the conversation context
- Sends everything to the model together
All appended events become part of the conversation history and are available to the model in subsequent interactions.
Example: See examples/session/appendevent (code)
Session Summarization
Overview
As conversations grow longer, maintaining full event history can become memory-intensive and may exceed LLM context windows. The session summarization feature automatically compresses historical conversation content into concise summaries using LLM-based summarization, reducing memory usage while preserving important context for future interactions.
Key Features
- Automatic summarization: Automatically trigger summaries based on configurable conditions such as event count, token count, or time threshold.
- Incremental summarization: Only new events since the last summary are processed, avoiding redundant computation.
- LLM-powered: Uses any configured LLM model to generate high-quality, context-aware summaries.
- Non-destructive: Original events remain unchanged; summaries are stored separately.
- Asynchronous processing: Summary jobs are processed asynchronously to avoid blocking the main conversation flow.
- Customizable prompts: Configure custom summarization prompts and word limits.
Basic Usage
Configure Summarizer
Create a summarizer with an LLM model and configure trigger conditions:
Integrate with Session Service
Attach the summarizer to your session service (in-memory or Redis):
Automatic Summarization in Runner
Once configured, the Runner automatically triggers summarization. You can also configure the LLM agent to use summaries in context:
How it works:
The framework provides two distinct modes for managing conversation context sent to the LLM:
Mode 1: With Summary (WithAddSessionSummary(true))
- The session summary is inserted as a separate system message after the first existing system message (or prepended if no system message exists).
- All incremental events after the summary timestamp are included (no truncation).
- This ensures complete context: condensed history (summary) + all new conversations since summarization.
WithMaxHistoryRunsis ignored in this mode.
Mode 2: Without Summary (WithAddSessionSummary(false))
- No summary is prepended.
- Only the most recent
MaxHistoryRunsconversation turns are included. - When
MaxHistoryRuns=0(default), no limit is applied and all history is included. - Use this mode for short sessions or when you want direct control over context window size.
Context Construction Details:
Best Practices:
- For long-running sessions, use
WithAddSessionSummary(true)to maintain full context while managing token usage. - For short sessions or when testing, use
WithAddSessionSummary(false)with appropriateMaxHistoryRuns. - The Runner automatically enqueues async summary jobs after appending events to the session.
Configuration Options
Summarizer Options
Configure the summarizer behavior with the following options:
Trigger Conditions:
WithEventThreshold(eventCount int): Trigger summarization when the number of new events since last summary exceeds the threshold. Example:WithEventThreshold(20)triggers when 20+ new events have occurred since last summary.WithTokenThreshold(tokenCount int): Trigger summarization when the new token count since last summary exceeds the threshold. Example:WithTokenThreshold(4000)triggers when 4000+ new tokens have been added since last summary.WithTimeThreshold(interval time.Duration): Trigger summarization when time elapsed since the last event exceeds the interval. Example:WithTimeThreshold(5*time.Minute)triggers after 5 minutes of inactivity.
Composite Conditions:
WithChecksAll(checks ...Checker): Require all conditions to be met (AND logic). Use withCheck*functions (notWith*). Example:WithChecksAny(checks ...Checker): Trigger if any condition is met (OR logic). Use withCheck*functions (notWith*). Example:
Note: Use Check* functions (like CheckEventThreshold) inside WithChecksAll and WithChecksAny. Use With* functions (like WithEventThreshold) as direct options to NewSummarizer. The Check* functions create checker instances, while With* functions are option setters.
Summary Generation:
WithMaxSummaryWords(maxWords int): Limit the summary to a maximum word count. The limit is included in the prompt to guide the model's generation. Example:WithMaxSummaryWords(150)requests summaries within 150 words.WithPrompt(prompt string): Provide a custom summarization prompt. The prompt must include the placeholder{conversation_text}, which will be replaced with the conversation content. Optionally include{max_summary_words}for word limit instructions.WithSkipRecent(skipFunc SkipRecentFunc): Skip the most recent events during summarization using a custom function. The function receives all events and returns how many tail events to skip. Return 0 to skip none. Useful for avoiding summarizing very recent/incomplete conversations, or applying time/content-based skipping strategies.
Tool Call Formatting:
By default, the summarizer includes tool calls and tool results in the conversation text sent to the LLM for summarization. The default format is:
- Tool calls:
[Called tool: toolName with args: {"arg": "value"}] - Tool results:
[toolName returned: result content]
You can customize how tool calls and results are formatted using these options:
WithToolCallFormatter(f ToolCallFormatter): Customize how tool calls are formatted in the summary input. The formatter receives amodel.ToolCalland returns a formatted string. Return empty string to exclude the tool call.WithToolResultFormatter(f ToolResultFormatter): Customize how tool results are formatted in the summary input. The formatter receives themodel.Messagecontaining the tool result and returns a formatted string. Return empty string to exclude the result.
Example with custom tool formatters:
Example with custom prompt:
Session Service Options
Configure async summary processing in session services:
WithSummarizer(s summary.SessionSummarizer): Inject the summarizer into the session service.WithAsyncSummaryNum(num int): Set the number of async worker goroutines for summary processing. Default is 3. More workers allow higher concurrency but consume more resources.WithSummaryQueueSize(size int): Set the size of the summary job queue. Default is 100. Larger queues allow more pending jobs but consume more memory.WithSummaryJobTimeout(timeout time.Duration): Set the timeout for processing a single summary job. Default is 60 seconds.
Manual Summarization
You can manually trigger summarization using the session service APIs:
API Description:
-
EnqueueSummaryJob: Asynchronous summarization (recommended)- Background processing, non-blocking
- Automatic fallback to sync processing on failure
- Suitable for production environments
CreateSessionSummary: Synchronous summarization- Immediate processing, blocking current operation
- Direct result return
- Suitable for debugging or scenarios requiring immediate results
Parameter Description:
- filterKey:
session.SummaryFilterKeyAllContentsindicates generating summary for the complete session - force parameter:
false: Respects configured trigger conditions (event count, token count, time threshold, etc.), only generates summary when conditions are mettrue: Forces summary generation, completely ignores all trigger condition checks, executes regardless of session state
Usage Scenarios:
| Scenario | API | force | Description |
|---|---|---|---|
| Normal auto-summary | Automatically called by Runner | false |
Auto-generates when trigger conditions met |
| Session end | EnqueueSummaryJob |
true |
Force generate final complete summary |
| User requests view | CreateSessionSummary |
true |
Immediately generate and return |
| Scheduled batch processing | EnqueueSummaryJob |
false |
Batch check and process qualified sessions |
| Debug/testing | CreateSessionSummary |
true |
Immediate execution, convenient for verification |
Retrieve Summary
Get the latest summary text from a session:
Filter Key Support:
The GetSessionSummaryText method supports an optional WithSummaryFilterKey option to retrieve summaries for specific event filters:
- When no option is provided, returns the full-session summary (
SummaryFilterKeyAllContents) - When a specific filter key is provided but not found, falls back to the full-session summary
- If neither exists, returns any available summary as a last resort
How It Works
-
Incremental Processing: The summarizer tracks the last summarization time for each session. On subsequent runs, it only processes events that occurred after the last summary.
-
Delta Summarization: New events are combined with the previous summary (prepended as a system event) to generate an updated summary that incorporates both old context and new information.
-
Trigger Evaluation: Before generating a summary, the summarizer evaluates configured trigger conditions (based on incremental event count, token count, and time threshold since last summary). If conditions aren't met and
force=false, summarization is skipped. -
Async Workers: Summary jobs are distributed across multiple worker goroutines using hash-based distribution. This ensures jobs for the same session are processed sequentially while different sessions can be processed in parallel.
-
Fallback Mechanism: If async enqueueing fails (queue full, context cancelled, or workers not initialized), the system automatically falls back to synchronous processing.
Best Practices
-
Choose appropriate thresholds: Set event/token thresholds based on your LLM's context window and conversation patterns. For GPT-4 (8K context), consider
WithTokenThreshold(4000)to leave room for responses. -
Use async processing: Always use
EnqueueSummaryJobinstead ofCreateSessionSummaryin production to avoid blocking the conversation flow. -
Monitor queue sizes: If you see frequent "queue is full" warnings, increase
WithSummaryQueueSizeorWithAsyncSummaryNum. -
Customize prompts: Tailor the summarization prompt to your application's needs. For example, if you're building a customer support agent, focus on key issues and resolutions.
-
Balance word limits: Set
WithMaxSummaryWordsto balance between preserving context and reducing token usage. Typical values range from 100-300 words. -
Test trigger conditions: Experiment with different combinations of
WithChecksAnyandWithChecksAllto find the right balance between summary frequency and cost.
Summarizing by Event Type
In real-world applications, you may want to generate separate summaries for different types of events. For example:
- User Message Summary: Summarize user needs and questions
- Tool Call Summary: Record which tools were used and their results
- System Event Summary: Track system state changes
To achieve this, you need to set the FilterKey field on events to identify their type.
Setting FilterKey with AppendEventHook
The recommended approach is to use AppendEventHook to automatically set FilterKey before events are persisted:
Once FilterKey is set, you can generate independent summaries for different event types:
FilterKey Prefix Convention
⚠️ Important: FilterKey must include the appName + "/" prefix.
Why: The Runner uses appName + "/" as the filter prefix when filtering events. If your FilterKey lacks this prefix, events will be filtered out, causing:
- LLM cannot see conversation history, may repeatedly trigger tool calls
- Summary content is incomplete, losing important context
Example:
Technical Details: The framework uses prefix matching (strings.HasPrefix) to determine which events should be included in the context. See ContentRequestProcessor filtering logic for details.
Complete Examples
See the following examples for complete FilterKey usage scenarios:
- examples/session/hook - Hook basics
- examples/summary/filterkey - Summarizing by FilterKey
Performance Considerations
- LLM costs: Each summary generation calls the LLM. Monitor your trigger conditions to balance cost and context preservation.
- Memory usage: Summaries are stored in addition to events. Configure appropriate TTLs to manage memory in long-running sessions.
- Async workers: More workers increase throughput but consume more resources. Start with 2-4 workers and scale based on load.
- Queue capacity: Size the queue based on your expected concurrency and summary generation time.
Complete Example
Here's a complete example demonstrating all components together:
References
By properly using session management, in combination with session summarization mechanisms, you can build stateful intelligent Agents that maintain conversation context while efficiently managing memory, providing users with continuous and personalized interaction experiences while ensuring the long-term sustainability of your system.