Session Management
Overview
tRPC-Agent-Go provides powerful session (Session) management capabilities to maintain conversation history and context information during interactions between Agents and users. The session management module supports multiple storage backends, including in-memory storage and Redis storage, providing flexible state persistence for Agent applications.
π― Key Features
- Session persistence: Save complete conversation history and context.
- Multiple storage backends: Support in-memory storage and Redis storage.
- Event tracking: Fully record all interaction events within a session.
- Multi-level storage: Support application-level, user-level, and session-level data storage.
- Concurrency safety: Built-in read-write locks ensure safe concurrent access.
- Automatic management: After specifying the Session Service in Runner, sessions are automatically created, loaded, and updated.
Core Concepts
Session Hierarchy
Data Levels
- App Data: Global shared data, such as system configuration and feature flags.
- User Data: User-level data shared across all sessions of the same user, such as user preferences.
- Session Data: Session-level data storing the context and state of a single conversation.
Usage Examples
Integrate Session Service
Use runner.WithSessionService
to provide complete session management for the Agent runner. If not specified, in-memory session management is used by default. Runner automatically handles session creation, loading, and updates, so users do not need additional operations or care about internal details:
After integrating session management, the Agent gains automatic session capabilities, including:
- Automatic session persistence: Each AI interaction is automatically saved to the session.
- Context continuity: Automatically load historical conversation context to enable true multi-turn conversations.
- State management: Maintain three levels of state data: application, user, and session.
- Event stream processing: Automatically record all interaction events such as user input, AI responses, and tool calls.
Basic Session Operations
If you need to manually manage existing sessions (e.g., to query statistics of existing Sessions), you can use the APIs provided by the Session Service.
Create and Manage Sessions
GetSession
- Function: Retrieve an existing session based on AppName, UserID, and SessionID.
- Params:
key
: Session key, must include complete AppName, UserID, and SessionID.options
: Optional parameters, such assession.WithEventNum(10)
to limit the number of returned events.
- Returns:
- If the session does not exist, returns
nil, nil
. - If the session exists, returns the complete session object (including merged app, user, and session state).
- If the session does not exist, returns
Usage:
DeleteSession
- Function: Remove the specified session from storage. If the user has no other sessions, the user record is automatically cleaned up.
- Characteristics:
- Deleting a non-existent session does not produce an error.
- Automatically cleans up empty user-session mappings.
- Thread-safe operations.
Usage:
ListSessions
State Management
Storage Backends
In-memory Storage
Suitable for development environments and small-scale applications:
In-memory Configuration Options
WithSessionEventLimit(limit int)
: Sets the maximum number of events stored per session. Default is 1000. When the limit is exceeded, older events are evicted.WithSessionTTL(ttl time.Duration)
: Sets the TTL for session state and event list. Default is 0 (no expiration). If set to 0, sessions will not expire automatically.WithAppStateTTL(ttl time.Duration)
: Sets the TTL for application-level state. Default is 0 (no expiration). If not set, app state will not expire automatically.WithUserStateTTL(ttl time.Duration)
: Sets the TTL for user-level state. Default is 0 (no expiration). If not set, user state will not expire automatically.WithCleanupInterval(interval time.Duration)
: Sets the interval for automatic cleanup of expired data. Default is 0 (auto-determined). If set to 0, automatic cleanup will be determined based on TTL configuration. Default cleanup interval is 5 minutes if any TTL is configured.
Example with full configuration:
Default configuration example:
Redis Storage
Suitable for production environments and distributed applications:
Redis Configuration Options
WithSessionEventLimit(limit int)
: Sets the maximum number of events stored per session. Default is 1000. When the limit is exceeded, older events are evicted.WithRedisClientURL(url string)
: Creates a Redis client from URL. Format:redis://[username:password@]host:port[/database]
.WithRedisInstance(instanceName string)
: Uses a preconfigured Redis instance from storage. Note:WithRedisClientURL
has higher priority thanWithRedisInstance
.WithExtraOptions(extraOptions ...interface{})
: Sets extra options for the Redis session service. This option is mainly used for customized Redis client builders and will be passed to the builder.WithSessionTTL(ttl time.Duration)
: Sets the TTL for session state and event list. Default is 0 (no expiration). If set to 0, sessions will not expire.WithAppStateTTL(ttl time.Duration)
: Sets the TTL for application-level state. Default is 0 (no expiration). If not set, app state will not expire.WithUserStateTTL(ttl time.Duration)
: Sets the TTL for user-level state. Default is 0 (no expiration). If not set, user state will not expire.
Example with full configuration:
Configuration Reuse
If multiple components need Redis, you can configure a Redis instance and reuse the configuration across components.
Redis Storage Structure
Session Summarization
Overview
As conversations grow longer, maintaining full event history can become memory-intensive and may exceed LLM context windows. The session summarization feature automatically compresses historical conversation content into concise summaries using LLM-based summarization, reducing memory usage while preserving important context for future interactions.
Key Features
- Automatic summarization: Automatically trigger summaries based on configurable conditions such as event count, token count, or time threshold.
- Incremental summarization: Only new events since the last summary are processed, avoiding redundant computation.
- LLM-powered: Uses any configured LLM model to generate high-quality, context-aware summaries.
- Non-destructive: Original events remain unchanged; summaries are stored separately.
- Asynchronous processing: Summary jobs are processed asynchronously to avoid blocking the main conversation flow.
- Customizable prompts: Configure custom summarization prompts and word limits.
Basic Usage
Configure Summarizer
Create a summarizer with an LLM model and configure trigger conditions:
Integrate with Session Service
Attach the summarizer to your session service (in-memory or Redis):
Automatic Summarization in Runner
Once configured, the Runner automatically triggers summarization. You can also configure the LLM agent to use summaries in context:
How it works:
The framework provides two distinct modes for managing conversation context sent to the LLM:
Mode 1: With Summary (WithAddSessionSummary(true)
)
- The session summary is prepended as a system message.
- All incremental events after the summary timestamp are included (no truncation).
- This ensures complete context: condensed history (summary) + all new conversations since summarization.
WithMaxHistoryRuns
is ignored in this mode.
Mode 2: Without Summary (WithAddSessionSummary(false)
)
- No summary is prepended.
- Only the most recent
MaxHistoryRuns
conversation turns are included. - When
MaxHistoryRuns=0
(default), no limit is applied and all history is included. - Use this mode for short sessions or when you want direct control over context window size.
Context Construction Details:
Best Practices:
- For long-running sessions, use
WithAddSessionSummary(true)
to maintain full context while managing token usage. - For short sessions or when testing, use
WithAddSessionSummary(false)
with appropriateMaxHistoryRuns
. - The Runner automatically enqueues async summary jobs after appending events to the session.
Configuration Options
Summarizer Options
Configure the summarizer behavior with the following options:
Trigger Conditions:
WithEventThreshold(eventCount int)
: Trigger summarization when the number of events exceeds the threshold. Example:WithEventThreshold(20)
triggers after 20 events.WithTokenThreshold(tokenCount int)
: Trigger summarization when the total token count exceeds the threshold. Example:WithTokenThreshold(4000)
triggers after 4000 tokens.WithTimeThreshold(interval time.Duration)
: Trigger summarization when time elapsed since the last event exceeds the interval. Example:WithTimeThreshold(5*time.Minute)
triggers after 5 minutes of inactivity.
Composite Conditions:
WithChecksAll(checks ...Checker)
: Require all conditions to be met (AND logic). Use withCheck*
functions (notWith*
). Example:WithChecksAny(checks ...Checker)
: Trigger if any condition is met (OR logic). Use withCheck*
functions (notWith*
). Example:
Note: Use Check*
functions (like CheckEventThreshold
) inside WithChecksAll
and WithChecksAny
. Use With*
functions (like WithEventThreshold
) as direct options to NewSummarizer
. The Check*
functions create checker instances, while With*
functions are option setters.
Summary Generation:
WithMaxSummaryWords(maxWords int)
: Limit the summary to a maximum word count. The limit is included in the prompt to guide the model's generation. Example:WithMaxSummaryWords(150)
requests summaries within 150 words.WithPrompt(prompt string)
: Provide a custom summarization prompt. The prompt must include the placeholder{conversation_text}
, which will be replaced with the conversation content. Optionally include{max_summary_words}
for word limit instructions.
Example with custom prompt:
Session Service Options
Configure async summary processing in session services:
WithSummarizer(s summary.SessionSummarizer)
: Inject the summarizer into the session service.WithAsyncSummaryNum(num int)
: Set the number of async worker goroutines for summary processing. Default is 2. More workers allow higher concurrency but consume more resources.WithSummaryQueueSize(size int)
: Set the size of the summary job queue. Default is 100. Larger queues allow more pending jobs but consume more memory.WithSummaryJobTimeout(timeout time.Duration)
(in-memory only): Set the timeout for processing a single summary job. Default is 30 seconds.
Manual Summarization
You can manually trigger summarization using the session service APIs:
API Description:
-
EnqueueSummaryJob
: Asynchronous summarization (recommended)- Background processing, non-blocking
- Automatic fallback to sync processing on failure
- Suitable for production environments
CreateSessionSummary
: Synchronous summarization- Immediate processing, blocking current operation
- Direct result return
- Suitable for debugging or scenarios requiring immediate results
Parameter Description:
- filterKey:
session.SummaryFilterKeyAllContents
indicates generating summary for the complete session - force parameter:
false
: Respects configured trigger conditions (event count, token count, time threshold, etc.), only generates summary when conditions are mettrue
: Forces summary generation, completely ignores all trigger condition checks, executes regardless of session state
Usage Scenarios:
Scenario | API | force | Description |
---|---|---|---|
Normal auto-summary | Automatically called by Runner | false |
Auto-generates when trigger conditions met |
Session end | EnqueueSummaryJob |
true |
Force generate final complete summary |
User requests view | CreateSessionSummary |
true |
Immediately generate and return |
Scheduled batch processing | EnqueueSummaryJob |
false |
Batch check and process qualified sessions |
Debug/testing | CreateSessionSummary |
true |
Immediate execution, convenient for verification |
Retrieve Summary
Get the latest summary text from a session:
How It Works
-
Incremental Processing: The summarizer tracks the last summarization time for each session. On subsequent runs, it only processes events that occurred after the last summary.
-
Delta Summarization: New events are combined with the previous summary (prepended as a system event) to generate an updated summary that incorporates both old context and new information.
-
Trigger Evaluation: Before generating a summary, the summarizer evaluates configured trigger conditions (event count, token count, time threshold). If conditions aren't met and
force=false
, summarization is skipped. -
Async Workers: Summary jobs are distributed across multiple worker goroutines using hash-based distribution. This ensures jobs for the same session are processed sequentially while different sessions can be processed in parallel.
-
Fallback Mechanism: If async enqueueing fails (queue full, context cancelled, or workers not initialized), the system automatically falls back to synchronous processing.
Best Practices
-
Choose appropriate thresholds: Set event/token thresholds based on your LLM's context window and conversation patterns. For GPT-4 (8K context), consider
WithTokenThreshold(4000)
to leave room for responses. -
Use async processing: Always use
EnqueueSummaryJob
instead ofCreateSessionSummary
in production to avoid blocking the conversation flow. -
Monitor queue sizes: If you see frequent "queue is full" warnings, increase
WithSummaryQueueSize
orWithAsyncSummaryNum
. -
Customize prompts: Tailor the summarization prompt to your application's needs. For example, if you're building a customer support agent, focus on key issues and resolutions.
-
Balance word limits: Set
WithMaxSummaryWords
to balance between preserving context and reducing token usage. Typical values range from 100-300 words. -
Test trigger conditions: Experiment with different combinations of
WithChecksAny
andWithChecksAll
to find the right balance between summary frequency and cost.
Performance Considerations
- LLM costs: Each summary generation calls the LLM. Monitor your trigger conditions to balance cost and context preservation.
- Memory usage: Summaries are stored in addition to events. Configure appropriate TTLs to manage memory in long-running sessions.
- Async workers: More workers increase throughput but consume more resources. Start with 2-4 workers and scale based on load.
- Queue capacity: Size the queue based on your expected concurrency and summary generation time.
Complete Example
Here's a complete example demonstrating all components together:
References
By properly using session management, in combination with session summarization mechanisms, you can build stateful intelligent Agents that maintain conversation context while efficiently managing memory, providing users with continuous and personalized interaction experiences while ensuring the long-term sustainability of your system.