Memory Usage Guide
Overview
The Memory module is the memory management system in the tRPC-Agent-Go framework, providing Agents with persistent memory and context management capabilities. By integrating memory services, session management, and memory tools, the Memory system helps Agents remember user information, maintain dialog context, and provide personalized response experiences across multiple conversations.
Positioning
Memory manages long-term user information with isolation dimension
<appName, userID>. It can be understood as a "personal profile" gradually
accumulated around a single user.
In cross-session scenarios, Memory enables the system to retain key user information, avoiding repetitive information gathering in each session.
It is suitable for recording stable, reusable facts such as "user name is John", "occupation is backend engineer", "prefers concise answers", "commonly used language is English", and directly using this information in subsequent interactions.
Two Memory Modes
Memory supports two modes for creating and managing memories. Choose based on your scenario:
Auto Mode is available when an Extractor is configured and is recommended as the default choice.
| Aspect | Agentic Mode (Tools) | Auto Mode (Extractor) |
|---|---|---|
| How it works | Agent decides when to call memory tools | System extracts memories automatically from conversations |
| User experience | Visible - user sees tool calls | Transparent - memories created silently in background |
| Control | Agent has full control over what to remember | Extractor decides based on conversation analysis |
| Available tools | All 6 tools | memory_search by default; configurable memory_load; enabled write tools can be exposed |
| Processing | Synchronous - during response generation | Asynchronous - background workers after response |
| Best for | Precise control, user-driven memory management | Natural conversations, hands-off memory building |
Selection Guide:
- Agentic Mode: Agent automatically decides when to call memory tools based on conversation content (e.g., when user mentions personal information or preferences), user sees tool calls, suitable for scenarios requiring precise control over memory content
- Auto Mode (recommended): Natural conversation flow, system passively learns about users, simplified UX
Core Values
- Context Continuity: Maintain user history across sessions, avoiding repetitive questioning and input.
- Personalized Service: Provide customized responses and suggestions based on long-term user profiles and preferences.
- Knowledge Accumulation: Transform facts and experiences from conversations into reusable knowledge.
- Persistent Storage: Support multiple storage backends to ensure data safety and reliability.
Use Cases
The Memory module is suitable for scenarios requiring cross-session user information and context retention:
Use Case 1: Personalized Customer Service Agent
Requirement: Customer service Agent needs to remember user information, historical issues, and preferences for consistent service.
Implementation:
- First conversation: Agent uses
memory_addto record name, company, contact - Record user preferences like "prefers concise answers", "technical background"
- Subsequent sessions: Agent uses
memory_loadto load user info, no repeated questions needed - After resolving issues: Use
memory_updateto update issue status
Use Case 2: Learning Companion Agent
Requirement: Educational Agent needs to track student learning progress, knowledge mastery, and interests.
Implementation:
- Use
memory_addto record mastered knowledge points - Use topic tags for categorization:
["math", "geometry"],["programming", "Python"] - Use
memory_searchto query related knowledge, avoid repeated teaching - Adjust teaching strategies based on memories, provide personalized learning paths
Use Case 3: Project Management Agent
Requirement: Project management Agent needs to track project information, team members, and task progress.
Implementation:
- Record key project info:
memory_add("Project X uses Go language", ["project", "tech-stack"]) - Record team member roles:
memory_add("John Doe is backend lead", ["team", "role"]) - Use
memory_searchto quickly find relevant information - After project completion: Use
memory_clearto clear temporary information
Quick Start
Requirements
- Go 1.21 or later.
- A valid LLM API key (OpenAI-compatible endpoint).
- Redis service (optional for production).
Environment Variables
Agentic Mode Configuration (Optional)
In Agentic mode, the Agent automatically decides when to call memory tools based on conversation content to manage memories. Configuration involves three steps:
Conversation example:
Auto Mode Configuration (Recommended)
In Auto mode, an LLM-based extractor analyzes conversations and automatically creates memories. The only difference from Agentic mode is in Step 1: add an Extractor.
Conversation example:
Configuration Comparison
| Step | Agentic Mode | Auto Mode |
|---|---|---|
| Step 1 | NewMemoryService() |
NewMemoryService(WithExtractor(ext)) |
| Step 2 | WithTools(memoryService.Tools()) |
WithTools(memoryService.Tools()) |
| Step 3 | WithMemoryService(memoryService) |
WithMemoryService(memoryService) |
| Available tools | add/update/delete/clear/search/load | search by default; load configurable; enabled write tools can be exposed |
| Memory creation | Agent actively calls tools | Background auto extraction |
Core Concepts
The memory module is the core of tRPC-Agent-Go's memory management. It provides complete memory storage and retrieval capabilities with a modular design that supports multiple storage backends and memory tools.
Usage Guide
Integrate with Agent
Use a two-step approach to integrate the Memory Service with an Agent:
- Register tools: Use
llmagent.WithTools(memoryService.Tools())to register memory tools with the Agent - Set service: Use
runner.WithMemoryService(memoryService)to set the memory service in the Runner
Memory Service
Configure the memory service in code. Five backends are supported: in-memory, Redis, MySQL, PostgreSQL, and pgvector. Two vector search backends are also available: sqlitevec and mysqlvec.
Configuration Example
Memory Tool Configuration
The memory service provides 6 tools. Common tools are enabled by default, while dangerous operations require manual enabling.
Tool List
| Tool | Function | Agentic Mode | Auto Extraction Mode | Description |
|---|---|---|---|---|
memory_add |
Add new memory | ✅ Default | ⚙️ Hidden by default | Create new memory entry |
memory_update |
Update memory | ✅ Default | ⚙️ Hidden by default | Modify existing memory |
memory_search |
Search memory | ✅ Default | ✅ Default | Find by keywords |
memory_load |
Load memories | ✅ Default | ⚙️ Configurable | Load recent memories |
memory_delete |
Delete memory | ⚙️ Configurable | ⚙️ Hidden by default | Delete single memory |
memory_clear |
Clear memories | ⚙️ Configurable | ⚙️ Disabled by default | Delete all memories |
Notes:
- Agentic Mode: Agent actively calls tools to manage memory, all tools are configurable
- Default enabled tools:
memory_add,memory_update,memory_search,memory_load - Default disabled tools:
memory_delete,memory_clear
- Default enabled tools:
- Auto Mode: LLM extractor handles write operations in background. Tools() exposes Search by default; Load can be enabled;
WithAutoMemoryExposedTools()can selectively expose enabled write tools for hybrid usage.- Default enabled tools:
memory_add,memory_update,memory_delete,memory_search - Default disabled tools:
memory_load,memory_clear - Hidden by default:
memory_add,memory_update,memory_delete
- Default enabled tools:
- Default: Available immediately when service is created, no extra configuration needed
- Configurable: Can be enabled/disabled via
WithToolEnabled(); in Auto mode, enabled write tools can be exposed viaWithAutoMemoryExposedTools()
Enable/Disable Tools
Note: WithToolEnabled() controls whether a memory operation is available at
all. WithAutoMemoryExposedTools() controls which enabled tools are returned
from Tools() for the Agent to call in Auto mode. Write tools remain hidden by
default unless you expose them explicitly.
Overwrite Semantics (IDs and duplicates)
- Memory IDs are generated from memory content + sorted topics + appName + userID. Adding the same content and topics for the same user is idempotent and overwrites the existing entry (not append). UpdatedAt is refreshed.
- If you need append semantics or different duplicate-handling strategies, you can implement custom tools or extend the service with policy options (e.g. allow/overwrite/ignore).
Custom Tool Implementation
Note: In Auto mode, Tools() exposes memory_search by default, memory_load
when enabled, and any additional enabled tools you explicitly expose with
WithAutoMemoryExposedTools(). Dangerous operations like memory_clear should usually
stay application-controlled.
You can override default tools with custom implementations. See
memory/tool/tool.go for reference on how to implement custom tools.
Full Example
Below is a complete interactive chat example demonstrating memory capabilities in action.
Run the Example
Interactive Demo
Code Example
For full code, see examples/memory. Core implementation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
Storage Backends
In-Memory Storage
Use case: Development, testing, rapid prototyping
Configuration options:
WithMemoryLimit(limit int): Set memory limit per userWithCustomTool(toolName, creator): Register custom tool implementationWithToolEnabled(toolName, enabled): Enable/disable specific tool
Features: Zero config, high performance, no persistence
SQLite Storage
Use case: Local persistence, single-node deployments, demos
SQLite stores data in a single file. It is useful when you want persistence without operating MySQL/PostgreSQL/Redis.
Configuration options:
WithTableName(name): Table name (default "memories")WithSoftDelete(enabled): Enable soft delete (default false)WithMemoryLimit(limit): Memory limit per userWithSkipDBInit(skip): Skip table initialization- Auto mode:
WithExtractor,WithAsyncMemoryNum,WithMemoryQueueSize,WithMemoryJobTimeout - Tools:
WithCustomTool,WithToolEnabled
Notes:
- This backend uses
github.com/mattn/go-sqlite3and requires CGO. NewServiceowns the*sql.DBand closes it inClose().
SQLiteVec (sqlite-vec) Storage
Use case: Local persistence + semantic memory search on a single node
SQLiteVec stores memories in a SQLite file and uses sqlite-vec to do
vector similarity search (semantic search). Compared to the plain SQLite
backend, it requires an embedder to generate embeddings.
Configuration options:
WithTableName(name): Table name (default "memories")WithEmbedder(embedder): Text embedder for vector generation (required)WithIndexDimension(dim): Vector dimension (default is embedder dimension)WithMaxResults(limit): Max search results (default 10)WithSoftDelete(enabled): Enable soft delete (default false)WithMemoryLimit(limit): Memory limit per userWithSkipDBInit(skip): Skip table initialization- Auto mode:
WithExtractor,WithAsyncMemoryNum,WithMemoryQueueSize,WithMemoryJobTimeout - Tools:
WithCustomTool,WithToolEnabled
Notes:
- This backend uses
github.com/mattn/go-sqlite3and requires CGO. - The
sqlite-vecextension is compiled and registered in-process via Go bindings (no external.so/.dylibdownload at runtime).
Redis Storage
Use case: Production, high concurrency, distributed deployment
Configuration options:
WithRedisClientURL(url): Redis connection URL (recommended)WithRedisInstance(name): Use pre-registered Redis instanceWithMemoryLimit(limit): Memory limit per userWithKeyPrefix(prefix): Set a prefix for all Redis keys. When set, every key is prefixed withprefix:. For example, ifprefixis"myapp", the keymem:{app:user}becomesmyapp:mem:{app:user}. Default is empty (no prefix). This is useful for sharing a single Redis instance across multiple environments or servicesWithCustomTool(toolName, creator): Register custom toolWithToolEnabled(toolName, enabled): Enable/disable toolWithExtraOptions(...options): Extra options passed to Redis client
Note: WithRedisClientURL takes priority over WithRedisInstance
Key prefix example:
MySQL Storage
Use case: Production, ACID guarantees, complex queries
Configuration options:
WithMySQLClientDSN(dsn): MySQL DSN connection string (recommended, requiresparseTime=true)WithMySQLInstance(name): Use pre-registered MySQL instanceWithSoftDelete(enabled): Enable soft delete (default false)WithTableName(name): Custom table name (default "memories")WithMemoryLimit(limit): Memory limit per userWithCustomTool(toolName, creator): Register custom toolWithToolEnabled(toolName, enabled): Enable/disable toolWithExtraOptions(...options): Extra options passed to MySQL clientWithSkipDBInit(skip): Skip table initialization (for users without DDL permissions)
DSN example:
Table schema (auto-created):
Resource cleanup: Call Close() method to release database connection:
MySQL Vector (mysqlvec) Storage
Use case: Production, vector similarity search with MySQL + native VECTOR type
MySQL Vector stores memories in MySQL with embedding vectors for semantic similarity
search. It uses MySQL 9.0+ native VECTOR type when available, and automatically
falls back to BLOB storage with Go-side cosine similarity for older versions (8.x).
Configuration options:
WithMySQLClientDSN(dsn): MySQL DSN connection string (recommended, requiresparseTime=true)WithMySQLInstance(name): Use pre-registered MySQL instanceWithEmbedder(embedder): Text embedder for vector generation (required)WithSoftDelete(enabled): Enable soft delete (default false)WithTableName(name): Custom table name (default "memories")WithIndexDimension(dim): Vector dimension (default 1536)WithMaxResults(limit): Max search results (default 15)WithMemoryLimit(limit): Memory limit per userWithCustomTool(toolName, creator): Register custom toolWithToolEnabled(toolName, enabled): Enable/disable toolWithExtraOptions(...options): Extra options passed to MySQL clientWithSkipDBInit(skip): Skip table initialization (for users without DDL permissions)
Note: Requires MySQL 5.7.8+ (for JSON column type). Uses native VECTOR on MySQL 9.0+; falls back to BLOB + Go-side cosine similarity on MySQL 5.7/8.x. No additional vector library required.
Table schema (auto-created, MySQL 9.0+):
Resource cleanup: Call Close() method to release database connection:
PostgreSQL Storage
Use case: Production, advanced JSONB features
Configuration options:
WithHost/WithPort/WithUser/WithPassword/WithDatabase: Connection parametersWithSSLMode(mode): SSL mode (default "disable")WithPostgresInstance(name): Use pre-registered PostgreSQL instanceWithSoftDelete(enabled): Enable soft delete (default false)WithTableName(name): Custom table name (default "memories")WithSchema(schema): Specify database schema (default is public)WithMemoryLimit(limit): Memory limit per userWithCustomTool(toolName, creator): Register custom toolWithToolEnabled(toolName, enabled): Enable/disable toolWithExtraOptions(...options): Extra options passed to PostgreSQL clientWithSkipDBInit(skip): Skip table initialization (for users without DDL permissions)
Note: Direct connection parameters take priority over WithPostgresInstance
Table schema (auto-created):
Resource cleanup: Call Close() method to release database connection:
pgvector Storage
Use case: Production, vector similarity search with PostgreSQL + pgvector
Configuration options:
WithHost/WithPort/WithUser/WithPassword/WithDatabase: Connection parametersWithSSLMode(mode): SSL mode (default "disable")WithPostgresInstance(name): Use pre-registered PostgreSQL instanceWithEmbedder(embedder): Text embedder for vector generation (required)WithSoftDelete(enabled): Enable soft delete (default false)WithTableName(name): Custom table name (default "memories")WithSchema(schema): Specify database schema (default is public)WithIndexDimension(dim): Vector dimension (default 1536)WithMaxResults(limit): Max search results (default 10)WithMemoryLimit(limit): Memory limit per userWithCustomTool(toolName, creator): Register custom toolWithToolEnabled(toolName, enabled): Enable/disable toolWithExtraOptions(...options): Extra options passed to PostgreSQL clientWithSkipDBInit(skip): Skip table initialization (for users without DDL permissions)WithHNSWIndexParams(params): HNSW index parameters for vector search
Note: Direct connection parameters take priority over WithPostgresInstance. Requires pgvector extension to be installed in PostgreSQL.
Table schema (auto-created):
Resource cleanup: Call Close() method to release database connection:
Backend Comparison
| Feature | InMemory | SQLite | SQLiteVec | Redis | MySQL | MySQL Vec | PostgreSQL | pgvector |
|---|---|---|---|---|---|---|---|---|
| Persistence | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Distributed | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Transactions | ❌ | ✅ ACID | ✅ ACID | Partial | ✅ ACID | ✅ ACID | ✅ ACID | ✅ ACID |
| Queries | Simple | SQL | SQL + Vector | Medium | SQL | SQL + Vector | SQL | SQL + Vector |
| JSON | ❌ | Basic | Basic | Basic | JSON | JSON | JSONB | JSONB |
| Performance | Very High | Med-High | Med-High | High | Med-High | Med-High | Med-High | Med-High |
| Configuration | Zero | Simple | Medium | Simple | Medium | Medium | Medium | Medium |
| Soft Delete | ❌ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ |
| Use Case | Dev/Test | Local Persistence | Local Vector | High Concurrency | Enterprise | MySQL Vector Search | Advanced Features | Vector Search |
Selection guide:
Register PostgreSQL Instance (Optional):
Storage Backend Comparison
| Feature | In-Memory | SQLite | SQLiteVec | Redis | MySQL | PostgreSQL | pgvector |
|---|---|---|---|---|---|---|---|
| Data Persistence | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Distributed Support | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ |
| Transaction Support | ❌ | ✅ (ACID) | ✅ (ACID) | Partial | ✅ (ACID) | ✅ (ACID) | ✅ (ACID) |
| Query Capability | Simple | SQL | SQL + Vector | Medium | Powerful (SQL) | Powerful (SQL) | SQL + Vectors |
| JSON Support | ❌ | Basic | Basic | Partial | ✅ (JSON) | ✅ (JSONB) | ✅ (JSONB) |
| Performance | Very High | Med-High | Medium-High | High | Medium-High | Medium-High | Medium-High |
| Configuration Complexity | Low | Low | Medium | Medium | Medium | Medium | Medium |
| Use Case | Dev/Test | Local Dev | Local Vector | Production | Production | Production | Vector Search |
| Monitoring Tools | None | None | None | Rich | Very Rich | Very Rich | Very Rich |
Selection Guide:
- Development/Testing: Use in-memory storage for fast iteration
- Local Development (Persistent): Use SQLite when you want persistence without operating an external database
- Local Development (Vector Search): Use SQLiteVec when you want semantic search in a single-file SQLite DB
- Production (High Performance): Use Redis storage for high concurrency scenarios
- Production (Data Integrity): Use MySQL storage when ACID guarantees and complex queries are needed
- Production (PostgreSQL): Use PostgreSQL storage when JSONB support and advanced PostgreSQL features are needed
- Production (Vector Search): Use pgvector storage when similarity search with embeddings is needed
- Hybrid Deployment: Choose different storage backends based on different application scenarios
FAQ
Difference between Memory and Session
Memory and Session solve different problems:
| Dimension | Memory | Session |
|---|---|---|
| Purpose | Long-term user profile | Temporary conversation context |
| Isolation | <appName, userID> |
<appName, userID, sessionID> |
| Lifecycle | Persists across sessions | Valid within a single session |
| Content | User profile, preferences, facts | Conversation history, messages |
| Data Size | Small (tens to hundreds) | Large (tens to thousands of messages) |
| Use Case | "Remember who the user is" | "Remember what was said" |
Example:
Memory ID Idempotency
Memory ID is generated from SHA256 hash of "content + sorted topics + appName + userID". Same content produces the same ID for the same user:
Implications:
- ✅ Natural deduplication: Avoids redundant storage
- ✅ Idempotent operations: Repeated additions don't create multiple records
- ⚠️ Overwrite update: Cannot append same content (add timestamp or sequence number if append is needed)
Search Behavior Notes
Search behavior depends on the backend:
- For
inmemory/redis/mysql/postgres:SearchMemoriesuses token matching (not semantic search). - For
pgvector/mysqlvec/sqlitevec:SearchMemoriesuses vector similarity search and requires an embedder.
Token matching details (non-vector backends):
English tokenization: lowercase → filter stopwords (a, the, is, etc.) → split by spaces
Chinese tokenization: prefers gse word segmentation with
low-weight CJK character trigram fallback
Limitations (non-vector backends):
- These backends perform filtering and sorting in application layer ([O(n)] complexity)
- Performance affected by data volume
- Not semantic similarity search
- Ranking uses BM25-style lexical scoring + query coverage + ordered phrase bonus, not vector semantics
Recommendations:
- Use explicit keywords and topic tags to improve hit rate
- If you need semantic similarity search, use the pgvector, mysqlvec, or sqlitevec backend
Soft Delete Considerations
Support status:
- ✅ MySQL, PostgreSQL, pgvector: support soft delete
- ❌ InMemory, Redis: not supported (hard delete only)
Soft delete configuration:
Behavior differences:
| Operation | Hard Delete | Soft Delete |
|---|---|---|
| Delete | Immediate removal | Set deleted_at field |
| Query | Not visible | Auto-filtered (WHERE deleted_at IS NULL) |
| Recovery | Cannot recover | Can manually clear deleted_at |
| Storage | Saves space | Occupies space |
Migration trap:
Best Practices
Production Environment Configuration
Error Handling
Tool Enabling Strategy
Advanced Configuration
Auto Mode Configuration Options
| Option | Description | Default |
|---|---|---|
WithExtractor(extractor) |
Enable auto mode with LLM extractor | nil (disabled) |
WithAsyncMemoryNum(n) |
Number of background worker goroutines | 1 |
WithMemoryQueueSize(n) |
Size of memory job queue | 10 |
WithMemoryJobTimeout(d) |
Timeout for each extraction job | 30s |
Extraction Checkers
Checkers control when memory extraction should be triggered. By default, extraction happens on every conversation turn. Use checkers to optimize extraction frequency and reduce LLM costs.
Available Checkers
| Checker | Description | Example |
|---|---|---|
CheckMessageThreshold |
Triggers when accumulated messages exceed threshold | CheckMessageThreshold(5) - when messages > 5 |
CheckTimeInterval |
Triggers when time since last extraction exceeds interval | CheckTimeInterval(3*time.Minute) - every 3 min |
ChecksAll |
Combines checkers with AND logic | All checkers must pass |
ChecksAny |
Combines checkers with OR logic | Any checker passing triggers extraction |
Checker Configuration Examples
Model callbacks (before/after)
The extractor also supports injecting before/after model callbacks via model.Callbacks (structured only). This is useful for tracing, request rewriting, or short-circuiting the model call in tests.
ExtractionContext
The ExtractionContext provides information for checker decisions:
Note: Messages contains all accumulated messages since the last successful extraction. When a checker returns false, messages are accumulated and will be included in the next extraction. This ensures no conversation context is lost when using turn-based or time-based checkers.
Tool Control
In auto extraction mode, WithToolEnabled controls whether each tool is
available. memory_search is exposed through Tools() by default,
memory_load is exposed once enabled, and WithAutoMemoryExposedTools
selectively exposes enabled write tools for hybrid usage.
Front-end Tools (exposed via Tools() for agent to call):
| Tool | Default | Description |
|---|---|---|
memory_search |
✅ On | Search memories by query |
memory_load |
❌ Off | Load all or recent N memories |
Back-end Tools (used by extractor in background by default):
| Tool | Default | Description |
|---|---|---|
memory_add |
✅ On | Add new memories (extractor uses this) |
memory_update |
✅ On | Update existing memories |
memory_delete |
✅ On | Delete memories |
memory_clear |
❌ Off | Clear all user memories (dangerous) |
Configuration Examples:
Note: WithToolEnabled and WithAutoMemoryExposedTools can be called before or after
WithExtractor - the order does not matter.
Comparison: Agentic Mode vs Auto Mode
| Tool | Agentic Mode (no extractor) | Auto Mode (with extractor) |
|---|---|---|
memory_add |
✅ Agent calls via Tools() |
⚙️ Agent calls via Tools() if exposed; extractor uses in background |
memory_update |
✅ Agent calls via Tools() |
⚙️ Agent calls via Tools() if exposed; extractor uses in background |
memory_search |
✅ Agent calls via Tools() |
✅ Agent calls via Tools() |
memory_load |
✅ Agent calls via Tools() |
⚙️ Agent calls via Tools() if enabled |
memory_delete |
⚙️ Agent calls via Tools() if enabled |
⚙️ Agent calls via Tools() if exposed; extractor uses in background |
memory_clear |
⚙️ Agent calls via Tools() if enabled |
⚙️ Agent calls via Tools() if exposed; extractor uses in background if enabled |
Memory Preloading
Both modes support preloading memories into the system prompt:
When preloading is enabled, memories are automatically injected into the system prompt, giving the Agent context about the user without explicit tool calls.
When WithPreloadMemory(N) uses a positive value, the framework first probes
how many memories the user has. If the count is at most N, it injects all
memories. If the count is larger than N, it switches to query-aware
memory_search behavior internally and injects only the top N relevant
results for the current user message. If query extraction is empty, the
search fails, or the search returns no matches, it falls back to directly
loading up to N memories.
Injection Mechanism: Preloaded memories are merged into the existing system prompt rather than inserted as a separate system message. This ensures the request always contains a single system message, maintaining compatibility with models that have limited support for multiple system messages (e.g., Qwen3.5 series may return "System message must be at the beginning" error).
⚠️ Important Note: Setting the configuration to -1 loads all memories,
which may significantly increase Token Usage and API Costs. By default,
preloading is disabled (0), and we recommend using positive budgets (e.g., 10-50)
to balance performance and cost.
Hybrid Approach
You can combine both approaches:
- Use Auto mode for passive learning (background extraction)
- Enable search tool for explicit memory queries
- Preload memories for immediate context
External Long-Term Memory Integration (mem0)
memory/mem0 integrates mem0, an externally hosted long-term memory platform. It is suitable when you want mem0 to handle memory extraction and storage, while the Agent continues to query memories through standard tools.
Unlike the built-in backends above, memory/mem0 is not a full memory.Service implementation. It uses an ingest-first pattern: Runner forwards session transcripts to mem0 after each turn, mem0 performs extraction on the service side, and the Agent uses read-oriented tools to query the results.
Use case: Hosted long-term memory, background extraction after each turn, and no local CRUD write path.
Configuration Example
Integration points:
- Register tools with
llmagent.WithTools(mem0Svc.Tools()) - Use
runner.WithSessionIngestor(mem0Svc)to send session transcripts to mem0 - Do not use
runner.WithMemoryService(...)with this integration
Why WithSessionIngestor(...) Instead of WithMemoryService(...)
runner.WithMemoryService(...) is designed for built-in memory backends that implement the full memory.Service contract. In addition to read APIs, that contract includes framework-owned write semantics such as AddMemory, UpdateMemory, DeleteMemory, ClearMemories, and EnqueueAutoMemoryJob(...).
memory/mem0 has a different boundary. It does not expose the full CRUD lifecycle to the framework. Instead, it accepts a completed session transcript, forwards it to mem0 for hosted extraction, and then exposes read-oriented tools for retrieval.
Using runner.WithSessionIngestor(...) makes that boundary explicit:
- Runner sends the completed session transcript after each turn
- mem0 performs extraction and storage on the service side
- per-request ingest fields such as
metadata,agent_id, andrun_idcan be passed throughsession.IngestOption - the integration is not mistaken for a built-in backend that supports full framework-side CRUD or preload behavior
In short, MemoryService means "the framework manages memories directly", while SessionIngestor means "the framework hands the transcript to an external memory system". mem0 matches the second model.
Configuration Options
| Option | Purpose | Default |
|---|---|---|
WithAPIKey(key) |
mem0 API key. Required for all requests. | required |
WithHost(url) |
Override the mem0 API host/base URL. | https://api.mem0.ai |
WithOrgProject(orgID, projectID) |
Add mem0 org_id / project_id to ingest and retrieval requests. |
empty |
WithAsyncMode(bool) |
Controls mem0's async_mode flag on ingest requests. |
true |
WithVersion(v) |
Sets the mem0 ingestion API version field. | v2 |
WithTimeout(d) |
HTTP timeout used by the client. | 10s |
WithLoadToolEnabled(bool) |
Expose memory_load from Tools(). |
false |
WithAsyncMemoryNum(n) |
Number of background ingest workers. | 1 |
WithMemoryQueueSize(n) |
Queue size per ingest worker. | 10 |
WithMemoryJobTimeout(d) |
Timeout for queued jobs and synchronous fallback ingest. | 30s |
Notes
Tools()exposesmemory_searchby default;memory_loadcan be enabled explicitly.- All reads remain scoped to the current
<appName, userID>. - Runner automatically passes session context into ingest. Custom callers can also use
session.WithIngestMetadata,session.WithIngestAgentID, andsession.WithIngestRunIDwhen needed. - When mem0 metadata is available, search results can still carry structured fields such as
Topics,Kind,EventTime,Participants, andLocation. - Call
Close()on the service so background workers shut down cleanly. - If you need full CRUD tools or framework-side preload, use one of the built-in memory backends instead.