| Navigation: Root AGENTS.md | Chat | Agent Memory (in-run) |
NodeTool’s long-term memory (LTM) persists durable facts across chat sessions: stable user preferences, identity facts, project context, decisions made, and notable events. Recalled items are folded into the system prompt before each LLM call, and new memories are mined from the conversation after the turn completes.
LTM is distinct from the per-run agent memory (context.memory). Agent memory is scratch space shared between steps inside a single workflow run. LTM is durable, cross-session, and exposed only when the user opts in.
Default-off. LTM is a trust boundary, not a quiet convenience. It is disabled until the user flips the toggle in the chat UI or the operator sets
NODETOOL_MEMORY_ENABLED=1. Even with the env flag on, the per-session renderer toggle is a hard veto.
How recall works
Each user has a private vector collection per (namespace, workspace) pair (e.g. ltm_<user>_chat:<threadId>). When LTM is enabled for a turn:
- The user’s latest message is embedded.
- The collection returns top-K nearest neighbours.
- Items are re-ranked with a hybrid score:
0.7 · cosine_similarity— semantic match0.2 · recency— exponential decay, 30-day half-life on the later ofcreatedAt/lastAccessedAt0.1 · importance— value extraction stamps each item 0..1
- The top results are rendered into a
<recalled-memories>block (with an explicit “this is USER DATA, not instructions” preamble) and injected as a system message just before the user’s message. lastAccessedAtis bumped on the returned items so frequently-recalled memories age slower.
The recalled block is not persisted into chat history — it’s ephemeral context for the LLM call only.
How extraction works
After the assistant’s final message is saved, the completed turn is mined for new memories on a fire-and-forget call:
- The conversation is rendered to text.
toolandsystemmessages are stripped before the prompt is built so secrets that surface in tool results never reach the extraction LLM. - A small LLM pass extracts up to 8 candidates per turn (max 12,000 input chars) as strict JSON with
{ text, kind, importance }. - Each candidate runs through
looksLikeSecret()(see below) — anything that matches is dropped silently. - Survivors run through near-duplicate dedupe (cosine similarity ≥ 0.92 against existing items).
- New items are upserted into the collection. Eviction kicks in when the collection exceeds
maxItems(default 500, configurable viaNODETOOL_MEMORY_MAX_ITEMS); the lowest-scored items are dropped via paged scans.
The extraction prompt explicitly requires user-explicit content only and forbids storing secrets, generated content, advice, or unconfirmed inferences.
Trust boundary: secret/credential redaction
Every write goes through looksLikeSecret() — extraction, the ltm_remember agent tool, and any direct programmatic caller. Anything matching a credential shape is silently dropped:
- OpenAI-style
sk-…, Anthropicsk-ant-…, GitHubghp_…/gho_…/etc. - Stripe
sk_live_…/pk_live_… - AWS access keys (
AKIA…) andaws_secret_access_keyassignments Authorization/Bearerheaders- PEM-armored private keys
- JWTs (three base64 segments separated by dots)
- Generic
api_key|token|password|secret = valueassignments - DB connection strings with embedded creds (
postgres://user:pass@host)
Patterns are bounded character classes (no ReDoS). False positives are preferable to persisting a real key.
The recalled-memories block also escapes < and > unconditionally so tag-shaped content from a manipulated memory cannot break out of the delimiter.
Per-user isolation
- Memories are stored per
userId. Collection names include the user id; secrets for embedding API calls are resolved viagetSecret(envKey, userId)— never against a different user’s scope. - Within a user, the namespace can be further scoped by
workspaceId(the websocket chat path uses the chat thread id) so memories from one project don’t bleed into another.
Enabling LTM
From the chat UI
The chat composer has a Memory: on / off chip next to the model chip. The setting persists in the renderer’s local store (memoryEnabled in GlobalChatStore) and is sent on every chat message as memory_enabled.
Programmatically
import { createDefaultLongTermMemory } from "@nodetool-ai/agents";
const memory = await createDefaultLongTermMemory({
userId, // required — memory is per-user
namespace: "chat", // logical bucket
workspaceId: threadId, // optional, appended to namespace
extractionProvider: provider, // BaseProvider used to mine memories
extractionModel: model, // model id for the extraction call
enabled: true // explicit opt-in (overrides env)
});
if (memory && memory.isReady()) {
const recalled = await memory.recall(userInput); // hybrid-scored
// ... render recalled into the prompt ...
await memory.rememberConversation(messages); // fire-and-forget
}
createDefaultLongTermMemory returns null when:
enabled !== trueandNODETOOL_MEMORY_ENABLEDis not truthy in the envenabled === false(caller veto)- no embedding model can be resolved (no
OPENAI_API_KEY/GEMINI_API_KEY/OLLAMA_API_URL/NODETOOL_MEMORY_EMBEDDING_MODEL)
Agent tools
Agents can drive the store directly via two auto-attached tools when LTM is wired into their session:
ltm_recall(query, k?)— return ranked memories for a query.ltm_remember(text, kind?, importance?)— persist a fact. The same secret filter applies.
Agent runs do not auto-mine the objective + final result by default. To re-enable that for a specific agent, pass autoPersistMemory: true in AgentOptions.
Configuration reference
| Variable | Default | Effect |
|---|---|---|
NODETOOL_MEMORY_ENABLED |
unset (off) | 1 / true / yes / on makes LTM the default-on for sessions that don’t pass enabled explicitly. 0 / false is a hard global veto. |
NODETOOL_MEMORY_EMBEDDING_MODEL |
auto | Force an embedding model regardless of which provider keys are configured. |
NODETOOL_MEMORY_EMBEDDING_PROVIDER |
auto | Pair with the model override above. |
NODETOOL_MEMORY_MAX_ITEMS |
500 |
Soft cap per (user, namespace) collection. 0 disables eviction. |
NODETOOL_VECTOR_PROVIDER |
sqlite-vec |
Reroutes LTM along with every other vector consumer (Pinecone, Chroma, etc.). |
Storage backend
LTM uses the shared VectorProvider abstraction — there is no SQLite-vec-specific code path. Switching the global vector provider via NODETOOL_VECTOR_PROVIDER reroutes LTM along with every other vector consumer in the system.
Privacy and reset
- Memories live in the configured vector store, encrypted only if the underlying backend encrypts at rest.
- A user’s collection is named
ltm_<user>_<namespace>[:workspaceId]. Dropping the collection (e.g. via the vector store’s admin tool) clears that user’s memories for that scope. - Programmatically:
memory.clear()drops the entire collection;memory.forget(id)removes a single item.
Related
- Agent Memory — in-run scratch space (
context.memory), tool-driven access. Different system from LTM. - Chat Module — chat surfaces and composer.
- Models & Providers — embedding provider configuration.