Vector Storage

NodeTool stores embeddings through a pluggable provider abstraction in @nodetool-ai/vectorstore. The same VectorProvider / VectorCollection interface backs the local SQLite-vec store, Supabase/pgvector, and (stubbed) Pinecone, so every caller — base nodes, the tRPC collections router, the file-upload REST handler, and agent tools — works identically against any backend.

Choosing a Backend

Backend	When to use	Setup
`sqlite-vec` (default)	Local-first, single-machine deployments. Embedded, zero-config.	None — first use creates `vectorstore.db` in the NodeTool data dir.
`supabase`	Hosted multi-user deployments backed by a Supabase / Postgres+pgvector project.	Install `packages/vectorstore/sql/supabase-migration.sql` once on the project, then set `SUPABASE_URL` + `SUPABASE_KEY`.
`pinecone`	Stub — surfaces a clear `notImplemented` error today.	—

Selection happens in createVectorProviderFromEnv() (packages/vectorstore/src/provider-factory.ts), which is called by getDefaultVectorProvider() on first use.

Configuration

The active backend is chosen by NODETOOL_VECTOR_PROVIDER. Defaults to sqlite-vec if unset.

`NODETOOL_VECTOR_PROVIDER=sqlite-vec` (default)

Env var	Purpose
`VECTORSTORE_DB_PATH`	Override the SQLite database file. Defaults to `vectorstore.db` in the NodeTool data dir (`getDefaultVectorstoreDbPath()` in `@nodetool-ai/config`).

The provider re-uses the process-wide getDefaultStore() so any code that touches the lower-level SqliteVecStore API and any code that goes through getDefaultVectorProvider() share one connection.

`NODETOOL_VECTOR_PROVIDER=supabase`

Env var	Purpose
`SUPABASE_URL`	Supabase project URL (e.g. `https://xyz.supabase.co`).
`SUPABASE_KEY` or `SUPABASE_SERVICE_ROLE_KEY`	API key. Service-role is recommended server-side; anon-key requires RLS policies that grant the role access to the registry/records tables.
`NODETOOL_VECTOR_SCHEMA`	Optional Postgres schema; defaults to `public`.

These match the env vars the Supabase storage backend already uses, so a single Supabase project can back both file storage and vector data.

One-time SQL install

The provider does no DDL itself — it expects the schema to exist already. Run packages/vectorstore/sql/supabase-migration.sql once via the Supabase SQL editor or supabase migration. It creates:

nodetool_vec_collections — registry table (name PK, metadata jsonb, dimension int, metric text).
nodetool_vec_records — shared records table (collection FK, id, document, embedding vector, uri, metadata jsonb). The FK has ON UPDATE CASCADE so renaming a collection follows automatically — no record rewrite.
nodetool_vec_match — RPC for similarity search. Required because PostgREST cannot use pgvector operators (<=>, <->, <#>) directly.

Once the dimension is fixed for a collection, you can add an ivfflat index for faster search — the migration file documents the exact statement.

Filter Support Matrix

VectorFilter is a MongoDB-style predicate language. Adapters translate the subset they can support and must throw UnsupportedFilterError for anything else, rather than silently dropping conditions.

Operator	sqlite-vec	supabase
`{field: scalar}` (equality)	—	yes (via jsonb `@>`)
`{field: { $eq: v }}`	—	yes
`{field: { $ne \| $gt \| $gte \| $lt \| $lte: v }}`	—	—
`{field: { $in: [...] }}`	—	—
`{$and: [...]}`	—	yes (flattens into the metadata predicate)
`{$or: [...]}`	—	—
`{$document: { $contains: "text" }}`	yes	yes
`{$document: { $or: [{$contains}, ...] }}`	yes	—

If you need a richer filter on Supabase, extend the SQL function and the splitFilter() translator in packages/vectorstore/src/supabase-provider.ts together — the JS-side translator is the contract.

Programmatic API

import {
  getDefaultVectorProvider,
  type VectorCollection
} from "@nodetool-ai/vectorstore";

const provider = getDefaultVectorProvider();

// Create / get a collection.
const docs = await provider.getOrCreateCollection({
  name: "support_articles",
  dimension: 1536,
  metric: "cosine",
  metadata: { embedding_model: "text-embedding-3-small" }
});

// Upsert records — embeddings can be pre-computed or generated by the
// collection's embedding function.
await docs.upsert([
  { id: "kb-101", document: "How to reset your password", uri: "https://…" }
]);

// Vector + metadata + document filtering.
const matches = await docs.query({
  text: "I forgot my password",
  topK: 5,
  filter: { $document: { $contains: "password" } }
});

For tests or migrations, swap the active provider with setDefaultVectorProvider(...) (and resetDefaultVectorProvider() to release it). When constructing SupabaseProvider directly, pass a pre-built client to mock the network layer:

import { SupabaseProvider } from "@nodetool-ai/vectorstore";

const provider = new SupabaseProvider({
  client: fakeSupabaseClient   // any object satisfying SupabaseClient
});

Where it’s used

Workflow nodes — every node under vector.* in @nodetool-ai/base-nodes (packages/base-nodes/src/nodes/vector.ts) goes through the default provider.
Collections REST/tRPC API — packages/websocket/src/trpc/routers/collections.ts and packages/websocket/src/collection-api.ts.
Agent tools — VecTextSearchTool, VecHybridSearchTool, VecRecursiveSplitAndIndexTool, etc. in packages/agents/src/tools/vector-tools.ts.

Adding a New Backend

Implement VectorProvider and VectorCollection (packages/vectorstore/src/provider.ts) in a new <backend>-provider.ts.
Translate VectorFilter to the backend’s query language. Throw UnsupportedFilterError for predicates you can’t express — never silently drop.
Wire it into createVectorProviderFromEnv() in provider-factory.ts under a new kind.
Add a tests file under tests/ that injects a mock client (no live backend).
Re-export the provider class and its options interface from src/index.ts.

The existing SqliteVecProvider and SupabaseProvider are good references — keep adapter state mutable so modify({ name, metadata }) is observable on the same handle, and surface uri on every read path (it’s part of the VectorRecord / VectorMatch contract).

Edit this page on GitHub