NodeTool stores embeddings through a pluggable provider abstraction in @nodetool-ai/vectorstore. The same VectorProvider / VectorCollection interface backs the local SQLite-vec store, Supabase/pgvector, and (stubbed) Pinecone, so every caller — base nodes, the tRPC collections router, the file-upload REST handler, and agent tools — works identically against any backend.

Choosing a Backend

Backend When to use Setup
sqlite-vec (default) Local-first, single-machine deployments. Embedded, zero-config. None — first use creates vectorstore.db in the NodeTool data dir.
supabase Hosted multi-user deployments backed by a Supabase / Postgres+pgvector project. Install packages/vectorstore/sql/supabase-migration.sql once on the project, then set SUPABASE_URL + SUPABASE_KEY.
pinecone Stub — surfaces a clear notImplemented error today.

Selection happens in createVectorProviderFromEnv() (packages/vectorstore/src/provider-factory.ts), which is called by getDefaultVectorProvider() on first use.

Configuration

The active backend is chosen by NODETOOL_VECTOR_PROVIDER. Defaults to sqlite-vec if unset.

NODETOOL_VECTOR_PROVIDER=sqlite-vec (default)

Env var Purpose
VECTORSTORE_DB_PATH Override the SQLite database file. Defaults to vectorstore.db in the NodeTool data dir (getDefaultVectorstoreDbPath() in @nodetool-ai/config).

The provider re-uses the process-wide getDefaultStore() so any code that touches the lower-level SqliteVecStore API and any code that goes through getDefaultVectorProvider() share one connection.

NODETOOL_VECTOR_PROVIDER=supabase

Env var Purpose
SUPABASE_URL Supabase project URL (e.g. https://xyz.supabase.co).
SUPABASE_KEY or SUPABASE_SERVICE_ROLE_KEY API key. Service-role is recommended server-side; anon-key requires RLS policies that grant the role access to the registry/records tables.
NODETOOL_VECTOR_SCHEMA Optional Postgres schema; defaults to public.

These match the env vars the Supabase storage backend already uses, so a single Supabase project can back both file storage and vector data.

One-time SQL install

The provider does no DDL itself — it expects the schema to exist already. Run packages/vectorstore/sql/supabase-migration.sql once via the Supabase SQL editor or supabase migration. It creates:

  • nodetool_vec_collections — registry table (name PK, metadata jsonb, dimension int, metric text).
  • nodetool_vec_records — shared records table (collection FK, id, document, embedding vector, uri, metadata jsonb). The FK has ON UPDATE CASCADE so renaming a collection follows automatically — no record rewrite.
  • nodetool_vec_match — RPC for similarity search. Required because PostgREST cannot use pgvector operators (<=>, <->, <#>) directly.

Once the dimension is fixed for a collection, you can add an ivfflat index for faster search — the migration file documents the exact statement.

Filter Support Matrix

VectorFilter is a MongoDB-style predicate language. Adapters translate the subset they can support and must throw UnsupportedFilterError for anything else, rather than silently dropping conditions.

Operator sqlite-vec supabase
{field: scalar} (equality) yes (via jsonb @>)
{field: { $eq: v }} yes
{field: { $ne | $gt | $gte | $lt | $lte: v }}
{field: { $in: [...] }}
{$and: [...]} yes (flattens into the metadata predicate)
{$or: [...]}
{$document: { $contains: "text" }} yes yes
{$document: { $or: [{$contains}, ...] }} yes

If you need a richer filter on Supabase, extend the SQL function and the splitFilter() translator in packages/vectorstore/src/supabase-provider.ts together — the JS-side translator is the contract.

Programmatic API

import {
  getDefaultVectorProvider,
  type VectorCollection
} from "@nodetool-ai/vectorstore";

const provider = getDefaultVectorProvider();

// Create / get a collection.
const docs = await provider.getOrCreateCollection({
  name: "support_articles",
  dimension: 1536,
  metric: "cosine",
  metadata: { embedding_model: "text-embedding-3-small" }
});

// Upsert records — embeddings can be pre-computed or generated by the
// collection's embedding function.
await docs.upsert([
  { id: "kb-101", document: "How to reset your password", uri: "https://…" }
]);

// Vector + metadata + document filtering.
const matches = await docs.query({
  text: "I forgot my password",
  topK: 5,
  filter: { $document: { $contains: "password" } }
});

For tests or migrations, swap the active provider with setDefaultVectorProvider(...) (and resetDefaultVectorProvider() to release it). When constructing SupabaseProvider directly, pass a pre-built client to mock the network layer:

import { SupabaseProvider } from "@nodetool-ai/vectorstore";

const provider = new SupabaseProvider({
  client: fakeSupabaseClient   // any object satisfying SupabaseClient
});

Where it’s used

  • Workflow nodes — every node under vector.* in @nodetool-ai/base-nodes (packages/base-nodes/src/nodes/vector.ts) goes through the default provider.
  • Collections REST/tRPC APIpackages/websocket/src/trpc/routers/collections.ts and packages/websocket/src/collection-api.ts.
  • Agent toolsVecTextSearchTool, VecHybridSearchTool, VecRecursiveSplitAndIndexTool, etc. in packages/agents/src/tools/vector-tools.ts.

Adding a New Backend

  1. Implement VectorProvider and VectorCollection (packages/vectorstore/src/provider.ts) in a new <backend>-provider.ts.
  2. Translate VectorFilter to the backend’s query language. Throw UnsupportedFilterError for predicates you can’t express — never silently drop.
  3. Wire it into createVectorProviderFromEnv() in provider-factory.ts under a new kind.
  4. Add a tests file under tests/ that injects a mock client (no live backend).
  5. Re-export the provider class and its options interface from src/index.ts.

The existing SqliteVecProvider and SupabaseProvider are good references — keep adapter state mutable so modify({ name, metadata }) is observable on the same handle, and surface uri on every read path (it’s part of the VectorRecord / VectorMatch contract).