Date: 2026-06-08
Status: Approved design, ready for implementation plan
Scope of this iteration: Author a HuggingFace worker Docker image (layered on the
existing nodetool-core image), provide local build/run ergonomics, add a minimal
authentication handshake to the worker, and validate the full UI → TS server → remote
worker path locally (CPU-only). Cloud/GPU deployment is out of scope here and is
captured as recorded constraints for a follow-up iteration.
NodeTool runs Python nodes through a worker process. Today the TypeScript server most
commonly spawns a local worker over a stdio bridge (PythonStdioBridge,
length-prefixed msgpack over stdin/stdout).
There is a second, already-built topology: the worker can run as a long-lived WebSocket server, and the TS server attaches to it remotely. The pieces that already exist:
nodetool-core/Dockerfile builds a Python worker image (micromamba / Ubuntu
22.04 / Python 3.11) that installs nodetool-core from PyPI and runs
python -m nodetool.worker --host 0.0.0.0 --port 7777 — a WebSocket server, with a
ws-handshake healthcheck. EXPOSE 7777, CMD (not ENTRYPOINT).nodetool2 runtime: createPythonBridge() returns a WebsocketPythonBridge
(reconnect + exponential backoff) whenever NODETOOL_WORKER_URL=ws://host:port is
set, instead of spawning a local worker. The TS server (server.ts, http-api.ts,
mcp-server.ts) calls createPythonBridge(), so setting that env var is sufficient
to switch the whole server onto a remote worker.What is missing, and what this iteration delivers:
This is a packaging + wiring + auth task. It is not a protocol change — the msgpack WebSocket protocol and the bridge already exist.
Goals
nodetool-huggingface Docker image that layers on the core worker image and
installs the released nodetool-huggingface package from PyPI.Authorization: Bearer token handshake on the worker, honored by
the TS bridge on connect and reconnect.Non-goals (this iteration)
packages/deploy (no first-class deploy target for the worker).| Decision | Choice |
|---|---|
| Primary first target | Local Docker (CPU-only validation on macOS) |
| What the iteration is about | The HF worker image is the goal; core is the stepping stone |
| HF image base / CUDA strategy | Layer on the core image + PyPI CUDA wheels (torch wheel bundles CUDA; host needs only the NVIDIA driver + container runtime) |
| “Done” boundary | Dockerfile + local run + validation (no packages/deploy changes) |
| Package source | Released PyPI packages (version via build-arg) |
| Worker auth | Build NODETOOL_WORKER_TOKEN handshake now (worker + bridge) |
| Build wrappers | Makefile in nodetool-huggingface |
| Validation | Committed smoke DSL (SentenceSimilarity / all-MiniLM-L6-v2) |
mambaorg/micromamba:jammy
│ (nodetool-core/Dockerfile — exists)
▼
nodetool-core:local FROM base; uv pip install nodetool-core==<NODETOOL_VERSION>
EXPOSE 7777 / HEALTHCHECK / CMD ["python","-m","nodetool.worker","--host","0.0.0.0","--port","7777"]
│ (nodetool-huggingface/Dockerfile — new)
▼
nodetool-hf:local FROM ${CORE_IMAGE}; uv pip install nodetool-huggingface==<HF_VERSION>
(EXPOSE / HEALTHCHECK / CMD inherited from core; HF has no worker module of its own)
The HF image adds only Python packages on top of core. python -m nodetool.worker
comes from the inherited nodetool-core dependency, so the HF image needs no new CMD.
web UI ──ws──► TS server (host :7777) ──► createPythonBridge()
│ NODETOOL_WORKER_URL set?
▼ yes → WebsocketPythonBridge
ws://localhost:8787 ──► hf-worker container (:7777 internal)
Authorization: Bearer <NODETOOL_WORKER_TOKEN>
The TS dev server already binds host 7777, so the worker container must publish on a
different host port. This design uses 8787 → NODETOOL_WORKER_URL=ws://localhost:8787.
All new files live in the nodetool-huggingface sibling repo, except the bridge
change (nodetool2) and the worker auth change (nodetool-core).
nodetool-huggingface/Dockerfile (new)# Layer the HuggingFace node stack on top of the core worker image.
# CORE_IMAGE: locally-built `nodetool-core:local`, or a published
# ghcr.io/nodetool-ai/nodetool:<tag>. HF_VERSION pins the PyPI release.
ARG CORE_IMAGE=nodetool-core:local
FROM ${CORE_IMAGE}
ARG HF_VERSION=0.7.1
USER root
# torch 2.9 + the rest pull CUDA-enabled wheels with a bundled CUDA runtime;
# no nvidia/cuda base needed. The host supplies the NVIDIA driver at run time.
RUN uv pip install --python $VIRTUAL_ENV --index-url https://pypi.org/simple \
"nodetool-huggingface==${HF_VERSION}" \
&& rm -rf /root/.cache/uv /root/.cache/pip /tmp/* /var/tmp/*
# EXPOSE 7777, HEALTHCHECK, CMD all inherited from the core image.
Notes:
ocr, hunyuan3d, triposg, …) are
not installed; if needed later they become additional build-args / image variants.nodetool-huggingface/docker-compose.yaml (new)hf-worker:
build: { context: ., args: { CORE_IMAGE, HF_VERSION } }ports: ["8787:7777"]volumes: ["hf-cache:/app/huggingface"] (named volume; HF_HOME=/app/huggingface)environment: pass through NODETOOL_WORKER_TOKEN (and any provider/runtime env
the worker needs).gpu compose profile adds deploy.resources.reservations.devices (NVIDIA) /
gpus: all. Not activated on macOS — the default (CPU) service is used locally.hf-cache declared at the bottom.nodetool-huggingface/Makefile (new)| Target | Action |
|---|---|
build-core |
docker build -t nodetool-core:local ../nodetool-core (or document pulling ghcr.io/nodetool-ai/nodetool:<tag>) |
build-hf |
docker build -t nodetool-hf:local --build-arg CORE_IMAGE=nodetool-core:local . |
up |
docker compose up (CPU) — runs the worker on localhost:8787 |
down |
docker compose down |
nodetool-huggingface/docs/worker-deployment.md (new)Build → run → wire → validate steps; the cloud-constraints section (§8); the
CPU-only-on-macOS and host-port-8787 caveats; the NODETOOL_WORKER_TOKEN setup.
nodetool-huggingface/examples/hf-worker-smoke.{ts,json} (new)A minimal workflow using the HuggingFace SentenceSimilarity node with
sentence-transformers/all-MiniLM-L6-v2 (~80 MB, CPU-fast) — a feature-extraction node
that returns an np_array embedding per input string. Committed for repeatable
validation: the .ts DSL form documents the graph; the exported .json form is the
artifact loaded into the running TS server (UI import or the server’s run API).
Fallback node: FillMask / distilbert-base-uncased.
Wiring note (updated): originally the remote WebsocketPythonBridge was created
only by the websocket server, so the CLI/DSL local path couldn’t run Python nodes.
That gap is now closed — connectPythonBridgeForGraph / resolvePythonNodeExecutor in
@nodetool-ai/runtime wire the bridge into the in-process runners, so both
nodetool workflows run <graph.json> and nodetool run <file.ts> execute Python (incl.
HuggingFace worker) nodes: remote when NODETOOL_WORKER_URL is set, else a local stdio
worker. The validation graph can therefore be driven by either a running server
(web UI) or the CLI directly.
NODETOOL_WORKER_TOKEN)A shared-secret bearer token, opt-in, identical env name on both ends.
nodetool-corestart_server (server.py), pass a process_request hook to
websockets.asyncio.server.serve(...).NODETOOL_WORKER_TOKEN.Authorization: Bearer <token>. Compare with
hmac.compare_digest (constant time). On mismatch/absence, return an HTTP 401
Response from process_request, which aborts the handshake before any frame.--host / --port default from env (NODETOOL_WORKER_HOST,
NODETOOL_WORKER_PORT) so the >70000 identity-port trick is possible on cloud hosts.
Optional; not required for local.nodetool2 python-websocket-bridge.tsws WebSocket constructor at the connect site already takes an options object
({ maxPayload }). Add headers: { Authorization: 'Bearer ' + token } when a token
is configured.options.workerToken ?? process.env.NODETOOL_WORKER_TOKEN, threaded
through PythonBridgeOptions and createPythonBridge() (alongside wsUrl).ws headers option; websockets
process_request + request.headers).Performed end-to-end on macOS Docker (CPU):
make build-core then make build-hf succeed.make up; the container’s ws-handshake healthcheck reports
healthy.NODETOOL_WORKER_TOKEN=secret.NODETOOL_WORKER_URL=ws://localhost:8787 NODETOOL_WORKER_TOKEN=secret npm run dev:server;
server logs a successful discover + worker.status; HF node metadata is present.NODETOOL_WORKER_URL=ws://localhost:8787 NODETOOL_WORKER_TOKEN=secret nodetool workflows run examples/hf-worker-smoke.json)
or by loading it into the running server (web UI). The HF node runs on the
container and returns an embedding (np_array) — SentenceSimilarity is a
feature-extraction node (one embedding per input string), not a cross-string scorer,
so the smoke graph previews the embedding(s). First run downloads the model into the
hf-cache volume; subsequent runs reuse it.CUDA-only nodes (most diffusers/3D) are expected to fail locally and are deferred to the GPU iteration.
Validated against current (2026) RunPod and Vast.ai docs. The image design holds on
both (PyPI CUDA wheels + host driver; CMD runs on boot). The deltas below are
exposure/ops concerns that the deploy iteration must satisfy.
*.proxy.runpod.net) is Cloudflare-fronted: 100 MB body
cap, 100 s timeout, WS disconnects observed at ~1.8 MiB. Unusable for large
binary frames. → Use Expose TCP Ports (direct, public IP, random external port).-p is direct TCP passthrough (good). Their Caddy auth-proxy (when
external≠internal port) inserts an HTTP hop → use an identity port map (e.g. a
port >70000) for raw passthrough.ASSET_BUCKET / S3 (the worker
already supports asset env) so WebSocket frames stay small. This both relieves the
frame-size limits and reduces egress. The 256 MB bridge frame ceiling should be
treated as a backstop, not a routine payload size.7777 to a random external port on a public IP, known
only after boot (RunPod: RUNPOD_PUBLIC_IP / RUNPOD_TCP_PORT_*; Vast: “IP Port
Info” / API). NODETOOL_WORKER_URL must therefore be resolved at runtime from the
provider API, not hardcoded.WebsocketPythonBridge reconnects to a fixed URL. For cloud,
reconnect must re-resolve host:port (the instance/IP can change). This is
deploy-iteration work, explicitly out of scope now.NODETOOL_WORKER_TOKEN handshake (built this iteration) is the app-level gate. For
public exposure it must be combined with TLS (wss) — terminated by you, since
neither platform’s usable path gives free TLS at these frame sizes — or, preferably, a
private tunnel (Tailscale / WireGuard / SSH) with the worker bound to the tunnel
interface. Never expose the raw token-only port over plain ws:// on the public
internet.allowedCudaVersions / UI filter;
Vast cuda_vers>=… driver_version>=…), else the container can land on an old-driver
host and fail to use the GPU.CMD
runs as PID 1; SSH/Jupyter modes replace it → use onstart)./workspace (set
HF_HOME=/workspace/.cache/huggingface); Vast Volume at /data (host-pinned, lost if
the host vanishes); Vast container disk is wiped on destroy. Treat the HF cache as a
rebuildable warm cache.| Risk | Mitigation |
|---|---|
| HF image is very large; slow first build | Expected; layer on core for cache reuse; document build time. |
| macOS Docker can’t exercise GPU paths | Validate CPU-capable nodes only; GPU validation deferred to the cloud iteration. |
| First-run model download latency | Healthcheck start-period; hf-cache volume persists models across restarts. |
| Token mistakenly left unset in a cloud deploy | Document that unset = open; the cloud iteration must require it (and a tunnel/TLS). |
| Bridge reconnects to a stale URL after instance change | Known gap; re-resolution is deploy-iteration work, recorded in §8.2. |
nodetool run couldn’t route to the worker |
The CLI/DSL runners now wire the Python bridge (connectPythonBridgeForGraph), so nodetool workflows run / nodetool run execute Python nodes against a remote (or local stdio) worker — §5.5. |
packages/deploy (build → push → run → wire).NODETOOL_WORKER_URL resolution + reconnect-with-re-resolution.