Run models locally, through cloud APIs with your own keys, or both in the same graph.
Local vs. cloud
Local
- Data stays on your disk
- Free after download
- Works offline
- 4–20 GB per model; needs a capable GPU or Apple Silicon
Cloud (BYOK)
- No download
- Runs on the provider’s hardware
- Latest model releases
- Billed by the provider, at the provider’s price — no NodeTool markup
- Requires internet; data goes to the provider
Mixed
Pick the best provider per node:
- ASR (Whisper) — local for sensitive audio
- Image generation — Flux locally for control, FAL/KIE cloud for speed
- Document processing — local for confidential files
Cloud models
Available through provider nodes:
Top 3D Generation Models
| Model | Provider | Capabilities | Key Features |
|---|---|---|---|
| Hunyuan3D V2/3.0 | Hunyuan | T2M/I2M | High-quality 3D meshes and textures |
| Trellis 2 | Trellis | T2M/I2M | Consistent geometry with textured output |
| TripoSR | Tripo | I2M | Fast image-to-3D reconstruction |
| Shap-E | OpenAI | T2M/I2M | Text or image prompt to 3D assets |
| Point-E | OpenAI | T2M | Point cloud generation |
| Meshy AI | Meshy | T2M/I2M | Textured mesh generation |
| Rodin AI | Rodin | T2M/I2M | High fidelity 3D creation |
Top Video Generation Models
| Model | Provider | Capabilities | Key Features |
|---|---|---|---|
| OpenAI Sora 2 Pro | T2V/I2V up to 15s | Realistic motion, refined physics, synchronized native audio, 1080p output | |
| Google Veo 3.1 | T2V/I2V with references | Upgraded realistic motion, extended clip length, multi-image references, native 1080p with synced audio | |
| ByteDance Seedance 2.0 | ByteDance | T2V/I2V | High-quality cinematic video with stable characters and smooth motion |
| Runway Gen-3 Alpha / Aleph | Runway | T2V/I2V/Extend | Professional-grade video generation with precise motion control and high fidelity |
| Luma | Luma AI | Video editing/modification | AI-powered video modification and creative editing |
| xAI Grok Imagine | T2V/I2V/T2I | Multimodal text/image to short video with coherent motion and synchronized audio; also text-to-image | |
| Alibaba Wan 2.6 | Multi-shot T2V/I2V | Affordable 1080p with stable characters and native audio; reference-guided generation | |
| MiniMax Hailuo 2.3 | High-fidelity T2V/I2V | Expressive characters, complex motion and lighting effects | |
| Kling 3.0 | T2V/I2V with audio | Text/image to synchronized video with speech, ambient sound, and effects; strong audio-visual coherence |
Top Image Generation Models
| Model | Provider | Capabilities | Key Features |
|---|---|---|---|
| Black Forest Labs FLUX.2 | T2I with control | Photoreal images, multi-reference consistency, accurate text rendering, flexible control | |
| Google Nano Banana 2.0 | High-res T2I/Edit | Sharper 2K output, 4K upscaling, improved text rendering, better character consistency | |
| GPT Image 2 | T2I/Edit | High-quality photorealistic generation and instruction-based editing | |
| Ideogram V3 | Ideogram | T2I/Edit | Exceptional typography rendering and artistic style control |
| Z-Image Turbo | Z-AI | T2I | Fast high-quality text-to-image with strong prompt adherence |
| Seedream 4.5 | ByteDance | T2I/Edit | High-fidelity image generation and instruction-based editing |
Top Music & Audio Generation Models
| Model | Provider | Capabilities | Key Features |
|---|---|---|---|
| Suno | Suno | Music generation/extension | Full song creation from text prompts, with extend, cover, and remix features |
| ElevenLabs V3 Dialogue | Text-to-dialogue | Multi-speaker dialogue generation with emotional control |
Using These Models
Access these models through NodeTool’s generic nodes:
- For Video: Use
nodetool.video.TextToVideoornodetool.video.ImageToVideo - For Images: Use
nodetool.image.TextToImage - For 3D: Use
nodetool.3d.TextTo3Dornodetool.3d.ImageTo3D - For Music: Use kie.ai-backed Suno nodes (Suno Generate, Extend, Cover)
- Select Provider: Click the model dropdown in the node properties
- Configure API: Add provider API keys in
Settings → Providers
Access via kie.ai (recommended for broad model support): Many of these models are available through kie.ai, an AI provider aggregator that often offers competitive or lower pricing compared to upstream providers.
- Configure using
KIE_API_KEYinSettings → Providers
Access via fal.ai:
- Configure using
FAL_API_KEYinSettings → Providers
Cost Considerations: Cloud models typically charge per generation. Check each provider’s pricing before extensive use. Local models are free after download but require capable hardware.
Getting Started
Option 1: Start with Local Models (Recommended)
- Open Models → Model Manager in NodeTool
- Install these starter models:
- GPT-OSS (~4 GB) – Text generation and chat
- Flux (~12 GB) – High-quality image generation
- Wait for downloads to complete
- Run templates – they’ll work offline!
Option 2: Start with Cloud Providers
- Get an API key from a provider:
- In NodeTool, go to Settings → Providers
- Paste your API key
- Select the provider when using AI nodes
Detailed Guides
General
- Models Manager – Download and manage AI models
- Getting Started – First workflow
Local AI
- Supported Models – List of local models (llama.cpp, MLX, Whisper, Flux)
Cloud AI
- Providers Guide – Set up OpenAI, Anthropic, Google
- HuggingFace Integration – Access 500,000+ models
Advanced
- Proxy & Self-Hosted – Secure deployments
- Deployment Guide – Cloud infrastructure