NE
NeuralWatt
Serverless
Hosted, OpenAI-compatible inference with energy-based pricing: pay a flat $5.00/kWh for actual GPU energy consumed (up to 95% cheaper on efficient models) instead of per-token, with real per-request energy metrics on every response. Standard per-token pricing and kWh-based monthly subscriptions ($20/$50/$100) are also available. Powered by Neuralwatt Optimize, also available for self-hosted vLLM via Neuralwatt Deploy.
Features
OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode
Compliance
SOC 2
HIPAA
GDPR
Pricing Model
per tokenDetails
Models 5
API Base
https://api.neuralwatt.com/v1 LLM Inference
Model Catalog (5)
| Model | Type | Input $/1M | Output $/1M | Context | Speed | Status |
|---|---|---|---|---|---|---|
| GLM 5.2 Z.ai | llm | $1.45 | $4.50 | 1.0M | — | |
| Kimi K2.6 Moonshot AI | llm | $0.690 | $3.22 | 262k | — | |
| Kimi K2.7 Code Moonshot AI | code | $0.950 | $4.00 | 262k | — | |
| Qwen3.5 397B A17B Alibaba | llm | $0.690 | $4.14 | 262k | — | |
| Qwen3.6 35B A3B Alibaba | llm | $0.290 | $1.15 | 131k | — |