NE

NeuralWatt

Serverless

Hosted, OpenAI-compatible inference with energy-based pricing: pay a flat $5.00/kWh for actual GPU energy consumed (up to 95% cheaper on efficient models) instead of per-token, with real per-request energy metrics on every response. Standard per-token pricing and kWh-based monthly subscriptions ($20/$50/$100) are also available. Powered by Neuralwatt Optimize, also available for self-hosted vLLM via Neuralwatt Deploy.

Features

OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode

Compliance

SOC 2
HIPAA
GDPR

Pricing Model

per token

Details

Models 5
API Base https://api.neuralwatt.com/v1
LLM Inference

Model Catalog (5)

Model Type Input $/1M Output $/1M Context Speed Status
GLM 5.2
Z.ai
llm $1.45 $4.50 1.0M
Kimi K2.6
Moonshot AI
llm $0.690 $3.22 262k
Kimi K2.7 Code
Moonshot AI
code $0.950 $4.00 262k
Qwen3.5 397B A17B
Alibaba
llm $0.690 $4.14 262k
Qwen3.6 35B A3B
Alibaba
llm $0.290 $1.15 131k