NeuralWatt

Serverless

Hosted, OpenAI-compatible inference with energy-based pricing: pay a flat $5.00/kWh for actual GPU energy consumed (up to 95% cheaper on efficient models) instead of per-token, with real per-request energy metrics on every response. Standard per-token pricing and kWh-based monthly subscriptions ($20/$50/$100) are also available. Powered by Neuralwatt Optimize, also available for self-hosted vLLM via Neuralwatt Deploy.

Features

OpenAI Compatible

Streaming

Batching

Fine-tuning

Embeddings

Vision

Audio

Function Calling

JSON Mode

Compliance

SOC 2

HIPAA

GDPR

Pricing Model

per token

Details

Models 5

API Base https://api.neuralwatt.com/v1

LLM Inference

Model Catalog (5)

Model	Type	Input $/1M	Output $/1M	Context	Speed
GLM 5.2 Z.ai	llm	$1.45	$4.50	1.0M	—
Kimi K2.6 Moonshot AI	llm	$0.690	$3.22	262k	—
Kimi K2.7 Code Moonshot AI	code	$0.950	$4.00	262k	—
Qwen3.5 397B A17B Alibaba	llm	$0.690	$4.14	262k	—
Qwen3.6 35B A3B Alibaba	llm	$0.290	$1.15	131k	—