Question 1

What is Thermal9?

Accepted Answer

Thermal9 is a real-time HTTP service that sits inline with a GPU scheduler (Slurm, Kubernetes, neocloud stacks). Before each workload placement, the scheduler asks Thermal9 whether the placement is thermally safe. The engine returns ALLOW or DENY in under 1.2 milliseconds median, with a full decision proof. No machine learning, no probabilistic safety calls, fully auditable, fail-closed.

Question 2

How fast is Thermal9 in production?

Accepted Answer

Measured on a 32-node DGX H100 SuperPOD reference profile, single-threaded HTTP service on commodity x86: 917 microseconds minimum, 1.17 ms median, 2.14 ms p90, 2.32 ms p95, 3.53 ms p99, 1.37 ms mean. A typical scheduling decision has a 50 ms end-to-end latency budget; Thermal9's p99 uses 7 percent of that budget.

Question 3

How does Thermal9 verify against public NVIDIA data?

Accepted Answer

Reference profiles are derived directly from public NVIDIA datasheets and cross-checked against the independent mlco2/impact public GPU TDP dataset (MIT, 49 GPU entries citing NVIDIA datasheets). On DGX H100, H200, and A100 SuperPOD profiles, the per-GPU TDP deviation is 0.0 percent against the public reference. Reproducible by anyone with the bundled kit and one CLI command.

Question 4

What thermal events does Thermal9 prevent?

Accepted Answer

On a 256-GPU DGX H100 SuperPOD with a 1-hour synthetic workload trace replayed under industry-baseline scheduling, the workload produced 1,104 thermal limit violations and 6 throttle events. Replayed under Thermal9 admission control, the same workload produced 0 thermal limit violations and 0 throttle events — a 100 percent reduction across 1,110 total events.

Question 5

What is the resource footprint?

Accepted Answer

Approximately 20 MB RAM, one CPU core, no GPU required. Thousands of placement decisions per second per core. A small VM or container is plenty. Restart safety: profile loads in under 100 ms at startup. Network: deploy on the same datacenter network as the scheduler; not exposed to the public internet.

Question 6

How does Thermal9 integrate with my scheduler?

Accepted Answer

Thermal9 is an HTTP service. Your scheduler sends a POST /evaluate request with the workload and candidate node, the engine runs in under 1.2 ms median, and returns a JSON response with decision (ALLOW or DENY), gate trace, admission score, and minimum headroom. Drop-in for Slurm, Kubernetes, and neocloud stacks.

Question 7

Is there a way to evaluate Thermal9 before deployment?

Accepted Answer

Yes. We offer a free 30-day shadow assessment to qualified facilities. You capture 14 to 30 days of nvidia-smi or DCGM telemetry under 5 minutes of setup per node, with no agents and no scheduler integration. We deliver a full HTML and JSON report identifying every thermally unsafe placement during the window, with stranded-capacity dollar figures using published cloud rates. The assessment is read-only; Thermal9 does not sit in your scheduling path during the assessment window.

Thermal9

Numbers, with a way to check them.

1,000 sequential evaluate calls. Single-threaded. Commodity x86.

Same workload trace. Same SuperPOD. Played twice.

Validated against public NVIDIA data.

One HTTP call per placement decision.

Who buys this.

Engagement options