LLM Gateways, Gateway Layer, NuClide Stack

What it is

An LLM gateway is a reverse proxy for model APIs. The operator wires up keys for OpenAI, Anthropic, Google, Mistral, their own Ollama box, and a handful of fine-tunes; the gateway exposes a single OpenAI-compatible endpoint and handles routing, rate-limiting, fallback, observability, and cost accounting. LiteLLM is the Python-native one (most common in research); OneAPI is the Go/Chinese-ecosystem one (most common in commercial deployments). Portkey, Helicone-Proxy, and APISIX-AI sit in the same niche.

What goes wrong

The gateway holds the operator’s entire AI billing relationship. If it’s exposed without auth, an attacker can route arbitrary prompts through any of the configured providers: burning the operator’s quota, exfiltrating embedded prompts that may contain customer data, and racking up usage charges on premium models. Worse: the admin panel typically lists every model alias, the keys behind them, and the per-user/per-team budget. The attacker learns the operator’s whole AI org chart before issuing a single request.

How we test

We confirm the gateway by its /v1/models response shape (LiteLLM’s is distinct from a vanilla OpenAI proxy), then check /health/readiness and /key/info for admin-key reachability. The key endpoint, when unauthenticated, returns the operator’s full virtual-key inventory including budget caps and team assignments. We do not issue paid completions. The catalogue is enough to demonstrate the quota-drain risk and identify the operator.

LLM Gateways

What it is

What goes wrong

How we test

Research

Cross-cloud surveys

Cat-05: LiteLLM Gateway Survey — Open Proxies Exposing Commercial LLM API Keys

AI Gateways Population Survey: Cat-32 (2026-06-01)

LLM Safety / Guardrail / Policy Engine population survey

LLM gateway / proxy population survey, 2026-05-17

VisorBishop iter-5: LiteLLM Proxy + Argilla + Promptfoo (gateway + annotation + eval tiers)

VisorBishop iter-6: Full LiteLLM 5,391-host population sweep (283 unauth LLMjacking primitives)

VisorBishop Phase 5: Three primitives that turn 492 critical hosts into an impact narrative

LLM Gateways / OpenAI-Compatible Proxies: Cross-Cloud Survey (2026-05)

Field cases

Chinese commercial Claude-reseller ecosystem: 32 pooled Anthropic accounts across six relays, ~13.92B tokens served via claude-relay-service OSS

Hetzner LiteLLM proxy fronting Ollama-cpu + 4 RunPod GPU pods, fully unauth (65.108.197.157)

Coordinated disclosures

1. Enable LiteLLM's master-key auth (one env var):

LiteLLM example: set master key + virtual keys per consumer

In LiteLLM config or env:

Other categories in this layer

RAG Frameworks

Rerankers