24. LLM Safety / Guardrails / Policy Engines / Moderation, NuClide Reference

Section created: 2026-05-19. Companion to §23 (eval / red-team self-hosted). This section covers the guardrail and policy layer that sits in front of / after LLM calls, plus content-moderation platforms used as conversational safety filters.

The category is split by deployment mode:

Subclass	Examples	Deployment mode	Shodan visibility
LLM-native guardrails (self-hostable)	Guardrails AI, NeMo Guardrails, Lakera Guard self-hosted, LlamaGuard (deployed via TGI/vLLM/Ollama), Garak REST	Self-hosted HTTP server	Direct, T1/T2
General-purpose policy engines	Open Policy Agent (OPA) on port 8181, Styra DAS Edge agent	Self-hosted HTTP server	Direct, T2
LLMOps / observability with safety dimension	W&B Weave, Humanloop, Gantry (now shut down), LangSmith (caller-side)	SaaS-mostly	Indirect (caller-side dorks against apps using them)
Content moderation (pre-LLM-era, now used as filters)	Spectrum Labs, ActiveFence, Two Hat (Microsoft), Hive Moderation	SaaS-only	Indirect (caller-side dorks against apps using them)
AI governance / red-team commercial	CalypsoAI, Protect AI, HiddenLayer, Robust Intelligence	SaaS-mostly	Indirect (caller-side dorks against customer-deployed apps)

Methodology lesson from §23 (carried forward): Single-word substring matching on response bodies ("garak", "guardrails", "weave") fires on Japanese anime filenames, common English words, and unrelated platforms at population scale. Conjunctive matching required. Every query below uses http.html / http.title scoping; bare-string dorks are documented but starred as (noisy).

1. LLM-Native Guardrails (self-hostable)

Guardrails AI (`guardrails serve`)

The open-source guardrails-ai package ships a server mode. Default port 8000. Validates LLM output against operator-defined “guards” (Pydantic-style contracts).

Shodan Query	Notes
`http.html:"guardrails ai"`	Product-name body match. Verified 2026-05-19: 6 hits.
`http.html:"/guards"`	Endpoint path. 1,048 hits, noisy (matches any `/guards/` UI route in any app). Use as candidate set; verify via `/api/guards` response shape (JSON array).
`http.html:"validate_using_guards"`	Package-specific helper string.
`http.html:"guardrails-api"`	Alternative package identifier.
`http.html:"guardrails-ai"`	Hyphenated variant.
`http.html:"guardrails_server"`	Server-mode identifier.
`http.html:"guardrails-server"`	Alternate.
`http.html:"from guardrails import"`	Python import surfaced in code-display routes.
`http.html:"@guardrails.com"`	Guardrails Hub email convention.
`http.html:"hub.guardrailsai.com"`	Guardrails Hub URL in customer apps.
`http.html:"validate-many"`	Server route.
`http.html:"GuardrailsValidator"`	Validator class name.
`port:8000 "guardrails"`	Port + bare string (noisy).
`port:8000 http.html:"validators"`	OPA-similar route on default port.
`port:8080 http.html:"guardrails"`	Alt port.
`port:5000 http.html:"guardrails"`	Alt port.
`port:443 http.html:"guardrails"`	TLS-fronted.
`hostname:"guardrails"`	rDNS pattern.
`hostname:"guardrails-ai"`	Hyphenated rDNS.
`ssl.cert.subject.cn:"guardrails"`	TLS cert CN.
`ssl.cert.subject.cn:"guardrails-ai"`	TLS cert CN variant.
`ssl.cert.subject.cn:"guardrailsai"`	Vendor CN.
`org:"Guardrails AI"`	Shodan ORG-tag if assigned.
`http.headers.x-powered-by:"guardrails"`	X-Powered-By header.

Stage 2 verify probe: GET /api/guards returns JSON array of guard definitions when present. GET /openapi.json returns FastAPI schema with /guards route family. Both confirm Guardrails AI server vs the /guards/ noise class.

NeMo Guardrails (NVIDIA)

nemoguardrails server default port 8000. CLI-dominant ecosystem; rare in HTTP-server mode.

Shodan Query	Notes
`http.html:"nemo-guardrails"`	Package identifier in source. Verified 2026-05-19: 3 hits.
`http.html:"nemoguardrails"`	No-hyphen variant.
`http.html:"NeMo Guardrails"`	Product-name body match.
`http.html:"NVIDIA NeMo"`	NVIDIA family banner.
`http.html:"/v1/rails/configs"`	NeMo rails-config endpoint path. Was 0 hits 2026-05-19 (rare deployment).
`http.html:"/v1/rails/generate"`	NeMo rails-generate endpoint.
`http.html:"/v1/rails"`	Parent path.
`http.html:"colang"`	NeMo’s policy DSL name.
`http.html:".co"`	colang file extension reference.
`http.html:"jailbreak_detection"`	NeMo rail class.
`http.html:"facts.co"`	NeMo example rail filename.
`http.html:"hallucination_check"`	NeMo rail.
`port:8000 http.html:"rails"`	Default port + rails marker.
`port:8000 http.html:"colang"`	Port + DSL.
`port:8080 http.html:"nemo"`	Alt port + vendor.
`port:8443 http.html:"nemo"`	TLS-fronted.
`"NeMo Guardrails"`	Product-name any-field (noisy).
`hostname:"guardrails"`	rDNS pattern (shared with Guardrails AI; verify with platform-specific probe).
`hostname:"nemo"`	NVIDIA-NeMo rDNS.
`ssl.cert.subject.cn:"nemo"`	TLS cert CN.
`ssl.cert.subject.cn:"nemoguardrails"`	TLS cert CN exact.
`org:"NVIDIA"`	NVIDIA-deployed (broad).

Stage 2 verify probe: GET /v1/rails/configs returns JSON array of rail names. aimap fingerprint already present.

Lakera Guard (self-hosted variant)

Lakera’s commercial product is API-only; the self-hosted variant ships a Server: lakera header. Caller-side dorks find customer apps integrating the SaaS.

Shodan Query	Notes
`Server: lakera`	Header-based; high precision when matched. Verified 2026-05-19: 1 hit.
`http.html:"lakera-guard"`	Body marker. Verified 2026-05-19: 8 hits.
`http.html:"lakera"`	Vendor-name bare (broad, includes caller-side).
`http.html:"lakera.ai"`	Vendor domain in customer HTML.
`http.html:"api.lakera.ai"`	API URL in customer apps.
`http.html:"lakera-chrome"`	Lakera browser-extension reference.
`http.html:"/v1/guard"`	Lakera guard endpoint path.
`http.html:"/v2/guard"`	Lakera v2 API path.
`http.html:"/v1/prompt_injection"`	Lakera-specific endpoint.
`http.html:"prompt-injection-attack"`	Detection category name in Lakera responses.
`http.html:"jailbreak_attempt"`	Lakera category.
`http.html:"unknown_links"`	Lakera category.
`http.html:"relevant_language"`	Lakera category.
`http.html:"pii"`	Lakera category (broad; combine).
`http.html:"lakera-guard" http.html:"flagged"`	Body marker + Lakera response shape.
`http.html:"lakera_guard"`	Snake-case variant.
`port:8000 "lakera"`	Default port + vendor.
`port:8443 "lakera"`	TLS-fronted.
`port:443 http.html:"lakera"`	HTTPS.
`ssl.cert.subject.cn:"lakera"`	TLS cert CN.
`ssl.cert.subject.cn:"lakera-guard"`	TLS cert CN exact.
`ssl.cert.subject.cn:"lakera.ai"`	Vendor cert CN.
`hostname:"lakera"`	rDNS pattern.
`org:"Lakera"`	Shodan ORG.
`http.headers.x-powered-by:"lakera"`	Powered-by header.

Stage 2 verify probe: POST /v1/guard with empty body should return Lakera-specific error response. aimap fingerprint already present.

LlamaGuard (Meta: deployed via TGI / vLLM / Ollama)

LlamaGuard is a model, not a server. Discovery via the underlying inference server’s /v1/models response.

Shodan Query	Notes
`http.html:"Llama-Guard"`	Model name in HTML response. Was 0 hits 2026-05-19 (Shodan indexes JSON `/v1/models` responses sparsely).
`http.html:"meta-llama/Llama-Guard-3"`	Hugging Face model ID variant.
`http.html:"LlamaGuard"`	Camel-case variant.
`http.html:"unsafe_categories"`	LlamaGuard taxonomy term.

Side-channel discovery (recommended): re-query past LLM-Gateway surveys’ /v1/models outputs for Llama-Guard model name. The model is server-agnostic; deployment population surfaces in already-harvested LLM-Gateway corpora more efficiently than via Shodan.

Garak REST (NVIDIA adversarial harness)

See §23. CLI-dominant; 0 confirmed at population scale 2026-05-04.

2. General-Purpose Policy Engines

Open Policy Agent (OPA)

The dominant general-purpose policy engine. opa run -s ships an HTTP server on port 8181 by default. Used as the central policy layer in K8s, microservice meshes, and increasingly AI tool-use authorization.

Shodan Query	Notes
`port:8181 http.status:200`	OPA default port; broad. Use with body filter.
`http.html:"/v1/policies"`	OPA REST API endpoint.
`http.html:"/v1/data"`	OPA data API.
`port:8181 "opa"`	Port + bare string (noisy).
`http.title:"OPA"`	Title-based; rare since OPA has no UI by default.
`product:"Open Policy Agent"`	Shodan product tag if indexed.
`hostname:"opa"`	rDNS pattern.
`ssl.cert.subject.cn:"opa"`	TLS cert CN.

Stage 2 verify probe: GET /v1/policies returns JSON array of policy IDs. GET /v1/data returns JSON of policy data tree. Either confirms OPA + reveals operator-authored policy structure.

Risk class: policy data may include role assignments, allowed-action lists, tenant routing rules, AI-API quota policies. Reading /v1/data is read-only, but the policy structure itself is sensitive.

Styra DAS Edge agent

Commercial OPA distribution. Self-hosted edge agent reports to a SaaS control plane.

Shodan Query	Notes
`http.html:"styra"`	Vendor-name body match.
`http.html:"styra-das"`	Product identifier.
`port:8181 http.html:"styra"`	OPA-port + Styra wrapper.

3. LLMOps / Observability with Safety Dimension

These platforms blend evaluation, tracing, and policy-style guardrails. SaaS-mostly, but visibility through caller-side dorks.

W&B Weave (Weights & Biases)

LLM-call tracing with quality / safety gates. Hosted at wandb.ai/weave; some self-hosted exposure exists.

Shodan Query	Notes
`http.html:"wandb.ai/weave"`	Caller-side: apps that embed the W&B Weave dashboard URL.
`http.html:"weave-python"`	Package identifier.
`http.html:"weave-trace"`	Trace identifier in HTML.
`http.html:"weave_server"`	Server-mode identifier.
`http.html:"/weave/"`	Path-based; noisy (any app with `/weave/` route). Verified 2026-05-19: 1,032 hits, high FP suspicion.
`http.html:"wandb-weave"`	Package alternate.

Caller-side discovery: customer apps that mention W&B Weave in their HTML reveal which orgs are using it for LLM observability, useful for population mapping of the observability tier.

Humanloop

LLM app development with feedback loops + guardrail-like evaluation criteria. SaaS-primary.

Shodan Query	Notes
`http.html:"humanloop"`	Vendor-name body match.
`http.html:"app.humanloop.com"`	Caller-side: apps embedding Humanloop dashboard URL.
`http.html:"humanloop-python"`	Package identifier.
`ssl.cert.subject.cn:"humanloop"`	TLS cert CN.

Gantry

Observability + quality + safety policies for ML/LLM. Note: company shut down 2024; queries here for historical/forensic discovery only.

Shodan Query	Notes
`http.html:"gantry.io"`	Vendor-domain body match.
`http.html:"app.gantry.io"`	Caller-side dashboard URL.
`http.title:"Gantry"`	Noisy, gantry is a real word (shipping/manufacturing). Verified 2026-05-19: 44 hits, most unrelated.
`http.html:"/gantry-"`	Path prefix. Noisy, 2,229 hits 2026-05-19, mostly unrelated.

LangSmith (LangChain observability + eval)

See §5 / §23. Already documented; carried here for cross-reference.

4. Content Moderation (pre-LLM-era, now used as filters)

SaaS-only platforms. Visibility through caller-side dorks, find apps that integrate them.

Spectrum Labs

Shodan Query	Notes
`http.html:"spectrumlabsai.com"`	Caller-side.
`http.html:"spectrum-labs"`	Package / API identifier.

ActiveFence

Shodan Query	Notes
`http.html:"activefence.com"`	Caller-side.
`http.html:"activefence-api"`	API identifier.

Two Hat (Microsoft Azure Content Safety)

Two Hat acquired by Microsoft; now integrated into Azure Content Safety.

Shodan Query	Notes
`http.html:"twohat.com"`	Caller-side (legacy).
`http.html:"contentsafety.cognitive.microsoft.com"`	Azure Content Safety endpoint (caller-side).

Hive Moderation

Shodan Query	Notes
`http.html:"hivemoderation.com"`	Caller-side.
`http.html:"thehive.ai"`	Vendor domain.
`http.html:"hive-api"`	API identifier.

5. AI Governance / Red-Team Commercial

SaaS-mostly; caller-side dorks find customer-deployed apps that integrate them.

CalypsoAI

Shodan Query	Notes
`http.html:"calypsoai.com"`	Caller-side.
`http.html:"calypso-ai"`	Vendor identifier.

Protect AI

Multi-product: Recon, Sightline, Guardian, ModelScan.

Shodan Query	Notes
`http.html:"protectai.com"`	Caller-side.
`http.html:"protect-ai"`	Vendor identifier.
`http.html:"modelscan"`	ModelScan CLI / report identifier.
`http.html:"sightline"`	Sightline product identifier.
`http.html:"guardian"`	Generic; needs Protect-AI co-occurrence to disambiguate.

HiddenLayer

Shodan Query	Notes
`http.html:"hiddenlayer.com"`	Caller-side.
`http.html:"hiddenlayer-ai"`	Vendor identifier.

Robust Intelligence

Shodan Query	Notes
`http.html:"robustintelligence.com"`	Caller-side.
`http.html:"robust-intelligence"`	Vendor identifier.
`http.html:"robust-ai"`	Alternate identifier.

6. Cross-Category Caller-Side Discovery

Apps that reference SaaS safety platforms in their HTML / JS bundles reveal the deployment population of the SaaS safety layer without requiring access to the SaaS itself. Useful for mapping which orgs use which guardrails.

Pattern: combine vendor-domain in HTML with a customer-identifying signal.

http.html:"lakera.ai" http.html:"customer"
http.html:"openai.com/v1/moderations" http.html:"production"
http.html:"calypsoai.com" http.html:"login"

Caveat: caller-side discovery surfaces customers, not exposures. The customer’s own AI infrastructure may still need separate discovery via §1 / §22 / etc.

Tier System (this section)

Subclass	Default tier	Population deployment shape
Guardrails AI server	T2 (auth optional)	Rare; CLI-dominant ecosystem
NeMo Guardrails server	T2	Rare; CLI-dominant
Lakera Guard self-hosted	T1 (no auth default on self-host variant)	Rare; commercial product mostly SaaS
LlamaGuard via TGI/vLLM/Ollama	A (no auth concept on the underlying server)	Counted within §3 model-serving
OPA on 8181	T1 (default config has no auth on REST API)	Common in K8s; rare on public internet
W&B Weave / Humanloop / Gantry	n/a (SaaS)	Caller-side dorks only
Content moderation SaaS	n/a (SaaS)	Caller-side dorks only
AI governance commercial	n/a (SaaS)	Caller-side dorks only

Verified survey results + FP traps (2026-05-29)

Survey safety-guardrail-2026-05-29. The category ships auth-off by default. Most guardrail API servers are Shodan-dark behind JSON roots (Insight #67); only LLM Guard’s OpenAPI title indexes.

Dork	Total	Yield
`http.html:"LLM Guard API"`	9	CLEAN. Real LLM Guard (Protect AI) servers. Only guardrail marker that reliably indexes (OpenAPI title in HTML). 1 unauth / 2 auth / 4 aged-out on verify.

FP traps (do NOT re-run / require conjunct)

Dork	Total	Trap
`port:5000 http.html:"vigil"`	20	FP SWAMP. “vigil” = Pro-Vigil video-surveillance brand + Synology NAS (nas-vigil) on residential ISPs. NOT deadbits/vigil-llm. Needs `/analyze` conjunct.
`http.html:"/v1/rails/configs"`	0	NeMo serves JSON; path not in crawled HTML.
`http.html:"guardrails-ai" port:8000`	0	String in `/docs`, not root HTML.
`http.html:"rebuff" port:3000`	0	Archived May 2025, Next.js; string not in crawled HTML.

Verification probes

LLM Guard: GET / -> {"name":"LLM Guard API"}; POST /analyze/prompt {"prompt":"test"} -> verdict JSON (unauth) or {"message":"Not authenticated"} (AUTH_TOKEN set).
NeMo / Guardrails AI / Vigil / Rebuff are JSON/Swagger-dark; need direct probe, not Shodan.

Finding: the safety tool is the unguarded thing

5.78.101.230 (Hetzner): unauth LLM Guard :8000 + STACKED unauth data tier (MongoDB :27017, Redis 7.2.10 :6379 PING/PONG-confirmed, MySQL :3306, Postgres :5432, Docker registry :5000). The guardrail bypass was the smallest part. Insight #12: run the IP-direct shadow on every confirmed guardrail host.

Thesis: shipping default predicts the open rate

LLM Guard AUTH_TOKEN opt-in -> 1/3 reachable open. Voice-AI no-auth-concept -> all open. ML-gov OpenMetadata auth-on -> patched/closed. Three points, one curve.

aimap fingerprint gap

aimap + VisorBishop have NO guardrail fingerprint (v1.9.39). Candidate LLM Guard fp: GET / -> {"name":"LLM Guard API"} + POST /analyze/prompt scanner-object shape.