Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All reference

Reference

24. LLM Safety / Guardrails / Policy Engines / Moderation

Source: https://github.com/nuclide-research/AI-LLM-Infrastructure-OSINT/blob/main/shodan/queries/24-llm-safety-guardrail-policy

Section created: 2026-05-19. Companion to §23 (eval / red-team self-hosted). This section covers the guardrail and policy layer that sits in front of / after LLM calls, plus content-moderation platforms used as conversational safety filters.

The category is split by deployment mode:

SubclassExamplesDeployment modeShodan visibility
LLM-native guardrails (self-hostable)Guardrails AI, NeMo Guardrails, Lakera Guard self-hosted, LlamaGuard (deployed via TGI/vLLM/Ollama), Garak RESTSelf-hosted HTTP serverDirect, T1/T2
General-purpose policy enginesOpen Policy Agent (OPA) on port 8181, Styra DAS Edge agentSelf-hosted HTTP serverDirect, T2
LLMOps / observability with safety dimensionW&B Weave, Humanloop, Gantry (now shut down), LangSmith (caller-side)SaaS-mostlyIndirect (caller-side dorks against apps using them)
Content moderation (pre-LLM-era, now used as filters)Spectrum Labs, ActiveFence, Two Hat (Microsoft), Hive ModerationSaaS-onlyIndirect (caller-side dorks against apps using them)
AI governance / red-team commercialCalypsoAI, Protect AI, HiddenLayer, Robust IntelligenceSaaS-mostlyIndirect (caller-side dorks against customer-deployed apps)

Methodology lesson from §23 (carried forward): Single-word substring matching on response bodies ("garak", "guardrails", "weave") fires on Japanese anime filenames, common English words, and unrelated platforms at population scale. Conjunctive matching required. Every query below uses http.html / http.title scoping; bare-string dorks are documented but starred as (noisy).


1. LLM-Native Guardrails (self-hostable)

Guardrails AI (guardrails serve)

The open-source guardrails-ai package ships a server mode. Default port 8000. Validates LLM output against operator-defined “guards” (Pydantic-style contracts).

Shodan QueryNotes
http.html:"guardrails ai"Product-name body match. Verified 2026-05-19: 6 hits.
http.html:"/guards"Endpoint path. 1,048 hits, noisy (matches any /guards/ UI route in any app). Use as candidate set; verify via /api/guards response shape (JSON array).
http.html:"validate_using_guards"Package-specific helper string.
http.html:"guardrails-api"Alternative package identifier.
http.html:"guardrails-ai"Hyphenated variant.
http.html:"guardrails_server"Server-mode identifier.
http.html:"guardrails-server"Alternate.
http.html:"from guardrails import"Python import surfaced in code-display routes.
http.html:"@guardrails.com"Guardrails Hub email convention.
http.html:"hub.guardrailsai.com"Guardrails Hub URL in customer apps.
http.html:"validate-many"Server route.
http.html:"GuardrailsValidator"Validator class name.
port:8000 "guardrails"Port + bare string (noisy).
port:8000 http.html:"validators"OPA-similar route on default port.
port:8080 http.html:"guardrails"Alt port.
port:5000 http.html:"guardrails"Alt port.
port:443 http.html:"guardrails"TLS-fronted.
hostname:"guardrails"rDNS pattern.
hostname:"guardrails-ai"Hyphenated rDNS.
ssl.cert.subject.cn:"guardrails"TLS cert CN.
ssl.cert.subject.cn:"guardrails-ai"TLS cert CN variant.
ssl.cert.subject.cn:"guardrailsai"Vendor CN.
org:"Guardrails AI"Shodan ORG-tag if assigned.
http.headers.x-powered-by:"guardrails"X-Powered-By header.

Stage 2 verify probe: GET /api/guards returns JSON array of guard definitions when present. GET /openapi.json returns FastAPI schema with /guards route family. Both confirm Guardrails AI server vs the /guards/ noise class.

NeMo Guardrails (NVIDIA)

nemoguardrails server default port 8000. CLI-dominant ecosystem; rare in HTTP-server mode.

Shodan QueryNotes
http.html:"nemo-guardrails"Package identifier in source. Verified 2026-05-19: 3 hits.
http.html:"nemoguardrails"No-hyphen variant.
http.html:"NeMo Guardrails"Product-name body match.
http.html:"NVIDIA NeMo"NVIDIA family banner.
http.html:"/v1/rails/configs"NeMo rails-config endpoint path. Was 0 hits 2026-05-19 (rare deployment).
http.html:"/v1/rails/generate"NeMo rails-generate endpoint.
http.html:"/v1/rails"Parent path.
http.html:"colang"NeMo’s policy DSL name.
http.html:".co"colang file extension reference.
http.html:"jailbreak_detection"NeMo rail class.
http.html:"facts.co"NeMo example rail filename.
http.html:"hallucination_check"NeMo rail.
port:8000 http.html:"rails"Default port + rails marker.
port:8000 http.html:"colang"Port + DSL.
port:8080 http.html:"nemo"Alt port + vendor.
port:8443 http.html:"nemo"TLS-fronted.
"NeMo Guardrails"Product-name any-field (noisy).
hostname:"guardrails"rDNS pattern (shared with Guardrails AI; verify with platform-specific probe).
hostname:"nemo"NVIDIA-NeMo rDNS.
ssl.cert.subject.cn:"nemo"TLS cert CN.
ssl.cert.subject.cn:"nemoguardrails"TLS cert CN exact.
org:"NVIDIA"NVIDIA-deployed (broad).

Stage 2 verify probe: GET /v1/rails/configs returns JSON array of rail names. aimap fingerprint already present.

Lakera Guard (self-hosted variant)

Lakera’s commercial product is API-only; the self-hosted variant ships a Server: lakera header. Caller-side dorks find customer apps integrating the SaaS.

Shodan QueryNotes
Server: lakeraHeader-based; high precision when matched. Verified 2026-05-19: 1 hit.
http.html:"lakera-guard"Body marker. Verified 2026-05-19: 8 hits.
http.html:"lakera"Vendor-name bare (broad, includes caller-side).
http.html:"lakera.ai"Vendor domain in customer HTML.
http.html:"api.lakera.ai"API URL in customer apps.
http.html:"lakera-chrome"Lakera browser-extension reference.
http.html:"/v1/guard"Lakera guard endpoint path.
http.html:"/v2/guard"Lakera v2 API path.
http.html:"/v1/prompt_injection"Lakera-specific endpoint.
http.html:"prompt-injection-attack"Detection category name in Lakera responses.
http.html:"jailbreak_attempt"Lakera category.
http.html:"unknown_links"Lakera category.
http.html:"relevant_language"Lakera category.
http.html:"pii"Lakera category (broad; combine).
http.html:"lakera-guard" http.html:"flagged"Body marker + Lakera response shape.
http.html:"lakera_guard"Snake-case variant.
port:8000 "lakera"Default port + vendor.
port:8443 "lakera"TLS-fronted.
port:443 http.html:"lakera"HTTPS.
ssl.cert.subject.cn:"lakera"TLS cert CN.
ssl.cert.subject.cn:"lakera-guard"TLS cert CN exact.
ssl.cert.subject.cn:"lakera.ai"Vendor cert CN.
hostname:"lakera"rDNS pattern.
org:"Lakera"Shodan ORG.
http.headers.x-powered-by:"lakera"Powered-by header.

Stage 2 verify probe: POST /v1/guard with empty body should return Lakera-specific error response. aimap fingerprint already present.

LlamaGuard (Meta: deployed via TGI / vLLM / Ollama)

LlamaGuard is a model, not a server. Discovery via the underlying inference server’s /v1/models response.

Shodan QueryNotes
http.html:"Llama-Guard"Model name in HTML response. Was 0 hits 2026-05-19 (Shodan indexes JSON /v1/models responses sparsely).
http.html:"meta-llama/Llama-Guard-3"Hugging Face model ID variant.
http.html:"LlamaGuard"Camel-case variant.
http.html:"unsafe_categories"LlamaGuard taxonomy term.

Side-channel discovery (recommended): re-query past LLM-Gateway surveys’ /v1/models outputs for Llama-Guard model name. The model is server-agnostic; deployment population surfaces in already-harvested LLM-Gateway corpora more efficiently than via Shodan.

Garak REST (NVIDIA adversarial harness)

See §23. CLI-dominant; 0 confirmed at population scale 2026-05-04.


2. General-Purpose Policy Engines

Open Policy Agent (OPA)

The dominant general-purpose policy engine. opa run -s ships an HTTP server on port 8181 by default. Used as the central policy layer in K8s, microservice meshes, and increasingly AI tool-use authorization.

Shodan QueryNotes
port:8181 http.status:200OPA default port; broad. Use with body filter.
http.html:"/v1/policies"OPA REST API endpoint.
http.html:"/v1/data"OPA data API.
port:8181 "opa"Port + bare string (noisy).
http.title:"OPA"Title-based; rare since OPA has no UI by default.
product:"Open Policy Agent"Shodan product tag if indexed.
hostname:"opa"rDNS pattern.
ssl.cert.subject.cn:"opa"TLS cert CN.

Stage 2 verify probe: GET /v1/policies returns JSON array of policy IDs. GET /v1/data returns JSON of policy data tree. Either confirms OPA + reveals operator-authored policy structure.

Risk class: policy data may include role assignments, allowed-action lists, tenant routing rules, AI-API quota policies. Reading /v1/data is read-only, but the policy structure itself is sensitive.

Styra DAS Edge agent

Commercial OPA distribution. Self-hosted edge agent reports to a SaaS control plane.

Shodan QueryNotes
http.html:"styra"Vendor-name body match.
http.html:"styra-das"Product identifier.
port:8181 http.html:"styra"OPA-port + Styra wrapper.

3. LLMOps / Observability with Safety Dimension

These platforms blend evaluation, tracing, and policy-style guardrails. SaaS-mostly, but visibility through caller-side dorks.

W&B Weave (Weights & Biases)

LLM-call tracing with quality / safety gates. Hosted at wandb.ai/weave; some self-hosted exposure exists.

Shodan QueryNotes
http.html:"wandb.ai/weave"Caller-side: apps that embed the W&B Weave dashboard URL.
http.html:"weave-python"Package identifier.
http.html:"weave-trace"Trace identifier in HTML.
http.html:"weave_server"Server-mode identifier.
http.html:"/weave/"Path-based; noisy (any app with /weave/ route). Verified 2026-05-19: 1,032 hits, high FP suspicion.
http.html:"wandb-weave"Package alternate.

Caller-side discovery: customer apps that mention W&B Weave in their HTML reveal which orgs are using it for LLM observability, useful for population mapping of the observability tier.

Humanloop

LLM app development with feedback loops + guardrail-like evaluation criteria. SaaS-primary.

Shodan QueryNotes
http.html:"humanloop"Vendor-name body match.
http.html:"app.humanloop.com"Caller-side: apps embedding Humanloop dashboard URL.
http.html:"humanloop-python"Package identifier.
ssl.cert.subject.cn:"humanloop"TLS cert CN.

Gantry

Observability + quality + safety policies for ML/LLM. Note: company shut down 2024; queries here for historical/forensic discovery only.

Shodan QueryNotes
http.html:"gantry.io"Vendor-domain body match.
http.html:"app.gantry.io"Caller-side dashboard URL.
http.title:"Gantry"Noisy, gantry is a real word (shipping/manufacturing). Verified 2026-05-19: 44 hits, most unrelated.
http.html:"/gantry-"Path prefix. Noisy, 2,229 hits 2026-05-19, mostly unrelated.

LangSmith (LangChain observability + eval)

See §5 / §23. Already documented; carried here for cross-reference.


4. Content Moderation (pre-LLM-era, now used as filters)

SaaS-only platforms. Visibility through caller-side dorks, find apps that integrate them.

Spectrum Labs

Shodan QueryNotes
http.html:"spectrumlabsai.com"Caller-side.
http.html:"spectrum-labs"Package / API identifier.

ActiveFence

Shodan QueryNotes
http.html:"activefence.com"Caller-side.
http.html:"activefence-api"API identifier.

Two Hat (Microsoft Azure Content Safety)

Two Hat acquired by Microsoft; now integrated into Azure Content Safety.

Shodan QueryNotes
http.html:"twohat.com"Caller-side (legacy).
http.html:"contentsafety.cognitive.microsoft.com"Azure Content Safety endpoint (caller-side).

Hive Moderation

Shodan QueryNotes
http.html:"hivemoderation.com"Caller-side.
http.html:"thehive.ai"Vendor domain.
http.html:"hive-api"API identifier.

5. AI Governance / Red-Team Commercial

SaaS-mostly; caller-side dorks find customer-deployed apps that integrate them.

CalypsoAI

Shodan QueryNotes
http.html:"calypsoai.com"Caller-side.
http.html:"calypso-ai"Vendor identifier.

Protect AI

Multi-product: Recon, Sightline, Guardian, ModelScan.

Shodan QueryNotes
http.html:"protectai.com"Caller-side.
http.html:"protect-ai"Vendor identifier.
http.html:"modelscan"ModelScan CLI / report identifier.
http.html:"sightline"Sightline product identifier.
http.html:"guardian"Generic; needs Protect-AI co-occurrence to disambiguate.

HiddenLayer

Shodan QueryNotes
http.html:"hiddenlayer.com"Caller-side.
http.html:"hiddenlayer-ai"Vendor identifier.

Robust Intelligence

Shodan QueryNotes
http.html:"robustintelligence.com"Caller-side.
http.html:"robust-intelligence"Vendor identifier.
http.html:"robust-ai"Alternate identifier.

6. Cross-Category Caller-Side Discovery

Apps that reference SaaS safety platforms in their HTML / JS bundles reveal the deployment population of the SaaS safety layer without requiring access to the SaaS itself. Useful for mapping which orgs use which guardrails.

Pattern: combine vendor-domain in HTML with a customer-identifying signal.

http.html:"lakera.ai" http.html:"customer"
http.html:"openai.com/v1/moderations" http.html:"production"
http.html:"calypsoai.com" http.html:"login"

Caveat: caller-side discovery surfaces customers, not exposures. The customer’s own AI infrastructure may still need separate discovery via §1 / §22 / etc.


Tier System (this section)

SubclassDefault tierPopulation deployment shape
Guardrails AI serverT2 (auth optional)Rare; CLI-dominant ecosystem
NeMo Guardrails serverT2Rare; CLI-dominant
Lakera Guard self-hostedT1 (no auth default on self-host variant)Rare; commercial product mostly SaaS
LlamaGuard via TGI/vLLM/OllamaA (no auth concept on the underlying server)Counted within §3 model-serving
OPA on 8181T1 (default config has no auth on REST API)Common in K8s; rare on public internet
W&B Weave / Humanloop / Gantryn/a (SaaS)Caller-side dorks only
Content moderation SaaSn/a (SaaS)Caller-side dorks only
AI governance commercialn/a (SaaS)Caller-side dorks only

See also


Verified survey results + FP traps (2026-05-29)

Survey safety-guardrail-2026-05-29. The category ships auth-off by default. Most guardrail API servers are Shodan-dark behind JSON roots (Insight #67); only LLM Guard’s OpenAPI title indexes.

DorkTotalYield
http.html:"LLM Guard API"9CLEAN. Real LLM Guard (Protect AI) servers. Only guardrail marker that reliably indexes (OpenAPI title in HTML). 1 unauth / 2 auth / 4 aged-out on verify.

FP traps (do NOT re-run / require conjunct)

DorkTotalTrap
port:5000 http.html:"vigil"20FP SWAMP. “vigil” = Pro-Vigil video-surveillance brand + Synology NAS (nas-vigil) on residential ISPs. NOT deadbits/vigil-llm. Needs /analyze conjunct.
http.html:"/v1/rails/configs"0NeMo serves JSON; path not in crawled HTML.
http.html:"guardrails-ai" port:80000String in /docs, not root HTML.
http.html:"rebuff" port:30000Archived May 2025, Next.js; string not in crawled HTML.

Verification probes

  • LLM Guard: GET / -> {"name":"LLM Guard API"}; POST /analyze/prompt {"prompt":"test"} -> verdict JSON (unauth) or {"message":"Not authenticated"} (AUTH_TOKEN set).
  • NeMo / Guardrails AI / Vigil / Rebuff are JSON/Swagger-dark; need direct probe, not Shodan.

Finding: the safety tool is the unguarded thing

5.78.101.230 (Hetzner): unauth LLM Guard :8000 + STACKED unauth data tier (MongoDB :27017, Redis 7.2.10 :6379 PING/PONG-confirmed, MySQL :3306, Postgres :5432, Docker registry :5000). The guardrail bypass was the smallest part. Insight #12: run the IP-direct shadow on every confirmed guardrail host.

Thesis: shipping default predicts the open rate

LLM Guard AUTH_TOKEN opt-in -> 1/3 reachable open. Voice-AI no-auth-concept -> all open. ML-gov OpenMetadata auth-on -> patched/closed. Three points, one curve.

aimap fingerprint gap

aimap + VisorBishop have NO guardrail fingerprint (v1.9.39). Candidate LLM Guard fp: GET / -> {"name":"LLM Guard API"} + POST /analyze/prompt scanner-object shape.