Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All reference

Reference

LLM Safety / Guardrail Engines — Shodan Query Catalog

Source: https://github.com/nuclide-research/AI-LLM-Infrastructure-OSINT/blob/main/shodan/queries/safety-guardrail-queries

Generated: 2026-05-27 from pre-survey OSINT pass (12 platforms) Updated: 2026-05-31 — added LlamaRisk “LlamaGuard AI Firewall” vendor + Meta-model disambiguation (Censys CT discovery; findings #36162-36163) See: data/platform-intel/safety-guardrail-osint-2026-05-27.md for full intel

Category theme: Guardrail engines deployed as API servers almost universally ship with auth off — they assume trusted-network placement. An exposed guardrail server = safety bypass (route around it), prompt log access, and policy config disclosure. The irony: security tools with no security on themselves.


LlamaGuard / Llama Guard 3 (Meta)

Auth default: off (no auth concept on hosting server unless explicitly configured) Exposure class: model roster reveals safety classifier; unauth inference endpoint usable to probe bypass ⚠ Disambiguation: “LlamaGuard” the model (this section, Meta meta-llama/Llama-Guard-3-*) is NOT “LlamaGuard AI Firewall,” a commercial guardrail vendor by LlamaRisk (see its own section below). A bare-string search for LlamaGuard returns the vendor brand (cert subject_dn), not the model. The model lives in the inference server’s /v1/models body and is Shodan/Censys-body-dark; it surfaces via Censys CT cert names ("Llama-Guard" full-text = 130 hits, 2026-05-31, finding #36162). Match the model by the data layer, never by the name.

LabelQueryRationaleFP Risk
primaryhttp.html:"Llama-Guard-3"Model name as indexed by Shodan from /v1/models JSONLow
secondaryhttp.html:"meta-llama/Llama-Guard" port:8000Full HuggingFace model path in vLLM responsesLow
tertiaryhttp.html:"llama-guard-2-8b" OR http.html:"llama-guard-3-8b"Specific model variant namesLow
ollamahttp.html:"llama-guard" port:11434Ollama deployments serving the modelMed (name collision)
identity-probeGET /v1/modelsid field containing "Llama-Guard"Confirms model loaded on inference server

LlamaGuard AI Firewall (LlamaRisk) — commercial vendor, NOT the Meta model

Auth default: SaaS/self-hosted firewall product; auth posture per-deployment (unknown, not tested) Exposure class: commercial AI-firewall/guardrail product. Cataloged for disambiguation: this is the brand the bare string “LlamaGuard” resolves to, the name-collision flagged in the Meta section above. Distinct vendor (LlamaRisk), distinct from meta-llama/Llama-Guard-*. Discovery: Censys, not Shodan html. Bare full-text "LlamaGuard" returns this vendor via cert subject_dn/names (0 hosts, 5 certs, 14 web properties, 2026-05-31). Finding #36163.

LabelQuery (Censys CenQL)RationaleFP Risk
brand-domainscert.names: "llamarisk.com" or cert.names: "llamaguard.com"Vendor + product domainsLow
product-titleweb.endpoints.http.html_title= "LlamaGuard AI Firewall"Product UI titleLow
deployed-instancehost.ip: "129.151.137.216"llamaguard.129-151-137-216.nip.io proof/staging deploy (Oracle Cloud)
known assetsllamaguard.com, www.llamaguard.com, llamaguard-firewall.arhaamali.com, llamaguard-proof.llamarisk.com, dashboard.llamarisk.comObserved CT assets

Note: auth state and product surface NOT tested (vendor out of any engagement scope; observation only). If a guardrail survey later includes commercial AI-firewall vendors, this is the LlamaRisk entry.


NeMo Guardrails (NVIDIA)

Auth default: off (no built-in auth; Authorization header forwarded to upstream LLM only) Exposure class: rail config names (/v1/rails/configs), Colang policy structure, upstream LLM config, conversation state

LabelQueryRationaleFP Risk
primaryhttp.html:"/v1/rails/configs"Unique NeMo endpoint path — appears in Swagger UI and error pagesLow
secondaryhttp.html:"nemoguardrails" port:8000Python package name in server responsesLow
tertiaryhttp.html:"colang" port:8000NeMo’s policy DSL name — distinctive to NeMo ecosystemLow
quaternaryhttp.html:"/v1/rails/generate"NeMo rails-generate endpoint pathLow
product-namehttp.html:"NeMo Guardrails"Full product nameMed (docs/blog pages)
certssl.cert.subject.cn:"nemoguardrails"TLS cert CN for dedicated deploymentsLow
identity-probeGET /v1/rails/configs → 200 + JSON arrayReturns [] or list of config names; unique to NeMo

Lakera Guard (self-hosted enterprise)

Auth default: on for SaaS; self-hosted details gated (assume API key required) Exposure class: caller-side: which orgs use Lakera (API URL in JS bundles); self-hosted: guard policy config

LabelQueryRationaleFP Risk
primaryhttp.html:"lakera-guard"Hyphenated product name — appears in caller-side HTMLLow
header"Server: lakera"Response header on self-hosted instancesLow
endpointhttp.html:"/v1/guard" http.html:"lakera"Guard endpoint path combined with vendor nameLow
caller-sidehttp.html:"api.lakera.ai"SaaS API URL hardcoded in customer appsMed (legitimate refs)
response-fieldhttp.html:"prompt_injection" http.html:"lakera"Lakera response category name in customer HTMLMed
certssl.cert.subject.cn:"lakera.ai"Vendor TLS cert CNLow
identity-probePOST /v1/guard {"input": "test"} → JSON with "flagged", "categories" keysLakera-specific response shape

Guardrails AI

Auth default: off (GUARDRAILS_API_KEY is optional env var; unset = open) Exposure class: all guard definitions (validation logic + schemas), guard names, LLM proxy config, OpenAPI spec

LabelQueryRationaleFP Risk
primaryhttp.html:"guardrails-ai" port:8000Package identifier on default portMed (string common)
secondaryhttp.html:"guardrailsai.com" port:8000Vendor domain in Swagger UILow
endpointhttp.html:"/guards" http.html:"guardrails"Guards endpoint + vendor nameMed
openapihttp.html:"guardrails" http.html:"/openapi.json"FastAPI schema endpoint + vendorMed
healthhttp.html:"health-check" http.html:"guardrails"Health endpoint + vendorMed
hubhttp.html:"hub.guardrailsai.com"Guardrails Hub URL in server HTMLLow
identity-probeGET /guards → 200 + JSON array of guard objectsConfirms open Guardrails AI server; guard definitions exposed
confirm-probeGET /health-check{"status": "ok"}Faster liveness check

LLM Guard (Protect AI)

Auth default: off (AUTH_TOKEN is optional; when unset, all /analyze/* endpoints open) Exposure class: scanner config (which scanners active, thresholds, model names), scan result cache (100 recent prompts), OpenAPI spec

LabelQueryRationaleFP Risk
primaryhttp.html:"LLM Guard API"Exact string from OpenAPI info.title fieldLow
secondaryhttp.html:"laiyer/llm-guard"Docker image name in deployment artifactsLow
tertiaryhttp.html:"protectai/llm-guard"GitHub repo reference in server responsesLow
swaggerhttp.html:"llm-guard" http.html:"swagger"Swagger UI exposed for LLM GuardMed
portport:8000 http.html:"llm-guard"Default port + product nameMed
identity-probeGET /swagger.json → JSON with info.title = "LLM Guard API"Definitive identification
scan-probePOST /analyze/prompt {"prompt": "test"} → JSON with is_valid, scanners_resultsConfirms scanner endpoint open

Rebuff (Protect AI — archived)

Auth default: off (dev default); MASTER_API_KEY=12345 in example config = default creds risk Exposure class: injection detection history, canary token corpus, VectorDB contents, embedded API keys (OpenAI, Pinecone, Supabase) in leaked env config

LabelQueryRationaleFP Risk
primaryport:3000 http.html:"rebuff"Product name on default Node.js portMed
secondaryport:3000 http.html:"rebuff.ai"Vendor domain referenceLow
apiport:3000 http.html:"/api/detect"Primary detection endpoint pathLow
canaryport:3000 http.html:"canary"Canary token endpoint referenceHigh (generic word)
env-leakhttp.html:"MASTER_API_KEY" port:3000Exposed env config — default creds indicatorLow
identity-probePOST /api/detect {"userInput": "test"} → JSON with injectionScore, heuristicScore, vectorScoreConfirms Rebuff API open

ShieldLM (Tsinghua University / thu-coai)

Auth default: off (no built-in server; inherits from vLLM/hosting framework with no default auth) Exposure class: safety classifications with explanations, bilingual (CN/EN) content flags, model reasoning output

LabelQueryRationaleFP Risk
primaryhttp.html:"ShieldLM" port:8000Model name in vLLM /v1/models responseLow
secondaryhttp.html:"thu-coai/ShieldLM"HuggingFace model path referenceLow
model-idhttp.html:"shieldlm" port:8000Lowercase variantLow
identity-probeGET /v1/modelsid containing "ShieldLM"Confirms ShieldLM loaded via vLLM
confirmPOST /v1/chat/completions → response text containing "safe", "unsafe", or "controversial"ShieldLM three-class output

Llama-Recipes Safety Demos (Meta)

Auth default: off (demo code, no auth) Exposure class: demo API keys, prompt/response logs, accidental production deployment of test code

LabelQueryRationaleFP Risk
primaryhttp.html:"llama-recipes"Meta’s recipe collection identifierLow
secondaryhttp.html:"PurpleLlama" port:8000 OR port:5000PurpleLlama safety demo suiteLow
tertiaryhttp.html:"llama-recipes" http.html:"safety"Safety-specific demo pageLow
identity-probeGET / → HTML with "llama-recipes" or "PurpleLlama" referenceDemo server running

OpenShield (AI Firewall — archived 2026-02-03)

Auth default: on for main API (port 8080); Adminer DB UI on port 8085 has no separate auth layer Exposure class: port 8085 Adminer = full DB access — content filter rules, request logs, API key entries, rate limit configs

LabelQueryRationaleFP Risk
primaryport:8080 http.html:"openshield"Product name on API portLow
adminerport:8085 http.html:"adminer" http.html:"openshield"DB management UI exposedLow
adminer-genericport:8085 http.html:"adminer"Broader Adminer sweep (verify OpenShield via DB schema)High
secondaryhttp.html:"openshieldai" port:8080GitHub org name in server responsesLow
identity-probeGET /openai/v1/models → 401 (auth required on main API)Confirms OpenShield proxy presence
db-probeGET :8085 → Adminer login page → check for openshield schemaDB UI accessible, confirms OpenShield

PromptGuard / Llama Prompt Guard 2 (Meta)

Auth default: off (reference deployments carry no auth; HF_TOKEN is startup-only, not API auth) Exposure class: prompt injection/jailbreak classification results, confidence scores per class (benign/injection/jailbreak), model version

LabelQueryRationaleFP Risk
primaryhttp.html:"Llama-Prompt-Guard" port:8000Model name in server responsesLow
secondaryhttp.html:"Prompt-Guard-86M"Original model variant nameLow
tertiaryhttp.html:"prompt-guard" http.html:"injection" port:8000Model name + injection detection contextMed
llamafirewallhttp.html:"LlamaFirewall" port:8000Parent suite name in server HTMLLow
identity-probeGET /v1/modelsid containing "Prompt-Guard"Confirms PromptGuard model loaded
confirmPOST /v1/chat/completions {"messages":[{"role":"user","content":"test"}]} → response with "INJECTION", "JAILBREAK", or "BENIGN"PromptGuard classification output

AIShield Guardian (Bosch)

Auth default: on (GUARDIAN_API_KEY required; enterprise contact provisioning) Exposure class: caller-side: customer apps embedding guardian endpoint URLs; Watchtower scan results if exposed via Jupyter/Streamlit

LabelQueryRationaleFP Risk
primaryhttp.html:"AIShield Guardian"Product name in customer integrationsLow
secondaryhttp.html:"aishield" http.html:"guardian"Vendor + product name conjunctMed
watchtowerhttp.html:"AIShield Watchtower"Open-source companion productLow
boschhttp.html:"bosch-aisecurity"GitHub org referenceLow
caller-sidehttp.html:"GUARDIAN_API_ENDPOINT"Env variable reference in exposed configLow
identity-probeN/A — no known public default endpoint; enterprise product

Vigil (deadbits/vigil-llm)

Auth default: off (zero auth implemented in any version; Flask bound to 0.0.0.0:5000) Exposure class: full scanner config via /settings (scanner list, model names, thresholds, embedding API key), prompt scan cache (100 recent entries), YARA rule paths, VectorDB collection names

LabelQueryRationaleFP Risk
primaryport:5000 http.html:"vigil"Product name on Flask default portMed (common word)
secondaryport:5000 http.html:"prompt injection" http.html:"analyze"Injection detection context on default portMed
endpointport:5000 http.html:"/analyze/prompt"Primary scan endpoint pathLow
settingsport:5000 http.html:"/settings"Settings endpoint pathMed
scannerport:5000 http.html:"vigil" http.html:"scanner"Scanner config contextMed
identity-probeGET /settings → JSON with scanner, embedding, cache keysDefinitive identification + config leak
scan-probePOST /analyze/prompt {"prompt": "test"} → JSON with uuid, prompt_entropy, resultsConfirms open scan endpoint
write-probePOST /add/texts {"texts": ["test"], "metadatas": [{}]} → 200Confirms unauthenticated write to VectorDB

Cross-Platform Notes

Conjunctive matching required. Most platform names ("guardrails", "vigil", "rebuff", "guard") are common English words. Every query above uses conjunctive signals — port + name, or name + endpoint path. Single-term body matches will produce population-scale noise.

Port 8000 congestion. Six of these 12 platforms default to port 8000. Any port:8000 sweep must confirm platform identity via the verification probe before claiming a finding.

Zero-auth is the class-level finding. NeMo Guardrails, Guardrails AI, LLM Guard, Vigil, PromptGuard, LlamaGuard hosting servers, and Rebuff all ship with auth off. This is not individual misconfiguration — it is the default posture for the category. Any internet-exposed instance is likely unintentionally open.

Cross-reference with existing surveys. LlamaGuard and PromptGuard surface in model-serving surveys (§3). Port 8000 sweeps from prior LLM-orchestration and model-serving surveys (01-llm-orchestration.md, 03-model-serving.md) may already contain guardrail-server hits. Re-query those corpora for Llama-Guard, rails/configs, and LLM Guard API before running fresh Shodan harvests.

See also: shodan/queries/24-llm-safety-guardrail-policy.md — broader categorical coverage including OPA, content moderation SaaS, and AI governance platforms.