Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All reference

Reference

AI Evaluation / Red-Team — Shodan Query Catalog

Source: https://github.com/nuclide-research/AI-LLM-Infrastructure-OSINT/blob/main/shodan/queries/ai-eval-redteam-queries

Generated: 2026-05-27 from pre-survey OSINT pass (13 platforms) See: data/platform-intel/ai-eval-redteam-osint-2026-05-27.md for full intel Companion file: shodan/queries/23-ai-safety-eval.md (prior pass with confirmed hit counts)

Confirmed prior pass (2026-05-04): Promptfoo 22, LangSmith 96, Garak ~0, DeepEval ~0. CLI-dominant ecosystem — Garak, PyRIT, PromptBench, OpenAI Evals, RAGAS have no HTTP server mode and produce 0 Shodan hits as standalone services.


Promptfoo

Auth default: off (CSRF present, no auth gate on API routes) Exposure class: Red-team configs, adversarial prompt libraries, eval results with LLM outputs, provider endpoint references

LabelQueryRationaleFP Risk
primaryhttp.html:"promptfoo"HTML-scoped; confirmed 22 hits in prior passLow
secondaryhttp.title:"promptfoo"Title-scoped; confirmed 17 hits in prior passLow
port-onlyport:15500~730 hits but port shared — FP without HTML conjunctionHigh
api-routehttp.html:"/api/evals"Unique Promptfoo eval route in HTML (SPA bundle)Low
api-route-2http.html:"/api/redteam"Red-team route in SPA bundleLow
mcp-modeport:3100 http.html:"promptfoo"MCP server mode + HTML confirmationLow
certssl.cert.subject.cn:"promptfoo"TLS cert CNLow
rdnshostname:"promptfoo"rDNS hostname patternMed
identity-probeGET /api/user/email{"email": null}Confirms unauthenticated Promptfoo instance
identity-probe-2GET /api/evals → JSON array with createdAt + results fieldsConfirms eval data exposure

LangSmith (self-hosted)

Auth default: off on pre-v0.10 deployments (AUTH_TYPE=none); v0.10+ defaults to basic auth Exposure class: Full LLM call traces (inputs/outputs/tokens), prompt templates, eval datasets, API keys in trace metadata

LabelQueryRationaleFP Risk
primaryhttp.html:"langsmith"HTML-scoped; confirmed 96 hits in prior passLow
secondaryhttp.title:"LangSmith"Title-scoped; confirmed 67 hits in prior passLow
port-frontendport:1980 http.html:"langsmith"LangSmith Nginx frontend port + body confirmationLow
port-apiport:1984 http.html:"langsmith"Backend API port + body confirmationLow
port-onlyport:1984 http.status:2003,061 hits; port shared — FP without HTML conjunctionHigh
info-endpointhttp.html:"/info" port:1980LangSmith /info endpoint returning instance_flags JSONMed
traces-endpointhttp.html:"/api/v1/runs"Run trace API path in HTMLMed
headerhttp.html:"X-Tenant-Id"LangSmith-specific multi-tenancy header reflected in docs/error pagesMed
certssl.cert.subject.cn:"langsmith"TLS cert CNLow
rdnshostname:"langsmith"rDNS hostname patternLow
identity-probeGET /api/v1/runs?limit=10 → JSON array with run_type, inputs, outputs fieldsConfirms unauthenticated trace exposure
identity-probe-2GET /info{"instance_flags": {...}, "version": "..."}Confirms LangSmith instance + version disclosure

Inspect AI (UK AISI)

Auth default: off (no auth mechanism; binds localhost by default — external binding required for exposure) Exposure class: Eval log files with task inputs, model outputs, scores; full benchmark results; dataset sample contents

LabelQueryRationaleFP Risk
primaryport:7575 http.html:"inspect"Inspect AI default port + HTML confirmationLow
port-onlyport:7575Port is not a common shared port; lower FP than 8000/8080Med
api-logsport:7575 http.html:"/api/logs"Log viewer API endpoint pathLow
titlehttp.title:"inspect" port:7575Title-based (page title may be “Inspect” or “Inspect AI”)Low
alt-portport:6565 http.html:"inspect"Common alternate port from —port flagMed
package-id"inspect-ai"Package identifier in any indexed fieldMed
certssl.cert.subject.cn:"inspect"TLS cert CN — high FP risk given common wordHigh
identity-probeGET /api/logs → JSON array of eval log entries with eval_id, task, model fieldsConfirms Inspect AI log viewer

HELM (Stanford CRFM)

Auth default: off (read-only static result viewer, no auth) Exposure class: Benchmark evaluation results, full prompt/response logs, model comparison tables

LabelQueryRationaleFP Risk
primaryport:8000 http.html:"HELM" http.html:"scenarios"HELM-specific “scenarios” terminology + portLow
secondaryhttp.title:"HELM" port:8000Title-scoped with portMed
api-runsport:8000 http.html:"/api/runs"HELM result API pathMed
fieldport:8000 http.html:"run_spec"Unique HELM JSON field name in SPA bundleLow
field-2port:8000 http.html:"adapter_spec"HELM adapter specification fieldLow
alt-portport:8080 http.html:"HELM" http.html:"scenarios"Some configurations use 8080Med
crfmhttp.html:"crfm" http.html:"HELM"Stanford CRFM identifier in pageLow
identity-probeGET /api/runs → JSON array with run_spec, stats, adapter_spec fieldsConfirms HELM instance

TruLens

Auth default: off (Streamlit, no auth; inherits Streamlit exposure class) Exposure class: LLM eval traces (full input/output per call), feedback scores, leaderboard across app versions, SQLite eval database

LabelQueryRationaleFP Risk
primaryport:8501 http.html:"trulens"TruLens on default Streamlit port + body confirmationLow
secondaryhttp.title:"TruLens" port:8501Title-basedLow
generic-streamlitport:8501 http.html:"trulens"Streamlit with TruLens body markerLow
feedback-fieldhttp.html:"trulens_feedback"TruLens-specific Streamlit component nameLow
trace-fieldhttp.html:"trulens_trace"TruLens trace componentLow
alt-portport:8502 http.html:"trulens"Alt Streamlit port (configured via run_dashboard(port=8502))Low
rdnshostname:"trulens"rDNS patternLow
identity-probeGET / → 200 + <title>TruLens</title> or trulens in Streamlit page bodyConfirms TruLens dashboard

Arthur Shield

Auth default: on (API key required; no documented defaults — requires explicit K8s secret provisioning) Exposure class: If misconfigured: task configs, rule results, inference IDs, safety rule pass/fail decisions, retrieved context from hallucination detection

LabelQueryRationaleFP Risk
primaryhttp.html:"arthur" http.html:"validate_prompt"Arthur Shield-specific API path terminologyLow
secondaryhttp.html:"/api/v2/task" http.html:"arthur"Arthur Shield API route prefixLow
docs-pathhttp.html:"/docs" http.html:"arthur" http.html:"shield"Swagger UI path + product nameMed
fieldhttp.html:"rule_results" http.html:"arthur"Unique response field nameLow
field-2http.html:"inference_id" http.html:"arthur"Inference ID field unique to Arthur Shield responsesLow
certssl.cert.subject.cn:"arthur"TLS cert CN (FP risk: “arthur” is a common name)High
identity-probeGET /docs → Swagger UI listing /api/v2/task/{task_id}/validate_prompt routeConfirms Arthur Shield instance

Patronus AI

Auth default: on (OIDC/OAuth2 in production; basic auth in POC) Exposure class: If misconfigured: eval logs with LLM traces, evaluator profiles, dataset contents, account metadata

LabelQueryRationaleFP Risk
primaryhttp.html:"patronus" http.html:"/evaluate"Patronus + evaluation endpoint pathLow
secondaryhttp.html:"/v1/" http.html:"patronus"V1 API prefix + brandLow
docshttp.html:"/docs" http.html:"patronus"Swagger UI with Patronus brandMed
mcphttp.html:"patronus-mcp-server"Patronus MCP server package identifierLow
identity-probeGET /v1/evaluate → 200 or 401 with {"detail": "..."} JSON bodyConfirms Patronus instance

DeepEval / Confident AI

Auth default: off (OSS local eval server); on for enterprise/Confident AI cloud Exposure class: Test case corpora (prompts + expected/actual outputs), evaluation metric scores, LLM API key environment variable references

LabelQueryRationaleFP Risk
primaryport:8000 http.html:"deepeval" http.html:"confident"Conjunction of both identifiers on default portLow
secondaryhttp.html:"/api/test-cases" http.html:"deepeval"Test-case endpoint path + product nameLow
healthport:8000 http.html:"/api/health" http.html:"deepeval"Health endpoint + product nameLow
fieldhttp.html:"test_case_id" http.html:"deepeval"Unique test-case ID fieldLow
certssl.cert.subject.cn:"deepeval"TLS cert CNLow
rdnshostname:"deepeval"rDNS patternLow
identity-probeGET /api/test-cases → JSON array with test_case_id, input, expected_output, actual_output fieldsConfirms DeepEval eval server

No Shodan Surface (CLI-only / no HTTP server)

These platforms produce zero Shodan hits as standalone services. Indirect pivots noted.

PlatformIndirect Shodan Pivot
Garakhttp.html:"garak" http.html:"report.jsonl" — accidentally exposed output directories
PyRITport:5000 http.html:"pyrit" — Playground Labs only
PromptBenchPivot to Jupyter fingerprint on port 8888
OpenAI Evalshttp.html:"oaieval" http.html:"eval_results" — exposed result files
RAGASPivot to LangSmith/Langfuse fingerprints where RAGAS results are ingested
LlamaRiskN/A — not an AI eval/red-team tool (DeFi org)

Combined Sweeps

LabelQueryRationale
port-sweep(port:15500 OR port:1980 OR port:1984 OR port:7575 OR port:8501)AI eval dedicated-port sweep
html-sweep(http.html:"promptfoo" OR http.html:"langsmith" OR http.html:"trulens")Auth-off platforms with confirmed population
title-sweep(http.title:"promptfoo" OR http.title:"LangSmith" OR http.title:"TruLens")Title-scoped sweep
rdns-sweep(hostname:"promptfoo" OR hostname:"langsmith" OR hostname:"deepeval" OR hostname:"trulens")rDNS sweep
cert-sweep(ssl.cert.subject.cn:"promptfoo" OR ssl.cert.subject.cn:"langsmith" OR ssl.cert.subject.cn:"deepeval")TLS cert sweep
api-route-sweep(http.html:"/api/evals" OR http.html:"/api/redteam" OR http.html:"/api/v1/runs" OR http.html:"/api/test-cases")Unique API route paths