Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All reference

Reference

Cat-07 RAG Frameworks — Shodan Query Log (2026-05-31)

Source: https://github.com/nuclide-research/AI-LLM-Infrastructure-OSINT/blob/main/shodan/queries/rag-frameworks-query-log-2026-05-31

Playwright web UI, VPN: Mullvad US. Zero = result. “Harvested” = unique IPs pulled into the corpus.

PlatformDorkShodan hitsHarvestedNotes
RAGFlowhttp.html:"ragflow" (favicon dork stale=0)1,67417default-creds admin@ragflow.io/admin + CVE-2024-12433 RCE; ~50% FP expected (Insight #15); sampled for chain
AnythingLLMhttp.title:"AnythingLLM" port:300115430auth-off-default (single-user); sampled
Onyxhttp.title:"Onyx" port:30007130configurable auth (AUTH_TYPE=disabled possible); sampled
Perplexicahttp.title:"Perplexica"6420no-auth-by-default; LLM API keys in config.toml; sampled
Kotaemonhttp.html:"kotaemon"1716default-creds admin/admin; full
DocsGPThttp.html:"DocsGPT"1413auth-off-default; CVE-2025-0868 pre-auth RCE; full
PrivateGPThttp.html:"privateGPT" (:8001 lock=0)88auth-off-default; variant rescued the dork; full
Quivrhttp.html:"quivr"88auth-on-default (Supabase JWT); identity-only; full
Ragapphttp.html:"ragapp"44no-auth-by-design; /admin + /api/management/config; full
txtaihttp.html:"txtai"32auth-off-default; full
Danswerhttp.title:"Danswer" port:300000fully rebranded to Onyx
LightRAGport:9621 http.html:"LightRAG"00Shodan-dark: SPA does not render name; port 9621 bare=519 but Chinese-cloud/WAF noise. Needs masscan + /health probe (Insight #21).
Cognitahttp.html:"cognita"+"truefoundry" (+truefoundry alone)00Shodan-dark SPA
R2Rport:7272 http.html:"r2r"00Shodan-dark: JSON API, no HTML to index. Needs masscan 7272 + /v3/health probe.
Verbahttp.html:"goldenverba"00Shodan-dark Next.js SPA

Totals: 15 dorks run · 10 platforms returned hits · 5 returned 0 (1 rebrand-dead, 4 Shodan-dark SPA/JSON-API) · 148 unique IPs harvested for the chain.

Population note: RAGFlow’s 1,674 dominates but is HTML-renderer-biased and ~50% FP per Insight #15. The auth-off-default SPA tier (LightRAG, Cognita, R2R, Verba) is Shodan-dark — the same HTML-renderer-vs-SPA split seen in Cat-29 and Insight #21. True population for those requires masscan + port-probe.

Censys Platform queries (manual web UI, Free tier, 2026-05-31)

Logged by the same standing rule. “Confirmed/Unauth” = after our verification probe.

Query (CenQL)Censys hitsHarvestedResult
host.services.banner: "LightRAG"gated0banner field Starter-gated on Free tier
host.services.port=9621~1.2Kfaceted: uvicorn 331; vs Shodan’s undifferentiated 519 noise
host.services.port=9621 and host.services.software.product="uvicorn"185100LightRAG candidates → 81 confirmed, 36 UNAUTH

Censys recovered the Shodan-dark LightRAG tier: Shodan HTML dork = 0; Censys = 185 candidates, 36 unauth confirmed. First NuClide finding sourced entirely from Censys. R2R (7272) and Cognita/Verba (8000) reachable the same way next.