Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All reference

Reference

Code Assistants — Shodan Query Catalog

Source: https://github.com/nuclide-research/AI-LLM-Infrastructure-OSINT/blob/main/shodan/queries/code-assistants-queries

Generated: 2026-05-27 from pre-survey OSINT pass (14 platforms) See: data/platform-intel/code-assistants-osint-2026-05-27.md for full intel


Tabby (TabbyML)

Auth default: off (auth is optional; no token required in community edition) Exposure class: Code completion endpoint, model identity, server config, running model list — all accessible without credentials

LabelQueryRationaleFP Risk
primaryhttp.html:"tabbyml" port:8080”tabbyml” appears in bundled JS asset paths and meta tagsLow
secondaryport:8080 http.html:"tabby" http.html:"/v1/completions"Combined signal: Tabby JS tag + completion path string in HTMLLow
swaggerport:8080 http.html:"swagger-ui" http.html:"tabby"Swagger UI always present; combined with tabby body stringMed
identity-probeGET /v1/health{"device":..., "model":..., "chat_model":...}chat_model + device fields in health response are Tabby-specific

FP note: Port 8080 is heavily shared. Never run port:8080 alone. The tabbyml string is the safest primary signal — appears in static asset paths.


Sourcegraph / Cody

Auth default: on (built-in auth; free-license instances promote all users to site-admin) Exposure class: On free/misconfigured instances: full code search index, connected repo credentials, Cody conversation history

LabelQueryRationaleFP Risk
primaryhttp.title:"Sourcegraph" port:7080Title is application-set; port 7080 is Sourcegraph defaultLow
secondaryport:7080 http.html:"sourcegraph-frontend"sourcegraph-frontend bundle name in HTML sourceLow
alt-porthttp.title:"Sourcegraph" port:3080Older deployments used 3080Low
identity-probeGET / → HTML with sourcegraph-frontend script bundle or data-sourcegraph-app-version meta attributeConfirms Sourcegraph; version extraction possible

FP note: The title “Sourcegraph” is application-specific; collision risk is negligible. Port 7080 is not commonly used by other services.


Continue.dev

Auth default: N/A — IDE extension only, no server component Exposure class: N/A — exposure belongs to configured backend (Ollama, vLLM, etc.)

LabelQueryRationaleFP Risk
Not applicableContinue.dev has no standalone server

FP note: No Shodan queries. Survey the backend model servers via Ollama/llama.cpp/vLLM surveys.


Refact.ai (self-hosted)

Auth default: off initially; community edition accepts any API key value Exposure class: Model capability list, completion and chat endpoints, running model configuration

LabelQueryRationaleFP Risk
primaryport:8008 http.html:"refact"”refact” appears in the web UI sourceMed
secondaryport:8008 http.html:"coding_assistant_caps"coding_assistant_caps is a Refact-specific endpoint name appearing in JS sourceLow
caps-pathport:8008 http.html:"refact-caps"/refact-caps path string in UI or JS bundleLow
identity-probeGET /refact-caps{"code_completion_models":..., "caps_version":..., "cloud_name":...}code_completion_models + caps_version fields are Refact-specific

FP note: Port 8008 is shared with some JupyterHub and other services. The coding_assistant_caps body string is the strongest low-FP signal.


Aider (browser mode)

Auth default: off (Streamlit has no authentication by default) Exposure class: Full terminal/IDE session, git repo content, LLM API keys in environment, conversation history

LabelQueryRationaleFP Risk
primaryport:8501 http.html:"aider" http.html:"streamlit"Aider Streamlit UI contains “aider” in page content + Streamlit framework identifierLow
secondaryport:8501 http.html:"AI pair programming"Aider’s own tagline present in the browser UILow
identity-probeGET / → 200 + Streamlit HTML with “aider” in title or body textConfirms Aider Streamlit instance

FP note: Port 8501 is Streamlit’s default — many non-Aider apps use it. Body string conjunct is mandatory. Population likely small; Aider is primarily a CLI tool.


code-server (Coder)

Auth default: on by default (auto-generated password); auth: none config option disables all auth — common misconfiguration Exposure class: When auth: none: full VS Code IDE, integrated terminal (root shell on container), filesystem, extension install; when password-protected: hash-as-cookie bypass (issue #7696) possible if config file is readable

LabelQueryRationaleFP Risk
primaryhttp.html:"code-server" port:8080”code-server” appears in page title and HTMLMed
secondaryport:8080 http.html:"coder-options"<meta id="coder-options"> is unique to code-server login pageLow
no-authport:8080 http.html:"coder-options" -http.html:"password"Login page without password form = auth: none = fully openLow
identity-probeGET / → HTML with <meta id="coder-options" and name="password" inputConfirms code-server; absence of password field = auth disabled

FP note: Port 8080 is heavily shared. The coder-options meta element is the strongest discriminating signal. Linuxserver.io Docker images use port 8443 as the default — add port:8443 variant.


OpenDevin / All-Hands OpenHands

Auth default: off Exposure class: Full agent execution environment — see category-09 survey

LabelQueryRationaleFP Risk
See category-09 surveyCovered in prior survey

FP note: Covered in cat-09. CVE-2026-34444 (lupa sandbox escape) and WebSocket auth bypass are load-bearing for this platform.


SWE-agent

Auth default: off (no authentication on web UI or Flask backend) Exposure class: GitHub PAT submitted to agent, repository content, LLM API keys, full agent execution log

LabelQueryRationaleFP Risk
primaryport:3000 http.html:"SWE-agent"”SWE-agent” in page HTMLLow
secondaryport:8000 http.html:"swe-agent"Flask backend port with body stringLow
identity-probeGET / on port 3000 → HTML with “SWE-agent” title; GET /socket.io/?EIO=4&transport=polling → socket.io handshakesocket.io endpoint confirms SWE-agent backend

FP note: Port 3000 is heavily polluted. Body string conjunct required. Population expected to be very small — SWE-agent is primarily a research tool not typically left running.


Cursor

Auth default: N/A — desktop application, no server Exposure class: N/A

LabelQueryRationaleFP Risk
Not applicableDesktop app only; no self-hosted server component

GitHub Copilot Enterprise / GHES

Auth default: on (GitHub OAuth enforced; Copilot itself has no unauthenticated path) Exposure class: Misconfigured GHES: code repos, issues, PRs, user data; CVE-2024-9487 SAML bypass enables auth bypass on affected versions

LabelQueryRationaleFP Risk
primaryhttp.title:"GitHub Enterprise" port:443GHES serves this title on the front pageLow
secondaryssl.cert.subject.cn:"github" port:443 -http.title:"GitHub"Self-signed certs on internal GHES with “github” in CNMed
saml-bypasshttp.title:"GitHub Enterprise" http.html:"saml"SAML-enabled GHES instances (CVE-2024-9487 surface)Low
identity-probeGET /api/v3/{"current_user_url":..., "hub":...}GitHub Enterprise REST API root — confirms GHES

FP note: GitHub.com itself will match http.title:"GitHub" — use port:443 -http.title:"GitHub" to exclude. http.title:"GitHub Enterprise" is specific to GHES deployments.


Codeium Enterprise (Windsurf Enterprise)

Auth default: on (SSO / SAML 2.0 enforced; no unauthenticated paths) Exposure class: Auth-gated; no known unauthenticated exposure class; the /_route/ path pattern identifies the deployment

LabelQueryRationaleFP Risk
primaryssl.cert.subject.cn:"codeium" port:443Enterprise deployments use certs with “codeium” in CNLow
secondaryhttp.html:"/_route/api_server" port:443Client-configured enterprise endpoint path appears in HTML/JSLow
identity-probeGET /_route/api_server/ → TLS-gated proprietary JSON/_route/ path prefix is Codeium-specific

FP note: Population will be small (enterprise-only licensed product). Auth is enforced. These dorks identify the deployment for version/attribution purposes, not unauthenticated access.


FauxPilot

Auth default: off (dummy API key "dummy" accepted; no real auth mechanism) Exposure class: Full code completion access, submitted code context, Triton model identity, GPU metrics on port 8002

LabelQueryRationaleFP Risk
primaryport:5000 http.html:"codegen"”codegen” in HTML/response body of FauxPilot proxyMed
secondaryport:5000 http.html:"fauxpilot"”fauxpilot” in page source or error responsesLow
tritonport:8000 http.html:"fauxpilot"Triton HTTP port co-located with FauxPilotLow
metricsport:8002 http.html:"nv_inference"Triton metrics endpoint exposes nv_inference_* Prometheus metricsLow
identity-probePOST /v1/engines/codegen/completions with Authorization: Bearer dummy{"object":"text_completion","model":"codegen-..."}/v1/engines/codegen/ path + dummy key acceptance confirms FauxPilot

FP note: Port 5000 is heavily used (Flask default, AirPlay on macOS, etc.). The engines/codegen path is the strongest discriminating signal. Project is in maintenance mode; wild instances will be old deployments.


WizardCoder / CodeLlama via llama.cpp

Auth default: off Exposure class: See llama.cpp / model-serving survey

LabelQueryRationaleFP Risk
model-filterport:8080 http.html:"llama.cpp" http.html:"codellama"llama.cpp server with CodeLlama model loadedLow
model-filter-2port:8080 http.html:"llama.cpp" http.html:"wizardcoder"llama.cpp server with WizardCoder modelLow

FP note: These are sub-queries against the llama.cpp population, not standalone platform dorks. Run against existing llama.cpp survey results.


JetBrains AI Service

Auth default: on (JWT/Bearer token required; no unauthenticated paths in official product) Exposure class: Auth-gated; community proxy deployments may expose OpenAI-compatible endpoint

LabelQueryRationaleFP Risk
proxyport:8080 http.html:"jetbrains-ai"Community jetbrains-ai-proxy projects may expose this stringMed
identity-probeGET /v1/models → OpenAI-format model list (if community proxy)Confirms OpenAI-compatible proxy frontend

FP note: No standard self-hosted JetBrains AI server fingerprint exists. Official Mellum on-prem is enterprise-licensed and not publicly accessible. These dorks target community proxy instances only.