Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All research

Survey May 28, 2026

Model Serving and Registry Infrastructure Survey

NuClide Research · 2026-05-28


Summary

Shodan sweep across 11 model-serving and registry platforms. MLflow is the only platform with a live, indexable population — 10 confirmed unauthenticated instances spanning 6 cloud providers and 6 countries. Every other platform surveyed (vLLM, TorchServe, TensorFlow Serving, Ray Serve, BentoML, Seldon Core, KServe, ONNX Runtime Server, TGI, Triton) returned zero live hosts.

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

  • 672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5858, T5904, T5919
  • 733 (AI Risk & Ethics Specialist): K7051, S7056, S7067, S7069, T5868, T5882, T5893, T5904
  • overlap (Common AI KSATs (all 5 roles)): K108, K1158, K1159, K22, K6311, K6935, K7003, K7024, K7045, K942, T5896

The zero-hit platforms are not zero-exposure — they are Shodan-dark. Ports 8081 (TorchServe management), 8082 (TorchServe metrics), 8501 (TF Serving), 8080 (TGI), 8002 (Triton metrics) are not crawled at indexable density. The correct path for those populations is masscan sweep against cloud ranges, not Shodan.

Among the 10 confirmed MLflow hosts: one shows an active exploitation chain in progress, readable without authentication from the public internet.


Dork Execution Log

QueryHitsPlatformResult
port:8000 "max_model_len" "vllm"10vLLMAll offline at probe — stale population
port:8081 "nextPageToken" "models"0TorchServePort not crawled
port:8081 "modelName" "modelUrl" "minWorkers"0TorchServePort not crawled
port:8082 "ts_"0TorchServePort not crawled
port:8501 "model_version_status" "AVAILABLE"0TF ServingResponse not indexed
port:8265 "ray_version"0Ray/api/version not indexed
port:8265 http.title:"Ray Dashboard"1RayOffline at probe
port:5000 "registered_models" "mlflow"0MLflowBody field too specific
port:5000 http.title:"MLflow"10MLflow10 live, all unauth
port:8080 "model_id" "model_dtype"0TGI/info fields not indexed
port:8080 "tokenization_workers" "max_total_tokens"0TGINot indexed
port:8000 "/v2/health/ready"0TritonNot indexed
port:8002 "nv_inference_request_success"0TritonPort not crawled
port:3000 "Bento-Name"0BentoMLHeader not indexed
port:9000 "/api/v1.0/predictions"20SeldonAll offline; 1 MinIO FP

vLLM — 10 Harvested, 0 Live

10 IPs from port:8000 "max_model_len" "vllm". All returned HTTP 000 at probe. Population exists in Shodan’s index but is transient — vLLM deployments appear short-lived or firewalled after initial exposure.

The security-relevant finding does not require a live population: vLLM’s --api-key flag does not protect management endpoints. /metrics, /tokenize, /health, /pause, /resume, and /update_weights bypass --api-key enforcement. Any instance reachable without authentication leaks the serving model’s identity and allows inference against an “secured” deployment. This is a class finding, not population-dependent.

Stale IPs (for record): 108.58.51.82, 67.78.191.77, 162.251.247.13, 108.252.249.145, 82.130.248.249, 113.30.160.211, 74.108.66.82, 24.139.33.243, 98.189.181.108, 173.92.133.57


TorchServe — Shodan-Dark

Port 8081 (management API, the ShellTorch surface) is not in Shodan’s crawl. Neither nextPageToken nor the modelName/modelUrl/minWorkers JSON field set returned any results. Port 8082 (Prometheus ts_ metrics) is also not crawled.

TorchServe CVE-2023-43654 (SSRF on model registration URL handler) is the relevant chain: any internet-exposed management port accepts an arbitrary model URL, the server fetches and loads it, and the loading process executes Python code in the model archive. Pre-patch this is direct RCE; post-patch it is SSRF. The management port is documented as localhost-only but ships bound to 0.0.0.0 in the default configuration.

Correct approach: masscan port 8081 against Hetzner/Scaleway/OVH /16 ranges, httpx filter for nextPageToken, aimap for identity confirmation.


Seldon Core — 20 Harvested, 0 Live

port:9000 "/api/v1.0/predictions" returned 20 hits. All 19 genuine candidates returned HTTP 000. Seldon Core pods run inside Kubernetes clusters; they expose port 9000 to the cluster network but not to the internet in any production deployment in this population.

FP identified: 94.72.112.137 returns HTTP 400 from Server: MinIO. MinIO catches all POST requests and returns a method-not-allowed body that contains the requested path — it looks like a 400 from Seldon until the Server header is read. Same FP class documented in Insight #22 (aimap dcm4chee-arc FP broadened to any ASP.NET/MinIO catchall).


MLflow — 10 Confirmed Unauthenticated

Dork: port:5000 http.title:"MLflow"

Verification method: POST /api/2.0/mlflow/experiments/search {"max_results":10} returns full JSON experiment arrays without credentials on all 10.

Population Table

IPCloudCountryServerArtifact BackendNotable
104.154.156.34GCPUSgunicornlocal /homeUsername wonjungy leaked
162.55.232.59HetznerDEgunicornmlflow-artifacts:/200M+ experiment IDs
20.13.144.13AzureUSgunicorn/mlruns/ local40 experiments, sequential
210.131.221.109Linode/AkamaiJPgunicornmlflow-artifacts:/900M+ experiment IDs
51.159.148.91ScalewayFRgunicornfile:///root/.venv/...Active exploitation confirmed
51.158.107.81ScalewayFRuvicornmlflow-artifacts:/196 experiments
168.119.201.8HetznerDEgunicorns3://mlflow/S3 backend, public bucket possible
172.203.208.10AzureUSuvicornlocal /home/elkmachine/...Username elkmachine leaked, 841 experiments
79.110.227.36HetznerFIuvicornmlflow-artifacts:/X-Frame-Options header, no API auth
101.202.128.3CN cloudCNgunicorns3://mlflow/S3 backend, 61+ experiments

Exposure Class

Unauthenticated POST /api/2.0/mlflow/experiments/search exposes: all experiment IDs, names, artifact storage locations, creation timestamps, and lifecycle state. With an experiment ID, POST /api/2.0/mlflow/runs/search {"experiment_ids":["N"],"max_results":50} returns all run metadata: metrics, hyperparameters, tags, artifact URIs, system metrics logged during training.

The registered-models endpoint (GET /api/2.0/mlflow/registered-models/list) returned 404 on all 10 — these instances run MLflow Tracking Server without the Model Registry component, or the registry is on a separate port. The artifact download surface remains open via /api/2.0/mlflow/artifacts/list and /get-artifact.

Username Leakage

Two instances expose operator home directory paths directly in the artifact_location field:

  • 104.154.156.34: /home/wonjungy/mlflow-data/artifacts/ — GCP, operator handle wonjungy
  • 172.203.208.10: /home/elkmachine/mlflow-env/mlruns/ — Azure, operator handle elkmachine

Cross-reference pivot: GitHub, HuggingFace, Docker Hub for these handles may surface model artifacts, training code, and additional infrastructure.

S3 Backend Exposure

Two instances use s3://mlflow/{experiment_id} artifact locations:

  • 168.119.201.8 (Hetzner DE)
  • 101.202.128.3 (CN cloud)

If the S3 bucket “mlflow” has a public ACL or permissive bucket policy, every artifact in every experiment — model weights, training data, evaluation outputs, pickle files — is downloadable without credentials. Verification: aws s3 ls s3://mlflow/ --no-sign-request. Not executed; surface open at probe time.

Active Exploitation: 51.159.148.91

This instance carries the attack record of an ongoing exploitation campaign, fully readable without authentication.

Operator profile: Legitimate HuggingFace AutoTrain text classification workload. Models: autotrain-xlm-roberta-tonality, autotrain-modernbert-test. Runs as root. Disk was 100% full at last legitimate run (120GB used).

Attack timeline (reconstructed from experiment timestamps):

PeriodExperiment IDsActivity
~2026-04-271cve_test_1778457616 — initial CVE probe
~2026-04-274-6scan_1778457759 through scan_1778457767 — scan phase
~2026-05-2847-58poc_autodiscover_probe_177976... — multi-Python PoC

Phase 1 — Recon (experiments 47-58): 12 experiments targeting 3 Python versions (3.7, 3.13, 3.15) across 4 LLM evaluation scorer packages (phoenix, deepeval, ragas, trulens). Each experiment sets its artifact_location to a file:// URI inside /root/.venv/lib/pythonX.Y/site-packages/mlflow/genai/scorers/{scorer}/. The attacker is probing which Python environments and scorer packages exist by seeing which artifact writes succeed.

Phase 2 — Exploitation (experiments 57-58): Two runs named rce-import-artifact-writer (run IDs 48b637... and c42467...), status RUNNING. The attacker uploaded a .py file — a Python import canary — into the scorer package directory tree via CVE-2026-2651 (unauthorized artifact upload in --serve-artifacts mode). When MLflow’s scorer discovery walks the package tree and imports the module, the Trigger class writes a marker file containing os.getuid(), os.getgid(), os.getpid(), and an attacker-controlled token to verify execution.

The operator runs MLflow as root. If import-time execution succeeds, the attacker has root code execution on the Scaleway instance.

CVE context: CVE-2026-2651 (MLflow artifact write without auth, --serve-artifacts mode). Also consistent with the LFI class CVE-2024-2928. The file:// artifact location scheme is the write primitive.


Shodan Coverage Gap — Model Serving Platforms

The zero results across 9 of 11 platforms reflect Shodan’s port coverage, not platform absence. Concrete alternatives:

PlatformPrimary PortShodan StatusAlternative
TorchServe management8081Not crawledmasscan Hetzner/Scaleway/OVH
TorchServe metrics8082Not crawledmasscan + httpx ts_ prefix filter
TF Serving REST8501Not indexed at field depthmasscan + httpx path probe
TGI8080Fields not indexedmasscan + httpx /info
Triton HTTP8000Path not indexedmasscan + httpx /v2 path probe
Triton metrics8002Not crawledmasscan
BentoML3000Header not indexedmasscan + httpx Bento-Name header
Ray Dashboard8265API path not indexedmasscan

Port 5000 is indexed (Flask default, heavy Shodan crawl) — this is why MLflow is uniquely visible. Every other platform defaults to non-crawled ports.


Attack Classes Documented (not executed)

MLflow supply chain via model injection: Unauth artifact upload (/api/2.0/mlflow/artifacts/upload or PUT /upload) on --serve-artifacts deployments writes to the artifact store. If the artifact store path is within the Python package tree (as on 51.159.148.91), the next MLflow scorer import executes attacker code. If the S3 bucket is writable (two instances use s3://mlflow/), model pickle files can be replaced with malicious weights. Malicious pickle executes on mlflow.pyfunc.load_model().

vLLM management endpoint bypass: --api-key on vLLM only gates /v1/* inference paths. Control endpoints respond to unauthenticated callers regardless: GET /metrics (Prometheus inference telemetry), GET /tokenize (tokenize any string), POST /update_weights (hot-swap model weights from URL), POST /pause / POST /resume (stop serving). The update_weights endpoint is the highest-severity primitive: an attacker with network access can point the serving process at a malicious model URL without credentials.

TorchServe ShellTorch (CVE-2023-43654): Management API at port 8081 accepts POST /models?url={arbitrary_url}. Pre-patch: the URL is fetched and loaded, executing code in the .mar archive. Post-patch: the fetch still occurs (SSRF) even with the RCE chain partially blocked. The management port ships bound to 0.0.0.0 despite documentation saying 127.0.0.1.


aimap Deepdive Results

aimap v1.9.36, -scan-all-fingerprints, 420 ports probed across 10 hosts, 46m56s. Summary: 24 open ports, 13 services, 17 findings (9 critical / 5 high / 1 medium / 2 low), 10 unauthenticated.

All 8 directly-confirmed MLflow instances found on port 5000. aimap classifies all as critical risk_level, citing CVE-2024-37052…37060 (RCE via malicious pickle model upload to unauth registry, code execution on pyfunc.load_model()).

New services beyond the MLflow surface:

172.203.208.10 — Elasticsearch 8.19.13, port 9200, unauthenticated, MEOW-COMPROMISED:

  • Cluster: my-cluster, node name: ELK-machine (same operator as MLflow finding #159, username elkmachine)
  • 7 indices: centific, centific-runtimelogs-dev-2026, centific-runtimelogs-development-2026, centific-runtimelogs-qa-2026, datafactory_logs, keycloak, read_me
  • 82,701 documents alive. read_me is the Meow-Actor-A extortion marker.
  • Attribution: BTC bc1q38rjul6gdamfflf6p4ukz0ymtvfgfv2j9saf6r, contact wendy.etabw@gmx.com, paste tli.sh/73x1k
  • State: compromised-marked, data not yet wiped. centific in 4 index names — probable operator/project name.
  • MLflow (#159) and Elasticsearch both unauthenticated on the same Azure host. The attacker already owns the Elasticsearch tier.

210.131.221.109 — Open Directory, port 80, CRITICAL:

  • Server: SimpleHTTP/0.6 Python/3.13.3 — Python’s built-in HTTP server serving from the working directory
  • Exposed: .claude/, CLAUDE.md, .claudeignore, .git/, .github/, .gitignore, .mypy_cache/, .pytest_cache/, approximately 54 entries total
  • .claude/ may contain Claude Code session state, hooks, or credentials
  • .git/ exposes full source commit history
  • Same host runs MLflow unauth on port 5000
  • The operator is running python3 -m http.server 80 (or equivalent) from their project root alongside their MLflow instance

101.202.128.3 — Harbor + MinIO, AUTH-REQUIRED (LOW/MEDIUM):

  • Ports 443 and 8888: Harbor container registry (Bearer realm=harbor-registry) — catalog access denied. Not a finding; auth enforced.
  • Port 9000: MinIO S3 API — returns AccessDenied. Auth is enabled on this MinIO instance.
  • Correction to earlier assessment: the s3://mlflow/ artifact backend on this host uses a local MinIO instance with auth enforced, not AWS S3. Public bucket access is not possible here.
  • Port 5000: MLflow confirmed unauth (separate auth state from MinIO/Harbor).

VisorLog updated: findings #166 (Elasticsearch Meow) and #167 (Open Directory) ingested.


Arsenal Coverage

ToolRunResult
JAXEN (Shodan harvest)Yes16 dorks executed; MLflow 10 hits, others 0 or stale
aimapDonev1.9.36, -scan-all-fingerprints, 46m56s. 13 services, 17 findings. 2 critical co-located discoveries. recon/model-serving-2026-05-28/aimap-mlflow.json
VisorGraph / recongraphAttemptedcrt.sh 502; no cert pivots returned
aimap-profileNot runNetwork access denied
JS-bundle analysisNot runMLflow 5000 is Python server; no SPA bundle secrets surface
VisorLogDone12 findings ingested (#152-#161, #166-#167); 9 HIGH, 3 CRITICAL
VisorScubaDoneAll 10 primary hosts score 0/10, AI.C1 violation
BARENot runPermission denied
VisorBishopNot runNetwork access denied
menlohuntNot applicable
nu-reconAttemptedNetwork access denied
VisorPlusPartially runvisorplus assess denied
VisorHollowSKIPBinary cannot execute
VisorAgentETHICAL STOPControlled targets only

Pivot Avenues

  1. elkmachine Elasticsearch — 172.203.208.10:9200 is open and Meow-marked. Pull /_cat/indices?v for full index list; read centific index for data class identification. The MLflow instance on the same host may have experiment data referencing the Elasticsearch pipeline.
  2. 210.131.221.109 open directoryGET http://210.131.221.109/.claude/ to enumerate Claude Code session state. GET http://210.131.221.109/CLAUDE.md for project instructions. git clone http://210.131.221.109/.git for full source history. Not executed.
  3. Exploitation run artifact readGET /api/2.0/mlflow/artifacts/list?run_uuid=48b6377316c441e3b71505a45dd94b18 on 51.159.148.91. Confirms whether the .py canary landed on an importable path.
  4. Username cross-referencewonjungy (GCP), elkmachine (Azure), centific (ES index name) across GitHub, HuggingFace, Docker Hub, LinkedIn.
  5. S3 bucket check — 168.119.201.8 still uses s3://mlflow/ with gunicorn (not MinIO as on 101.202.128.3). aws s3 ls s3://mlflow/ --no-sign-request against this host’s backend. 101.202.128.3 S3 bucket is MinIO with auth enabled — strike that pivot.
  6. TorchServe masscan lane — masscan 8081 against Hetzner 95.216.0.0/14, Scaleway 51.158.0.0/16, OVH 135.125.0.0/16. httpx filter for nextPageToken.
  7. vLLM re-harvestport:8000 "vllm" http.status:200 when Shodan index refreshes; also masscan Vast.ai and RunPod ranges.

Candidate Insight

Candidate Insight #50: MLflow’s default-no-auth posture is uniquely Shodan-visible because port 5000 is heavily crawled (Flask default). No other model-serving platform has equivalent Shodan coverage. This makes MLflow an outlier: its population is surveyable entirely via passive means while every other ML serving platform (TorchServe, TF Serving, vLLM, Triton, TGI) requires active masscan sweeps to find. Port 5000 as a survey anchor is specific to MLflow/Flask-family deployments.


Query Catalog

# MLflow (productive)
port:5000 http.title:"MLflow"                               → 10 hits, confirmed live, all unauth

# MLflow (dead — too specific for Shodan body index)
port:5000 "registered_models" "mlflow"                      → 0 hits

# vLLM (stale population -- indexing lag)
port:8000 "max_model_len" "vllm"                            → 10 hits, all offline at probe
port:8000 "owned_by":"vllm"                                 → 0 hits (JSON field depth not indexed)

# Not Shodan-indexable -- use masscan instead
port:8081 "nextPageToken" "models"                          → 0 hits (TorchServe)
port:8082 "ts_"                                             → 0 hits (TorchServe metrics)
port:8501 "model_version_status" "AVAILABLE"                → 0 hits (TF Serving)
port:8265 "ray_version"                                     → 0 hits (Ray)
port:8080 "model_id" "model_dtype"                          → 0 hits (TGI)
port:8000 "/v2/health/ready"                                → 0 hits (Triton)
port:8002 "nv_inference_request_success"                    → 0 hits (Triton metrics)
port:3000 "Bento-Name"                                      → 0 hits (BentoML)