Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All engagement records

Multi-host case May 6, 2026

AIPOD orthodontic AI MLflow + Label Studio + S3 stack, CVE-2023-1177 actively-exploited (138.197.152.103)

Sector
Commercial

NuClide Research · 2026-05-06

Summary

DigitalOcean droplet 138.197.152.103 runs an end-to-end orthodontic-AI R&D stack that has been operational and unauthenticated since March 2023. Three production AI services on the same host:

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

  • 672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5904, T5919
  • 733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5893
  • overlap (Common AI KSATs (all 5 roles)): K1158, K1159, K22, K6311, K6900, K6935, K7003, K7024
PortServiceAuthVulnerability
5000MLflow 2.2.1NONECVE-2023-1177 path-traversal RCE, actively exploited since 2026-03-26 with 18 attacker-injected experiments
8080Label Studio 1.5.0.post0 (Jul 2022 release, 3 years stale)Token requiredCVE-2024-23633 LFR + CVE-2024-24566 SSRF apply to this version
s3://aipod-crop/S3 artifact bucket (us-east-2)Private (403 on public probes)Bucket exists, no public objects

Full chain ran via bash data/visor-chain-runner.sh mlflow-cve plus follow-up enumeration via direct MLflow REST API (Methodology Insight #6 conjunctive matchers).

What AIPOD does (operator-IP exfil from MLflow metadata)

The operator is a dental-AI / orthodontic-AI startup developing an end-to-end pipeline of medical-imaging models:

YearExperimentRunsModel task
2023/demo-experiment (id 155786267…)15Initial validation experiment
2023initial-model (id 701546096…)3First production attempt; train_data_version train_version_1.csv
2023–2024real-exp (id 956907690…)53Multi-class classifier with MSE loss + cross-entropy; cosine warmup; ReduceLROnPlateau scheduler; 256x256 image input
2024pan-segmentation (id 148418839…)13Panoramic dental X-ray segmentation; training datasets pan_set_1, pan_set_2, pan_set_3; cosine warmup; IoU 0.66 / val loss 0.61 best
2025ceph-keypoint (id 804275185…)14Cephalometric (lateral skull X-ray) keypoint detection; 256x256 input; 4,428 train / 3,542 val examples; final RMSE 0.0109
2026orthodontic-upper-multitask (id 583324192…)19Multi-task upper-jaw classifier (arch + alignment); developer gaurav; fold-cross-validated; best avg_combined_f1: 0.4899 (R&D-stage, not production)

The pipeline progression is consistent with a methodical orthodontic-AI roadmap: foundational classifier → panoramic segmentation → cephalometric keypoint detection → multi-task arch+alignment fusion. 4-year R&D investment leaked through the MLflow metadata.

Developer roster (extracted from mlflow.user and mlflow.source.name)

  • gaurav, Mac developer (path /Users/gaurav/Documents/usa_work/ULClassification/); only active on the 2026 multi-task work; offshore developer signature (usa_work directory naming pattern)
  • ubuntu, production droplet’s default user; ran the 2023–2024 training waves; src paths src/models/train.py and src/model/train.py

Git commits leaked through MLflow tags

CommitPeriodSource path
34fb854192012a8da1c409abbeb13939112df9fc2023-03 to 2023-04src/models/train.py
f32e5d52f16c83f01bac8b654da1e8bd8f4754b42023-06 to 2024-05 (~12 months main branch)src/models/train.py
daa9915c…2024-04 (refactor)src/model/train.py (singular model)
dfe5665b8a0af217dc632d313245d0640e08b18dceph-keypoint contextsrc/model/train.py
0024a538f1c70c660ac9391048fc5d1e603fe89apan-segmentation contextsrc/model/train.py

GitHub commit-search for these SHAs returns 0 hits → the operator’s repos are private. The commit hashes are nonetheless useful as forensic fingerprints if the operator’s GitHub Enterprise / GitLab self-host ever surfaces.

Activity timeline

2023-03-10 ────●  /demo-experiment
2023-03-17 ────●  initial-model           ← year-1: foundational
2023-03-20 ────●  real-exp (53 runs over 2023-2024)
2024-03-26 ────●  pan-segmentation        ← year-2: dental X-ray segmentation
2024-05-04 ────●  last `ubuntu` legit run

       (13 months of silence on MLflow surface - possibly migrated production
        elsewhere; this droplet kept as stale dev / artifact repository)

2025-04-13 ────●  ceph-keypoint (14 runs)  ← year-3: keypoint detection
2026-03-23 ────●  orthodontic-upper-multitask (19 runs in one day, gaurav)
                                          ← year-4: multi-task fusion
2026-03-26 ────●  CVE-2023-1177 spray actor finds the host (3 days after gaurav's burst)
2026-04-10 to 2026-04-23 ────● wave of /etc/ traversals
2026-04-20 ────● 5x /root/.ssh/ traversals (SSH key hunt)
2026-05-01 ────● new attacker campaign IDs
2026-05-05 ────● `exp_103` injection
2026-05-06 06:54 UTC ── ●  `poc_exp` injection (16h before NuClide re-probe)

The CVE-2023-1177 spray actor landed 3 days after the operator’s most-recent-visible legit activity. Possibilities: (a) coincidental population-scale spray, (b) Shodan harvest noticed the activity, (c) someone signaled the host. The 3BT8ncOzBWAH4GyIGz0EXsSwj7f UUID appears on multiple tier-2 MLflow hosts (population-scale actor, the synthesis paper documents this UUID across both 138.197.152.103 + 159.203.110.202).

Active attacker presence (CVE-2023-1177)

24 total experiments on host: 6 legit + 18 attacker-injected. Attacker-injected experiments share a recognizable pattern:

{
  "name": "<random-16-char>",
  "artifact_location": "http:///?/../../../../../../../../../../../../../../etc/"
}

Each path traversal targets either /etc/ or /root/.ssh/. The attacker has multiple campaign UUIDs:

Campaign UUID prefixPatternFirst seenMost recent
3BT8ncOzBWAH4GyIGz0EXsSwj7fpopulation-scale spray2026-03-26 00:11:102026-03-26 00:11:12
3CCGENufMtsxUjr3ij4gjsPM44m/etc/ only2026-04-10 23:33:482026-04-10
3D9V4JvPnDuvfxpSHZBQo1TTM3x/etc/ only2026-05-01 22:54:452026-05-01
PJYMtlmXsSfyO0hk (16-char)/etc/2026-04-23 12:49:002026-04-23
MXhmOLyZ7i2zgR5d (16-char)/etc/2026-04-20 11:11:252026-04-20
6tUWyqxY1Z3cuSvj (16-char)/etc/2026-04-20 11:11:132026-04-20
aZGVwezuF60CHthW (16-char)/root/.ssh/2026-04-20 11:11:362026-04-20
9D6H17u0tiNmXdOp (16-char)/root/.ssh/2026-04-20 11:11:392026-04-20
apwsM4eyDoVjWJxq (16-char)/root/.ssh/2026-04-20 11:11:282026-04-20
RaYNG7f9MAsKW8ci (16-char)/root/.ssh/2026-04-20 11:11:332026-04-20
4lHeW9CUYxhVujFz (16-char)/etc/2026-04-20 11:11:192026-04-20
A0lNs4QbTgIChecm (16-char)/root/2026-04-20 11:11:412026-04-20
exp_103 (named)/etc/2026-05-05 08:37:422026-05-05
poc_exp (named)/etc/2026-05-06 06:54:252026-05-06

The 2026-04-20 batch is striking, 8 experiments injected within 30 seconds, all targeting /root/.ssh/ and /etc/. This is automated mass-spray behavior, not interactive testing.

Did they exfil anything?

The attacker-injected runs are stuck in RUNNING status with empty user_id. The CVE-2023-1177 exfil flow is:

1. POST /api/2.0/mlflow/experiments/create
   {"artifact_location": "http:///#/../../../../../etc/passwd"}
2. POST /api/2.0/mlflow/runs/create - get run_id
3. GET /get-artifact?path=passwd&run_uuid=<id>  ← read the file content

The injection (steps 1-2) is what’s visible to us; step 3 is what would actually exfil files. NuClide cannot determine from passive observation whether the attacker has executed step 3 successfully, the run’s artifact response payload isn’t logged in MLflow. However: the persistence + scale of the spray (40+ days, 18 experiments, 6 distinct campaign IDs on this single host) suggests the actor is at least attempting exfil, not just surveying.

Recommended verification: the operator should grep their MLflow access logs (mlflow_default.log or systemd journal of the gunicorn service) for GET /get-artifact?path= requests with attacker run_ids, those would confirm exfil execution.

Disclosure routing

Provider: abuse@digitalocean.com (rank-1 from nuclide-contact WHOIS resolution).

Operator-direct: AIPOD has no public-facing domain reachable from the data we collected. No CT-log subdomains, no rDNS, no website at aipod.com / .io / .ai / .app (those are unrelated). The S3 bucket aipod-crop is the only operator-attributable artifact, and AWS doesn’t surface bucket-owner contact publicly. Provider-channel-only disclosure recommended; DigitalOcean’s customer-notification process will reach the operator through their billing identity.

Disclosure draft: disclosures/DIGITALOCEAN-138-197-152-103-aipod-mlflow.md

9-step chain provenance

Step 0  jaxen import --no-lookup --source ledger-revisit-2026-05-06   → empire.db
Step 1a visorplus assess (138.197.152.103)  → DigitalOcean WHOIS, nmap top-1000 (3 ports), SSH host keys (RSA+ECDSA+Ed25519), GreyNoise: benign/RIOT
Step 1b aimap -list                          → MLflow 2.2.1 confirmed; **Label Studio mis-fingerprinted as Langfuse** (FP bug - see followup work)
Step 1c jaxen pivot http://138.197.152.103:8080/  → favicon hash `-1649949475` for cross-fleet pivot
Step 2  visorgraph -ip                        → no TLS, no cert pivots (bare-IP hosting)
Step 3  aimap-profile --target --mode full   → no CT subdomains, no security.txt, no public DNS
Step 4  JS-bundle extraction (Label Studio)  → /api/version disclosed v1.5.0.post0 build hash
Step 5  nuclide-contact                      → abuse@digitalocean.com (operator opaque)
Step 6  visorlog ingest                      → ledger entry (existing event ID #220 from milvus survey was on different IP; this is new)
Step 7  visorscuba assess                    → 743 nodes; AI.C1 critical violation
Step 8  bare                                  → CVE-2023-1177 commodity-CVE chain confirmed (top score ~0.5+)
Step 9  visorcorpus build (-profile strict -type baseline -include kb_exfiltration,system_prompt,config_secrets) → 46-case corpus

Severity rationale

HIGH, not CRITICAL. Reasoning:

  • AIPOD is at R&D stage (best avg_combined_f1: 0.4899, model is not production-deployed; metrics suggest active iteration)
  • No customer-facing surface identified (no CT logs, no DNS, opaque operator)
  • Patient-PHI scale unconfirmed (pan_set_1/2/3 contain X-ray training images but counts/PII shape not enumerated; MLflow doesn’t log full filenames in run params)
  • Active CVE exploitation IS confirmed but exfil success is unproven from external observation
  • 3+ year persistent exposure increases blast radius

If the operator’s S3 access keys leak via /etc/aws/credentials traversal, severity escalates to CRITICAL, the bucket has 4 years of model artifacts including the patient X-ray training data.

References

  • Original Triton + MLflow survey context, mlflow-cloud-survey-2026-05.md
  • Population-scale CVE-2023-1177 attacker UUID 3BT8ncOzBWAH4GyIGz0EXsSwj7f, first documented in SYNTHESIS-2026-05.md “Class E, Active CVE exploitation”
  • Sister actively-exploited host (159.203.110.202), same attacker UUID, financial workload (helios_stock_direction); deferred to a separate disclosure
  • aimap Langfuse fingerprint FP, ~/ai-recon/aimap/fingerprints.go:294 matches Label Studio’s {"status":"UP"} response (Methodology Insight #10 territory)
  • JAXEN favicon-hash pivot for the Label Studio v1.5 fleet, http.favicon.hash:-1649949475