AIPOD orthodontic AI MLflow + Label Studio + S3 stack, CVE-2023-1177 actively-exploited (138.197.152.103), NuClide engagement record

NuClide Research · 2026-05-06

Summary

DigitalOcean droplet 138.197.152.103 runs an end-to-end orthodontic-AI R&D stack that has been operational and unauthenticated since March 2023. Three production AI services on the same host:

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5904, T5919
733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5893
overlap (Common AI KSATs (all 5 roles)): K1158, K1159, K22, K6311, K6900, K6935, K7003, K7024

Port	Service	Auth	Vulnerability
5000	MLflow 2.2.1	NONE	CVE-2023-1177 path-traversal RCE, actively exploited since 2026-03-26 with 18 attacker-injected experiments
8080	Label Studio 1.5.0.post0 (Jul 2022 release, 3 years stale)	Token required	CVE-2024-23633 LFR + CVE-2024-24566 SSRF apply to this version
s3://`aipod-crop`/	S3 artifact bucket (us-east-2)	Private (403 on public probes)	Bucket exists, no public objects

Full chain ran via bash data/visor-chain-runner.sh mlflow-cve plus follow-up enumeration via direct MLflow REST API (Methodology Insight #6 conjunctive matchers).

What AIPOD does (operator-IP exfil from MLflow metadata)

The operator is a dental-AI / orthodontic-AI startup developing an end-to-end pipeline of medical-imaging models:

Year	Experiment	Runs	Model task
2023	`/demo-experiment` (id 155786267…)	15	Initial validation experiment
2023	`initial-model` (id 701546096…)	3	First production attempt; train_data_version `train_version_1.csv`
2023–2024	`real-exp` (id 956907690…)	53	Multi-class classifier with MSE loss + cross-entropy; cosine warmup; ReduceLROnPlateau scheduler; 256x256 image input
2024	`pan-segmentation` (id 148418839…)	13	Panoramic dental X-ray segmentation; training datasets `pan_set_1, pan_set_2, pan_set_3`; cosine warmup; IoU 0.66 / val loss 0.61 best
2025	`ceph-keypoint` (id 804275185…)	14	Cephalometric (lateral skull X-ray) keypoint detection; 256x256 input; 4,428 train / 3,542 val examples; final RMSE 0.0109
2026	`orthodontic-upper-multitask` (id 583324192…)	19	Multi-task upper-jaw classifier (arch + alignment); developer `gaurav`; fold-cross-validated; best avg_combined_f1: 0.4899 (R&D-stage, not production)

The pipeline progression is consistent with a methodical orthodontic-AI roadmap: foundational classifier → panoramic segmentation → cephalometric keypoint detection → multi-task arch+alignment fusion. 4-year R&D investment leaked through the MLflow metadata.

Developer roster (extracted from `mlflow.user` and `mlflow.source.name`)

gaurav, Mac developer (path /Users/gaurav/Documents/usa_work/ULClassification/); only active on the 2026 multi-task work; offshore developer signature (usa_work directory naming pattern)
ubuntu, production droplet’s default user; ran the 2023–2024 training waves; src paths src/models/train.py and src/model/train.py

Git commits leaked through MLflow tags

Commit	Period	Source path
`34fb854192012a8da1c409abbeb13939112df9fc`	2023-03 to 2023-04	src/models/train.py
`f32e5d52f16c83f01bac8b654da1e8bd8f4754b4`	2023-06 to 2024-05 (~12 months main branch)	src/models/train.py
`daa9915c…`	2024-04 (refactor)	src/model/train.py (singular `model`)
`dfe5665b8a0af217dc632d313245d0640e08b18d`	ceph-keypoint context	src/model/train.py
`0024a538f1c70c660ac9391048fc5d1e603fe89a`	pan-segmentation context	src/model/train.py

GitHub commit-search for these SHAs returns 0 hits → the operator’s repos are private. The commit hashes are nonetheless useful as forensic fingerprints if the operator’s GitHub Enterprise / GitLab self-host ever surfaces.

Activity timeline

2023-03-10 ────●  /demo-experiment
2023-03-17 ────●  initial-model           ← year-1: foundational
2023-03-20 ────●  real-exp (53 runs over 2023-2024)
2024-03-26 ────●  pan-segmentation        ← year-2: dental X-ray segmentation
2024-05-04 ────●  last `ubuntu` legit run

       (13 months of silence on MLflow surface - possibly migrated production
        elsewhere; this droplet kept as stale dev / artifact repository)

2025-04-13 ────●  ceph-keypoint (14 runs)  ← year-3: keypoint detection
2026-03-23 ────●  orthodontic-upper-multitask (19 runs in one day, gaurav)
                                          ← year-4: multi-task fusion
2026-03-26 ────●  CVE-2023-1177 spray actor finds the host (3 days after gaurav's burst)
2026-04-10 to 2026-04-23 ────● wave of /etc/ traversals
2026-04-20 ────● 5x /root/.ssh/ traversals (SSH key hunt)
2026-05-01 ────● new attacker campaign IDs
2026-05-05 ────● `exp_103` injection
2026-05-06 06:54 UTC ── ●  `poc_exp` injection (16h before NuClide re-probe)

The CVE-2023-1177 spray actor landed 3 days after the operator’s most-recent-visible legit activity. Possibilities: (a) coincidental population-scale spray, (b) Shodan harvest noticed the activity, (c) someone signaled the host. The 3BT8ncOzBWAH4GyIGz0EXsSwj7f UUID appears on multiple tier-2 MLflow hosts (population-scale actor, the synthesis paper documents this UUID across both 138.197.152.103 + 159.203.110.202).

Active attacker presence (CVE-2023-1177)

24 total experiments on host: 6 legit + 18 attacker-injected. Attacker-injected experiments share a recognizable pattern:

{
  "name": "<random-16-char>",
  "artifact_location": "http:///?/../../../../../../../../../../../../../../etc/"
}

Each path traversal targets either /etc/ or /root/.ssh/. The attacker has multiple campaign UUIDs:

Campaign UUID prefix	Pattern	First seen	Most recent
`3BT8ncOzBWAH4GyIGz0EXsSwj7f`	population-scale spray	2026-03-26 00:11:10	2026-03-26 00:11:12
`3CCGENufMtsxUjr3ij4gjsPM44m`	`/etc/` only	2026-04-10 23:33:48	2026-04-10
`3D9V4JvPnDuvfxpSHZBQo1TTM3x`	`/etc/` only	2026-05-01 22:54:45	2026-05-01
`PJYMtlmXsSfyO0hk` (16-char)	`/etc/`	2026-04-23 12:49:00	2026-04-23
`MXhmOLyZ7i2zgR5d` (16-char)	`/etc/`	2026-04-20 11:11:25	2026-04-20
`6tUWyqxY1Z3cuSvj` (16-char)	`/etc/`	2026-04-20 11:11:13	2026-04-20
`aZGVwezuF60CHthW` (16-char)	`/root/.ssh/`	2026-04-20 11:11:36	2026-04-20
`9D6H17u0tiNmXdOp` (16-char)	`/root/.ssh/`	2026-04-20 11:11:39	2026-04-20
`apwsM4eyDoVjWJxq` (16-char)	`/root/.ssh/`	2026-04-20 11:11:28	2026-04-20
`RaYNG7f9MAsKW8ci` (16-char)	`/root/.ssh/`	2026-04-20 11:11:33	2026-04-20
`4lHeW9CUYxhVujFz` (16-char)	`/etc/`	2026-04-20 11:11:19	2026-04-20
`A0lNs4QbTgIChecm` (16-char)	`/root/`	2026-04-20 11:11:41	2026-04-20
`exp_103` (named)	`/etc/`	2026-05-05 08:37:42	2026-05-05
`poc_exp` (named)	`/etc/`	2026-05-06 06:54:25	2026-05-06

The 2026-04-20 batch is striking, 8 experiments injected within 30 seconds, all targeting /root/.ssh/ and /etc/. This is automated mass-spray behavior, not interactive testing.

Did they exfil anything?

The attacker-injected runs are stuck in RUNNING status with empty user_id. The CVE-2023-1177 exfil flow is:

1. POST /api/2.0/mlflow/experiments/create
   {"artifact_location": "http:///#/../../../../../etc/passwd"}
2. POST /api/2.0/mlflow/runs/create - get run_id
3. GET /get-artifact?path=passwd&run_uuid=<id>  ← read the file content

The injection (steps 1-2) is what’s visible to us; step 3 is what would actually exfil files. NuClide cannot determine from passive observation whether the attacker has executed step 3 successfully, the run’s artifact response payload isn’t logged in MLflow. However: the persistence + scale of the spray (40+ days, 18 experiments, 6 distinct campaign IDs on this single host) suggests the actor is at least attempting exfil, not just surveying.

Recommended verification: the operator should grep their MLflow access logs (mlflow_default.log or systemd journal of the gunicorn service) for GET /get-artifact?path= requests with attacker run_ids, those would confirm exfil execution.

Disclosure routing

Provider: abuse@digitalocean.com (rank-1 from nuclide-contact WHOIS resolution).

Operator-direct: AIPOD has no public-facing domain reachable from the data we collected. No CT-log subdomains, no rDNS, no website at aipod.com / .io / .ai / .app (those are unrelated). The S3 bucket aipod-crop is the only operator-attributable artifact, and AWS doesn’t surface bucket-owner contact publicly. Provider-channel-only disclosure recommended; DigitalOcean’s customer-notification process will reach the operator through their billing identity.

Disclosure draft: disclosures/DIGITALOCEAN-138-197-152-103-aipod-mlflow.md

9-step chain provenance

Step 0  jaxen import --no-lookup --source ledger-revisit-2026-05-06   → empire.db
Step 1a visorplus assess (138.197.152.103)  → DigitalOcean WHOIS, nmap top-1000 (3 ports), SSH host keys (RSA+ECDSA+Ed25519), GreyNoise: benign/RIOT
Step 1b aimap -list                          → MLflow 2.2.1 confirmed; **Label Studio mis-fingerprinted as Langfuse** (FP bug - see followup work)
Step 1c jaxen pivot http://138.197.152.103:8080/  → favicon hash `-1649949475` for cross-fleet pivot
Step 2  visorgraph -ip                        → no TLS, no cert pivots (bare-IP hosting)
Step 3  aimap-profile --target --mode full   → no CT subdomains, no security.txt, no public DNS
Step 4  JS-bundle extraction (Label Studio)  → /api/version disclosed v1.5.0.post0 build hash
Step 5  nuclide-contact                      → abuse@digitalocean.com (operator opaque)
Step 6  visorlog ingest                      → ledger entry (existing event ID #220 from milvus survey was on different IP; this is new)
Step 7  visorscuba assess                    → 743 nodes; AI.C1 critical violation
Step 8  bare                                  → CVE-2023-1177 commodity-CVE chain confirmed (top score ~0.5+)
Step 9  visorcorpus build (-profile strict -type baseline -include kb_exfiltration,system_prompt,config_secrets) → 46-case corpus

Severity rationale

HIGH, not CRITICAL. Reasoning:

AIPOD is at R&D stage (best avg_combined_f1: 0.4899, model is not production-deployed; metrics suggest active iteration)
No customer-facing surface identified (no CT logs, no DNS, opaque operator)
Patient-PHI scale unconfirmed (pan_set_1/2/3 contain X-ray training images but counts/PII shape not enumerated; MLflow doesn’t log full filenames in run params)
Active CVE exploitation IS confirmed but exfil success is unproven from external observation
3+ year persistent exposure increases blast radius

If the operator’s S3 access keys leak via /etc/aws/credentials traversal, severity escalates to CRITICAL, the bucket has 4 years of model artifacts including the patient X-ray training data.

References

Original Triton + MLflow survey context, mlflow-cloud-survey-2026-05.md
Population-scale CVE-2023-1177 attacker UUID 3BT8ncOzBWAH4GyIGz0EXsSwj7f, first documented in SYNTHESIS-2026-05.md “Class E, Active CVE exploitation”
Sister actively-exploited host (159.203.110.202), same attacker UUID, financial workload (helios_stock_direction); deferred to a separate disclosure
aimap Langfuse fingerprint FP, ~/ai-recon/aimap/fingerprints.go:294 matches Label Studio’s {"status":"UP"} response (Methodology Insight #10 territory)
JAXEN favicon-hash pivot for the Label Studio v1.5 fleet, http.favicon.hash:-1649949475