MLflow Tracking Server on Public Cloud: Auth Posture Survey
NuClide Research · 2026-05-03
Summary
Mass-scan of port 5000 across 28 cloud-provider /16 ranges (DO/Hetzner/Vultr) returned 12,106 hits → fingerprinted via /version + /api/2.0/mlflow/experiments/search body match → 11 confirmed MLflow Tracking Server instances, all unauthenticated.
DCWF KSAT coverage
Auto-derived from DCWF AI work-role rule files (ksat-tag).
- 672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5904
- 733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5893, T5904
- overlap (Common AI KSATs (all 5 roles)): K108, K1157, K1158, K1159, K22, K6311, K6935, K7003, K7024, K942, S7065
Small absolute count vs. the noise on port 5000 (Flask’s default), but every single confirmed instance is unauth and exposes the operator’s ML experiment metadata, model registry, and artifact-storage URIs. Two of the eleven are already being actively exploited via CVE-2023-1177 (path traversal) by external attackers, observable as attacker-injected experiments with artifact_location values like http:///?/../../../../../../../etc/ and /root/.ssh/. Same attacker IDs span multiple hosts, indicating a coordinated CVE-2023-1177 sweep at population scale.
Behind the attacker noise, the legitimate operators run substantial production ML workloads in finance (algorithmic trading SPX hedging), medicine (pediatric vital-sign classification), dental imaging, livestock/horse-racing breeding models, manufacturing process homogeneity, AI safety research, and chatbot services.
Methodology
masscan -iL <28 cloud /16 CIDRs> -p 5000 --rate 10000
→ 12,106 port-5000 hits
mlflow-probe.py (200-thread fingerprint)
GET /version → MLflow version (raw text)
GET /api/2.0/mlflow/experiments/search → confirms MLflow REST API
GET /api/2.0/mlflow/registered-models/list → model registry inventory
→ 11 confirmed MLflow Tracking Servers
CVE correlation:
- < 2.2.1 vulnerable to CVE-2023-1177 (path traversal in artifact endpoints)
- < 2.8.1 vulnerable to CVE-2023-6014 (auth bypass on multi-user)
- < 2.10.0 vulnerable to CVE-2024-37052 through 60 (recipe deserialization → RCE)
- < 2.12.0 vulnerable to CVE-2024-37060 (pickle deserialization)
Findings Summary
| Metric | Value |
|---|---|
| Cloud /16 ranges scanned | 28 |
| Masscan hits on :5000 | 12,106 |
| MLflow Tracking confirmed | 11 |
| Unauthenticated | 11 (100%) |
| Already-compromised (CVE-2023-1177 attacker artifacts visible) | 2 |
| CVE-vulnerable by version | 2 (v2.2.1 + v2.9.2) |
| Versions sampled | 2.2.1, 2.9.2, 2.17.1, 2.18.0, 2.20.2, 2.21.3 (×2), 2.22.1, 2.22.4, 3.4.0 (×2) |
Class A: Already-Compromised (CVE-2023-1177 exploitation in progress)
138.197.152.103 (DigitalOcean): MLflow 2.2.1 (vulnerable to CVE-2023-1177, CVE-2023-6014, CVE-2024-37052/…/60)
Attacker-injected experiments (path-traversal CVE-2023-1177 PoCs):
3D9V4JvPnDuvfxpSHZBQo1TTM3x → http:///?/../../../../../../../../../../../../../../etc/
PJYMtlmXsSfyO0hk → http:///#/../../../../../../../../../../../../../../etc/
MXhmOLyZ7i2zgR5d → http:///#/../../../../../../../../../../../../../../etc/
6tUWyqxY1Z3cuSvj → http:///#/../../../../../../../../../../../../../../etc/
3CCGENufMtsxUjr3ij4gjsPM44m → http:///?/../../../../../../../../../../../../../../etc/
3BT8ncOzBWAH4GyIGz0EXsSwj7f → http:///#/../../../../../../../../../../../../../../etc/
3BT8OPIHCXoEhTZvbIPhCefQe7O → http:///#/../../../../../../../../../../../../../../etc/
HfbDEvCSaL9t2Bkn → http:///#/../../../../../../../../../../../../../../etc/
xk8wSBjZY7MJGU6r → http:///#/../../../../../../../../../../../../../../etc/
Operator’s actual experiment (the legitimate workload buried in attacker noise):
orthodontic-upper-multitask → s3://aipod-crop/artifacts/583324192295777494
The operator runs dental AI models for orthodontic upper-jaw classification (multitask learning) with model artifacts stored in an S3 bucket called aipod-crop. The bucket name is now disclosed; whether the MLflow stores its S3 credentials in a way that’s also readable via the path-traversal vulnerability would require further (destructive) probing.
159.203.110.202 (DigitalOcean): MLflow 2.9.2 (vulnerable to CVE-2024-37052/…/60)
Attacker-injected experiments, same actor as 138.197.152.103 (matching IDs 3BT8ncOzBWAH4GyIGz0EXsSwj7f and 3BT8OPIHCXoEhTZvbIPhCefQe7O), worse target list, /root/.ssh/:
A0lNs4QbTgIChecm → http:///#/../../../../../../../../../../../../../../root/
9D6H17u0tiNmXdOp → http:///#/../../../../../../../../../../../../../../root/.ssh/
aZGVwezuF60CHthW → http:///#/../../../../../../../../../../../../../../root/.ssh/
RaYNG7f9MAsKW8ci → http:///#/../../../../../../../../../../../../../../root/.ssh/
apwsM4eyDoVjWJxq → http:///#/../../../../../../../../../../../../../../root/.ssh/
4lHeW9CUYxhVujFz → http:///#/../../../../../../../../../../../../../../etc/
exploit_33295 → /mlflow/artifacts/13 (named "exploit" - older opportunistic actor)
Operator’s actual experiment:
helios_stock_direction → /mlflow/artifacts/1
The operator runs a finance / algorithmic-trading model (“Helios” stock-direction predictor) on this MLflow. Same actor that’s been targeting 138.197.152.103 has also been spraying CVE-2023-1177 PoCs against this instance, specifically aimed at SSH key extraction.
Cross-host attacker correlation
The duplicated experiment IDs (3BT8ncOzBWAH4GyIGz0EXsSwj7f, 3BT8OPIHCXoEhTZvbIPhCefQe7O) appearing on BOTH compromised instances mean the same attacker IS spraying the same payload UUIDs across all reachable vulnerable MLflow servers. The attacker has been doing this for some time and at scale; both hosts are accumulating attacker artifacts.
The path-traversal artifact_location idiom (http:///?/../../../../) is the canonical PoC for CVE-2023-1177, when MLflow constructs the file path for artifact retrieval, the .. segments escape the artifact root and read arbitrary filesystem paths. With etc/ and root/.ssh/ as targets, the attacker is harvesting SSH keys, /etc/passwd, and similar generic-secrets fare.
Class B: Production ML Workloads (Operator-Attributable by Experiment Names)
157.90.104.16 (Hetzner): MLflow 2.17.1: algorithmic trading
hedge /root/ml/experimentation/mlflow_artifacts/9
hedge_extra_features /root/ml/experimentation/mlflow_artifacts/15
hedge_update_TFS /root/ml/experimentation/mlflow_artifacts/17
hedge_markov_crash /root/ml/experimentation/mlflow_artifacts/11
hedge_with_markov_crash_feats /root/ml/experimentation/mlflow_artifacts/10
spx_test mlflow-artifacts:/8
pomorski_labels_spx_bayesian_optimisation mlflow-artifacts:/7
spx_rf_manual_labels mlflow-artifacts:/6
delete_me /root/ml/experimentation/mlflow_artifacts/14
rce_test_1772834193 /root/ml/experimentation/mlflow_artifacts/22 ← attacker probe
Operator profile: quantitative trading firm running SPX (S&P 500) hedging strategies. The “Pomorski labels” reference points to a known finance-research approach (Pomorski meta-labelling for trading). Markov-crash + bayesian-optimization signals serious production quant work. Local-disk artifact storage.
The rce_test_1772834193 experiment is unix-timestamp-suffixed and named “rce_test”, it’s an attacker probing for the CVE-2024-37052 recipe deserialization RCE path. This instance is also being actively scanned, but on this version (2.17.1, post-fix) the exploit doesn’t land.
65.109.36.121 (Hetzner): MLflow 2.22.1: pediatric medical ML
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check_final2
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check_final1
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check_final
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check6
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check5
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check4
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check3
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check2
beh_ped_lyi_sta_ml_etgn_cal_cri_100_min_v4_check1
fts_large_v1_cal_all_cri_100_day_sic_vs_hlt_f1_xgboost_updated1
Decoded naming: beh (behavioral?) ped (pediatric) lyi (lying) sta (standing) ml (ml) etgn (?) cal (calibration) cri (criterion) 100_min (100-minute window), the model classifies pediatric patient state from sensor data with calibration thresholds across time windows. The fts_large_v1_cal_all_cri_100_day_sic_vs_hlt_f1_xgboost experiment name explicitly carries sic_vs_hlt (sick vs healthy) + f1_xgboost (F1-scored XGBoost classifier).
Operator profile: medical-ML team training pediatric vital-sign / posture classifiers (sick-vs-healthy outcome prediction). The _check_final2 + 9 prior _checkN experiments imply iterative model tuning, production ML work with regulatory implications (HIPAA-class data class, FDA AI/ML pre-market submissions).
188.166.132.129 + 188.166.38.104 (DigitalOcean): MLflow 3.4.0: livestock/horse-racing breeding ML (multi-host operator)
Both hosts have identical experiment lists pointing to the same s3://flow-bucket/*:
DKA_LOSS_MODEL s3://flow-bucket/30
GC_LOSS_EXPERIMENT_ONE_MODEL s3://flow-bucket/29
TAH_LOSS_EXPERIMENT s3://flow-bucket/28
TAH_WEIGHT_EXPERIMENTS s3://flow-bucket/21
GC_BREEDERS_EXPERIMENTS s3://flow-bucket/20
GC_BREEDER_EXPERIMENT s3://flow-bucket/19
GC_BREEDER_PRODUCTION_TRAINING s3://flow-bucket/18
GC_BREEDER_ADVANCED_TRAINING s3://flow-bucket/17
GC_BREEDER_ROBUST_TRAINING s3://flow-bucket/16
GC_BREEDER_V2_MEMORY_OPTIMIZED s3://flow-bucket/15
Operator profile: the BREEDER + WEIGHT + LOSS naming pattern with track-code-style prefixes (DKA, TAH, GC) is consistent with horse-racing AI, predicting race outcomes from horse weight, breeder lineage, track conditions. “PRODUCTION_TRAINING” + “ADVANCED_TRAINING” + “V2_MEMORY_OPTIMIZED” + “ROBUST_TRAINING” indicates serious MLOps for a commercial gambling/sports-analytics product. Multi-host deployment (two MLflow servers serving the same S3 bucket) suggests staging + production, or a load-balanced setup.
Alternative interpretation: agricultural livestock breeding (cattle/poultry). Either way, a commercial breeding-prediction product with substantial ML investment exposed.
159.69.35.23 (Hetzner): MLflow 2.18.0: manufacturing process
HOMOGENEITY_OPW_03 through _11 (9 variants, all s3://mlflow/24-32)
shadow_experiment s3://mlflow/35
Operator profile: “HOMOGENEITY OPW” + sequential numbering pattern looks like manufacturing process control (Operations Per Window? Optimization Per Wafer?), possibly semiconductor wafer-uniformity classification or similar quality-control ML. “shadow_experiment” is MLOps speak for a champion/challenger test.
168.119.201.8 + 168.119.201.89 (Hetzner): MLflow 2.21.3: AI safety research (multi-host)
zone_strategy s3://mlflow/3
sine_wave s3://mlflow/2
l2_probe s3://mlflow/1
Default s3://mlflow/0
Operator profile: “l2_probe” + “zone_strategy” + “sine_wave” reads as LLM interpretability / safety research terminology. L2-norm probes are a standard interpretability technique (Anthropic-style mechanistic interpretability). “Zone strategy” could be an attack-zone characterization. Two hosts in the same /24 with identical experiment lists = multi-replica research environment.
Other instances (HIGH/MEDIUM, briefer)
| Host | Version | Workload |
|---|---|---|
135.181.108.159 (Hetzner) | 2.20.2 | Chatbot_Service (single experiment) |
65.109.28.42 (Hetzner) | 2.22.4 | git-query-recommender, dev tool / code search |
Per-Class Severity
| Class | Count | Severity | Notes |
|---|---|---|---|
| A, Already-compromised (CVE-2023-1177 active exploitation) | 2 | CRITICAL | Attacker has been active against these for some time; SSH key + /etc/* extraction possible |
| B, Production ML workloads, post-CVE-fix versions, exposed metadata | 9 | HIGH | Operator-attributable workloads with sensitive content classes (medical, finance, breeding/gambling, manufacturing) |
The CRITICAL classification on Class A is sharp: an attacker who can read arbitrary filesystem paths via CVE-2023-1177 can extract SSH keys and pivot to full host compromise. Both 138.197.152.103 and 159.203.110.202 should assume their VPS root is compromised.
Cross-Survey Pattern (updated)
| Tier | Platform | Sample | Unauth |
|---|---|---|---|
| Vector DB | Qdrant / ChromaDB / Milvus | 142 | 100% |
| Inference | Triton / vLLM | 46 | 100% |
| Image-gen | A1111 | 1 | 100% |
| MLOps | MLflow Tracking | 11 | 100%, 18% actively compromised |
| Orchestration UI | Flowise / n8n / Open WebUI / Langflow | 1170 | 0% (small misconfig %) |
The MLOps tier (MLflow) joins the vector-DB and inference tiers in the auth-off-by-default cluster. The new wrinkle: passive observation of attacker activity within the MLflow data shows the auth-off-by-default state is being actively exploited at population scale, not just theoretically risky.
Remediation
# MLflow basic auth (since 2.5)
mlflow server --app-name basic-auth --host 0.0.0.0 --port 5000
# Configure default-admin via env
export MLFLOW_AUTH_CONFIG_PATH=/path/to/auth-config.ini
# Or front with reverse-proxy auth (nginx/Caddy HTTP Basic)
For the 2 already-compromised instances: assume root compromise, rotate all SSH keys + AWS credentials in the artifact-store config, redeploy from clean image.
Disclosure Posture
138.197.152.103(dental AI / orthodontic-upper-multitask) and159.203.110.202(helios_stock_direction) are in active-attacker territory. Time-sensitive disclosure to DigitalOcean abuse + the operator (if identifiable from the S3 bucketaipod-cropfor the first one) is warranted within hours, not days.157.90.104.16(algorithmic trading), finance-class data exposed; Hetzner abuse channel.65.109.36.121(pediatric medical ML), HIPAA-relevant data class implied; Hetzner abuse + potentially direct operator if identifiable.188.166.132.129/.104(horse-racing/livestock breeding ML), commercial product IP exposed.- Other 4 instances, informational.
NuClide Pipeline Artifacts
| Stage | Notes |
|---|---|
| Discovery | masscan port 5000 → 12,106 IPs |
| Fingerprint | mlflow-probe.py, /version regex + /api/2.0/mlflow/experiments/search body match |
| What was NOT done | No exploitation of CVE-2023-1177 (file read), no attempt to read attacker artifacts (which would have triggered the same vulnerable code path), no inference / model artifact downloads |
References
- MLflow auth: https://mlflow.org/docs/latest/auth/index.html
- CVE-2023-1177: https://nvd.nist.gov/vuln/detail/CVE-2023-1177
- CVE-2024-37052/53/54/55/56/57/58/59/60: https://github.com/mlflow/mlflow/security/advisories
- Cross-survey index: index.md