Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All research

Survey May 6, 2026

Compute Orchestration / Training tier, cloud survey 2026-05

NuClide Research

Summary

A Shodan-seeded survey of the Compute Orchestration / Training tier of the category taxonomy confirmed 118 unauthenticated exposures across three platforms, Apache Spark (85), Apache Airflow (29), Ray Dashboard (4), out of 203 candidate hosts surfaced by three Shodan dorks.

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

  • 672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5858, T5904
  • 733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5882, T5893, T5904
  • overlap (Common AI KSATs (all 5 roles)): K1158, K1159, K22, K6311, K6935, K7003, K942, S7065

Population-tier severity breakdown:

SeverityCountComposition
Critical128 Airflow unauth-dashboard (DAG enumeration WITHOUT auth via /home) + 4 Ray Dashboard unauth (CVE-2023-48022 ShadowRay surface)
High79Apache Spark, Master + Worker + Application UI exposed; cluster topology + driver Environment-tab credential leak surface
Medium25Apache Airflow login pages (version disclosure but auth-gated dashboard); 6 Spark Worker-only
Low2Apache Airflow /api/v1/version + /health only (component visibility, not admin)

Single-day end-to-end execution: Shodan dork → fast-probe fingerprint → VisorGraph cert-pivot → aimap-profile classification → nuclide-contact disclosure resolution → VisorLog ledger ingest → VisorScuba compliance scoring → BARE Metasploit-module ranking → VisorCorpus adversarial corpus.

Headline findings

1. The Airflow /home bypass: 8 unauth dashboards

Apache Airflow’s web UI redirects / to /home when the user is logged in. Operators who enable the AnonymousUser public role (commonly added during testing and forgotten in production) reach the same /home unauthenticated the redirect short-circuits to a fully populated DAGs listing, scheduler state, and last-run history. The login page at /login/ is still served, but the dashboard is reachable around it.

This is not an authentication bypass exploit, it’s the documented behavior of AUTH_ROLE_PUBLIC = "Admin" (or "Op") plus WEB_SERVER_AUTH_TYPE = "AUTH_DB" with a public role that has full read access. Operator misconfiguration is the attack path. Eight of 36 confirmed-Airflow hosts in this sample have the configuration shipped open.

IPOperatorProviderSeverity
81.200.154.252(Timeweb customer cx90974)Timeweb (Russia/Kazakhstan)Critical
167.71.184.30(DigitalOcean customer)DigitalOceanCritical
159.223.47.220(DigitalOcean customer)DigitalOceanCritical
34.107.199.191(GCP customer)Google CloudCritical
34.120.202.253(GCP customer)Google CloudCritical
34.209.146.250(AWS customer, us-west-2)AWSCritical
35.184.10.196(GCP customer)Google CloudCritical
52.2.224.249(AWS customer, us-east-1)AWSCritical

Methodology lesson: a probe that only checks / will miss this category entirely. The /-route returns an HTTP 302 to /home, and following the redirect lands on the dashboard if the public role is configured. A naked /login/ check will catch the version-disclosure surface but report “login-gated” when the dashboard is in fact open.

2. Ray Dashboard: 4 confirmed CVE-2023-48022 ShadowRay surface

Out of 26 Ray-dorked candidates, 4 hosts return the Ray Dashboard at root without authentication. Ray’s CVE-2023-48022 (ShadowRay) is an unauthenticated job-submission RCE that has been actively exploited since disclosure. The fix requires operator action (set RAY_HEAD_NODE_ENABLE_AUTH=1); the framework default remains auth-off.

IPOperatorProvider
100.48.41.65(AWS customer)AWS EC2
34.193.202.61(AWS customer, us-east-1)AWS EC2
44.216.229.38(AWS customer)AWS EC2
94.124.160.20(Shock Hosting customer)Shock Hosting

The remaining 16 of the 26 Shodan hits had ports open but my fingerprint did not match the Ray Dashboard root content shape, these are likely Ray Serve deployments with custom basePath, or Ray instances behind a reverse proxy. Worth follow-up probing on /-/routes and /-/healthz for Ray Serve fingerprints.

3. Apache Spark: 85 hosts, three deployment shapes

Spark Master returns the cluster dashboard at / on port 8080 (or worker UI on 8081, application UI on 4040) with no authentication framework, Spark UI is Tier-A “no-auth concept” by default. The dashboard discloses:

  • Cluster topology (Master + Worker IPs + memory + cores)
  • Currently-running applications + their driver IPs
  • Recently-completed applications + their final state
  • Per-application Environment tab (port 4040) which routinely leaks credentials embedded in spark.hadoop.fs.s3a.access.key, spark.cassandra.connection.host, spark.streaming.kafka.bootstrap.servers, and similar runtime config

Geographic distribution (post-dedup of repeated Korea host 3.38.161.105):

CountryConfirmed
United States21
China23
Germany17
France17
Korea4
(other / mixed)3

Spark’s exposure rate is consistent with the auth-on-default-vs-off thesis: the framework ships open by design (Spark UI assumes trusted-network deployment) and operators who expose it on the public internet have not bolted authentication on top via reverse proxy or k8s ingress.

Methodology

Scope: the Compute Orchestration / Training tier as defined in the category taxonomy, Apache Spark, Apache Airflow, Ray (Dashboard + Serve), Dask, Prefect, Temporal, Kubeflow / KServe, BentoML. This survey covered Spark + Airflow + Ray; the remaining six platforms are deferred to a follow-up sweep.

Discovery: three Shodan dorks (http.html:"ray dashboard" country:"US", http.html:"apache airflow", http.title:"Spark Master") executed manually in the Shodan web UI (no API credits available). Top-5 country pages per dork; dedupe + honeypot-list cross-reference produced 203 unique candidate hosts; zero AS63949 honeypot overlap.

Fingerprint and confirmation: aimap (canonical fingerprinter) encountered slow throughput in the multi-port + deep-enumerator path under the active Mullvad VPN egress, so a 50-thread Python fast-probe (fast-probe.py) ran in parallel and produced the canonical fingerprint output. Each hit carries the platform’s distinctive content token at HTTP 200 on the platform’s documented port:

  • Ray Dashboard: <title>Ray Dashboard</title> or favicon.ico reference + Ray-specific React bundle paths
  • Apache Airflow: <title>Airflow - Login</title> (login-gated) or <title>DAGs - Airflow</title> (unauth admin via /home) plus is_scheduler_running meta tag
  • Apache Spark: Spark Master at / Spark Worker at / Spark Jobs title pattern plus <meta name="application-name" content="Spark"> and standard Spark UI HTML scaffold

Auth-posture validation (Airflow): a secondary /home re-probe distinguished login-gated Airflow (the bulk) from AnonymousUser-public Airflow (the 8 critical). The methodology lesson is captured under case-studies/commercial/SYNTHESIS-2026-05.md (Methodology Insight #8, see below).

Attribution: visorgraph cert-pivot per host produced operator-side attribution where TLS was on; aimap-profile --mode fast provided classification. nuclide-contact chained WHOIS abuse + DNS SOA + security.txt

  • pattern-guess+MX for disclosure recipient resolution per critical host.

Severity scoring: classify-and-ingest.py produces NDJSON in VisorLog ECS shape; severity rules:

  • Ray Dashboard reachable → critical (CVE-2023-48022 ShadowRay surface)
  • Airflow with DAGs/admin reachable via /homecritical (anonymous public role enabled, full read+sometimes write)
  • Airflow /api/v1/version + /health only → low (component visibility, not admin)
  • Airflow login-gated → medium (version-disclosure surface only)
  • Spark Master + Application UI → high (cluster topology + driver env credential leak)
  • Spark Master OR Application UI alone → high
  • Spark Worker only → medium

Ledger ingest: 118 findings written to data/nuclide.db via visorlog ingest --format ndjson.

Compliance scoring: visorscuba assess --db data/nuclide.db evaluated all 742 ledger nodes; our 118 produced 236 violations (118 × AI.C1 “AI services must not be publicly accessible without authentication” + 118 × AI.H1). Note: VisorScuba’s policy templates are Ollama-tuned, the violation message names “Unauthenticated Ollama” even for our Spark/Ray/Airflow findings. Policy needs platform-aware text; tracked as policy-coverage gap.

Exploit ranking: BARE returned consistent rank-1 matches across all 91 critical/high findings:

PlatformBARE rank-1 moduleCoverage
Apache Sparkexploits_linux_http_spark_unauth_rce79/79
Apache Spark (alt)exploits_linux_http_apache_spark_rce_cve_2022_3389179/79
Ray Dashboardexploits_linux_http_ray_agent_job_rce4/4
Apache Airflowexploits_linux_http_apache_airflow_dag_rce8/8

Every critical/high finding maps to a documented Metasploit commodity-CVE module, these are not first-party authz bugs. The unauth dashboards are known-CVE attack surface.

Adversarial corpus: visorcorpus build -profile strict -type baseline -max 200 produced a 137-case corpus (visorcorpus-compute-orch.json) for downstream LLM/RAG validation by operators consuming this disclosure.

Disclosure routing: 12 critical hosts

Critical hostWHOIS orgPrimary recipient
100.48.41.65 (Ray)Amazon.com, Inc.aws-security@amazon.com
34.193.202.61 (Ray)Amazon Technologies Inc.aws-security@amazon.com
44.216.229.38 (Ray)Amazon.com, Inc.aws-security@amazon.com
94.124.160.20 (Ray)Shock Hosting LLCabuse@shockhosting.com
159.223.47.220 (Airflow)DigitalOcean, LLCabuse@digitalocean.com
167.71.184.30 (Airflow)DigitalOcean, LLCabuse@digitalocean.com
34.107.199.191 (Airflow)Google LLCgoogle-cloud-compliance@google.com
34.120.202.253 (Airflow)Google LLCgoogle-cloud-compliance@google.com
34.209.146.250 (Airflow)Amazon Technologies Inc.aws-security@amazon.com
35.184.10.196 (Airflow)Google LLCgoogle-cloud-compliance@google.com
52.2.224.249 (Airflow)Amazon Technologies Inc.aws-security@amazon.com
81.200.154.252 (Airflow)Timeweb, LLP (RU/KZ)abuse@timewebcloud.kz

Cloud-provider abuse channels forward to the customer; for Timeweb the customer routing path is less established and a duplicate notification to the operator-direct channel (where derivable from cert/reverse-DNS pivot) is recommended.

Methodology Insight #8: the Airflow /home bypass

A naked /-fetch reports Airflow as login-gated when its public role is enabled. The dashboard reachability check must follow //home (302 target) and inspect for the is_scheduler_running meta tag plus DAG listing. A login-gated instance returns the login template; a public-role instance returns the same template the authenticated dashboard renders.

This pattern parallels Methodology Insight #6 (substring-FP at scale) and #7 (Shodan-facet substring-FP) in SYNTHESIS-2026-05.md: a fingerprint that only inspects the entry-point response shape misses auth-bypass-via-misconfiguration findings whose entry-point looks identical to the login-required case. Future surveys against application-tier platforms (RAG framework, LLM orchestration, BI/Dashboard) should bake in post-redirect auth-posture validation, not just landing-page fingerprinting.

Cross-tier auth-posture observation

The compute-orchestration tier extends the auth-posture pattern documented in SYNTHESIS-2026-05.md:

PlatformTierAuth posture in default configConfirmed exposure rate
Apache SparkInfrastructure-for-engineers”no auth concept”85 / 120 candidates → ~71% exposure
Ray DashboardInfrastructure-for-engineersauth-off-default4 / 26 candidates confirmed unauth
Apache AirflowApplication-tier (auth available)login required by default36 / 57 candidates → ~63% had Airflow + 8 (~14%) had public-role-enabled

Apache Spark behaves like the Vector Databases tier, framework default is no-auth, exposure rate is high. Apache Airflow behaves like the LLM Orchestration tier, framework default is auth-on, exposure is the ~10-15% misconfig slice. Ray Dashboard sits between the two but skews infrastructure-side.

This empirically validates the cross-tier framing: the framework default IS the deployment.

Toolchain provenance

Shodan (manual web UI)        →  3 dorks
fast-probe.py                 →  126 confirmed unauth
visorgraph (per host)         →  attribution graphs
aimap-profile --mode fast     →  ethics + classification
classify-and-ingest.py        →  ECS NDJSON, 118 ledger events
visorlog ingest               →  data/nuclide.db updated
visorscuba assess             →  236 violations
nuclide-contact (per critical)→  disclosure recipients (12)
bare --top 3                  →  3 rank-1 Metasploit modules
visorcorpus build             →  137-case adversarial corpus

All artifacts at ~/recon/compute-orch-2026-05-06/ (NDJSON, JSONs, contact files). Per-host attribution at ~/recon/compute-orch-2026-05-06/attribution/.

Future work

  1. Re-probe the 16 Ray ports-open-no-match hosts on Ray Serve endpoints (/-/routes, /-/healthz), likely Ray Serve, not Ray Dashboard
  2. Sweep the remaining six Compute Orchestration platforms, Dask, Prefect, Temporal, Kubeflow, KServe, BentoML, using the same Shodan-then-probe pattern documented above
  3. Coordinated disclosure batch send to the 12 critical via the disclosures/send_drafts_api.py Gmail-API pipeline
  4. Add platform-aware text to VisorScuba policies, current AI.C1 violation hardcodes “Unauthenticated Ollama”
  5. Fold confirmed findings into SYNTHESIS-2026-05.md cross-tier table

References