What it is
You can’t fine-tune a 70B model on a laptop. ML compute orchestrators are how teams rent and schedule expensive GPUs. RunPod (managed) lets a researcher spin up an 8xA100 pod from a Jupyter button; Ray (Anyscale) is the Python-native distributed-compute framework; Volcano is the Kubernetes GPU scheduler; Kubeflow wraps both for an MLOps workflow; SkyPilot abstracts cloud GPU provisioning across providers. Each is the layer between “I need 80GB of VRAM” and “the GPU is now running my code.”
What goes wrong
These systems hold very expensive credentials. RunPod API keys map to
billable GPU pods; Ray clusters mount the operator’s full SSH agent and
kubeconfig; Kubeflow Pipelines runs as a service account with cluster-wide
read on most installs. An exposed Ray dashboard is a one-click ray submit
endpoint that runs arbitrary Python on the operator’s GPU fleet. An exposed
RunPod control plane lets an attacker spin up new pods for arbitrary workloads
on the operator’s bill. The cost vector here is real: we have seen
disclosures involving five-figure unauthorised GPU rentals.
How we test
We probe Ray’s dashboard /api/version, Kubeflow’s /pipeline endpoint, and
SkyPilot’s API server for fingerprints. Where reachable, we list jobs (no
submit, no cancel) to characterise what the operator runs and how much GPU
they have available. Job names typically include the model architecture and
training step, which is enough to attribute the operator and characterise
the loss vector for the disclosure.