Backup & Snapshots, Data Layer, NuClide Stack

What it is

Backups are easy to forget about. That’s why they’re dangerous. The popular ML and Kubernetes backup stack: Velero snapshots Kubernetes cluster state plus the persistent volumes underneath; Restic is the encrypted-by-default file backup tool whose REST server mode listens on a public port for incoming snapshots; Barman does Postgres-specific backup-and-restore; Longhorn (Rancher) is the Kubernetes block-storage layer that snapshots volumes on a schedule; BorgBackup sits in the same niche as Restic. In an ML deployment these tools are how the operator’s model weights, training datasets, and vector-DB volumes are persisted between restarts.

What goes wrong

A backup is a verbatim copy of the system at rest. And at rest, every secret is unencrypted and every model file is intact. Restic’s REST server, when exposed without HTTP auth, lets an attacker download every snapshot the operator has ever taken (which is usually the entire model registry plus the training data). Velero exposes its API through the Kubernetes API server, so a misconfigured cluster RBAC turns into a one-step model-exfiltration primitive. Longhorn’s UI ships without auth on port 80 and lists every volume by name (model-weights-pvc, training-data-pvc), pointing attackers exactly where to chain next.

How we test

We probe Restic REST /snapshots for the snapshot inventory (this works without auth in the default config), Longhorn /v1/volumes for the volume list, Velero’s BackupStorageLocation objects via the Kubernetes API. We do not download snapshots. The metadata (snapshot IDs, volume names, timestamps, sizes) is sufficient evidence and avoids us ever touching the model files themselves. A snapshot called mlflow-pvc measuring 240GB on a research host tells the disclosure story without any further reach.

Receipts

Research

Every survey, case study, and disclosure we've published that touches this layer of the stack. Counts on the cells above tally these directly.

Cross-cloud surveys

1

Survey May 4, 2026

Backup & Snapshot Services on Public AI Infrastructure: Survey

Re-probe of the 663 unauthenticated tier-2 Qdrant instances catalogued in the parallel cross-survey, this time targeting Qdrant's snapshot endpoints (GET /snapshots and GET /collections/<name>/snapsho…

Read →

Data Layer

Backup & Snapshots

What it is

What goes wrong

How we test

Research

Cross-cloud surveys

Backup & Snapshot Services on Public AI Infrastructure: Survey

Other categories in this layer

Vector Databases

Search Engines

OLAP / Analytics Backends

MLOps Tracking

Agent Memory

Data Labeling

Object Storage

Compute Orchestration

GPU Compute & Telemetry

Container Orchestration

Medical / Edge AI

Fine-tuning Runtimes

Document Parsers

Model Hubs & Registries