What it is
Models live somewhere. The Hugging Face Hub is the public default; ModelScope (Alibaba) is the Chinese equivalent; Replicate hosts serverless fine-tunes; BentoML’s BentoCloud is the Python-native deployment registry. Most large operators run a private mirror of one of these so internal teams can pull models without going to the public internet, and so the operator’s own fine-tunes can be versioned and served through the same interface every framework already speaks.
What goes wrong
A private hub mirror is a webhook target, a model-weights store, and an
API token issuer all in one. When the mirror is exposed without auth, an
attacker pulls every model the operator has uploaded, including private
fine-tunes that contain the operator’s training-data leakage. Worse, most
mirrors implement the same /api/models/upload endpoint as the upstream,
so an attacker can push a malicious model into the operator’s namespace
and wait for an internal team to pull it. The supply-chain risk is real:
PyTorch models execute arbitrary Python on load.
How we test
We probe /api/models (HF-compatible) or /api/v1/models (ModelScope) for
the inventory and read the model card metadata. Model names plus sizes
characterise the operator and identify private fine-tunes. We do not
download weights. Where the upload endpoint is reachable we confirm
reachability with an OPTIONS request and stop.