SPA + headless API is a high-severity exposure tell, NuClide Methodology

The insight

When a single-page application is hosted on a CDN platform (Vercel, Cloudflare Pages, Netlify, GitHub Pages, etc.) and its bundled JavaScript calls a same-brand API host of the form https://api.<brand>.<tld>/..., the API host is almost always on infrastructure the operator manages directly, and very often that infrastructure is unhardened.

The CDN provides TLS, edge caching, and DDoS protection for the static frontend. The API surface that does the actual work lands wherever the developer chose, and that choice is often a single cloud VM with no authentication wall, no WAF, and the operator’s data tier exposed on adjacent ports of the same host.

The CDN-hosted SPA gives the operator (and any casual observer) a false sense of security: the visible state is professional, the actual posture is unhardened.

Field-validated cases

Operator	SPA host	API host	Backing infra	Severity
PromptLayer (April 2026)	Vercel	call to Make.com webhook from the bundle	hardcoded webhook secrets	LLMjacking / quota drain
Multiple Phase 5 MLflow operators	Vercel / Netlify	tracker on raw cloud VM	unauth MLflow + open data ports	CRITICAL
SmartShop AI / amazonrec.space (May 2026)	Vercel	`api.amazonrec.space` → single Turkish VM	unauth FastAPI + MLflow + Redis + Postgres + Airflow on one host	CRITICAL

Three independent instances of the same pattern across the same six-week window. The pattern is consistent enough to codify as a detection heuristic.

Detection heuristic

For any operator under investigation whose primary web presence is on a CDN edge:

Identify the CDN by HTTP response headers: server: Vercel, cf-ray:, x-vercel-id:, x-served-by: (Fastly), via: 1.1 vegur (Heroku stragglers), server: Netlify.
Fetch the index HTML and extract any <script src="/assets/..."> or similar bundled-JS URLs.
Fetch the bundle. For typical SPA builds (Vite/Webpack/Turbopack) this is one 200KB–500KB file at a hashed path.
Grep for absolute URLs matching https?://api\.[^"]+ or https?://[^"]*\.<operator-domain>/api.
Resolve the extracted API hostname. If it lands on a different ASN than the CDN edge, that resolved host is the soft target. Probe it directly.

The heuristic surfaces the headless API in under 30 seconds per operator once automated.

Why it works

CDN platforms market themselves as “deploy your frontend, we handle the hard parts.” They deliberately do not host long-lived stateful backends. That separation of concerns is operationally correct, but it pushes the backend onto whatever infrastructure the developer is comfortable with, which for a small team is often the cheapest available cloud VM.

The same developer who pushes their frontend through a polished Vercel/Cloudflare workflow may have stood up the API on a Turkish or Indonesian or DigitalOcean shared-hosting VM in 20 minutes using a tutorial. The resulting asymmetry (professional frontend, unhardened backend) is the exposure-tell.

A second contributor: the CDN-hosted SPA usually pre-resolves the API hostname in the bundle as a constant. The hostname must be a stable public DNS name (so the SPA can call it from end-user browsers), which means it’s findable via passive DNS the moment the SPA ships. There is no network hardening shortcut available; the API has to be reachable from the open Internet by design.

How to apply at population scale

For a platform-class survey, after extracting per-host operator domains:

Filter for hosts where the operator domain is on a CDN edge (cheap test: does the operator domain resolve to a known CDN ASN, while a api. subdomain resolves elsewhere?)
Pull the SPA bundle and apply the heuristic.
Probe the resolved API host directly.
Bucket findings: SPA+API on same ASN (less interesting, may be self-hosted edge) vs. SPA on CDN + API elsewhere (high-priority probe).

What this is not

The heuristic does not catch:

Server-rendered apps without a separate API hostname (the backend is on the same edge or behind it).
API gateways with proper authentication. The heuristic surfaces the host, not the exposure level. The host still needs probing to confirm.
APIs accessed via the CDN’s own proxy (e.g. Vercel Edge Functions, Cloudflare Workers). These don’t reveal a separate backend hostname in the bundle; the API call goes to the same edge.

It does catch the most common small-team deployment topology where a solo dev or contractor splits frontend hosting from backend hosting along “easy” lines and leaves the backend half exposed.

Tooling

This heuristic becomes a VisorBishop stage in iter-9. discoverHeadlessAPI(target): given a CDN-fronted operator domain, extract its largest JS bundle, parse for api.* URLs, resolve them, and return the soft-target host record. Costs one HTTP fetch per operator; returns either a high-priority probe target or nothing.

When this could break

CDN platforms expanding backend hosting. Vercel’s Functions, Cloudflare Workers, and Netlify Functions all eat into the edge/backend split. Adoption is uneven across operator skill levels; expect this heuristic to remain effective for 2–3 more years before the topology shifts.
Backend-on-edge by default. If framework defaults (Next.js, Remix, SvelteKit) push more developers to deploy the API alongside the frontend on the CDN, the soft-target host stops existing.
API gateway adoption. Managed gateways (AWS API Gateway, Cloudflare API Gateway, Azure APIM) decouple the API hostname from the backend host, so DNS resolution stops pointing at the vulnerable VM. Operators who use a gateway are a different population.

For now, the pattern holds. Watch the cumulative case-count over the next two quarters; if the rate of confirmed instances drops below 1 per month while the platform-class survey volume holds steady, the topology is shifting and the heuristic needs revisiting.

SOURCE · case-studies/commercial/smartshop-ai-pentech-disclosure-2026-05-13.md