Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← Research library

HIGH · Disclosure May 3, 2026

Bind to localhost:

To: security@ntu.edu.tw Subject: Unauthenticated AI inference endpoint, NTU CSIE (mvnl-nas.csie.ntu.edu.tw, 140.112.91.209)


Nicholas Michael Kloster / NuClide Research nicholas@nuclide-research.com

2026-05-03

Re: Unauthenticated vLLM inference endpoint (Llama-3.3-70B-FP8), NTU CSIE MVNL Lab IP / Host: 140.112.91.209 / mvnl-nas.csie.ntu.edu.tw Port: 8080/tcp (public) Severity: HIGH


I’m an independent security researcher. I hold CISA disclosures CVE-2025-4364 and ICSA-25-140-11 and conduct good-faith AI infrastructure research under the NuClide Research umbrella. This is an unsolicited disclosure, no engagement exists with your organization, and I have not accessed, modified, or exfiltrated any data beyond what was necessary to confirm the exposure.


Summary

A machine in CSIE’s MVNL Lab (mvnl-nas.csie.ntu.edu.tw, 140.112.91.209) is running a vLLM inference server on port 8080 without authentication. The server hosts nvidia/Llama-3.3-70B-Instruct-FP8, a 70-billion-parameter instruction model, across two tensor-parallel GPU engines, accessible to any internet actor.


Infrastructure

FieldValue
IP140.112.91.209
Hostnamemvnl-nas.csie.ntu.edu.tw
vLLM version0.18.2rc1.dev73+gdb7a17ecc
Modelnvidia/Llama-3.3-70B-Instruct-FP8
Engines2 (tensor-parallel multi-GPU)
max_model_len6,000 tokens
Port8080/tcp, no authentication

Exposure

curl http://mvnl-nas.csie.ntu.edu.tw:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/Llama-3.3-70B-Instruct-FP8",
       "messages":[{"role":"user","content":"Hello"}],
       "max_tokens":100}'

Confirmed: inference executed without credentials. GET /metrics also returns unauthenticated Prometheus telemetry including request counts, token volumes, and per-engine latency distributions.

Usage at probe time: 237 completed requests, 450,604 prompt tokens processed.


Remediation

# Bind to localhost:
vllm serve nvidia/Llama-3.3-70B-Instruct-FP8 \
  --host 127.0.0.1 --port 8080 \
  --tensor-parallel-size 2

# Or add API key authentication:
vllm serve nvidia/Llama-3.3-70B-Instruct-FP8 \
  --api-key <secret> \
  --tensor-parallel-size 2

Note: I previously disclosed a separate exposure at g1pc2n108.g1.ntu.edu.tw (140.112.233.108), Ollama with 11 vision models, to this same address. If that disclosure was received and acted on, this new finding on the CSIE side of campus requires the same remediation.

Please acknowledge receipt.

Nicholas Kloster
nicholas@nuclide-research.com
nuclide-research.com