Bind to localhost:
To: security@ntu.edu.tw Subject: Unauthenticated AI inference endpoint, NTU CSIE (mvnl-nas.csie.ntu.edu.tw, 140.112.91.209)
Nicholas Michael Kloster / NuClide Research nicholas@nuclide-research.com
2026-05-03
Re: Unauthenticated vLLM inference endpoint (Llama-3.3-70B-FP8), NTU CSIE MVNL Lab IP / Host: 140.112.91.209 / mvnl-nas.csie.ntu.edu.tw Port: 8080/tcp (public) Severity: HIGH
I’m an independent security researcher. I hold CISA disclosures CVE-2025-4364 and ICSA-25-140-11 and conduct good-faith AI infrastructure research under the NuClide Research umbrella. This is an unsolicited disclosure, no engagement exists with your organization, and I have not accessed, modified, or exfiltrated any data beyond what was necessary to confirm the exposure.
Summary
A machine in CSIE’s MVNL Lab (mvnl-nas.csie.ntu.edu.tw, 140.112.91.209) is running a vLLM inference server on port 8080 without authentication. The server hosts nvidia/Llama-3.3-70B-Instruct-FP8, a 70-billion-parameter instruction model, across two tensor-parallel GPU engines, accessible to any internet actor.
Infrastructure
| Field | Value |
|---|---|
| IP | 140.112.91.209 |
| Hostname | mvnl-nas.csie.ntu.edu.tw |
| vLLM version | 0.18.2rc1.dev73+gdb7a17ecc |
| Model | nvidia/Llama-3.3-70B-Instruct-FP8 |
| Engines | 2 (tensor-parallel multi-GPU) |
| max_model_len | 6,000 tokens |
| Port | 8080/tcp, no authentication |
Exposure
curl http://mvnl-nas.csie.ntu.edu.tw:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/Llama-3.3-70B-Instruct-FP8",
"messages":[{"role":"user","content":"Hello"}],
"max_tokens":100}'
Confirmed: inference executed without credentials. GET /metrics also returns unauthenticated Prometheus telemetry including request counts, token volumes, and per-engine latency distributions.
Usage at probe time: 237 completed requests, 450,604 prompt tokens processed.
Remediation
# Bind to localhost:
vllm serve nvidia/Llama-3.3-70B-Instruct-FP8 \
--host 127.0.0.1 --port 8080 \
--tensor-parallel-size 2
# Or add API key authentication:
vllm serve nvidia/Llama-3.3-70B-Instruct-FP8 \
--api-key <secret> \
--tensor-parallel-size 2
Note: I previously disclosed a separate exposure at g1pc2n108.g1.ntu.edu.tw (140.112.233.108), Ollama with 11 vision models, to this same address. If that disclosure was received and acted on, this new finding on the CSIE side of campus requires the same remediation.
Please acknowledge receipt.
Nicholas Kloster
nicholas@nuclide-research.com
nuclide-research.com