INHA University: Ollama Stack + vLLM Node, NuClide engagement record

NuClide Research · 2026-05-01 (updated 2026-05-03)

Summary

INHA University (인하대학교) in Incheon has two independent unprotected AI inference nodes: an Ollama instance (165.246.39.51) with 7 models totalling ~133GB including gpt-oss:20b and dual Nemotron-Cascade 30B, and a separate vLLM 0.8.4 node (165.246.170.53) serving a containerized Qwen model with 90% prefix cache efficiency.

Node Summary

Node	IP	Service	Model	Port	Notes
Ollama	165.246.39.51	Ollama	gpt-oss:20b + 6 models	11434	CVE-2025-63389 injectable
vLLM	165.246.170.53	vLLM 0.8.4	local-qwen (Qwen, containerized)	8000	311 requests, 90% cache hit

Infrastructure

Field	Value
IP	165.246.39.51 (Ollama) / 165.246.170.53 (vLLM)
Organization	INHA UNIVERSITY
Country	South Korea
Open ports	11434 (Ollama, public) / 8000 (vLLM, public)

Model Inventory

Model	Size	Notes
`gpt-oss:20b`	12.1GB	Local inference, 20.9B params, `gpt-oss` family
`hf.co/unsloth/gpt-oss-20b-GGUF:Q8_0`	12.1GB	Same weights, direct HF GGUF pull
`nemotron-cascade-2:30b`	24.3GB	NVIDIA Nemotron Cascade 2 30B
`gemma4:26b-a4b-it-q8_0`	28.1GB	Gemma 4 Q8
`nemotron-3-nano:30b`	24.3GB	NVIDIA Nemotron-3 Nano 30B
`qwen3.5:27b`	22.5GB	,
`deepseek-r1:14b`	9.0GB	,

Total local storage: ~132GB

Findings

F1: Local gpt-oss:20b and Dual Nemotron Stack (HIGH)

gpt-oss:20b is running locally (12.1GB, 20.9B params). The model family gpt-oss is the OpenAI open-source weights release. Both the standard Ollama-tagged version and the direct HuggingFace GGUF pull are present, suggesting the operator downloaded via hf.co/unsloth/gpt-oss-20b-GGUF:Q8_0 first, then aliased it.

The dual nemotron-cascade-2:30b and nemotron-3-nano:30b stack (both 24.3GB) suggests NVIDIA model evaluation or research use.

F2: CVE-2025-63389 Injectable (HIGH)

All models injectable via unauthenticated /api/create. The Nemotron and gpt-oss models have no system prompts, post-injection inference is unobstructed.

Remediation

OLLAMA_HOST=127.0.0.1:11434
systemctl restart ollama

Node: 165.246.170.53: vLLM Containerized Qwen Node

Field	Value
IP	165.246.170.53
rDNS	No rDNS (SERVFAIL)
vLLM version	0.8.4
Model ID	`local-qwen`
Model root	`/model` (container mount)
max_model_len	4,096 tokens
Port	8000/tcp public

local-qwen is an alias, the model is mounted at /model inside a container, hiding the actual model family and version. Based on the naming and university context, this is likely a Qwen 2.5 or Qwen 3 variant. The containerized deployment pattern (Docker volume mount at /model) and the vLLM 0.8.4 version suggest an automated or scripted deployment.

Metrics

Metric	Value
`request_success_total[stop]`	277
`request_success_total[length]`	34
Total requests	311
`prompt_tokens_total`	10,833
`generation_tokens_total`	12,900
`gpu_prefix_cache_queries_total`	532
`gpu_prefix_cache_hits_total`	481
Prefix cache hit rate	90.4%

The high cache hit rate (90.4%) indicates a consistent input pattern, likely a chatbot or assistant with a fixed system prompt contributing repeated prefix tokens.

Disclosure

Discovered: 2026-05-01 (Ollama) / 2026-05-03 (vLLM node)
Status: Pending outreach to INHA IT (inha.ac.kr)