Streamlit Data Apps on Public Cloud: Auth Posture Survey, NuClide Research

NuClide Research · 2026-05-03

Summary

Mass-scan of port 8501 (Streamlit’s default) across 28 cloud-provider /16 ranges (DO/Hetzner/Vultr) returned 1,389 hits → fingerprinted via /_stcore/host-config → 551 confirmed Streamlit apps, all unauthenticated (useExternalAuthToken: false). A 100-app Playwright-rendered sample revealed 84 unique app titles = operator-attributable products, spanning trading bots, OSINT tools, business admin portals, dashboards, and a long tail of internal AI demos.

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

672 (AI Test & Evaluation Specialist): K7003, K7004, K7044, S7068, S7070, S7075, T5904
733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5893, T5904
overlap (Common AI KSATs (all 5 roles)): K108, K1158, K1159, K22, K6311, K6900, K6935, K7003, K7024, K7048, K942, S7065

Streamlit ships without built-in authentication, the framework expects operators to put a reverse proxy in front of it. The 100% unauth result here is therefore expected: any Streamlit found on the public internet on its default port has no auth in front. The novel finding shape is what people are running on top of Streamlit unauth, production trading dashboards, dark-web OSINT tools, admin portals, etc., often with embedded API keys, LLM access, file-upload PII pipelines, and internal data exposed to every visitor.

This is the largest “long-tail” sample in the NuClide commercial-AI series and the broadest cross-section of how AI/data tooling actually gets deployed in 2026.

Methodology

masscan -iL <28 cloud /16 CIDRs> -p 8501 --rate 10000
  → 1,389 port-8501 hits

streamlit-probe.py (200-thread fingerprint)
  GET /_stcore/host-config → JSON {"useExternalAuthToken":false, "allowedOrigins":[...]}
  GET /_stcore/health      → "ok"
  GET /                    → HTML (title only, set client-side via JS)
  → 551 confirmed Streamlit apps

streamlit-render-probe.py (Playwright sample, 100 random instances)
  Render each app with a real browser; wait for JS hydration; extract
  document.title and first 5 lines of body text.
  → 98 successfully rendered, 84 unique custom titles

NuClide deliberately did not interact with the Streamlit apps (no form input, no file upload, no button clicks). Title + first-render snapshot only.

Findings Summary

Metric	Value
Cloud /16 ranges scanned	28
Masscan hits on :8501	1,389
Streamlit confirmed	551
Unauthenticated	551 (100%)
Sampled with Playwright (rendered)	100
Successful renders	98
Custom-titled apps	~85% (extrapolating from sample)
Default-titled (“Streamlit”/“main”/“app”)	~15%

Threat Classes Observed in the 100-App Sample

Class A: Trading bots / crypto / finance dashboards (highest concentration)

The single largest cluster. ~20% of titled apps are trading-related:

App title	IP	Notes
Trading Desk	165.227.127.162	Generic trading dashboard
Trading Dashboard	45.32.35.83	Generic
Trading Bot Dashboard	178.62.87.181	Generic
交易历史 - Binance Bot	149.28.141.122	Chinese-language Binance trading-history dashboard
Crypto Bot Dashboard	138.197.87.106	Generic
Hyperliquid Dashboard	45.76.92.38	Hyperliquid (perpetuals DEX) trading view
Polymarket Smart Money	159.69.23.69	Polymarket whale-tracking
Daytrade bot, dashboard	116.203.227.71	Generic
Bot Dashboard	116.203.192.203	Generic
PBGUI - Welcome	65.109.134.92	PassivBot UI (popular open-source crypto bot frontend)
Pre-Volatility Dashboard	(sampled)	Generic
Systematic Portfolio Dashboard	138.197.223.108	Generic
Kalshi Weather Desk	104.131.190.28	Kalshi prediction-market weather contracts
Finance Tracker	159.203.67.30	Generic
帝國矩陣指揮部 \| Institutional Grade	(sampled)	Chinese, “Empire Matrix Command Center”
台股基本面系統化分析	(sampled)	Chinese, Taiwan stock fundamental analysis
Heights Insights	159.203.64.214	Likely finance

Risk class: strategy disclosure (the dashboard reveals which signals/positions the operator runs), API-key exposure (Binance/Hyperliquid/Polymarket account credentials often hard-coded in Streamlit st.secrets), live position visibility, and the standard “free LLM/inference” exposure if the bot uses an embedded API key.

This matches the prior 94.183.187.228 (retail trading bot) finding from the earlier session, which was a single example of this pattern. The cloud sweep shows the pattern at population scale: trading bots on Streamlit are the dominant exposed AI workload type.

Class B: Operator-attributable admin portals (CRITICAL by class)

Apps named with administrative or operational language:

App title	IP	Notes
Fair Skies Admin Portal 👤	138.197.225.245	Customer admin UI
Quetzality Admin	138.197.33.66	Operator-branded admin
Heritage Lens Agent	45.63.100.169	Internal agent tool
Lynchburg Carbon Intelligence	104.236.8.2	Carbon-emission intelligence (operator: Lynchburg)
Peaqock Tenders	167.71.47.246	Peaqock tender-management
Yguazu - Pedidos de Combustible	149.28.107.69	Spanish, fuel-order management
AMZ Bid Manager Level 3	46.101.215.72	Amazon advertising bid manager
Управление данными о селлерах OZON	65.109.88.77	Russian, OZON e-commerce sellers data management
AFI Tools, Data Cleanse	206.189.190.107	Data-cleansing internal tool
Observability Utility Tool	206.189.132.132	Internal ops tool
WG Device Manager	45.63.53.47	WireGuard or similar device manager
MITEC Live	108.61.185.244	”MITEC”, Mexican government IT-secretariat naming pattern
Alarm Rationalization Platform	138.197.80.150	Industrial / SCADA-adjacent
FORGE, Milos	138.197.19.120	Custom platform
Sentinel Core	116.202.97.178	Custom platform
Shinbu Command Center	45.76.122.49	Custom platform
Ayuda-Foreclosure Manager	138.197.93.163	Foreclosure case management

Risk class: these are internal operations tooling exposed to the public internet. Each one likely contains the operator’s customer data, ticket queue, or business-process state.

Class C: AI / OSINT / agent tools (HIGH: operator IP + capability)

App title	IP	Notes
Robin: AI-Powered Dark Web OSINT Tool	(sampled)	Literal name
Polygraph Terminal	167.71.166.118	Investigative/intel tool
Evidence Engine	116.203.134.204	Investigative
Fourby Newsroom	167.172.119.126	News/research
AI Visibility OS (Personal Edition)	45.55.91.146	LLM-app monitoring
PolyGraph Terminal	167.71.166.118	Same as above
MySQL Chatbot	167.71.226.163	Free unauth-DB chatbot
NHL Agent Chat	167.172.26.116	Sports agent / scout chatbot
BANG Companion App	65.109.225.156	Unknown
Smart Dustbin Dashboard	206.189.155.51	IoT
Project Planner	167.172.182.229	Internal
MERMAID Data Analysis Toolkit	45.76.173.245	Marine science (MERMAID = Marine Ecological Research and Monitoring)
VisionCup Questions	149.28.138.130	Sports
Telco Churn Predictor	165.227.67.180	Telecom analytics
Water Flow Rate Prediction	167.71.210.165	Utility

Class D: Cross-correlation with the MLflow survey

GC Breeders Evaluation appears twice in the 100-app sample (port 8501), and GC_BREEDER_* was the dominant experiment-name pattern on the MLflow at 188.166.132.129/.104 (port 5000). Same operator runs:

MLflow Tracking Server with 10 GC_BREEDER_* experiments → see mlflow-cloud-survey-2026-05.md
Streamlit dashboards titled “GC Breeders Evaluation” for browsing the model results
Multi-host deployment (DigitalOcean)

This is a complete MLOps stack exposure for one operator: training (MLflow) + dashboarding (Streamlit), both unauth.

What Was NOT Done

No form fills, no button clicks, no file uploads to any Streamlit app
No interaction with /_stcore/stream (the WebSocket data plane), would let an attacker watch the operator’s live dashboard updates
No probing of st.secrets-style endpoints
No identification of embedded API keys (would require deeper interaction)

The Playwright render captured the public-facing first-screen state only. Many apps likely have richer surfaces behind the home page, login screens, admin tabs, file-upload forms, that NuClide did not exercise.

Cross-Survey Pattern (updated)

Tier	Platform	Sample	Unauth
Vector DB	Qdrant / ChromaDB / Milvus	142	100%
Inference	Triton / vLLM	46	100%
Image-gen	A1111	1	100%
MLOps	MLflow Tracking	11	100%
Data App	Streamlit	551	100%
Orchestration UI	Flowise / n8n / Open WebUI / Langflow	1170	0% (small misconfig %)

Streamlit confirms the broader pattern: anything in the AI/ML stack that doesn’t ship with auth-on-default is overwhelmingly deployed without auth in front of it. Operators who would never deploy a public-internet Postgres or Redis will happily expose their Streamlit dashboard with the same access semantics as that database.

Remediation

# Streamlit has NO built-in auth. Recommended:

# 1. Reverse-proxy with HTTP Basic / OAuth2 forward auth
#    (Caddy / Nginx / Traefik with oauth2-proxy)

# 2. streamlit-authenticator package (community, recommended)
#    pip install streamlit-authenticator

# 3. Streamlit Community Cloud SSO (managed, only works for public-cloud-hosted)

# 4. Bind to localhost only and front with a tunnel
streamlit run app.py --server.address=127.0.0.1

Disclosure Posture

NuClide is not opening 551 individual disclosure threads. The ~85% custom-titled fraction means several hundred operator-attributable apps. Disclosure priorities by class:

Trading bots / finance dashboards, operator-specific where the brand is identifiable. PBGUI and similar open-source bots = community awareness only. Branded Trading Bot Dashboards = direct operator contact.
Admin portals (Fair Skies, Quetzality, MITEC, OZON), operator-attributable; direct disclosure to brand contact.
GC Breeders multi-stack operator, coordinated disclosure with the MLflow finding (same operator, same VPS family).
Robin Dark Web OSINT Tool, given the data class implied (dark-web scrape data), worth contacting the operator directly via brand.

NuClide Pipeline Artifacts

Stage	Notes
Discovery	masscan port 8501 → 1,389 IPs
Fingerprint	`streamlit-probe.py`, `/_stcore/host-config` shape match
Render sample	`streamlit-render-probe.py`, Playwright on 100 random instances; 98 successful
Findings ledger	Top-titled instances ingested into `data/nuclide.db`
What was NOT done	No app interaction, no file uploads, no form fills, no probing of internal pages

References

Streamlit security model: https://docs.streamlit.io/develop/concepts/connections/secrets-management
streamlit-authenticator: https://github.com/mkhorasani/Streamlit-Authenticator
Cross-survey index: index.md