Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All research

Survey Jun 6, 2026

Arize Phoenix Population Survey — 41/55 Unauthenticated Project Disclosure

NuClide Research · 2026-06-06


Executive Summary

Arize Phoenix (github.com/Arize-ai/phoenix) is an open-source LLM observability and tracing platform — span ingestion, project organization, dataset versioning, prompt management for production AI applications. 94 Shodan-indexed instances on "Phoenix" port:6006. 89 unique endpoints downloaded; 55 responded.

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

  • 672 (AI Test & Evaluation Specialist): K7003, K7004, S7068, S7070, S7075, T5858, T5904, T5919
  • 733 (AI Risk & Ethics Specialist): K7040, S7067, T5854, T5868, T5893
  • overlap (Common AI KSATs (all 5 roles)): K1157, K1158, K22, K6311, K6900, K6935, K7003, K942

Of 55 reachable instances, 41 (74.5%) expose /v1/projects without authentication, and 34 (61.8%) expose /v1/users without authentication. The user list endpoint returns account records including creation timestamps and account IDs — a PII disclosure at population scale.

This is a smaller population than Langfuse or RAGFlow, but the auth surface is more severe: where Langfuse and RAGFlow expose only a signup flag, Phoenix exposes the data layer (projects, users) directly. The finding class is LLM02:2025 Sensitive Information Disclosure (current OWASP Top 10 for LLM Applications #2).

Notable institutional findings: Northeastern University (Boston, USA), SENAI (Brazil — Serviço Nacional de Aprendizagem Industrial, the Brazilian national vocational education service).


Methodology

StageActionTool
Stage 0Shodan harvest "Phoenix" port:6006shodan CLI (89 records)
Stage 0cTCP/HTTP liveness via /healthzherald
Stage 1bAuth-posture probe /v1/projects and /v1/users (array_nonempty match)herald phoenix platform config
Stage 3vEndpoint semantics validated against Phoenix v6.x source (src/phoenix/server/api/routers/v1/)manual review
Stage 12bDataset enrichment with country/org from Shodan recordPython + Shodan join

The probes use array_nonempty matching: if the response contains a non-empty data array, the finding fires. Phoenix returns {"data": [], "next_cursor": null} when authenticated routes are queried without credentials in instances where auth is configured — so a non-empty data array is the unauth signal.

NuClide restraint: account count is the only /v1/users field consumed by herald. Schema/PII details were not extracted, per the restraint ethic — names ARE the finding.


Population Results

MetricCountRate
Shodan-indexed94
Unique endpoints downloaded89
Reachable (HEALTH_OPEN)5561.8% of indexed
/v1/projects unauth (PROJECTS_UNAUTH)4174.5% of reachable
/v1/users unauth (USERS_UNAUTH)3461.8% of reachable

The PROJECTS_UNAUTH > USERS_UNAUTH gap (41 vs 34) is consistent: about a third of Phoenix instances have configured user accounts (so /v1/users returns data), and the remainder have no users at all (a fresh install with no auth requirement — even more permissive).


Notable Findings

Northeastern University — 129.10.224.226:6006 (HIGH)

Northeastern University (Boston, USA, AS161). Phoenix instance with two projects: Essaybot and default. The Essaybot project name suggests a student essay-grading or writing-assistant LLM application — potentially handling student work products, which is FERPA-relevant data under US education privacy law.

Project names accessible unauthenticated. User records (count = 2) accessible. Span/trace data not exercised (restraint).

Disclosure recipient: oirc@northeastern.edu (Northeastern Office of Information Security)

SENAI Brazil — 200.9.65.187:6006 (HIGH)

Serviço Nacional de Aprendizagem Industrial (SENAI) is Brazil’s national industrial apprenticeship service — vocational education for ~3 million students annually, operated by the Brazilian National Confederation of Industry (CNI). Phoenix instance on SENAI infrastructure with 2 projects exposed and 2 users disclosed.

Vocational education context: Brazilian LGPD applies. Disclosure recipient: CERT.br for coordination, with SENAI national IT direct contact.

37.27.248.144:6006 Hetzner Helsinki — 21 Projects Disclosed (HIGH)

Single Phoenix instance on Hetzner Helsinki (AS24940) exposing 21 distinct project names unauthenticated. The largest project-count disclosure in the population. Operator not identified from the Shodan record (no TLS cert, no PTR).

Operator profiling: 21 projects suggests a sophisticated production deployment — either a multi-tenant SaaS offering Phoenix-as-a-service to downstream customers, or a single large engineering org with many parallel LLM applications. Either case elevates concern.

Scaleway Paris Cluster — 7 Instances (MEDIUM-PATTERN)

7 Phoenix instances on Scaleway France (163.172.x, 51.15.x, 51.158.x, 51.159.x), all PROJECTS_UNAUTH, most also USERS_UNAUTH. The IP clustering suggests either a single Scaleway customer running a fleet or a Scaleway tenant pattern. Worth follow-up profiling to determine if this is a single operator.

Google Cloud US Cluster — 7 Instances (MEDIUM-PATTERN)

7 Phoenix instances on Google LLC (34.x, 35.x), all PROJECTS_UNAUTH. Same pattern observation. One instance (34.133.205.22) discloses 8 projects.


Geographic Distribution (Findings)

CountryPROJECTS_UNAUTH hosts
United States12
France (Scaleway)8
Germany (Hetzner / Contabo / IONOS)7
Finland (Hetzner)1
Brazil (SENAI)1
Sweden1
Poland1
China (Aliyun, UCloud)2
Vietnam1
India1
Indonesia1
Canada1
Other5

Unlike Langfuse and RAGFlow (CN-dominant and CN-second respectively), Phoenix’s population is US-Western-Europe dominated. This reflects Arize’s commercial customer base — primarily US enterprise AI teams using the open-source Phoenix for self-hosted observability alongside Arize’s paid product.


Comparison: Auth Surface Severity

PlatformAuth signalData exposed
LangfusesignUpDisabled: false flagRegistration possible; trace data behind workspace auth
RAGFlowregisterEnabled: 1 flagRegistration possible; knowledge base behind tenant auth
Open WebUIfeatures.auth: falseFull chat interface, model inference
Difyis_allow_register: trueRegistration possible; apps behind workspace auth
Phoenix/v1/projects, /v1/users return dataDirect project + user enumeration; spans potentially accessible

Phoenix’s exposure model is the most severe: while the other platforms gate the data layer even when registration is open, Phoenix exposes project names and user account metadata unauthenticated, requiring no account creation step at all.


Disclosure Pipeline

FindingTierRecommended action
Northeastern UniversityHIGH (FERPA-class)oirc@northeastern.edu
SENAI BrazilHIGH (LGPD-class)CERT.br + SENAI IT
37.27.248.144 (21 projects)HIGH (operator unknown)Vendor mediated via Arize
Scaleway Paris cluster (7)MEDIUMProfiling first, then per-tenant
Google Cloud cluster (7)MEDIUMProfiling first
41 total PROJECTS_UNAUTHUPSTREAMArize: change default to auth-required for /v1/projects and /v1/users

The upstream remediation is the highest-leverage. Arize Phoenix currently ships with PHOENIX_ENABLE_AUTH defaulting to false. A one-line config change protects the entire population.


Remediation (per-operator)

# Phoenix environment:
PHOENIX_ENABLE_AUTH=true
PHOENIX_SECRET=<strong-random-secret>

Verify:

curl http://IP:6006/v1/projects
# Expected: 401 or 403, NOT a populated data array

Toolchain Provenance

Step 0:    shodan download '"Phoenix" port:6006' (89 records)
Step 0c:   IP extraction → ip-port.txt (89 unique)
Step 1b:   herald -platform phoenix < ip-port.txt
           - probe id projects_unauth: /v1/projects array_nonempty
           - probe id users_unauth: /v1/users array_nonempty
           - probe id health_open: /healthz body_contains "OK"
Step 3v:   Endpoint semantics verified against Arize/phoenix v6.x source
Step 12b:  This document

Tool: herald v0.1.1 (github.com/nuclide-research/herald). Phoenix config added with three probes covering the LLM02-class disclosure surface.


Research Contribution

Phoenix is the third same-day platform survey (after Langfuse and RAGFlow). Unlike the registration-flag findings, Phoenix surfaces a direct data-layer disclosure: the platform’s default deployment exposes the actual data model (projects, users) without any auth step. This maps cleanly to OWASP LLM Top 10 (2025) entry LLM02 Sensitive Information Disclosure, which jumped from #6 to #2 in the 2025 revision specifically because of incidents like this — enterprise AI deployments leaking observability data publicly.

The Phoenix population is also the smallest of the three (89 vs 1,140 vs 1,905), suggesting either a more sophisticated user base who hardens by default, or a less mature deployment cycle where many operators still run development instances unauthenticated. The 74.5% PROJECTS_UNAUTH rate suggests the latter.

The three-platform same-day corpus (Langfuse 88.9%, RAGFlow 87.2%, Phoenix 74.5%) is a strong empirical baseline for the auth-permissive-default cohort hypothesis (Candidate Insight #76).