Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

← All engagement records

Multi-host case May 3, 2026

Brazilian Banking-Compliance AI Consultant: Unauthenticated Qdrant with BCB / LGPD Methodology Corpus

Sector
Commercial

NuClide Research · 2026-05-03


Summary

A Qdrant instance on a DigitalOcean VPS exposes an unauthenticated endpoint with a collection schema consistent with a RAG-backed legal casework or compliance investigation platform. Collections include investigation_data, case_drafts, messages, attachments, sessions, and compliance_knowledge.

DCWF KSAT coverage

Auto-derived from DCWF AI work-role rule files (ksat-tag).

  • 672 (AI Test & Evaluation Specialist): K7003, S7068, S7075, T5904
  • 733 (AI Risk & Ethics Specialist): K7040, T5854, T5868
  • overlap (Common AI KSATs (all 5 roles)): K108, K1157, K1158, K22, K6311, K6900, K7003, K942

Re-probe update (2026-05-03): the instance has populated since the initial 2026-04-26 discovery. Three of the six collections now hold records:

Collection2026-04-262026-05-03
messages057
compliance_knowledge037
sessions010
investigation_data00
case_drafts00
attachments00

The platform is now actively in use, light volume (consistent with pilot or single-customer usage) but real records present. The messages and compliance_knowledge collections are the most likely to contain operator/customer PII or proprietary regulatory content. NuClide has not extracted payload contents; only the existence and counts have been confirmed.


Infrastructure

FieldValue
IP167.172.120.218
HostingDigitalOcean
Open port6333 (Qdrant, public, unauthenticated)
Likely functionRAG-based legal/compliance investigation tool
Discovery date2026-05-03
Disclosure statusPending

Collections

CollectionInferred Purpose
investigation_dataPrimary case investigation records
case_draftsIn-progress legal or compliance case documents
messagesCommunication records (internal or case-related)
attachmentsDocument attachments associated with cases
sessionsUser/agent session state
compliance_knowledgeRAG knowledge base, compliance regulations, policies

Findings

The Qdrant REST API at http://167.172.120.218:6333 requires no credentials. Collections are enumerable and scrollable without authentication. The collection schema maps directly to a legal case management workflow:

  • investigation_data + case_drafts → active casework content
  • attachments → supporting documents (contracts, evidence, filings)
  • messages → communications tied to cases
  • compliance_knowledge → embedded regulatory corpus used for RAG retrieval
  • sessions → session-scoped state, potentially including user identity

All collections returned empty vectors during initial probe. Empty state does not reduce severity, the access control gap persists regardless of data load. Any future write to these collections is immediately accessible to unauthenticated clients.

F2: Schema Confirms High-Sensitivity Data Classification (HIGH)

The collection naming is not generic. case_drafts and investigation_data indicate a workflow where legal strategy, evidence summaries, or compliance violation findings are being embedded and stored. If this platform is used by law firms, compliance officers, or enterprise legal teams, the exposed data would typically carry:

  • Attorney-client privilege
  • Work-product doctrine protection
  • Regulatory confidentiality obligations (GDPR Art. 9, sector-specific rules)
  • Trade secret status for compliance findings prior to remediation

The risk is asymmetric: the schema reveals exactly what will be present if the instance is populated, allowing targeted re-probe.

F3: No Tenant Isolation Visible at Schema Level (MEDIUM)

No per-tenant namespacing is evident in the collection names. A single-tenant deployment (one organization, one instance) would still be fully exposed. A multi-tenant deployment, multiple organizations’ cases in one Qdrant instance, would be catastrophic: all tenants’ investigation data accessible to any single probe.


Remediation

Enable Qdrant authentication before loading any case data:

# config.yaml
service:
  api_key: <strong-random-key>

Restrict port 6333 to the application subnet via firewall. Do not expose vector database endpoints to the public internet. If this is a multi-tenant deployment, enforce collection-level or namespace-level isolation at the application layer in addition to database-level auth.


Disclosure

  • Discovered: 2026-05-03
  • Status: Pending, operator not identified; collections empty at time of probe
  • Action: Re-probe warranted. If populated on follow-up, escalate immediately, legal casework data constitutes among the highest-impact breach classes in enterprise AI deployments.