Search Engines, Data Layer, NuClide Stack

What it is

Search engines power both the classic full-text retrieval tier (Elasticsearch, Apache Solr, Vespa) and the modern vector-similarity tier that LLM apps lean on for retrieval-augmented generation. The line between them has blurred since 2022: every mainstream engine (Elastic, OpenSearch, Solr 9, Vespa, Meilisearch, Typesense) now ships dense-vector indices alongside their inverted-index core. Many production RAG pipelines store their LangChain or LlamaIndex document chunks here rather than in a dedicated vector DB.

What goes wrong

The official Docker images ship with auth off by default. The operator must opt into security: set xpack.security.enabled=true for Elasticsearch, configure Solr’s security.json to enable the basic-auth plugin, or set the Meilisearch master key via environment variable. Across population-scale surveys, ~54% of reachable Elasticsearch instances skip the step entirely. Solr’s older Docker tags (solr:7.x) compound the problem with multiple unauthenticated remote-code-execution CVEs: CVE-2019-17558 (Velocity Template SSTI), CVE-2019-0193 (DataImportHandler), CVE-2019-12409 (JMX-RMI). The data layer itself discloses operator app schema via index and core names long before any document is read.

How we test

We probe each engine’s identity endpoint (/ for Elasticsearch’s version JSON, /solr/admin/info/system for Solr, /health for Meili and Typesense, /state/v1 for Vespa), confirm version, and then call the documented listing endpoint (/_cat/indices, /solr/admin/cores, /indexes, /collections). Index and core names are the finding: operators name things like rag-document-chunks, spring-ai-document-index, entity_vectors, kb_documents_v1. Disclosure of the operator’s app architecture happens before any document fetch. We never run free-text queries against the index; the names alone justify the severity claim.

Search Engines

What it is

What goes wrong

How we test

Meow / Indexrm Elasticsearch extortion. Three actors. (2026-05-17)

Meow / Indexrm campaign: per-actor census across 4,776 ES hosts

Elasticsearch AI-Stack Population Survey (2026-05-16)

Vector-DB Stragglers Population Survey (2026-05-16)

BI/Dashboard Platforms: Auth Posture Survey

Neo4j, Elasticsearch, Supabase, Redis Stack: AI Infrastructure Exposure Survey

New Vector Storage Survey: QuestDB / Meilisearch / PocketBase / NATS JetStream

SurrealDB, Typesense, and LanceDB: Exposure Survey

Elasticsearch / OpenSearch on Public Cloud: Auth Posture Survey

Unauthenticated ML Training Server — velutina-service.ch

sanctionscanner.com: Turkish AML/KYC Compliance SaaS: 79M KYB Records + Live Client Monitoring Exposed

Cn Gaohe Itgaohe 2026 05 17

Cn Gxota Guangxi Travel Dev 2026 05 17

Cn Hooper Erp 2026 05 17

Cn Timedb 2026 05 17

Cn Torchv Mengjia Zlmediakit 2026 05 17

Cn Woyaodiancan Restaurant Ai 2026 05 17

Cn Xiaoice Demo Virtualhuman 2026 05 17

De Aitalkx Dms Rag 2026 05 17

De Travelm Articles 2026 05 17

Eg Equant Tech Waffarha Lms 2026 05 17

It Isideweb Deskpro 2026 05 17

Np Mohp Hmis Ocl 2026 05 17

Ru Westcall Aicloud Backend 2026 05 17

Sa Tahakum Llm 2026 05 17

Solr 7.6.0 unauth fleet: Aggregate cloud-provider disclosure

Vector Databases

OLAP / Analytics Backends

MLOps Tracking

Agent Memory

Data Labeling

Object Storage

Compute Orchestration

GPU Compute & Telemetry

Container Orchestration

Medical / Edge AI

Backup & Snapshots

Fine-tuning Runtimes

Document Parsers

Model Hubs & Registries