§ TOPICS
Topics
A production LLM deployment is six layers of infrastructure, not one model. Pick a layer below, then a category inside it, to see every survey, case, and disclosure NuClide has on that platform class. 446 artifacts across 37 categories.
§ Reference topology
An AI/LLM application is nine layers deep.
Drawn as a layer cake from the user down to the public internet. Each layer has its canonical implementations and a per-layer population count from the corpus. Magenta-bordered nodes ship insecure-by-default on at least one popular distribution.
Action
Agent Layer
How LLMs reach out and take action: call APIs, browse the web, drive workflows.
MCP Servers
17Model Context Protocol, tool-calling agents
Open →Browser Agents
3Headless browsers driven by LLMs
Open →Workflow Automation
11n8n, Flowise, LLM-native flows
Open →Agent Frameworks
34LangGraph, AutoGen, CrewAI, multi-agent orchestration
Open →Voice Agents
1Vapi, Retell, LiveKit Agents, real-time voice + LLM
Open →Code Agents
3Aider, OpenHands, Continue, SWE-agent
Open →Interface
Application Layer
The surfaces humans actually interact with: chat UIs, notebooks, generation studios.
Routing
Gateway Layer
Routes the request, attaches retrieved context, mediates between user and model.
Inference
Model Layer
The runtime that actually executes the model: where the weights run.
Ollama
112Local-LLM runtime, no auth by design
Open →vLLM
8High-throughput batched inference
Open →Triton Inference Server
3NVIDIA model serving
Open →Speech & Audio
6Whisper, Piper, RVC, Coqui
Open →Embedding Servers
TEI, Infinity, sentence-transformers
Open →llama.cpp
1C++ inference runtime. frequently co-deployed on Ollama port :11434
Open →Memory & Knowledge
Data Layer
Vector stores, registries, memory, datasets: what the model knows and remembers.
Vector Databases
26ChromaDB, Milvus, Qdrant, Weaviate, pgvector
Open →Search Engines
26Elasticsearch, Solr, Meilisearch, Typesense, Vespa. full-text + vector search
Open →OLAP / Analytics Backends
8ClickHouse, Cassandra, ScyllaDB, Pinot. the trace + log + analytics tier under observability
Open →MLOps Tracking
30MLflow, W&B, ClearML, Aim, Comet ML. experiment tracking + model registry
Open →Agent Memory
2Mem0, Letta, Zep, Motorhead. long-term memory backends
Open →Data Labeling
4Label Studio, Argilla, CVAT, Doccano, Prodigy. training-data annotation
Open →Object Storage
3MinIO, S3, model & dataset stores
Open →Compute Orchestration
1RunPod, Ray, Volcano, Kubeflow, SkyPilot
Open →GPU Compute & Telemetry
1Run:AI, DCGM-exporter, NVIDIA Fleet Command. GPU fleet metrics + scheduling
Open →Container Orchestration
5Docker daemon, etcd, Vault, Consul, Portainer, Argo CD. the substrate AI runs on
Open →Medical / Edge AI
2DICOM, MONAI, Orthanc, dcm4che, NVIDIA NIM. clinical and edge model serving
Open →Backup & Snapshots
1Velero, Restic, Barman, Longhorn
Open →Fine-tuning Runtimes
Axolotl, LLaMA-Factory, Unsloth, torchtune
Open →Document Parsers
Unstructured, LlamaParse, marker, MinerU, Docling
Open →Model Hubs & Registries
HF Hub mirrors, ModelScope, BentoML
Open →Telemetry & Guardrails
Observability & Safety
Tracing, evaluation, and policy enforcement around the entire stack.