Methodology white paper · v0.1

Meridian
context-aware incident intelligence

How Meridian turns raw detection telemetry into source-bound incident narratives using a decoupled state/compute architecture, the ContextSync Protocol, and a Gemini-3 reasoning agent.

Architecture

5 layers

State protocol

ContextSync v0.2

Reasoning

Gemini 3.1 Pro

License

Apache 2.0

01The problem

Modern security operations centers receive tens of thousands of detections per day across SIEM, EDR, identity, network, and cloud sources. Each detection arrives as a discrete row in a query tool. The analyst's job is to convert those rows into an answer the business cares about: what happened, what does it mean, who is affected, and what should we do.

That conversion is mostly manual. The analyst pivots between queries, takes notes in a ticket, traces dependencies in a runbook, and writes a narrative for the executive team. The work product — the incident write-up — is the only deliverable that matters, and it's produced last, slowly, by hand.

We argue that the gap between "the alert fired" and "the executive understands the incident" is a problem of state, not of compute. The events exist. The relationships between them exist. The dependency graph exists. What's missing is a substrate that unifies them so a reasoning agent can compose the narrative directly.

02Thesis: state and compute, decoupled

The truth of what the organization knows belongs in a protocol-governed state layer. The reasoning belongs in a stateless compute layer that operates against it. The surface belongs in a third layer that presents the result. Each tier evolves without the others.

Most agent products today put state and compute in the same place: the model's conversation buffer, plus a vector store for retrieval. This collapses several distinct concerns — provenance, versioning, multi-actor permissions, audit, retention — onto a substrate that wasn't designed for them.

Meridian inverts this. We treat the state of what the organization knows as a first-class protocol-governed artifact store. The agent reasons over it. The Control Center surfaces the result. The compute layer can be swapped (today Gemini 3, tomorrow whatever wins) and the state layer doesn't move. The surface layer can be rebuilt and the agent doesn't notice. This is the architectural pattern we believe wins the next decade of operational AI products.

03Architecture

Meridian organizes its work into five stacked layers. Each layer has a stable contract with the layers above and below it. Internal evolution of any one layer does not require coordination with the others.

05 · Surface

Meridian Control Center

04 · Compute

Gemini 3.1 Pro · investigate() · agentic loop

03 · Persistence

MongoDB Atlas · vector search · provenance log

02 · Protocol

ContextSync v0.2 · USC stamps · default-deny ACLs

01 · Integration

MCP — Splunk Enterprise, MongoDB, future sources

Integration

Source-system connectors. The current implementation uses the Model Context Protocol (MCP) to talk to a Splunk Enterprise tenant for telemetry and to a MongoDB Atlas cluster for the artifact store. MCP gives us a stable tool interface that the reasoning agent can consume natively, and lets us add or swap sources without changing the agent.

Protocol

ContextSync Protocol v0.2 — the rules that govern every artifact Meridian writes: URIs, versioning, content addressing, default-deny permissions, immutable provenance. Detailed in §04.

Persistence

MongoDB Atlas. Three collections matter: artifacts (every event ever ingested, ~7,800 today), agent_memory (every investigation the agent has completed), provenance (the immutable read/write log). A fourth collection, entity_graph, holds dependency relationships used for blast-radius traversal. All artifacts are stamped with a Unified Spatiotemporal Coordinate (§05) and indexed via Atlas Vector Search on 768-dimensional embeddings produced by nomic-embed-text-v1.5.

Compute

Gemini 3.1 Pro (preview) via Vertex AI. The model has direct access to the persistence layer via two MCP servers (one for MongoDB, one for Splunk) routed through mcpToTool(). Agent code uses an explicit investigate(eventUri) entry point that walks a seven-step procedure described in §06.

Surface

The Meridian Control Center — a Next.js 16 / React 19 application. Server Components query MongoDB directly on every request. Real-time updates flow via change streams. UI design is deliberately CISO-grade: calm, source-cited, no dashboards-for-the-sake-of-dashboards.

04ContextSync Protocol

ContextSync is the substrate every artifact in Meridian lives on. The contract is small, intentional, and stable.

URIs

Every artifact has a globally-unique identifier of the form ctx://{org}/{domain}/{id}. The org field scopes the artifact to a tenant. The domain partitions by kind ( splunk-events, investigations, entities, compliance). The id is the content-addressable hash of the artifact body. URIs are stable; payloads are immutable for a given URI. New versions get new URIs.

Versioning + provenance

Updates produce a new content-addressed artifact rather than mutating in place. Each write is appended to the provenancelog with actor, operation, artifact URI, and timestamp. This is what powers the "source-bound" guarantee — every claim the agent surfaces in an investigation can be traced back to a specific read or write, and the audit trail is non-repudiable.

Permissions

Default-deny. Actors must hold an explicit grant for read, write, or publish on a given URI prefix. Grants are themselves artifacts and follow the same versioning + provenance rules.

Specification

Full protocol spec lives at github.com/metisos/contextsync-protocol. Meridian implements the v0.2 surface.

05Unified Spatiotemporal Coordinate

The USC is a 7-field tuple that locates every artifact in space and time with measurable uncertainty. It is how Meridian decides which events are related across noisy, distributed sources.

USC = ⟨ s, t, σs, σt, π, τ, e ⟩

s   spatial coordinate    (host, region, service, asset URI)
t   temporal coordinate   (ISO-8601 UTC, with sub-millisecond resolution)
σs  spatial uncertainty   (Gaussian std; topology-aware)
σt  temporal uncertainty  (Gaussian std; clock-skew aware)
π   provenance reference  (ctx:// URI of the producing actor)
τ   tier label            (cognitive · temporal · spatial)
e   embedding            (768-d nomic-embed-text-v1.5)

Cross-tier match formula

Two artifacts are considered candidates for causal linkage when the Gaussian-product score across their spatial and temporal coordinates exceeds a threshold:

C(p, Q) = exp(−d_s² / (2(σ_s² + r_s²))) · exp(−d_t² / (2(σ_t² + r_t²)))

where d_s and d_t are the spatial and temporal distances between the candidate p and the query Q, and r_s / r_t are the match-bandwidth parameters of the query. A score of 1.0 means perfect co-location; the agent treats any link below 0.7 as too weak to chain.

Worked example

For two events on the same host (d_s = 0) separated by 16 seconds (d_t = 16s), with temporal uncertainties of 5 seconds each, the score collapses to a one-dimensional Gaussian:

d_s = 0           →  exp(0) = 1.000
d_t = 16s         →  exp(−256 / (2 · (25 + 25))) = exp(−2.56) ≈ 0.077

C = 1.000 · 0.077 = 0.077    (below threshold, no link)

Tighten either uncertainty parameter and the link strengthens. The agent surfaces these scores per step in the Correlation sub-tab of an incident, so the user can audit which events were chained on what evidence.

06Agent loop

The reasoning agent has one canonical entry point — investigate(triggerArtifactUri) — and walks a seven-step procedure to produce a complete investigation record:

Fetch trigger. Read the artifact at triggerArtifactUri from the persistence layer.
Recall. Vector-search agent_memory for similar past investigations.
Causal chain. Walk backward through USC-matched artifacts to assemble the sequence of events that led here.
Blast radius. Traverse the entity_graphoutward from the trigger's root entity, categorizing hits as infrastructure, business, or compliance.
Hypothesis. Compose a root-cause hypothesis grounded in the chain.
Actions. Surface a prioritized action list with severity and reversibility annotations.
Persist. Write the full investigation record to agent_memory, append to provenance, and notify the surface layer via change streams.

Agentic loop with meta-tools

For free-form questions (the Meridian Agent surface), we use the Claude-style agentic loop pattern with four meta-tools: search_tools, list_tools, list_tool_details, call_tool. This keeps the agent's context window small — we never load the full MCP tool catalog into the prompt. The agent discovers tools on demand.

Splunk-native tool calls

The Meridian Agent surface attaches the Splunkbase MCP server (app 7931) to Gemini via mcpToTool(). When the CISO asks a search-shaped question, Gemini autonomously translates it into SPL, executes the search against live Splunk via the MCP transport, and renders both the SPL query and the result table in the chat response. Examples:

"Show the top 5 sourcetypes by event count over the last 30 days" → agent emits | tstats count by sourcetype | sort -count | head 5, executes, returns the actual table.
"Find proxy events with response_code >= 500" → agent constructs a bounded SPL search with proper time range and surfaces the results.

If the MCP server is unreachable, the agent falls back to emitting the SPL block for the user to run manually via a Run in Splunk button. The chat never hard-breaks on infrastructure issues.

Multimodal input

The composer accepts up to four files per message (images, PDFs, DOCX, plain text; 10 MB each, 20 MB total). Images and PDFs pass through to Gemini natively as inline binary parts. DOCX is parsed server-side via mammoth.js and attached as a text part. The agent reads attachments and reasons about them alongside the investigation casebook — a screenshot of a Slack alert, a prior post-mortem, a network diagram, all become first-class context.

07Data pipeline

End-to-end flow for a single detection:

Detection source  →  Splunk Enterprise   (8089 REST, 8088 HEC)
                  →  Splunk MCP server   (/services/mcp, Splunkbase app 7931)
                  →  Ingest worker       (ContextSync wrap + USC stamp)
                  →  MongoDB Atlas       (artifacts collection)
                                          ↓
                                          embedding (nomic v1.5)
                                          ↓
                                          Atlas Vector Search index
                                          ↓
                                          available to agent recall

Today the demo console runs against a self-hosted Splunk Enterprise 10.2.3 instance with five seeded incident archetypes (cascading failure, auth brute-force, privilege escalation, data exfiltration, DDoS surge). The artifact count is ~7,800. Investigations in agent_memory count three end-to-end runs at the time of writing. Storage footprint is ~60 MB against the Atlas M0 free-tier 512 MB cap, leaving generous room for production load.

Why MCP

We chose MCP over ad-hoc REST integration because it gives Gemini native tool routing via mcpToTool() in the @google/genaiSDK. The agent doesn't need a custom adapter per source — every MCP server is a uniform interface. Adding a new detection source (Sentinel, CrowdStrike Falcon, GuardDuty) becomes installing its MCP server and granting the agent read permission.

08Cognitive memory

The agent has three tiers of memory, all backed by ContextSync artifacts:

Cognitive — what the agent has figured out. Stored in agent_memory. Surfaced as the Casebooktab and as the "Similar past investigations" panel.
Temporal — when things happened. Stored as the t and σt fields of every USC stamp.
Spatial — where things live. Stored as the entity_graph with edges encoding dependency.

Recall

Every new investigation begins with a vector-search over agent_memory using a 768-d cosine index. The agent treats matching past investigations as evidence for or against its current hypothesis. This is how Meridian learns from its own work without retraining the model.

Hybrid retrieval

For the human-facing "similar past investigations" surface in Incidents, Meridian runs hybrid retrieval against MongoDB Atlas — a $vectorSearch over the 768-d cosine index in parallel with a $text BM25 query over the hypothesis field — and fuses the two ranked lists with Reciprocal Rank Fusion (k = 60). Hits that appear high in either lane win. Both raw scores are shown next to the fused score in the UI so the analyst can audit the ranking instead of trusting a black-box similarity number.

09Surface

The Control Center is built to be read by a Chief Information Security Officer at 7am with a coffee, not by an L1 SOC analyst hunting for a needle. That constraint drives every design decision:

One posture above the fold. Stable, Elevated, or Critical — with the narrative answer underneath.
Source-bound everything. Every claim in every surface carries a ctx:// URI you can click to read the underlying evidence — citation pills inline in chat, canvas, and incident detail.
Meridian Agent is primary. The CISO talks to the agent in plain English. It executes Splunk searches itself via MCP, summarizes the rows, and cites every claim. Multimodal — accepts attached screenshots, PDFs, and Word docs inline.
Canvas for deliverables. When the response is a written incident report — RCA, exec brief, weekly summary — it streams into a Claude-style canvas with copy / download as Markdown / export to PDF (via browser print pipeline) / export to Word .docx.
Incident drill-down with four lenses. Each incident in the feed opens to four sub-tabs: Detail (root-cause hypothesis, causal chain, blast radius, recommended actions), Correlation (real C(p,Q) match scores computed live from stored USC tuples — see §05), Risk graph (radial blast-radius visualization, click any entity to drill in), and Provenance(bipartite trace of the agent's actual reads and writes from the immutable provenance log).
Risk Map (org-wide). Force-directed graph of every entity in entity_graph with incident heat overlay. Toggle between Category view and Compliance lens (PCI-DSS / SOC 2 / GDPR / HIPAA / ISO 27001 umbrellas derived by BFS from compliance entities).
Replay the agent loop. A Replay button on every incident streams the seven-step investigate() procedure as named events with per-step durations and the evidence each step touched. Transforms the product from a summary tool into a visible, auditable reasoning system.
Confidence with receipts. Every confidence pill opens a popover decomposing the number into causal chain coherence, recall match strength, and action grounding components with explicit weights — no black-box percentages.
Cross-incident patterns. Pattern chips on each incident surface relationships across the casebook: same root entity, archetype cascade in 48h, entity overlap, severity burst.
Modular workspaces. Each console is an isolated workspace bound to one detection source. The lobby is the entry point; production sources are first-class peers of the demo, not afterthoughts.

10Results

Numbers from the live demo console at the time of writing:

Artifacts ingested

across 5 incident archetypes

~7,800

Investigations completed

100%, 100%, 95% confidence

3 end-to-end

Cross-tier match accuracy

computed live from stored USC tuples, no synthesis

real C(p,Q) on every chain

Hybrid retrieval

both raw scores surfaced, k = 60

$vectorSearch + $text + RRF

Splunk MCP tool calls

Splunkbase app 7931 via mcpToTool()

agent executes SPL itself

Storage

Atlas M0 free tier, well within budget

60 MB / 512 MB

Median investigation latency

trigger → narrative

≈ 20 seconds

Provenance entries

every read/write logged, surfaced in Provenance sub-tab

100+ recorded

The 20-second figure deserves context. The manual equivalent — an analyst writing an executive incident summary from raw Splunk results, with sources cited — is measured in hours. Meridian collapses the work, not the rigor: every claim in the resulting narrative is bound to a specific artifact URI that the analyst (or auditor) can open.

11Acknowledgements

Meridian is built on the shoulders of giants. The reasoning agent runs on Google Gemini 3.1 Pro via Vertex AI. The persistence layer is MongoDB Atlas with Vector Search. Detection telemetry is sourced from Splunk Enterprise via the Splunk MCP server (Splunkbase app 7931). Embeddings come from nomic-embed-text-v1.5 (Apache 2.0). The surface is Next.js 16 + React 19 + TypeScript 5.7.

Meridian itself is open source under Apache 2.0 at github.com/metisos. The ContextSync Protocol specification lives at github.com/metisos/contextsync-protocol.

Built by Christian Johnson at Metis Analytics, Saint Louis, Missouri.

Meridiancontext-aware incident intelligence

01The problem

02Thesis: state and compute, decoupled

03Architecture

Integration

Protocol

Persistence

Compute

Surface

04ContextSync Protocol

URIs

Versioning + provenance

Permissions

Specification

05Unified Spatiotemporal Coordinate

Cross-tier match formula

Worked example

06Agent loop

Agentic loop with meta-tools

Splunk-native tool calls

Multimodal input

07Data pipeline

Why MCP

08Cognitive memory

Recall

Hybrid retrieval

09Surface

10Results

11Acknowledgements

Meridian
context-aware incident intelligence