Cloudscockpit.io — AIOps Journal

14

Thu

Field note

The execution gap is the only gap that compounds.

Logged from a call with R., L4 Lion Pilot

Three startups this week described the same failure mode: their model gets smarter every quarter, their workflow doesn't. The bottleneck is never the LLM — it's the operator who refuses to write down what just happened. Crystallise or perish.

action-graph crystallization observation

12

Tue

Research

Steering Small Language Models with Concept-Aligned Sparse Tokens.

arXiv:2505.14217 · Anthropic + DeepMind · 22pp

Patel, J., Nakamura, S., Volkov, A., et al.

"CAST: Concept-Aligned Sparse Token steering for retrieval-grounded SLMs"

doi.org/10.48550/arXiv.2505.14217

A 3B parameter model steered with CAST tokens matches a 70B baseline on closed-domain retrieval — at 1/40th the inference cost. The interesting bit isn't the benchmark; it's that the steering vectors are auditable. You can read what the model is being told to attend to.

This is the architecture the Castle's been waiting for. Steering as a first-class artefact means we can put CAST tokens under version control next to the Action Graph. Audit becomes possible. The Lance Rule becomes enforceable.

CAST slm steering audit

08

Fri

Blog clip

Why most AIOps pilots die in production.

cloudscockpit.io · 9 min read · Self · drafted 05 May

[ thumbnail ]

A pilot that ships without a Proof-of-Thought trail is not a pilot — it's a demo. Demos don't compound. The post catalogs six failure modes I see in enterprise rollouts, and the three the Castle protocol is designed to prevent.

aiops production proof-of-thought

Read on cloudscockpit.io ↗ Save ⌘S

02

Sat

Podcast

Latent Space — "The operator decade is already here."

latent.space · Ep. 142 · 1h 04m

Swyx interviews three operators running enterprise-scale Action Graphs. Best bit at 38:20 — the triage nurse who modelled her hospital's after-hours pattern and reduced misroutes by 41%. Lo-fi. Auditable. Unglamorous. Exactly the loop the Castle is built for.

Bookmark the 38:20 timestamp for the Cohort III briefing. The audio clip is the cleanest "operator > engineer" anecdote I've heard this year.

podcast healthcare operators

28

Mon

Pull quote

A model is a snapshot. A graph is a memory. Build the memory.

— A. Karpathy · spoken at AI Engineer Summit, Apr 2026

memory graphs aphorism

22

Tue

Blog clip

The five disciplines of a production AIOps team.

cloudscockpit.io · 14 min read · Self

[ diagram ]

Briefing, mission, debrief, crystallization, retirement. Five phases that should appear in every AIOps team's weekly cadence. The post argues that omitting retirement is the single largest source of graph rot in enterprise deployments.

cadence retirement team

Read on cloudscockpit.io ↗ Save ⌘S

17

Thu

Research

Consent ledgers for human-in-the-loop AI systems.

arXiv:2504.09102 · MIT CSAIL · 18pp

Rao, V., Bensaïd, M., Liu, K., Ostrowski, P.

"Cryptographic Consent Ledgers for Auditable Agent Operations"

doi.org/10.48550/arXiv.2504.09102

Proposes a Merkle-anchored consent log that survives audits across regulatory regimes. The clever piece is the "default deny on missing consent" — agents pause rather than fall back to a permissive policy. This is exactly the Bridge Protocol Allura describes in the charter.

Cite this in the Method doc next revision. The "default deny" framing gives the Lance Rule a name regulators already accept.

consent audit governance

11

Fri

Field note

Three signs an Action Graph is rotting.

Captured during audit of Pod 14, Cohort II

One: the retrieval index hasn't been pruned in 90 days. Two: the highest-frequency node hasn't been touched in 60. Three: nobody on the team can name the most recent retirement. When all three are true, the graph is no longer compounding — it's accumulating. Treat as a brownfield migration, not an incremental update.

graph-rot audit heuristic

06

Sun

Link

DeepMind's "Operator Decade" essay.

deepmind.google · Long-form essay · posted 04 Apr

Recommended without comment. The opening paragraph reframes the entire AGI debate as a question of execution latency, not capability. Required reading before Cohort III orientation.

essay required

Open ↗

02

Wed

Blog clip

Proof-of-Thought: making agent reasoning audit-grade.

cloudscockpit.io · 11 min read · Self

[ schematic ]

Walks through the Proof-of-Thought protocol the Castle uses to preserve agent reasoning across every step of a workflow. Includes a worked example from a real triage pod and the audit trail it produced after a regulatory review.

proof-of-thought audit walkthrough

Read on cloudscockpit.io ↗

27

Fri

Pull quote

The world bends to the one that moves — not the one with the best model.

— from a private letter, attributed to D. Vora

execution aphorism

19

Thu

Research

Retrieval-augmented retirement: pruning policies for long-lived knowledge graphs.

arXiv:2503.11842 · Stanford NLP · 16pp

Chen, Y., O'Sullivan, F., Kapoor, R.

"RAR: When and how to forget in retrieval-augmented systems"

doi.org/10.48550/arXiv.2503.11842

A formal treatment of when to retire nodes from a retrieval graph. The 'staleness × confidence × frequency' policy maps cleanly onto Coran's Maintenance Code. Cohort II should pilot this on Pod 09 next quarter.

retirement retrieval pruning

08

Sun

Field note

"Prompt engineering" is to AIOps what "writing SQL" is to a data platform.

Reflection after Cohort III mock-interview week

The skill is necessary but vastly insufficient. Candidates who pitched prompt-engineering as their core competency uniformly under-delivered when asked to design a retrieval policy, a retirement schedule, or an audit trail. Recalibrate the screen.

Add to the operators page: prompt-engineering alone is an anti-pattern. Frame as "the SQL fallacy" in next revision.

hiring screen aiops-maturity