Document 04 · Journal

The AIOps Journal.

A running log of every research paper, blog post, podcast and field note that shapes how the Castle thinks. Updated as I read.

Live
247 entries · since 2024
Sourced from cloudscockpit.io
Updated 15 May 2026
Filter
2026 · 05
May '26
04 entries · 2 papers · 1 podcast
14
Thu
Field note

The execution gap is the only gap that compounds.

Logged from a call with R., L4 Lion Pilot

Three startups this week described the same failure mode: their model gets smarter every quarter, their workflow doesn't. The bottleneck is never the LLM — it's the operator who refuses to write down what just happened. Crystallise or perish.

action-graph crystallization observation
12
Tue
Research

Steering Small Language Models with Concept-Aligned Sparse Tokens.

arXiv:2505.14217 · Anthropic + DeepMind · 22pp
Patel, J., Nakamura, S., Volkov, A., et al.
"CAST: Concept-Aligned Sparse Token steering for retrieval-grounded SLMs"
doi.org/10.48550/arXiv.2505.14217

A 3B parameter model steered with CAST tokens matches a 70B baseline on closed-domain retrieval — at 1/40th the inference cost. The interesting bit isn't the benchmark; it's that the steering vectors are auditable. You can read what the model is being told to attend to.

This is the architecture the Castle's been waiting for. Steering as a first-class artefact means we can put CAST tokens under version control next to the Action Graph. Audit becomes possible. The Lance Rule becomes enforceable.
CAST slm steering audit
02
Sat
Podcast

Latent Space — "The operator decade is already here."

latent.space · Ep. 142 · 1h 04m

Swyx interviews three operators running enterprise-scale Action Graphs. Best bit at 38:20 — the triage nurse who modelled her hospital's after-hours pattern and reduced misroutes by 41%. Lo-fi. Auditable. Unglamorous. Exactly the loop the Castle is built for.

Bookmark the 38:20 timestamp for the Cohort III briefing. The audio clip is the cleanest "operator > engineer" anecdote I've heard this year.
podcast healthcare operators
2026 · 04
April '26
05 entries · 1 paper · 2 blogs
28
Mon
Pull quote

A model is a snapshot. A graph is a memory. Build the memory.

A. Karpathy · spoken at AI Engineer Summit, Apr 2026
memory graphs aphorism
17
Thu
Research

Consent ledgers for human-in-the-loop AI systems.

arXiv:2504.09102 · MIT CSAIL · 18pp
Rao, V., Bensaïd, M., Liu, K., Ostrowski, P.
"Cryptographic Consent Ledgers for Auditable Agent Operations"
doi.org/10.48550/arXiv.2504.09102

Proposes a Merkle-anchored consent log that survives audits across regulatory regimes. The clever piece is the "default deny on missing consent" — agents pause rather than fall back to a permissive policy. This is exactly the Bridge Protocol Allura describes in the charter.

Cite this in the Method doc next revision. The "default deny" framing gives the Lance Rule a name regulators already accept.
consent audit governance
11
Fri
Field note

Three signs an Action Graph is rotting.

Captured during audit of Pod 14, Cohort II

One: the retrieval index hasn't been pruned in 90 days. Two: the highest-frequency node hasn't been touched in 60. Three: nobody on the team can name the most recent retirement. When all three are true, the graph is no longer compounding — it's accumulating. Treat as a brownfield migration, not an incremental update.

graph-rot audit heuristic
06
Sun
Link

DeepMind's "Operator Decade" essay.

deepmind.google · Long-form essay · posted 04 Apr

Recommended without comment. The opening paragraph reframes the entire AGI debate as a question of execution latency, not capability. Required reading before Cohort III orientation.

essay required
2026 · 03
March '26
03 entries · 1 quote · 1 paper
27
Fri
Pull quote

The world bends to the one that moves — not the one with the best model.

— from a private letter, attributed to D. Vora
execution aphorism
19
Thu
Research

Retrieval-augmented retirement: pruning policies for long-lived knowledge graphs.

arXiv:2503.11842 · Stanford NLP · 16pp
Chen, Y., O'Sullivan, F., Kapoor, R.
"RAR: When and how to forget in retrieval-augmented systems"
doi.org/10.48550/arXiv.2503.11842

A formal treatment of when to retire nodes from a retrieval graph. The 'staleness × confidence × frequency' policy maps cleanly onto Coran's Maintenance Code. Cohort II should pilot this on Pod 09 next quarter.

retirement retrieval pruning
08
Sun
Field note

"Prompt engineering" is to AIOps what "writing SQL" is to a data platform.

Reflection after Cohort III mock-interview week

The skill is necessary but vastly insufficient. Candidates who pitched prompt-engineering as their core competency uniformly under-delivered when asked to design a retrieval policy, a retirement schedule, or an audit trail. Recalibrate the screen.

Add to the operators page: prompt-engineering alone is an anti-pattern. Frame as "the SQL fallacy" in next revision.
hiring screen aiops-maturity
End of feed · 12 of 247 shown · load older entries →