2026-04-29 · OPERATING DOCTRINE · BY RICO ALLEN

The Hardseal Learning Loop: if the system can fail the same way twice, the system has not learned.

Most engineering teams treat bugs like rain — something that happens, you mop it up, you move on. That works for a while. It does not work for systems that ship cryptographic evidence to defense customers, where a single repeated failure signals that the discipline behind the product is theater. This is how we structure learning at Hardseal so that the codebase compounds judgment instead of just shipping fixes.

The core law

We have one operating law that sits above every other rule in the company:

If the system can fail the same way twice, the system has not learned.

That sentence is doing a lot of work. It is a refusal — we will not file the same incident twice. It is also a target — every incident must produce something that makes the failure structurally impossible to repeat, not just unlikely. That something is rarely the patch. The patch is the start. The product primitive that comes out of the patch is the finish.

Five stages

The loop has five stages. Every incident worth learning from passes through all five.

1 — Incident

Something failed in a way that matters. Severity tiers: S0 (false claim risk to a customer), S1 (delivery break — a customer cannot use the artifact we shipped), S2 (evidence integrity gap — the proof can be tampered without detection), S3 (automation friction), S4 (cosmetic). The triage decides the SLA: S0/S1/S2 get a 48-hour clock. S3/S4 get scheduled.

Every entry is filed in MODEL_FAILURE_LOG.md with a 15-field schema that is non-negotiable. The schema is the discipline. Skipping a field is a smell.

2 — Hot-fix

We patch the immediate failure. This is the cheapest, fastest stage and the one most teams stop at. Stopping here is how the same bug shows up next quarter, in a slightly different shape, and the team thinks it is a new bug because the symptoms look different. They are not new. They are the same bug, wearing a hat.

3 — Deterministic check (with negative test)

This is the stage that separates compounding teams from amnesiac teams. After the patch, we add a deterministic check that fails before the patch and passes after. Then — and this is the part most teams skip — we add a negative test: a test that constructs the exact failing input and asserts the check catches it.

A patch without a negative test is faith. A patch with a negative test is engineering. The negative test is what proves the check actually checks the thing the bug exploited, instead of checking some adjacent thing that happened to fail in the same direction.

// EXAMPLE A banned-phrase scanner skips an "F-A-A-approved" overclaim because someone typed a Unicode non-breaking space between letters. The patch is to NFKC-normalize the string before scanning. The negative test pastes the exact Unicode-spaced string into a packet and asserts the scanner catches it. Without the negative test, the next person to refactor the scanner can silently break the fix and never know.

4 — Product primitive (with class-of-failure scan)

The deterministic check is local. The product primitive is structural. After the check is in place, we ask: what is the class of failure this incident represents, and how many other places in the system are vulnerable to the same class?

That question almost always returns more than one site. The Unicode-spacing bug above is not just about banned phrases — it is about every place in the codebase that compares strings without normalization. The class-of-failure scan catches all those sites. The product primitive is a normalized-string-compare helper used everywhere, with the negative test attached.

Without this stage, you fix the symptom and miss the disease. With this stage, the fix prevents an entire family of bugs you have not seen yet.

5 — Archive

The incident is logged, the check is in CI, the primitive is in the codebase, and the entry is closed in the failure log with the resolution path. Old entries are not deleted. They become institutional memory. New engineers learn the company's failure history by reading the log, not by repeating it.

A subtle but load-bearing piece: stale-rule deletion. When a rule was added because of a failure, and the failure mode no longer applies (because the underlying primitive made it impossible), we delete the rule. Codebases that only accumulate rules become unmaintainable. Codebases that delete obsolete rules stay sharp. The deletion is part of the loop, not a separate cleanup project.

What this gives us

After enough rotations of the loop, the codebase looks different from a codebase of the same size that only ships features. There are fewer config flags, because failures that used to require a flag became impossible. There are fewer special cases, because the class-of-failure scans collapsed them into shared primitives. There is more dead-simple code, because every bug that survived a rotation produced a structural simplification.

Most importantly: when a customer asks "how do you handle X attack," the answer is not a slide deck. The answer is the negative test that catches X, and the line of code that exists because of it.

Where this came from

This is not original. It is borrowed from aerospace flight test, from nuclear submarine reactor operations, from the Toyota production system. The phrase "if the system can fail the same way twice, it has not learned" is mine, but the structure is older than I am. What is unusual is applying it inside a software company that ships to defense customers, where the failure cost is high and the temptation to ship a hot-fix and move on is extreme.

We refuse the temptation. The loop is the product behind the product.

How to start using this in your own team

You do not need our entire failure-log schema. Start with three things:

Every incident gets a negative test. Not a regression test that asserts the fix. A test that constructs the failing input and asserts the patch catches it. Different framing, much sharper signal.
Every incident asks the class-of-failure question. "Where else in the codebase could this same bug shape exist?" Answer the question even if the scan returns zero. Document the zero.
Every quarter, delete a rule that is no longer load-bearing. If you cannot find one, your loop is not deleting things, which means it is also not learning.

One more thing

The loop applies to language too. Every customer-facing claim Hardseal makes is filtered by the same discipline. We ran a claim-language audit across 21 customer-facing artifacts when we wrote the first version of this doctrine. Zero forbidden positive claims. 20 of 21 carried explicit "does NOT certify" disclaimers. The audit was the deterministic check. The pre-commit hook that scans every commit for the same banned phrases is the product primitive. The whole rotation, applied to language instead of code.

Same loop. Same law. Same compounding.

// THE PRACTICE The next time something breaks in your system: patch it, then write the negative test, then ask the class-of-failure question, then look for one stale rule to delete. That is one rotation. Do twelve of those, in a row, with no exceptions. The codebase that comes out the other side will be unrecognizable.

← Test what we fly. Fly what we test. Hardseal as the evidence interface →