Gate Discipline¶

Status: Proven by gates.

Every gate has a pass/fail criterion defined before implementation. Gate tolerances are calibrated once, frozen, and never relaxed after observing results. A gate that "almost passes" has failed. Fix the instrument, not the threshold.

This prevents the most common form of measurement self-deception: adjusting the instrument until it confirms the hypothesis.

Gate	Claim	Test	Evidence	Status
0	Observer does not perturb the system	Bit-identical tokens and logits	Greedy decode, 32 layers	Pass
1	Recomputed attention matches native output	Cosine >= 0.999996, rel L2 <= 0.002759	2240 records, frozen thresholds	Pass
2	No memory leaks under sustained use	50 generations	0.0 MiB VRAM spread, 0.7 MiB CPU RSS growth	Pass
3	Permutation entropy (PE) tracks belief state complexity	Spearman rank correlation between per-head PE and known fractal dimension, targeting L15--21 heads, surviving permutation null	Planned	In progress

Calibration discipline¶

Gate tolerances follow a three-step protocol:

Define pass/fail criteria before implementation. Every gate has a documented criterion in the design spec before any code is written. The criterion is the hypothesis; the test either confirms or falsifies it.
Calibrate once, then freeze. For Gate 1, a single calibration pass runs the adapter across all layers and sequence positions on Mistral-7B-Instruct-v0.2. The worst-case metrics from that pass set the thresholds. Calibration produced 2240 parity records (32 layers x 70 per-layer records across 3 prompt shapes and multiple sequence lengths). The observed worst-case cosine delta from 1.0 was 1.31e-6; the frozen threshold applies 3x headroom (cosine >= 0.999996). The observed worst-case relative L2 was 0.00184; the frozen threshold is 0.002759 (approximately 1.5x headroom). These numbers live in tests/gates/test_gate1_parity.py as module-level constants.
Never relax after seeing task results. If a gate fails against frozen tolerances, the adapter is broken. Fix the adapter or downgrade the capture tier. Adjusting the threshold to make results pass is the single most dangerous thing an instrument builder can do.

Gate metrics vs. diagnostics¶

Not every number produced during validation is a gate metric. The distinction matters:

Gate metrics have frozen thresholds. Every record must pass. A single violation fails the gate.

Metric	Gate	Threshold
Cosine similarity	1	>= 0.999996
Relative L2 error	1	<= 0.002759
Token identity	0	Exact (`torch.equal`)
Logit identity	0	Exact (`atol=0, rtol=0`)
VRAM spread	2	<= 16 MiB
CPU RSS growth	2	<= 128 MiB

Relative L2 formula: \( \frac{\lVert \text{recomputed} - \text{native} \rVert_2}{\lVert \text{native} \rVert_2 + 10^{-12}} \), all comparisons in fp32.

Diagnostics are reported but do not gate. They exist for failure localization:

Max absolute error (Gate 1): per-record, not thresholded.
Pre-o_proj cosine (Gate 1): isolates whether a parity failure originates in the attention computation or in the head-merge/projection path.
Layer drift (Gate 1): per-layer mean cosine across steps. Error should not systematically grow with depth.

Fail-closed semantics¶

"Fail-closed" means: when the instrument detects an anomaly, it stops. It does not interpolate, impute, or degrade gracefully.

aggregate_deltas() raises on non-contiguous step_idx. No silent zero-fill.
compute_mean_delta_matrix() raises on missing layers. No silent zero-fill.
Gate 0 failure halts all downstream work. If the observer perturbs the system, every measurement is suspect.
Gate tests are tagged @pytest.mark.gpu and run via make test-gpu. They require the actual model on actual hardware. Mock tests prove plumbing; gate tests prove the instrument.