# BLADE-INFRA-OT Simulation - Verification & Validation Record

**Engine v5.0 (fail-closed research prototype) · May 2026 · CC BY 4.0**

This record documents the V&V evidence for the BLADE-INFRA-OT simulation engine
(`blade-infra-ot-sim.html`). Versions 3.0 and 4.0 implement the consolidated
findings of nine independent modeling-and-simulation audits (six of which ran
the code in Node). Every fix was implemented and verified by an automated test
harness run three times with identical results.

## Test harness result (engine v4.0)

```
35 checks · 3 consecutive runs · 35 passed / 0 failed each run
```

## v5.0 fixes (final launch round)

| ID | Finding | Fix | Verified |
|----|---------|-----|----------|
| T1 | Silent telemetry truncation: a 100k run kept only 25k rows while CSV reported 100k | Batch size **capped at 25,000** (UI + engine aligned) so RAW_TRAFFIC == messages - every processed message is captured, no silent loss | UI max=25000; cap == buffer |
| G3 | Dead ternary `aScore = inputFault ? adaraScore(f) : adaraScore(f)` | Reduced to `const aScore = adaraScore(f)` | source no longer contains the ternary |
| D1 | Clock drift was a static +0.45 (a "jump", not drift) | **Time-dependent** drift `min(1, simTime*0.00015)` accumulates over the run | tau @t=100s 0.799 → @t=5000s 0.285 |
| H1 | Authority holds were instantaneous (no human-in-the-loop) | **Operator-clearance model**: T0/T1 holds incur a stochastic human delay (mu 45s, sd 15s); Time-in-System aggregates latency + deliberation + clearance | T0 hold humanDelay 48.5s; avgTIS reported |
| G2 | `anyFmea()` ignored clock drift | clock drift included | FMEA-active note fires on drift-only |
| A1 | Fail-closed dominance hid operational impact | **Availability** (prop/total) surfaced in the run note | shown |
| R1 | Export not fully replay-grade | traffic rows now carry seed, FMEA, threshold, nodes, tier before/after, stale flag, Time-in-System | per-row fields present |
| V1/C1 | Internal header said v3/v4; title overclaimed "high-assurance" | Header → v5.0; title softened to **"fail-closed research prototype"** | markers present |

## v4.0 fixes (prior round)

| ID | Finding | Fix | Verified |
|----|---------|-----|----------|
| R1 / F01 | Malformed/missing features failed **open** (coerced to 0 = "perfect provenance") - an adversarial bypass | Input fault now **fails closed**: routes straight to ISOLATE with cause `DATA_FAULT`; no scoring on garbage | malformed msg → ISOLATE/DATA_FAULT; missing features never PROPAGATE |
| R2 | Four-tier spectrum collapsed to two regimes (T0≡T1, T2≡T3) | Four **distinct** tiers: T3 autonomous · T2 supervised (high-stakes deliberate) · T1 confirm (high-stakes + elevated-risk hold) · T0 manual-only (all ops hold) | hold rate strictly increasing: T3=79 < T2=1127 < T1=1358 < T0=2261 |
| F02 | Byzantine node only forced false-negatives | Byzantine node now has a **50% chance to maliciously flag** (DoS) | ~50% of benign traffic DoS-flagged by a compromised lone node |
| new | No clock-desync fault | **Clock-drift FMEA**: IT/OT desync inflates the provenance gap, penalizing SATA | tau 0.77 → 0.46 under drift |
| D-F2 | `attack rate = 0` impossible (`||0.25` falsy bug) | Strict numeric parse (`numField`) - 0 means 0 | benign-baseline runs now possible |
| D-F1 | External-dataset mode didn't thread tier/FLAME state | Dataset loop now updates `ctx.tier` and `flameOpenUntil` per row | sequential authority lifecycle preserved |
| perf | Synchronous dataset eval + coarse yield could stutter | `MessageChannel` macro-task yield; both MC and dataset chunked at 2500 | non-blocking to 100k |
| repro | Traffic export not replay-grade | Each row carries seed, FMEA state, threshold, node count, tier before/after, stale flag, cause | re-running the engine with the manifest reproduces the run |
| pub | ROC step 0.025 too coarse | Threshold sweep at **0.005** | 201 ROC points |
| tele | Metrics not exportable | **CSV export** of confusion matrix + system/ADARA rates + AUC | implemented |

## v3.0 fixes (prior round - retained and still verified)

Authority tier wired into outcomes (C1); SATA/HMAA/ERAM gate the terminal
decision (C2); ADARA-only metrics reported separately from the IFF roster rule
with a roster-only fraction (C3); MAIVA fails closed on a tie (C4); raw-traffic
export + external-dataset ingestion (C5); unbounded Gaussian noise (H1);
bus-latency TIMING stage (H2); seed-deterministic ledger genesis (H3); feature
validation (H4); independent per-node MAIVA observations (H5); HOLD counter
(H7); decoupled traffic/noise PRNG streams (H8); async batches (H9); ADARA-only
ROC label + small-n suppression (M1); isolation-cause telemetry (M2); sensor-
blindness feed-loss model (M3); SHA-256 CAVP boot self-test (M4); escaped
dynamic text (M5); softened credibility language + assumptions panel (K1-K3).

## Formal V&V test matrix

| Test ID | Objective | Pass criterion | Status |
|---------|-----------|----------------|--------|
| V-001 | SATA provenance below 0.60 blocks | bad provenance → ISOLATE, cause SATA | PASS |
| V-002 | Unknown originator blocks | off-roster → ISOLATE, cause IFF | PASS |
| V-003 | T0 high-stakes is not auto-propagated | T0 high-stakes → DELIBERATE/HOLD | PASS |
| V-004 | Bus latency creates a stale hold | latency > deadline → high-stakes held | PASS (20→316 ms) |
| V-005 | MAIVA fails closed on a tie | 1-of-2 vote → isolate | PASS |
| V-006 | Attack rate 0 yields no malicious rows | strict parse honors 0 | PASS |
| V-007 | Determinism / replay | same seed+config → identical metrics | PASS |
| V-008 | Malformed dataset row | no crash; DATA_FAULT isolate; fault counted | PASS |
| V-009 | Four distinct tier regimes | hold rate T3<T2<T1<T0 | PASS |
| V-010 | SHA-256 integrity | CAVP vectors match; chain verifies | PASS |
| V-011 | Time-dependent clock drift | tau decreases as simTime grows | PASS |
| V-012 | Operator clearance on T0/T1 hold | human delay ≥10s; Time-in-System reflects it | PASS |

## Representative Monte Carlo result (seed 12345, n=5000, thr 0.50, 3 nodes)

System TPR 0.921 / FPR 0.038 · ADARA-only TPR 0.821 / FPR 0.021 ·
ROC AUC 0.985 · roster-only share of detections ≈ 10% ·
outcomes prop / hold / iso all reported (hold class no longer hidden).

## Scenario narratives (computed through the four-tier pipeline)

| Scenario | Outcome | Note |
|---|---|---|
| 01 Nominal | all PROPAGATE at T3 | ADARA ≈ 0.09 |
| 02 Monterrey | benign passes; bursts ISOLATE (cause SATA+ADARA); tier collapses T3→T2→T1→T0 | ADARA ≈ 0.99 |
| 03 Maintenance | vendor high-stakes ops DELIBERATE (FLAME window) | ADARA ≈ 0.38 |
| 04 Coordinated | unknown probes ISOLATE (SATA + IFF both block) | ADARA ≈ 0.96 |

## Honest remaining limitations (scope boundary, not defects)

These are the items the audits classify as long-term / out of scope for an
architectural-tier (TRL 3-4) browser artifact, and they are stated in the
simulation UI:

- No plant-physics / mission-consequence model - this is **not a digital twin**.
- No real protocol-frame parsing; the five-feature vector is an abstraction.
- Feature distributions and detector weights are hand-specified, not calibrated
  to a captured OT corpus. External-dataset mode evaluates the pipeline on data
  it did not generate, but a field-calibrated detector and a bundled, cited
  dataset (e.g., SWaT/WADI/BATADAL) remain future work.
- The ledger is tamper-evident, not authenticated or externally anchored.
- Detection delay is a message-count within a burst, not wall-clock seconds.
- Authority tiers shape hold-rate, not classifier output: TPR/FPR are
  tier-invariant by design (documented, not a defect).

## Readiness

The six audits' trajectory: *Concept Stage* (v1) → *Strong Research Simulation*
(v3) → *Near High-Assurance Review Quality* for an executable research
simulation (v4). The v4 release closes the last fail-open (malformed input),
exercises four distinct authority regimes, adds active Byzantine-DoS and
clock-drift faults, and makes runs replay-grade and metrics-exportable. The
remaining gap to full high-assurance is the digital-twin / calibrated-dataset
work above, which is correctly out of scope for a simulation-only artifact and
is disclosed as such.
