Flow-Matching Gaussians · a Noogram agent fleet
lake build: exit 0 sorries: 0 (commuting) adversarial corpus: 9 / 9 sweep: 9000 pairs pass

Flow matching · optimal transport · a machine-checked dichotomy

When is flow matching the same as optimal transport?

Run a flow-matching ODE from one Gaussian to another. Run optimal transport between the same two Gaussians. The two maps coincide if and only if the covariances commute — and in that case the map is the simple Bures form, independent of the schedule you chose.

\[ T_1 \;=\; T_{\mathrm{OT}} \quad\Longleftrightarrow\quad \Sigma_0\Sigma_1=\Sigma_1\Sigma_0 \]

and in the commuting case \( \, T_1 = \big(\Sigma_1\Sigma_0^{-1}\big)^{1/2} \) — symmetric, equal to the Bures–Wasserstein map, the same for every valid schedule.

This proof's first version was wrong. The fleet's own adversary caught an invalid converse argument and a gap in dimension three and up. We kept the receipt of the fix — and then closed the hard direction for real, unconditionally. Anyone can show a green check; almost no one shows the red they passed through to get there.

The verdict, from the kernel — not from us

Below is the real output the Lean kernel and the audit produce. The machine-checked commuting case carries only the three foundational axioms — no sorry, no project-local axiom standing in for a missing argument. Every number in the strip above is captured from a command at build time; see §7 Reproduce / verify.

# lake build — anchor theorem accepted by the Lean kernel (exit 0) $ lake build Build completed successfully. exit 0 # the source the kernel verified is byte-identical to the shipped source $ git hash-object lean/FMG/Gaussian.lean 760ec9b3afb37978b6acb840a260cff0ae283ea6 ✓ matches kernel-provenance.log # #print axioms — only the three standard foundational axioms $ #print axioms FMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute ⇒ [propext, Classical.choice, Quot.sound] -- no sorryAx, no project-local axiom # adversarial corpus — author ≠ scorer; the kernel's exit code is the verdict $ bash FMG/Adversarial/check-adversarial.sh CORPUS OK — 9 / 9 resolved as the manifest predicts (8 unfaithful variants rejected · 1 axiom-smuggling build caught by the audit)

§1 · The result

For the independent-coupling Gaussian interpolant \( X_t = a(t)X_0 + b(t)X_1 \), the marginal covariance is \( \Sigma_t = a(t)^2\Sigma_0 + b(t)^2\Sigma_1 \), the advection velocity is linear, \( v(x,t)=A_t x \) with \( A_t = \tfrac12\dot\Sigma_t\Sigma_t^{-1} \), and the flow map \( \Phi_t \) it generates pushes \( \mathcal N(0,\Sigma_0) \) to \( \mathcal N(0,\Sigma_t) \). Call its terminal value \( T_1 \). The Bures–Wasserstein optimal-transport map between the same Gaussians is \( T_{\mathrm{OT}}=\Sigma_0^{-1/2}\!\big(\Sigma_0^{1/2}\Sigma_1\Sigma_0^{1/2}\big)^{1/2}\Sigma_0^{-1/2} \).

The dichotomy. \( T_1 = T_{\mathrm{OT}} \) exactly when \( \Sigma_0 \) and \( \Sigma_1 \) commute. In the commuting case the flow map collapses to the schedule-independent Bures form \( T_1=(\Sigma_1\Sigma_0^{-1})^{1/2} \), which is symmetric. When the covariances do not commute, \( T_1 \) is non-symmetric and differs from the (always symmetric) OT map.

The honest boundary, up front

§2 · Notebooks

Two pedagogical PyTorch notebooks, executed end-to-end and rendered to self-contained HTML. The source .ipynb files live in the repo's python/.

A — three-Dirac mixture →

The marginal stays a closed-form Gaussian mixture at every instant. Trajectories coloured by which atom they fall into, three schedules contrasted, flow-matching paths set against semi-discrete optimal transport.

B — covariance ellipses →

The Gaussian-to-Gaussian case: \( \Sigma_t = a^2\Sigma_0 + b^2\Sigma_1 \) drawn as evolving ellipses, commuting vs non-commuting schedules, and the commutator sweep behind the converse.

§3 · The paper

The full write-up — setup, the three-Dirac experiment, the Gaussian section with the theorem and its proof, and the numerical evidence. Download the PDF · the LaTeX source (paper-v1.tex + references.bib) is in the repo.

Your browser can't display the embedded PDF — open paper-v1.pdf directly.

§4 · The proof — the credibility anchor

The Lean source, not a screenshot. The fidelity anchor is the theorem FMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute, proved basis-free through the continuous functional calculus (so the repeated-eigenvalue cases that sink the naive diagonalisation never arise).

FMG/Gaussian.lean

The anchor theorem and its lemmas, syntax-highlighted. flowMap σ₀ σ₁ is Hermitian, solves the Bures pushforward equation, and equals the OT map.

FMG/Basic.lean

The library's base module — the import surface for FMG.lean.

The firebreak ledger

The disclosed gaps, each a clearly-scoped infrastructure or out-of-scope item — never the main theorem, never an axiom smuggled into the library. Full detail in lean/unproved.md.

ResidualWhat it isStatus
R1Matrix-ODE \( \Phi_1 = \) flowMap bridge (needs time-ordered-exponential machinery Mathlib lacks)informal — infrastructure gap
R2Schedule-independence / flatness (rides on R1; structurally already true of the Lean object)informal — out of Lean scope
R3The \( d\ge 3 \) converse \( T_1 \) symmetric \( \Rightarrow [\Sigma_0,\Sigma_1]=0 \)closed unconditionally — informal in Lean, math complete

The adversary that rejects axiom-smuggling

A green build does not certify a faithful proof if the context is poisoned. The corpus's ninth entry, A9_AxiomSmuggling.lean, builds clean by smuggling a false universal in as an axiom and feeding it to the verified theorem — and is caught not by the exit code but by the #print axioms audit. Author ≠ scorer: the red team writes the variants, the checker grades exit codes against the manifest.

-- A9_AxiomSmuggling.lean (excerpt) — the smuggled false universal axiom A9_commute_everything {n : Type*} [Fintype n] [DecidableEq n] (σ₀ σ₁ : Matrix n n ℝ) : Commute σ₀ σ₁ -- builds with exit 0, but #print axioms A9_smuggled reveals the -- non-standard dependency → the axiom-grep gate fails the build.
GateVerdictEvidence (build-time)
lake buildexit 0anchor accepted; shipped source blob-matches kernel-provenance.log (760ec9b…)
0-sorrycleangrep -rn sorry over the paper theorem files → no match
no project axiomcleanonly propext, Classical.choice, Quot.sound
adversarial corpus9 / 98 rejected + 1 axiom-smuggling build flagged by the audit

§5 · The fleet that watched itself work

Strip away the science and here is the meta-story: a swarm of software agents was given one question and told to deliver a publishable result — and it kept a logbook of its own motion while it worked. The arc is a soap bubble: polymerisation (the opening plan fans out), foaming (children get nucleated mid-flight, the froth swells), drainage (the foam settles, green fills the frame). Every number is pulled from a real field in the fleet's event log; where a number under-counts, the chart says so out loud — and we leave the caveat on.

Two of those views are now interactive and in this page's own charter — hover any role or any bar for detail, or open the full living-document page: the fleet, up close ↗. They are generated straight from the real .cosmon/state/events.jsonl + runtime-trace.jsonl by report/fleet_viz.py.

A · The role graph. Rounded boxes are roles, one colour per sub-fleet (cognition green, formal violet, instrumentation cyan). The left FLEET STATE panel is the shared blackboard; the dashed ENTRY is the mission, the dashed green EXIT the artifacts. The referee — red-team — exists only to attack the others' work; it caught the invalid proof step and the d≥3 gap. Hover any box for its charge.
B · The whole run. One row per molecule; bar length is real tackle→settle duration on a wall-clock axis; ticks are step completions, small squares worker dispatches. Outcome reads off the fill — solid completed, red hatch collapsed (a redundant / junk-probe molecule), amber outline stuck (an operator safety-gate), faint dashed pending (not yet unblocked). The collapsed/stuck/pending story is legible at a glance.

The same arc, told once more in seven static charts — each with its caveat left on:

DAG growth — molecules by state over wall-clock
1 · Polymerisation. The dashed line is the total tasks ever created — it climbs in three steps: 17 tasks laid down at once (the plan), a long flat drain, then a second burst of 5 unplanned repair nodes near 19:50. The dynamic DAG rewriting itself while running. Green (completed) rising to fill the frame is drainage.
Molecule Gantt — one bar per task
2 · The Gantt. Each bar is one task's whole life, coloured by sub-fleet — cognition (science + words), formal (the Lean proof), instrumentation (this report). The long left bars are deliberations; the short late formal bars are the surgical proof-patches, nucleated late, closed fast.
Frontier width — the foaming signal
3 · Foaming. Live foam — queued plus running — spikes to 17 (the whole opening science plan), then ebbs 17 → 12 → 11 → 6, with a bump near 19:45 as the proof-patch children arrive. Caveat printed on the chart itself: the true frontier is ready-vs-blocked, but the event log records only state counts, not dependency edges over time. This draws the measurable proxy (pending + running) and says so. We draw what we can measure and name what we can't.
Cumulative framing-bytes — a transparent cost proxy
4 · What it cost — ≈ $596 (≈ €549). The science DAG was 20 molecules, but the whole run was 59 across four polymerisations — and it burned ≈ 548M tokens across 5,309 assistant turns (78 worker sessions: 53 top-level + 25 subagents, all on claude-opus-4-8), summed from the per-message usage blocks in the session transcripts and priced at Anthropic's published rates: $19.63 input + $139.44 output + $259.69 cache-read + $177.67 cache-write (5-min + 1-hour tiers). The curve above is the bytes proxy from costs.csv; the dollars come from the transcripts. Three audit trails, named honestly: cs ensemble prints live per-worker INPUT/OUTPUT/COST, events.jsonl + runtime-trace.jsonl log every transition, and costs.csv is a sealed-bytes proxy — the transcripts are the real-token source of truth. 94.8% of token volume is cache reads; caching turned a ≈ $2,597 input bill into $260 (≈ $2,337 saved). This snapshot necessarily excludes this reconciliation run's own tokens.
Persona activity — workers spawned per persona
5 · Who did the work. Each bar is a named persona and how many workers it spawned — a sourcer, a proofsmith (twice), a skeptic hunting the dangerous wrong-answer, auditors, and a cluster of proof-patchers. Caveat on the chart: cosmon stamps every worker's role field with the same value, so the honest unit is the session name the persona ran under — which is what's plotted. Sub-fleet colour is inferred from each task's briefing, not a stored fact.
Critical path through the final DAG
6 · The critical path. The longest chain of strictly-ordered tasks — eleven tasks, roughly four hours end to end, from the opening literature-and-setup, through the deliberation that fixed the theorem's framing, out along the formal proof, into the instrumentation that drew these charts. Everything off the red line had slack; the red line could not.
Outcome mix — completed vs in-flight vs collapsed
7 · The honest scorecard. By sub-fleet, nothing hidden: cognition 12/12 completed; formal 4 completed, 1 in-flight, 1 collapsed; instrumentation 1 completed, 3 in-flight (the report was still draining when the snapshot was taken). In the science DAG at this snapshot: 17 tasks completed, 1 collapsed — and that single collapse was a junk probe created by hand while poking at the system's JSON, recognised and discarded. The log records it faithfully rather than quietly deleting it.

The whole ethic in miniature: the fleet proved what it could prove to the hilt, marked the rest with a steady finger instead of a flourish — then went back and closed the hard one honestly, demoting the nine-thousand-pair numerical sweep from evidence to a sanity check the moment a real proof existed. The trust isn't in the agents — it's in the gate the agents couldn't argue with. The full narrative is in report/report.md.

§6 · The talk — an ENS seminar

The same result, told to a room. A reveal.js deck built to be presented at an ENS seminar — the dichotomy, the proof that was wrong first, and the fleet that watched itself work, in slide form. The deck's closing invitation points back here; this page points back to the deck. Read one object two ways.

Open the slides →

The full reveal.js deck in your browser. Arrow keys to advance, Esc for the slide grid, S for speaker notes.

Slides as PDF →

The same deck exported to a single self-contained slides.pdf — for reading offline or printing the handout.

The constellation map →

An interactive map of the Noogram fleet's public artifacts — verticals wired to the live things they produced. The deck's standalone companion.

§7 · Reproduce / verify

Everything on this page regenerates from the repo. The status strip and the terminal verdict above are produced mechanically by the build script from real command output — not hand-typed.

# clone and check the proof
git clone https://github.com/noogram-labs/flow-matching-gaussians
cd flow-matching-gaussians/lean
lake build                              # exit 0 = the anchor theorem is accepted
bash FMG/Adversarial/check-adversarial.sh   # 9/9 — author ≠ scorer

# regenerate the notebooks, the Lean HTML, and the paper copy
bash docs/site/artifacts/build.sh

# regenerate the seven report figures from the fleet's own event log
python3 report/collect.py               # .cosmon/state → report/data/*.csv (read-only)
python3 report/render.py                # report/data/*.csv → report/figures/*.png
python3 report/fleet_viz.py             # events.jsonl → figures/fleet_*.svg + fleet.html

# rebuild this website (refreshes the status strip from live commands)
bash docs/site/build.sh
ProvenanceValue
Built from commitdd226f2 (dd226f258a9db92be28e121349c1160bb3fab774)
Build timestamp2026-06-18 (mechanically stamped by docs/site/build.sh)
Anchor theoremFMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute
Lean source blob760ec9b3afb37978b6acb840a260cff0ae283ea6 — matches kernel-provenance.log
Lean toolchainleanprover/lean4:v4.29.0 · Mathlib v4.29.0

§8 · References — the closed citable set

The only set the writers may cite. Every identifier was verified against the authoritative record (arXiv abstract or registered DOI) — fabricating a DOI is a blocker fault, and none below are fabricated. Tier and relevance live in source-ledger.md; the machine-readable set is references.bib.

  1. 1Building Normalizing Flows with Stochastic InterpolantsMichael S. Albergo, Eric Vanden-Eijnden · 2023 · ICLR
  2. 2Stochastic Interpolants: A Unifying Framework for Flows and DiffusionsMichael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden · 2023
  3. 3Flow Matching for Generative ModelingYaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le · 2023 · ICLR
  4. 4Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified FlowXingchao Liu, Chengyue Gong, Qiang Liu · 2023 · ICLR
  5. 5Score-Based Generative Modeling through Stochastic Differential EquationsYang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon et al. · 2021 · ICLR
  6. 6Computational Optimal Transport: With Applications to Data ScienceGabriel Peyré, Marco Cuturi · 2019
  7. 7A Convexity Principle for Interacting GasesRobert J. McCann · 1997
  8. 8On the Bures–Wasserstein distance between positive definite matricesRajendra Bhatia, Tanvi Jain, Yongdo Lim · 2019
  9. 9Wasserstein geometry of Gaussian measuresAsuka Takatsu · 2011 · Osaka J. Math.
  10. 10Geometry of Matrix Decompositions Seen Through Optimal Transport and Information GeometryKlas Modin · 2017
  11. 11Polar Factorization and Monotone Rearrangement of Vector-Valued FunctionsYann Brenier · 1991
  12. 12The distance between two random vectors with given dispersion matricesIngram Olkin, Friedrich Pukelsheim · 1982
  13. 13Flow Matching Guide and CodeYaron Lipman, Marton Havasi, Peter Holderrieth, Neta Shaul, Matt Le et al. · 2024