Chapter 12

Performance — How Bloom Got Fast

The optimization milestones, charted, with the real before/after numbers

Bloom started as a tree-walking interpreter drawing one shape at a time. Through a sequence of focused optimizations — a bytecode VM, an auto-batched WebGL renderer, fused draw opcodes, a WASM SIMD particle kernel, and GPU compute spikes — it now scales the same circle(x, y, r) you learn on day one to hundreds of thousands of shapes per frame. This page charts that progression from a single dataset so it stays honest and so future runs can extend it.

How to read these numbers Every chart and milestone below is rendered from one file — docs/benchmarks/history.json. The numbers are real measurements, not targets, but they are machine-dependent: each entry names the environment it was taken in (node/V8 on M1-class hardware, headless Chromium, or Chrome 148 with WebGPU). Absolute figures will differ on your hardware; the relative ranking and speedups are the durable result. Frame rates at and below ~10k shapes are pinned to the 60fps vsync cap and do not measure headroom.

Where Bloom lands vs p5.js

For the architecture behind these curves — instanced batching, packed color, fused opcodes, the SIMD kernel — see The Rendering Pipeline, and for a fair side-by-side accounting see Why Bloom over p5.js.

Per-milestone speedups

Each bar is one optimization's headline speedup (after vs. before, on its own benchmark). Hollow bars are spikes that are not shipped.

The milestones

In order, with the technique and the measured before/after for each. Shipped means it is in the default build; Spike means it was measured as a feasibility experiment and is not shipped (off by default or not merged).

Reproducing & extending the numbers

This chart is not a dead artifact — every number can be regenerated, and the dataset is append-only so new runs slot straight in.

VM / compute micro-benchmarks (node)

The fused-opcode and math-fusion numbers come from the vitest bench suite:

terminal

npx vitest bench src/lang/__tests__/perf.bench.ts

Hard timing thresholds are also enforced as a regression gate in src/lang/__tests__/perf-regression.test.ts (run with npm test), so a regression fails CI rather than silently eroding these gains.

Browser rendering benchmarks

The fps-by-shape-count curves come from the in-browser bench pages, which run Bloom and p5.js side by side on your hardware:

benchmark.html — Bloom VM vs p5.js (micro + visual)
webgl-bench.html — the WebGL2 instanced renderer at scale
webgpu-particle-demo.html (repo root) — the WebGPU compute spike (needs Chrome 148+ with WebGPU)

Appending a new milestone

Add an entry to the milestones array in docs/benchmarks/history.json with: name, date, optional commit, status (SHIPPED or SPIKE), technique, an environment key, and a metrics array of { label, unit, before?, after, speedup? }. This page re-renders automatically. A small helper script appends a timestamped node-bench entry for you:

terminal

node scripts/append-bench.mjs

It runs the vitest bench, parses the loop/draw/fib timings, and appends a dated milestone to history.json (review the diff before committing).

One dataset, many views The charts, the speedup bars, and the milestone list above are all rendered from docs/benchmarks/history.json at page load. Keep the JSON honest and the page stays honest.

← Chapter 11: The Rendering Pipeline Back to Contents