Appearance
Play Lc0 — Deep Technical Profile
Build timeline — ~9 active days across 3 phases (Feb 5 – Feb 23, 2026, 18 calendar days)
- Initial app + model management (2 days) — browser chess vs Leela, 40+ networks, LFS, R2 hosting, game history, UI polish
- Tournament mode + UX hardening (2 days) — tournament mode, live boards, concurrency scheduling, engine eviction, custom ONNX upload, Rolldown bundler, useEffect fixes
- Openings + ratings + search + sharing (5 active days, after 11-day gap) — 15K+ opening book, ECO picker, FIDE performance ratings, game detail modal, shareable URLs, MCTS search with time limits, temperature sampling experiment (reverted)
Table of Contents
- Pre-Implementation Research & Spec
- Architecture
- The Engine: MCTS + Neural Network Inference
- Neural Network Catalog
- Tournament System
- UI & Game Flow
- Technical Tradeoffs & Decisions
- Major Bugs & Debugging Stories
- AI Agent Involvement
- Development Timeline
- Key Files Reference
1. Project Overview
A fully client-side web application that lets you play chess against Leela Chess Zero (Lc0) neural networks running entirely in the browser. All inference happens locally via ONNX Runtime Web (WebGPU with WASM fallback) — no server-side computation.
By the numbers:
- ~16,000 lines of TypeScript/TSX across 51 files
- 36 commits across 11 PRs (9 merged, 1 closed, 1 open), built in 19 days (Feb 5–23, 2026)
- 53 neural network models from ~800 to ~2900 Elo
- MCTS search with configurable node budget and temperature
- Swiss and round-robin tournament mode with FIDE performance ratings
- 15,000+ opening positions (full ECO database)
- Custom ONNX model upload with verification
- Shareable game URLs via query parameters
- Models hosted on Cloudflare R2; app deployed to Cloudflare Pages
Libraries & Frameworks
All code runs in the browser — there is no backend service. The engine lives in a Web Worker.
UI (React web app)
- React 19 + react-dom — UI and game-flow components.
- Vite (rolldown-vite) + @vitejs/plugin-react — dev server and production bundler (rolldown variant for faster builds).
- vite-plugin-static-copy — copies ONNX Runtime's WASM artifacts into the build output.
- Tailwind CSS 4 + @tailwindcss/vite — utility-first styling.
- lucide-react — icons.
Chess UI & logic
- react-chessboard 5.8 — drag-and-drop chessboard component.
- chess.js 1.4 — move generation, legality, FEN/PGN handling.
Neural inference (engine worker)
- onnxruntime-web 1.24 — runs the LC0 ONNX networks in the browser with WebGPU (preferred) or WASM fallback.
- onnxruntime-node 1.24 (dev) — used only for benchmarking memory/perf of models on Node before shipping.
- The MCTS implementation, policy-index mapping (1858 UCI moves), and board-to-[1,112,8,8] encoding are hand-rolled in
engine/*.ts— no third-party chess-AI library.
Client-side persistence
- idb 8 — thin async wrapper around IndexedDB; used by
modelCache.tsto store decompressed ONNX networks so they don't re-download each session.
Tooling
- TypeScript 5.9 — types across the UI + engine worker.
- ESLint 9 + typescript-eslint + react-hooks/react-refresh plugins + globals — lint config.
- Type defs:
@types/react,@types/react-dom,@types/node.
2. Pre-Implementation Research & Spec
Before any code was written, I conducted extensive research into which Lc0 networks would work in a browser, how policy-only (0-node) play performs, and what the full architecture should look like. This research produced a 676-line implementation spec that was passed directly to Claude Code as the project blueprint.
2.1 Network Feasibility Analysis
I researched the full Lc0 network ecosystem and categorized models by browser feasibility:
Tier 1 — Tiny CNNs (~50 KB to a few MB): dkappe distilled series (Tiny Gyal, etc.). Run on any device, even WASM-only without WebGPU. Useful for low-end users and mobile.
Tier 2 — Small/medium CNNs (64x6 to 128x10): SE residual networks. Strong but compact. Fast inference, low download size. Good for mobile and modest hardware.
Tier 3 — Standard CNNs (192x15 to 320x24): Mainline Lc0 sizes. Need WebGPU for reasonable performance. Downloads get large (50-200 MB).
Tier 4 — Transformers (the standout): T1-256x10-distilled at ~65 MB FP16 has a dramatically stronger policy head than any residual CNN. At 1 node, its policy head alone should be in the ~2600-2800 Elo range. This became the "best practical browser net."
Tier 5 — Big nets (BT4, 768+ filters): Desktop-native territory. 707 MB for BT4. Needs ~4 GB VRAM. Works in browser with WebGPU on high-end hardware.
2.2 Policy-Only (0-Node) Strength Research
I researched how strong Lc0 is without any search — just the raw policy head picking the highest-probability move. This was critical because depth-0 play was designed as a first-class feature, not an afterthought.
Key data points gathered:
- Older convolutional nets (2021, net 67743): above 2200 Elo at 1 node — enough to trouble a human master
- Latest BT4 transformer: nearly 300 Elo stronger in raw policy than the strongest CNN (T78), with fewer parameters
- Wikipedia (Nov 2024): Lc0 models achieving "grandmaster-level strength at one position evaluation per move"
- Lc0 team claims "grandmaster" policy strength for BT3/BT4
Best estimate for BT4 at 1 node: roughly 2500-2700 Elo (strong IM to GM level). This validated the design decision to support depth-0 as a meaningful play mode — a user playing against T1-256x10-distilled at 0-node gets a genuine chess opponent, not a toy.
2.3 Prior Art: MaiaChess Browser Implementation
I studied the MaiaChess web platform as a working precedent. Maia uses a dual-engine architecture running entirely client-side: Maia neural network models converted to ONNX and run via onnxruntime-web, with Stockfish running alongside via WebAssembly for comparison analysis. Platform built with Next.js, TypeScript, React Context.
This confirmed the technical path: ONNX conversion via lc0 leela2onnx → onnxruntime-web → chess.js for board logic. I had already fine-tuned my own Maia model (hunter-chessbot project), so I knew the ONNX conversion pipeline worked.
2.4 The 676-Line Implementation Spec
my research culminated in a comprehensive 4-phase implementation spec covering every architectural decision:
Phase 1: Foundation & Depth-0 Only (the MVP)
- Project setup: React + TypeScript + Vite, react-chessboard + chess.js, onnxruntime-web with WebGPU (WASM fallback)
- Weight conversion pipeline: offline
lc0 leela2onnxconversion, optional FP16 quantization, CDN hosting - Board encoding: the 112-plane representation (104 history + 8 auxiliary), replicating
lc0/src/neural/encoder.ccexactly - Policy decoding: the 1858-element vector → UCI moves, replicating
lc0/src/chess/board.ccindexing - Milestone: "Leela outputs a legal move at depth 0"
Phase 2: MCTS Search (Depth > 0)
- MCTS with PUCT (AlphaZero variant), node budget controls (10/100/1000)
- Tree structure with visit counts, prior probabilities, value estimates
- Selection → expansion → backup loop
- Web Worker for non-blocking inference
Phase 3: Multiple Networks & Smart Loading
- IndexedDB caching (no re-download), network switching UI
- Engine scheduler managing concurrent sessions with memory pressure handling
- Download progress indicators
Phase 4: Polish & Features
- Eval bar, policy visualization, PGN export, mobile responsiveness
Risk mitigations were explicitly planned:
- Board encoding wrong → compare against lc0's actual encoder output for known positions
- Policy decoding wrong → same approach
- WebGPU not available → WASM fallback, smaller nets
- MCTS too slow in JS → start with low node counts, batch inference, SharedArrayBuffer
Curated default network selection: 11 networks spanning ~800-2900 Elo with distinct playing personalities — "Brawler" (Bad Gyal 8), "Wild Style" (Mean Girl 8), "Endgame Drill" (Ender), giving users meaningfully different opponents.
The spec was designed to be self-contained: it included network download links, architecture glossary (NxM filters × residual blocks, SE = Squeeze-Excite, SWA = Stochastic Weight Averaging, distilled = smaller net mimicking larger), and references to specific lc0 source files for encoder/decoder validation.
3. Architecture
┌─────────────────────────────────────────────────────┐
│ React 19 + Tailwind CSS │
│ ├── HomeScreen (network picker, game history) │
│ ├── GameScreen (board, controls, move history) │
│ └── TournamentPage (setup, live view, standings) │
└──────────────────┬──────────────────────────────────┘
│ postMessage (Web Worker)
┌──────────────────▼──────────────────────────────────┐
│ Web Worker │
│ ├── ONNX Runtime Web (WebGPU / WASM fallback) │
│ ├── MCTS Search (PUCT selection, backpropagation) │
│ ├── Board Encoding (FEN → [1,112,8,8] tensor) │
│ └── Policy Decoding (1858 logits → legal moves) │
└──────────────────┬──────────────────────────────────┘
│ fetch + IndexedDB cache
┌──────────────────▼──────────────────────────────────┐
│ Cloudflare R2 (model hosting, 25MB–707MB per model) │
└─────────────────────────────────────────────────────┘Key architectural boundaries:
- Main thread: React UI, game state (chess.js), opening book lookup, persistence (localStorage + IndexedDB)
- Web Worker: All neural network inference and MCTS search. Communicates via typed message protocol (
WorkerRequest/WorkerResponse) - Cloudflare R2: Model storage. Models are gzip-compressed
.onnx.binfiles. Downloaded on demand, decompressed viaDecompressionStream, cached in IndexedDB
4. The Engine: MCTS + Neural Network Inference
4.1 Board Encoding (encoding.ts)
Converts chess positions to the Lc0 input format: a [1, 112, 8, 8] Float32 tensor (7,168 elements).
112 input planes:
- Planes 0–103: 13 planes × 8 history positions. Per position: 6 own piece types + 6 opponent piece types + 1 repetition flag
- Planes 104–107: Castling rights (our queenside, our kingside, opponent queenside, opponent kingside)
- Plane 108: Is black to move (1.0 if black, 0.0 if white)
- Plane 109: Rule50 count / 99.0
- Plane 110: Zeros (move count, disabled)
- Plane 111: All ones
Perspective flipping: The network always sees the position from the side-to-move's perspective. When it's black's turn, piece ownership is swapped, the board is vertically flipped (rank = 7 - rank), and castling rights are swapped. This is the standard Lc0 convention.
4.2 Policy Decoding (decoding.ts, policyIndex.ts)
The neural network outputs 1858 policy logits — one per possible move in Lc0's compressed move encoding.
POLICY_INDEX: A pre-generated array of 1858 UCI move strings. The array index IS the policy output neuron index. This was initially generated programmatically by the AI, but the output was wrong — I directed using the reference table from my hunter-chessbot repo instead.
Decoding flow:
- For each legal move, flip to white perspective if black (via
flipUci) - Look up in
POLICY_INDEX_MAP(reverse map: UCI string → index) - Apply softmax with temperature scaling:
exp((logit - max) / temp) - If temperature > 0: sample from distribution. If temperature = 0: pick argmax
4.3 MCTS Algorithm (mcts.ts)
Tree node structure:
move,parent,children: Map<string, MCTSNode>prior(policy network probability),visits(N),totalValue(W)wdlSum: [win, draw, loss]accumulated over visitsexpanded,terminal,terminalValue
PUCT selection (cPUCT = 2.5):
score = -Q(child) + cPUCT × prior × sqrt(parentVisits) / (1 + childVisits)Q is negated because a child's value is from the opponent's perspective.
Search loop (mctsSearch):
- Create root, expand it (run inference)
- For each iteration (up to nodeLimit or timeLimitMs):
- Select: Walk down the tree picking highest PUCT child until reaching an unexpanded non-terminal leaf. A fresh
Chessinstance is replayed along the path. - Expand: Run neural network inference on the leaf. Create child nodes with priors. Store evaluation (value = wdl[0] - wdl[2]).
- Backpropagate: Walk back to root, negating value at each level. WDL is flipped (win↔loss) at each level.
- Select: Walk down the tree picking highest PUCT child until reaching an unexpanded non-terminal leaf. A fresh
- Progress callback every 10 iterations.
Move selection: After search, if temperature = 0: pick most-visited move (argmax). If temperature > 0: sample proportional to visits^(1/temperature).
Performance: ~80-100 nodes/sec on small nets, ~8-10 on large nets (single-node, unbatched).
Roadmap (Phase 2, not yet implemented): Batched MCTS with virtual loss for branch diversity, batch collection into [B, 112, 8, 8] tensor, expected 5-8× throughput.
4.4 Inference (inference.ts)
Execution provider selection: Checks navigator.gpu — if present, tries ["webgpu", "wasm"]; otherwise ["wasm"] only.
Output head discovery: Dynamically matches output tensor names containing "policy", "wdl", or "value" (but not "wdl" for value). If no WDL head exists, synthesizes WDL from the value head as [(v+1)/2, 0, (1-v)/2].
4.5 Worker Protocol
Request types: init (model URL), getBestMove (single inference, no search), evaluatePosition (WDL only), mctsSearch (full MCTS)
Response types: ready, initProgress, initError, bestMove, evaluation, mctsResult, mctsProgress, error
Lc0Engine class (main-thread API): Wraps the Web Worker with a pub-sub state pattern and promise-based request/response. Only one request of each type can be in-flight at a time.
4.6 Model Cache (modelCache.ts)
IndexedDB database lc0-model-cache with a single models object store. Models stored as decompressed ArrayBuffer keyed by URL. All operations silently catch errors to avoid crashing on IndexedDB issues.
5. Neural Network Catalog
53 models organized by playing strength, spanning 6 model families:
Model Families
| Family | Count | Architecture | Elo Range | Description |
|---|---|---|---|---|
| 11258 distilled | 15 | 16x2-SE to 128x10-SE | ~800–2450 | Distilled from Lc0 T10 training net. The backbone of the rating ladder. |
| Maia | 11 | 64x6-SE | ~1100–2200 | Trained to predict human moves at specific Lichess rating levels. |
| Gyal family | 8 | Various (16x2 to 192x16) | ~800–2500 | Lichess-trained. Sub-families: Tiny/Bad/Good/Evil/Mean with distinct play styles. |
| Official Lc0 | 5 | Various | ~2100–2900 | Official training runs: T70, T42850, T71 FRC/Armageddon variants. |
| Transformers | 5 | 256x10 to 1024x15 | ~2525–2900 | Newest architecture. T1, t3, T82, BT3, BT4. Require WebGPU + significant VRAM. |
| Specialty | 4 | Various | ~2100–2600 | Leelenstein (engine-game trained), Ender (endgame specialist), Little Demon, Maia 2200 Hunter (fine-tuned on my own games). |
Notable Models
| Model | Arch | Size | Runtime MB | Elo | Notes |
|---|---|---|---|---|---|
| Tiny Gyal | 16x2 | 1.1 MB | ~25 | ~800 | Smallest, blunders freely |
| Maia 1100 | 64x6-SE | 3.3 MB | ~39 | ~1100 | Human-like at Lichess 1100 |
| T1-256x10 Distilled | Transformer | 77 MB | ~459 | ~2525 | "Best practical browser net" |
| BT4-1024x15 | Transformer | 707 MB | ~3229 | ~2900 | Strongest available. GM-level at 1-node. Needs ~4 GB VRAM. |
| Maia 2200 Hunter | 64x6-SE | 3.3 MB | ~39 | ~2050 | Fine-tuned on my own blitz/rapid games |
Memory Estimation
Each model has an estimatedRuntimeMb field computed by the benchmark-network-memory.mjs script: loads the ONNX session via onnxruntime-node, measures RSS delta, stores round(peakDeltaMb * 1.2) as a conservative estimate. Used by the tournament engine to estimate how many concurrent games can run.
6. Tournament System
6.1 Configuration
- Formats: Round Robin (circle method / Berger tables) or Swiss (greedy top-down with color balancing)
- Entrants: Each has a network, temperature (0–2), searchNodes (0–800, 0 = raw policy), searchTimeMs (0–30s), custom label
- Best-of: 1–30 regulation games per series (default 3)
- Tiebreak: "capped" (up to N extra games) or "win_by" (leader must be ahead by M)
- Concurrency: 1–8 simultaneous games
- Custom positions: Opening FENs rotate across series
6.2 Execution (useTournamentRunner.ts, 2474 lines)
The tournament runner manages the complete lifecycle:
- Engine pooling: LRU-evicted
Lc0Engineinstances, max =maxSimultaneousGames × 2 + 2. Evicts by next-use distance. - Game execution: Each game creates a
chess.jsinstance, alternates moves between engines (MCTS or raw policy), records FEN history and WDL eval snapshots. Games end on checkmate, stalemate, draw rules, 300-ply limit, or 3-minute timeout. - Concurrency:
Promise.racepattern — fill concurrent slots, proceed when any finishes, refill. No entrant appears in two simultaneous matches. - Error handling: Exponential backoff retries (1s–30s, max 6 retries). After 6 retries, adjudicate as draw.
- Series reconciliation: After each game, recalculates series scores. Early termination when one side has insurmountable lead. Tiebreak games added dynamically.
6.3 Standings & Ratings
- Match points: 1 for series win, 0.5 for draw, 0 for loss
- Game points: From individual game results (regulation only, tiebreakers excluded)
- Buchholz: Sum of opponents' match points (strength of schedule)
- Performance rating: FIDE method — average opponent Elo + dp(score percentage) using the standard 51-entry lookup table
- Cross table: N×N head-to-head matrix with series points, game points, and per-pair performance ratings
6.4 Persistence
Active tournaments saved to localStorage every 200ms (debounced). Also archived to IndexedDB every 5 seconds. On reload, running matches reset to "waiting" and can be auto-resumed. Completed tournaments stored permanently with full state for reopening.
7. UI & Game Flow
7.1 Routing
No router library — App.tsx manages a state machine with screen types: home, game, tournament, share-loading, share-confirm. Persisted to localStorage.
7.2 Home Screen
- NetworkPicker: Searchable, sortable list of 53+ networks. Each shows name, architecture, Elo, download size, cache status. Inline download with progress bar. Temperature slider (0–2), MCTS search controls (nodes 0–800, time limit 0–30s), opening book selection, custom FEN input, share URL generation.
- GameHistory: Saved games list with expand/collapse, PGN display, and "Continue" for incomplete games.
7.3 Game Screen
- Board:
react-chessboardwith click-to-move and drag-and-drop. Legal move indicators (dots for quiet moves, rings for captures). Promotion picker overlay. - Status Bar: Engine status, "Thinking..." indicator, last move in SAN, WDL bar (three-segment Win/Draw/Loss visualization).
- Move History: Tabbed view — Moves (clickable navigation with arrow keys) and PGN (click to copy).
- Controls: New Game (alternates color), Resign (two-step confirmation), Flip Board, temperature adjustment.
- Opening Book: Checks user-selected openings before consulting the engine. Plays book moves randomly, shows "Book" badge.
- Auto-save: Every move persists to localStorage. Game completion triggers final save with result.
7.4 Share URLs
Query parameters: network (required, must match built-in ID), color, fen, temperature. Large models (>25MB, not cached) show a confirmation dialog before downloading.
8. Technical Tradeoffs & Decisions
8.1 Model Hosting: Local → Git LFS → Cloudflare R2
Evolution on Feb 6:
- Models bundled in
public/models/— hit Cloudflare Pages 25MB deployment limit - Tried Git LFS — abandoned same day (complexity, bandwidth costs)
- Final: Cloudflare R2 public bucket. Two upload scripts:
wrangler r2 object putfor normal models,aws s3 cpfor the 707MB BT4 (wrangler can't handle files that large)
8.2 Gzip Compression + .onnx.bin Extension
Models gzip-compressed for 30-45% size reduction. Browser decompresses via DecompressionStream API. Initially used .gz extension, but Vite's dev server (sirv) intercepted .gz files as pre-compressed assets. I directed renaming to .onnx.bin.
8.3 WebGPU vs WASM
Extensively debated. Five approaches analyzed (batched MCTS + WebGPU, GPU-resident search, lc0-to-WASM, Rust WASM, single-node MCTS). I pushed for practical implementation first, extraction into library later. Current: WASM with WebGPU as automatic upgrade when navigator.gpu is available.
8.4 MCTS vs Raw Policy (0-Node)
The app originally used only the policy head (no search). MCTS with PUCT selection was added in PR #9. The 0-node mode is still available (set searchNodes = 0) for speed or weaker play. Phase 2 batched MCTS is planned but not yet implemented.
8.5 Bundler: Rollup → Rolldown
Switched from Vite's default Rollup to rolldown-vite (Rust-based bundler) in PR #4 for faster builds.
8.6 Pre-Generated vs Programmatic Policy Index
The AI initially generated the 1858-entry policy index programmatically. The output was wrong — incorrect move ordering caused the engine to "play random bullshit." I identified the problem and directed using the pre-generated table from my hunter-chessbot reference repo. This fixed the engine immediately.
8.7 Strategic / "why do this at all" tradeoffs
- Why build this when Lichess / chess.com exist — Lichess and chess.com already let you play bots. But neither lets you pick from 50+ specific neural networks with distinct playing personalities (Leelenstein, the Gyal series, endgame-specialist distilled nets), upload your own ONNX weights, or run everything offline with no account. The product thesis is specifically "network-as-opponent" — the identity of which net you're playing is the experience, not a detail. Neither big site exposes that because their product is "you vs. strong AI," not "you vs. this specific model's style." Inferred.
- Fully client-side, no server compute — a trivial server-side Lc0 deployment would handle inference better (GPUs, predictable perf). Client-side means no account, no backend cost, no rate limits, no data leaving the browser, and the app works on a plane. It also means the app is trivially shareable — anyone with the URL gets a full chess lab with 50+ neural engines. The product becomes a static site + R2 bucket; ops complexity is nil. Inferred.
- "Play against a personality" as the thesis, not "play the strongest AI" — if the goal were strength, Stockfish-WASM wins and the project is 10× simpler. Choosing personality means shipping a network catalog with human-readable names ("Brawler," "Wild Style," "Endgame Drill"), not just Elo ratings. It also makes the Maia 2200 Hunter net (from the hunter-chessbot project) a first-class feature, not a curiosity — you can play against a fine-tuned model of a specific human. Evidenced in curated default network list, §2.4.
- Shipping 40+ networks with curated personalities, not a single strong one — every extra network adds hosting cost (BT4 alone is 707 MB), benchmark-run cost (memory profiling per network for tournament eviction), and QA surface (each ONNX has subtly different input/output shapes). Paid because the product is the catalog — a single-net release would be a weaker version of something chess.com already ships. Inferred.
- 0-node / depth-0 as a first-class play mode — classical engine products treat depth 0 as "broken." For Lc0 networks it's meaningful: BT4 at 1 node is ~2500–2700 Elo (strong IM to GM). Shipping it as a real option (not a debug setting) reveals the thesis — the value is in the network's judgment, not search depth. It also makes the product fast enough to be pleasant on weak hardware. Evidenced in §2.2.
- Tournament mode at all — optional feature that complicates scope (LRU engine pool, concurrent inference scheduling, shareable results, 2,474-line runner). Pays off because it turns "play one game vs. one net" into "study which nets beat which" — the catalog becomes an object of investigation, not just a picker. It's also the UI frame where the personality thesis pays off most clearly. Inferred.
8.8 Additional architectural tradeoffs worth naming
- Lc0 (policy-first NN) over Stockfish-WASM (handcrafted + NNUE) — Stockfish is stronger, more battle-tested, and smaller on the wire. But Stockfish plays one way; Lc0 networks give 53 distinct personalities (Maia human-style, the Gyal series, Leelenstein, endgame specialists). The product thesis — "play against a personality, not just a strength" — is only possible with swappable NN weights. Inferred.
- Reimplementing MCTS + encoder/decoder in TypeScript instead of compiling lc0 to WASM — lc0-to-WASM was one of five approaches analyzed (§8.3). Chose TS reimplementation: MCTS runs on the main JS heap with direct access to
chess.js, avoids the lc0 build toolchain, and keeps the codebase debuggable in a browser devtools session. Cost: every bug in §9.1 (policy index ordering, promotion encoding, board flip, history ordering, halfmove divisor) is a bug lc0.wasm wouldn't have had. Evidenced in §8.3. - COOP/COEP deployment cost for SharedArrayBuffer — SAB requires cross-origin isolation, which breaks third-party embeds, analytics, and some fonts/iframes. On Cloudflare Pages this means a
_headersfile that's silently broken on any misconfiguration. Product decision: pay the embed/analytics cost to unlock SIMD-threaded WASM inference. Evidenced in MEMORY.md. chess.jsreplayed along each MCTS path instead of make/unmake — every selection step rebuilds the board from root. At 100–800 nodes with depth ~20, that's O(N·d) move replays per search — likely a major share of the 80–100 nps ceiling on small nets.chess.jshas no unmake; writing a correct one (castling rights, ep, 50-move, 3-fold repetition) is a week of effort and bugs. Chose perf regression over correctness risk. Evidenced inmcts.ts.- Temperature 0.15 default, not 0 (argmax) — AI proposed 0 for strongest play; I overrode: deterministic play is predictable/boring, you lose the "every game feels different" product experience. 0.15 is tuned for "feels alive without throwing games." PR #11 tried broader temperature sampling and was reverted. Evidenced in §10.2.
- No router — screen state machine in
App.tsxfor 5 screens — adding React Router / TanStack Router for five states (home,game,tournament,share-loading,share-confirm) would force URL semantics that don't fit, since the share URL is a payload (base64-encoded game state), not a route. Clean "no-lock-in" stance — swap later if flows grow. Inferred.
8.9 Code-level tradeoffs visible in the source
- One dedicated Web Worker per engine instance, not a shared worker with a job queue —
Lc0EngineinworkerInterface.tsspawnsnew Worker(...)per instance with a pub-sub API for streaming state plus per-kind pending-promise slots. A shared worker would serialize inference or require a request-ID bookkeeping layer the tournament pool already duplicates. Tournament mode needs N parallel engines with independent ONNX sessions — the one-worker-per-engine invariant is what makes that cheap. Evidenced. - Tournament runner as a single 2,474-line hook with
useReffor live state, not a reducer or external store —useTournamentRunner.tskeepsengineMapRef,engineLastUsedAtRef,matchAbortMapRef,stateRefmirroring React state on every render via a custom imperativesetRuntimeupdater. The runner is an async state machine withPromise.race-scheduled concurrent games and eviction — patterns that fight React's render-driven reconciliation. Refs give synchronous read-after-write; a reducer couldn't without threading every continuation through dispatch. Cost: monolith no one else has to read. Evidenced. - LRU eviction by next-use distance (Bélády's optimal), not recency —
selectEvictionCandidatepicks the engine whose next scheduled use is farthest in the sorted upcoming-match list, tie-broken by oldestlastUsedAt. Plain recency LRU is a one-liner. Chess tournaments have a known future schedule (round-robin pairings), so Bélády's optimal is actually computable — and recency would thrash hard because every engine gets touched once per round. Evidenced inuseTournamentRunner.ts. - Share URLs as plain query params (
?network=foo&fen=...), not a compressed encoded payload —shareParams.tsuses four named params and rejects unknown network ids. A base64/LZ blob holding fullGameConfig+ PGN would allow richer sharing, but human-readable URLs are debuggable and validate cheaply. Theshare-loading→share-confirm→gamescreen chain exists specifically because large models require user consent before a 707 MB download — a compressed blob couldn't surface that decision. Evidenced. - Two separate IndexedDB databases:
lc0-model-cacheandplay-lc0-tournaments— model blobs (5–700 MB) and small game records live in separate databases, one via theidbwrapper and one via raw IDB. Browsers can GC the heavy DB independently without touching the light one. Tournament IDs are a content fingerprintt-${fnv1a(entrants + matches)}so resuming a running tournament updates the same record without dedup logic. Evidenced. - Policy index as a 1,858-entry hand-baked TypeScript literal, not programmatic generation —
policyIndex.tsis a copy of the canonical lc0 training-time move ordering from thehunter-chessbotreference repo. The AI's original programmatic generation shipped a subtly-wrong ordering that caused "playing random bullshit." The table is the contract with lc0, not a computation; shipping the canonical artifact trades ~60 KB of source for zero chance of regression on move ordering. Evidenced in §9.1 bug story.
9. Major Bugs & Debugging Stories
9.1 "Playing Random Bullshit" — Six Encoding Bugs (Feb 5)
The biggest debugging effort in the project. After the initial build, I reported the engine was playing nonsensical moves despite correct WDL evaluation. Six separate bugs were found:
- Policy index table: Programmatic generation produced wrong move ordering. Fixed by using pre-generated 1858-entry array from my reference repo.
- Promotion encoding: Inverted (queen treated as normal, n/b/r as underpromotions). Correct: q/r/b have explicit suffixes, knight uses bare 4-char move.
- Move flipping for black: Was flipping square indices instead of UCI string ranks.
- History ordering: Taking first 7 positions instead of most recent 7.
- FenHistory initialization: Started as
[]instead of[startFEN]. - Halfmove clock: Divided by 100.0, should be 99.0.
I identified that the value head was correct but the policy head was wrong, narrowing the investigation. I pointed to my working hunter-chessbot as the authoritative reference.
9.2 Bus Error in ONNX Conversion (Feb 6)
my custom fine-tuned Maia model crashed with Bus error: 10 during lc0 leela2onnx conversion. Deep investigation:
- Compared hex dumps of working base model vs my model
- Found my model had extra
training_paramsfields (policy_loss, accuracy) - Traced crash to
FloatOnnxWeightsAdapter::GetRawData()— KERN_PROTECTION_FAILURE at memory boundary - Root cause: Bug in lc0 v0.32.1 handling models with training_params populated (needed v0.21.0+)
9.3 useEffect Anti-Patterns (Feb 9)
I identified pervasive useEffect problems: "what the fuck are these useEffects for?" Led to a comprehensive rewrite of OpeningPicker (removed all 3 useEffects, removed open prop entirely, parents conditionally render instead). Also fixed a flickering bug where NetworkPicker's selection oscillated due to re-resolve effects depending on [networks, selected.id] and running before localStorage writes completed.
9.4 Vite .gz Interception
Gzipped model files couldn't be served in development because Vite's sirv middleware treated .gz files as pre-compressed assets. Solved by renaming to .onnx.bin.
10. AI Agent Involvement
10.1 Session Data
| Metric | Value |
|---|---|
| Sessions | 9 session directories |
| Subagent files | 63 JSONL files |
| Total dialog lines | ~3,371 |
| Date range | Feb 5–13, 2026 |
10.2 my Direction
I provided a comprehensive spec file (lc0-browser-chess-spec.md) upfront describing a 4-phase plan. I directed the architecture, identified bugs by testing, and pointed the AI to my working reference implementation (hunter-chessbot) when the AI's code was wrong.
Key corrections I made:
- Policy encoding: AI's programmatic generation was fundamentally wrong — I directed using my reference repo's pre-generated table
- useEffect quality: I identified anti-patterns the AI had written and directed a comprehensive rewrite ("fix all your code that is dog shit. check over it and do refactors where necessary, but with a brain this time")
- Temperature default: AI suggested 0; I pointed out that makes moves deterministic and predictable, changed to 0.15
- Modal vs page navigation: AI suggested page-based game detail view; I directed modal overlay instead
- Board size: I specified exact sizing (
min(90vh, 90vw)) - MCTS architecture: I pushed back on AI's recommendation to build in-repo, noting batched MCTS is logically a library. AI conceded.
- Auto health checks: AI added automatic engine health checks on modal open; I directed removing them
10.3 AI's Execution
The AI handled implementation, model conversion research, and the bulk of the coding. It deployed 3 parallel research subagents at project start to investigate Lc0 encoding, policy output format, and ONNX Runtime configuration requirements. The tournament system (2,474 lines in useTournamentRunner.ts alone) was largely AI-generated, with me directing the feature requirements and correcting UI decisions.
11. Development Timeline
| Date | Commits | Key Achievement |
|---|---|---|
| Feb 5 | 3 | First working app: Lc0 in browser, encoding bugs found and fixed |
| Feb 6 | 14 | 30 networks added, Maia series, Git LFS → R2 pivot, model compression, game saving |
| Feb 7 | 3 | Custom ONNX upload (PR #3), useEffect audit (PR #4), tournament mode (PR #1) |
| Feb 9-10 | 5 | Opening book system (PR #5, 15K+ openings), FIDE performance ratings (PR #6), modal polish (PR #7) |
| Feb 12 | 1 | Shareable game URLs (PR #8) |
| Feb 13 | 4 | MCTS search (PR #9), per-entrant tournament settings |
| Feb 23 | 3 | Temperature sampling attempt + revert (PR #11, still open) |
Key Velocity Facts
- First working app with neural network inference: 2 hours from initial commit
- 53 networks cataloged and converted: 1 day
- Full tournament mode with Swiss/round-robin: 1 day (PR #1, 6 commits)
- Opening book with 15K+ positions: 2 days (PR #5, 4 commits)
- MCTS search engine: 1 day (PR #9)
12. Key Files Reference
Engine
| File | Lines | Purpose |
|---|---|---|
engine/mcts.ts | ~250 | MCTS search (PUCT selection, expansion, backpropagation) |
engine/inference.ts | ~120 | ONNX Runtime session management, WebGPU/WASM |
engine/encoding.ts | ~180 | FEN → [1,112,8,8] tensor encoding |
engine/decoding.ts | ~80 | 1858 policy logits → legal moves with temperature |
engine/policyIndex.ts | ~1900 | Pre-generated 1858 UCI move lookup table |
engine/worker.ts | ~200 | Web Worker: model loading, inference, MCTS |
engine/workerInterface.ts | ~150 | Main-thread Lc0Engine class (pub-sub + promises) |
engine/modelCache.ts | ~60 | IndexedDB model caching |
Tournament
| File | Lines | Purpose |
|---|---|---|
hooks/useTournamentRunner.ts | 2474 | Complete tournament lifecycle management |
lib/tournament/pairings.ts | ~150 | Round-robin (Berger tables) + Swiss pairings |
lib/tournament/standings.ts | ~100 | Match/game points, Buchholz, sorting |
lib/tournament/performanceRating.ts | ~80 | FIDE dp lookup table + computation |
Data
| File | Lines | Purpose |
|---|---|---|
constants/networks.ts | ~800 | 53 network definitions with metadata |
lib/openingBook.ts | ~50 | Trie-based opening book lookup |
data/openings.ts | ~20 | Lazy-loaded ECO opening database |
UI
| File | Lines | Purpose |
|---|---|---|
components/GameScreen.tsx | ~500 | Active game view with board, controls, history |
components/NetworkPicker.tsx | ~600 | Network selection, download, configuration |
components/TournamentLiveScreen.tsx | ~500 | Live tournament view with standings |
components/OpeningPicker.tsx | ~400 | Opening book selection modal |