Thorin

Enter password to continue

Skip to content

Play Lc0 — Deep Technical Profile

Build timeline — ~9 active days across 3 phases (Feb 5 – Feb 23, 2026, 18 calendar days)

  1. Initial app + model management (2 days) — browser chess vs Leela, 40+ networks, LFS, R2 hosting, game history, UI polish
  2. Tournament mode + UX hardening (2 days) — tournament mode, live boards, concurrency scheduling, engine eviction, custom ONNX upload, Rolldown bundler, useEffect fixes
  3. Openings + ratings + search + sharing (5 active days, after 11-day gap) — 15K+ opening book, ECO picker, FIDE performance ratings, game detail modal, shareable URLs, MCTS search with time limits, temperature sampling experiment (reverted)

Table of Contents

  1. Project Overview
  1. Pre-Implementation Research & Spec
  2. Architecture
  3. The Engine: MCTS + Neural Network Inference
  4. Neural Network Catalog
  5. Tournament System
  6. UI & Game Flow
  7. Technical Tradeoffs & Decisions
  8. Major Bugs & Debugging Stories
  9. AI Agent Involvement
  10. Development Timeline
  11. Key Files Reference

1. Project Overview

A fully client-side web application that lets you play chess against Leela Chess Zero (Lc0) neural networks running entirely in the browser. All inference happens locally via ONNX Runtime Web (WebGPU with WASM fallback) — no server-side computation.

By the numbers:

  • ~16,000 lines of TypeScript/TSX across 51 files
  • 36 commits across 11 PRs (9 merged, 1 closed, 1 open), built in 19 days (Feb 5–23, 2026)
  • 53 neural network models from ~800 to ~2900 Elo
  • MCTS search with configurable node budget and temperature
  • Swiss and round-robin tournament mode with FIDE performance ratings
  • 15,000+ opening positions (full ECO database)
  • Custom ONNX model upload with verification
  • Shareable game URLs via query parameters
  • Models hosted on Cloudflare R2; app deployed to Cloudflare Pages

Libraries & Frameworks

All code runs in the browser — there is no backend service. The engine lives in a Web Worker.

UI (React web app)

  • React 19 + react-dom — UI and game-flow components.
  • Vite (rolldown-vite) + @vitejs/plugin-react — dev server and production bundler (rolldown variant for faster builds).
  • vite-plugin-static-copy — copies ONNX Runtime's WASM artifacts into the build output.
  • Tailwind CSS 4 + @tailwindcss/vite — utility-first styling.
  • lucide-react — icons.

Chess UI & logic

  • react-chessboard 5.8 — drag-and-drop chessboard component.
  • chess.js 1.4 — move generation, legality, FEN/PGN handling.

Neural inference (engine worker)

  • onnxruntime-web 1.24 — runs the LC0 ONNX networks in the browser with WebGPU (preferred) or WASM fallback.
  • onnxruntime-node 1.24 (dev) — used only for benchmarking memory/perf of models on Node before shipping.
  • The MCTS implementation, policy-index mapping (1858 UCI moves), and board-to-[1,112,8,8] encoding are hand-rolled in engine/*.ts — no third-party chess-AI library.

Client-side persistence

  • idb 8 — thin async wrapper around IndexedDB; used by modelCache.ts to store decompressed ONNX networks so they don't re-download each session.

Tooling

  • TypeScript 5.9 — types across the UI + engine worker.
  • ESLint 9 + typescript-eslint + react-hooks/react-refresh plugins + globals — lint config.
  • Type defs: @types/react, @types/react-dom, @types/node.

2. Pre-Implementation Research & Spec

Before any code was written, I conducted extensive research into which Lc0 networks would work in a browser, how policy-only (0-node) play performs, and what the full architecture should look like. This research produced a 676-line implementation spec that was passed directly to Claude Code as the project blueprint.

2.1 Network Feasibility Analysis

I researched the full Lc0 network ecosystem and categorized models by browser feasibility:

Tier 1 — Tiny CNNs (~50 KB to a few MB): dkappe distilled series (Tiny Gyal, etc.). Run on any device, even WASM-only without WebGPU. Useful for low-end users and mobile.

Tier 2 — Small/medium CNNs (64x6 to 128x10): SE residual networks. Strong but compact. Fast inference, low download size. Good for mobile and modest hardware.

Tier 3 — Standard CNNs (192x15 to 320x24): Mainline Lc0 sizes. Need WebGPU for reasonable performance. Downloads get large (50-200 MB).

Tier 4 — Transformers (the standout): T1-256x10-distilled at ~65 MB FP16 has a dramatically stronger policy head than any residual CNN. At 1 node, its policy head alone should be in the ~2600-2800 Elo range. This became the "best practical browser net."

Tier 5 — Big nets (BT4, 768+ filters): Desktop-native territory. 707 MB for BT4. Needs ~4 GB VRAM. Works in browser with WebGPU on high-end hardware.

2.2 Policy-Only (0-Node) Strength Research

I researched how strong Lc0 is without any search — just the raw policy head picking the highest-probability move. This was critical because depth-0 play was designed as a first-class feature, not an afterthought.

Key data points gathered:

  • Older convolutional nets (2021, net 67743): above 2200 Elo at 1 node — enough to trouble a human master
  • Latest BT4 transformer: nearly 300 Elo stronger in raw policy than the strongest CNN (T78), with fewer parameters
  • Wikipedia (Nov 2024): Lc0 models achieving "grandmaster-level strength at one position evaluation per move"
  • Lc0 team claims "grandmaster" policy strength for BT3/BT4

Best estimate for BT4 at 1 node: roughly 2500-2700 Elo (strong IM to GM level). This validated the design decision to support depth-0 as a meaningful play mode — a user playing against T1-256x10-distilled at 0-node gets a genuine chess opponent, not a toy.

2.3 Prior Art: MaiaChess Browser Implementation

I studied the MaiaChess web platform as a working precedent. Maia uses a dual-engine architecture running entirely client-side: Maia neural network models converted to ONNX and run via onnxruntime-web, with Stockfish running alongside via WebAssembly for comparison analysis. Platform built with Next.js, TypeScript, React Context.

This confirmed the technical path: ONNX conversion via lc0 leela2onnxonnxruntime-webchess.js for board logic. I had already fine-tuned my own Maia model (hunter-chessbot project), so I knew the ONNX conversion pipeline worked.

2.4 The 676-Line Implementation Spec

my research culminated in a comprehensive 4-phase implementation spec covering every architectural decision:

Phase 1: Foundation & Depth-0 Only (the MVP)

  • Project setup: React + TypeScript + Vite, react-chessboard + chess.js, onnxruntime-web with WebGPU (WASM fallback)
  • Weight conversion pipeline: offline lc0 leela2onnx conversion, optional FP16 quantization, CDN hosting
  • Board encoding: the 112-plane representation (104 history + 8 auxiliary), replicating lc0/src/neural/encoder.cc exactly
  • Policy decoding: the 1858-element vector → UCI moves, replicating lc0/src/chess/board.cc indexing
  • Milestone: "Leela outputs a legal move at depth 0"

Phase 2: MCTS Search (Depth > 0)

  • MCTS with PUCT (AlphaZero variant), node budget controls (10/100/1000)
  • Tree structure with visit counts, prior probabilities, value estimates
  • Selection → expansion → backup loop
  • Web Worker for non-blocking inference

Phase 3: Multiple Networks & Smart Loading

  • IndexedDB caching (no re-download), network switching UI
  • Engine scheduler managing concurrent sessions with memory pressure handling
  • Download progress indicators

Phase 4: Polish & Features

  • Eval bar, policy visualization, PGN export, mobile responsiveness

Risk mitigations were explicitly planned:

  • Board encoding wrong → compare against lc0's actual encoder output for known positions
  • Policy decoding wrong → same approach
  • WebGPU not available → WASM fallback, smaller nets
  • MCTS too slow in JS → start with low node counts, batch inference, SharedArrayBuffer

Curated default network selection: 11 networks spanning ~800-2900 Elo with distinct playing personalities — "Brawler" (Bad Gyal 8), "Wild Style" (Mean Girl 8), "Endgame Drill" (Ender), giving users meaningfully different opponents.

The spec was designed to be self-contained: it included network download links, architecture glossary (NxM filters × residual blocks, SE = Squeeze-Excite, SWA = Stochastic Weight Averaging, distilled = smaller net mimicking larger), and references to specific lc0 source files for encoder/decoder validation.


3. Architecture

┌─────────────────────────────────────────────────────┐
│  React 19 + Tailwind CSS                             │
│  ├── HomeScreen (network picker, game history)       │
│  ├── GameScreen (board, controls, move history)      │
│  └── TournamentPage (setup, live view, standings)    │
└──────────────────┬──────────────────────────────────┘
                   │ postMessage (Web Worker)
┌──────────────────▼──────────────────────────────────┐
│  Web Worker                                          │
│  ├── ONNX Runtime Web (WebGPU / WASM fallback)      │
│  ├── MCTS Search (PUCT selection, backpropagation)   │
│  ├── Board Encoding (FEN → [1,112,8,8] tensor)      │
│  └── Policy Decoding (1858 logits → legal moves)     │
└──────────────────┬──────────────────────────────────┘
                   │ fetch + IndexedDB cache
┌──────────────────▼──────────────────────────────────┐
│  Cloudflare R2 (model hosting, 25MB–707MB per model) │
└─────────────────────────────────────────────────────┘

Key architectural boundaries:

  • Main thread: React UI, game state (chess.js), opening book lookup, persistence (localStorage + IndexedDB)
  • Web Worker: All neural network inference and MCTS search. Communicates via typed message protocol (WorkerRequest/WorkerResponse)
  • Cloudflare R2: Model storage. Models are gzip-compressed .onnx.bin files. Downloaded on demand, decompressed via DecompressionStream, cached in IndexedDB

4. The Engine: MCTS + Neural Network Inference

4.1 Board Encoding (encoding.ts)

Converts chess positions to the Lc0 input format: a [1, 112, 8, 8] Float32 tensor (7,168 elements).

112 input planes:

  • Planes 0–103: 13 planes × 8 history positions. Per position: 6 own piece types + 6 opponent piece types + 1 repetition flag
  • Planes 104–107: Castling rights (our queenside, our kingside, opponent queenside, opponent kingside)
  • Plane 108: Is black to move (1.0 if black, 0.0 if white)
  • Plane 109: Rule50 count / 99.0
  • Plane 110: Zeros (move count, disabled)
  • Plane 111: All ones

Perspective flipping: The network always sees the position from the side-to-move's perspective. When it's black's turn, piece ownership is swapped, the board is vertically flipped (rank = 7 - rank), and castling rights are swapped. This is the standard Lc0 convention.

4.2 Policy Decoding (decoding.ts, policyIndex.ts)

The neural network outputs 1858 policy logits — one per possible move in Lc0's compressed move encoding.

POLICY_INDEX: A pre-generated array of 1858 UCI move strings. The array index IS the policy output neuron index. This was initially generated programmatically by the AI, but the output was wrong — I directed using the reference table from my hunter-chessbot repo instead.

Decoding flow:

  1. For each legal move, flip to white perspective if black (via flipUci)
  2. Look up in POLICY_INDEX_MAP (reverse map: UCI string → index)
  3. Apply softmax with temperature scaling: exp((logit - max) / temp)
  4. If temperature > 0: sample from distribution. If temperature = 0: pick argmax

4.3 MCTS Algorithm (mcts.ts)

Tree node structure:

  • move, parent, children: Map<string, MCTSNode>
  • prior (policy network probability), visits (N), totalValue (W)
  • wdlSum: [win, draw, loss] accumulated over visits
  • expanded, terminal, terminalValue

PUCT selection (cPUCT = 2.5):

score = -Q(child) + cPUCT × prior × sqrt(parentVisits) / (1 + childVisits)

Q is negated because a child's value is from the opponent's perspective.

Search loop (mctsSearch):

  1. Create root, expand it (run inference)
  2. For each iteration (up to nodeLimit or timeLimitMs):
    • Select: Walk down the tree picking highest PUCT child until reaching an unexpanded non-terminal leaf. A fresh Chess instance is replayed along the path.
    • Expand: Run neural network inference on the leaf. Create child nodes with priors. Store evaluation (value = wdl[0] - wdl[2]).
    • Backpropagate: Walk back to root, negating value at each level. WDL is flipped (win↔loss) at each level.
  3. Progress callback every 10 iterations.

Move selection: After search, if temperature = 0: pick most-visited move (argmax). If temperature > 0: sample proportional to visits^(1/temperature).

Performance: ~80-100 nodes/sec on small nets, ~8-10 on large nets (single-node, unbatched).

Roadmap (Phase 2, not yet implemented): Batched MCTS with virtual loss for branch diversity, batch collection into [B, 112, 8, 8] tensor, expected 5-8× throughput.

4.4 Inference (inference.ts)

Execution provider selection: Checks navigator.gpu — if present, tries ["webgpu", "wasm"]; otherwise ["wasm"] only.

Output head discovery: Dynamically matches output tensor names containing "policy", "wdl", or "value" (but not "wdl" for value). If no WDL head exists, synthesizes WDL from the value head as [(v+1)/2, 0, (1-v)/2].

4.5 Worker Protocol

Request types: init (model URL), getBestMove (single inference, no search), evaluatePosition (WDL only), mctsSearch (full MCTS)

Response types: ready, initProgress, initError, bestMove, evaluation, mctsResult, mctsProgress, error

Lc0Engine class (main-thread API): Wraps the Web Worker with a pub-sub state pattern and promise-based request/response. Only one request of each type can be in-flight at a time.

4.6 Model Cache (modelCache.ts)

IndexedDB database lc0-model-cache with a single models object store. Models stored as decompressed ArrayBuffer keyed by URL. All operations silently catch errors to avoid crashing on IndexedDB issues.


5. Neural Network Catalog

53 models organized by playing strength, spanning 6 model families:

Model Families

FamilyCountArchitectureElo RangeDescription
11258 distilled1516x2-SE to 128x10-SE~800–2450Distilled from Lc0 T10 training net. The backbone of the rating ladder.
Maia1164x6-SE~1100–2200Trained to predict human moves at specific Lichess rating levels.
Gyal family8Various (16x2 to 192x16)~800–2500Lichess-trained. Sub-families: Tiny/Bad/Good/Evil/Mean with distinct play styles.
Official Lc05Various~2100–2900Official training runs: T70, T42850, T71 FRC/Armageddon variants.
Transformers5256x10 to 1024x15~2525–2900Newest architecture. T1, t3, T82, BT3, BT4. Require WebGPU + significant VRAM.
Specialty4Various~2100–2600Leelenstein (engine-game trained), Ender (endgame specialist), Little Demon, Maia 2200 Hunter (fine-tuned on my own games).

Notable Models

ModelArchSizeRuntime MBEloNotes
Tiny Gyal16x21.1 MB~25~800Smallest, blunders freely
Maia 110064x6-SE3.3 MB~39~1100Human-like at Lichess 1100
T1-256x10 DistilledTransformer77 MB~459~2525"Best practical browser net"
BT4-1024x15Transformer707 MB~3229~2900Strongest available. GM-level at 1-node. Needs ~4 GB VRAM.
Maia 2200 Hunter64x6-SE3.3 MB~39~2050Fine-tuned on my own blitz/rapid games

Memory Estimation

Each model has an estimatedRuntimeMb field computed by the benchmark-network-memory.mjs script: loads the ONNX session via onnxruntime-node, measures RSS delta, stores round(peakDeltaMb * 1.2) as a conservative estimate. Used by the tournament engine to estimate how many concurrent games can run.


6. Tournament System

6.1 Configuration

  • Formats: Round Robin (circle method / Berger tables) or Swiss (greedy top-down with color balancing)
  • Entrants: Each has a network, temperature (0–2), searchNodes (0–800, 0 = raw policy), searchTimeMs (0–30s), custom label
  • Best-of: 1–30 regulation games per series (default 3)
  • Tiebreak: "capped" (up to N extra games) or "win_by" (leader must be ahead by M)
  • Concurrency: 1–8 simultaneous games
  • Custom positions: Opening FENs rotate across series

6.2 Execution (useTournamentRunner.ts, 2474 lines)

The tournament runner manages the complete lifecycle:

  • Engine pooling: LRU-evicted Lc0Engine instances, max = maxSimultaneousGames × 2 + 2. Evicts by next-use distance.
  • Game execution: Each game creates a chess.js instance, alternates moves between engines (MCTS or raw policy), records FEN history and WDL eval snapshots. Games end on checkmate, stalemate, draw rules, 300-ply limit, or 3-minute timeout.
  • Concurrency: Promise.race pattern — fill concurrent slots, proceed when any finishes, refill. No entrant appears in two simultaneous matches.
  • Error handling: Exponential backoff retries (1s–30s, max 6 retries). After 6 retries, adjudicate as draw.
  • Series reconciliation: After each game, recalculates series scores. Early termination when one side has insurmountable lead. Tiebreak games added dynamically.

6.3 Standings & Ratings

  • Match points: 1 for series win, 0.5 for draw, 0 for loss
  • Game points: From individual game results (regulation only, tiebreakers excluded)
  • Buchholz: Sum of opponents' match points (strength of schedule)
  • Performance rating: FIDE method — average opponent Elo + dp(score percentage) using the standard 51-entry lookup table
  • Cross table: N×N head-to-head matrix with series points, game points, and per-pair performance ratings

6.4 Persistence

Active tournaments saved to localStorage every 200ms (debounced). Also archived to IndexedDB every 5 seconds. On reload, running matches reset to "waiting" and can be auto-resumed. Completed tournaments stored permanently with full state for reopening.


7. UI & Game Flow

7.1 Routing

No router library — App.tsx manages a state machine with screen types: home, game, tournament, share-loading, share-confirm. Persisted to localStorage.

7.2 Home Screen

  • NetworkPicker: Searchable, sortable list of 53+ networks. Each shows name, architecture, Elo, download size, cache status. Inline download with progress bar. Temperature slider (0–2), MCTS search controls (nodes 0–800, time limit 0–30s), opening book selection, custom FEN input, share URL generation.
  • GameHistory: Saved games list with expand/collapse, PGN display, and "Continue" for incomplete games.

7.3 Game Screen

  • Board: react-chessboard with click-to-move and drag-and-drop. Legal move indicators (dots for quiet moves, rings for captures). Promotion picker overlay.
  • Status Bar: Engine status, "Thinking..." indicator, last move in SAN, WDL bar (three-segment Win/Draw/Loss visualization).
  • Move History: Tabbed view — Moves (clickable navigation with arrow keys) and PGN (click to copy).
  • Controls: New Game (alternates color), Resign (two-step confirmation), Flip Board, temperature adjustment.
  • Opening Book: Checks user-selected openings before consulting the engine. Plays book moves randomly, shows "Book" badge.
  • Auto-save: Every move persists to localStorage. Game completion triggers final save with result.

7.4 Share URLs

Query parameters: network (required, must match built-in ID), color, fen, temperature. Large models (>25MB, not cached) show a confirmation dialog before downloading.


8. Technical Tradeoffs & Decisions

8.1 Model Hosting: Local → Git LFS → Cloudflare R2

Evolution on Feb 6:

  1. Models bundled in public/models/ — hit Cloudflare Pages 25MB deployment limit
  2. Tried Git LFS — abandoned same day (complexity, bandwidth costs)
  3. Final: Cloudflare R2 public bucket. Two upload scripts: wrangler r2 object put for normal models, aws s3 cp for the 707MB BT4 (wrangler can't handle files that large)

8.2 Gzip Compression + .onnx.bin Extension

Models gzip-compressed for 30-45% size reduction. Browser decompresses via DecompressionStream API. Initially used .gz extension, but Vite's dev server (sirv) intercepted .gz files as pre-compressed assets. I directed renaming to .onnx.bin.

8.3 WebGPU vs WASM

Extensively debated. Five approaches analyzed (batched MCTS + WebGPU, GPU-resident search, lc0-to-WASM, Rust WASM, single-node MCTS). I pushed for practical implementation first, extraction into library later. Current: WASM with WebGPU as automatic upgrade when navigator.gpu is available.

8.4 MCTS vs Raw Policy (0-Node)

The app originally used only the policy head (no search). MCTS with PUCT selection was added in PR #9. The 0-node mode is still available (set searchNodes = 0) for speed or weaker play. Phase 2 batched MCTS is planned but not yet implemented.

8.5 Bundler: Rollup → Rolldown

Switched from Vite's default Rollup to rolldown-vite (Rust-based bundler) in PR #4 for faster builds.

8.6 Pre-Generated vs Programmatic Policy Index

The AI initially generated the 1858-entry policy index programmatically. The output was wrong — incorrect move ordering caused the engine to "play random bullshit." I identified the problem and directed using the pre-generated table from my hunter-chessbot reference repo. This fixed the engine immediately.

8.7 Strategic / "why do this at all" tradeoffs

  • Why build this when Lichess / chess.com exist — Lichess and chess.com already let you play bots. But neither lets you pick from 50+ specific neural networks with distinct playing personalities (Leelenstein, the Gyal series, endgame-specialist distilled nets), upload your own ONNX weights, or run everything offline with no account. The product thesis is specifically "network-as-opponent" — the identity of which net you're playing is the experience, not a detail. Neither big site exposes that because their product is "you vs. strong AI," not "you vs. this specific model's style." Inferred.
  • Fully client-side, no server compute — a trivial server-side Lc0 deployment would handle inference better (GPUs, predictable perf). Client-side means no account, no backend cost, no rate limits, no data leaving the browser, and the app works on a plane. It also means the app is trivially shareable — anyone with the URL gets a full chess lab with 50+ neural engines. The product becomes a static site + R2 bucket; ops complexity is nil. Inferred.
  • "Play against a personality" as the thesis, not "play the strongest AI" — if the goal were strength, Stockfish-WASM wins and the project is 10× simpler. Choosing personality means shipping a network catalog with human-readable names ("Brawler," "Wild Style," "Endgame Drill"), not just Elo ratings. It also makes the Maia 2200 Hunter net (from the hunter-chessbot project) a first-class feature, not a curiosity — you can play against a fine-tuned model of a specific human. Evidenced in curated default network list, §2.4.
  • Shipping 40+ networks with curated personalities, not a single strong one — every extra network adds hosting cost (BT4 alone is 707 MB), benchmark-run cost (memory profiling per network for tournament eviction), and QA surface (each ONNX has subtly different input/output shapes). Paid because the product is the catalog — a single-net release would be a weaker version of something chess.com already ships. Inferred.
  • 0-node / depth-0 as a first-class play mode — classical engine products treat depth 0 as "broken." For Lc0 networks it's meaningful: BT4 at 1 node is ~2500–2700 Elo (strong IM to GM). Shipping it as a real option (not a debug setting) reveals the thesis — the value is in the network's judgment, not search depth. It also makes the product fast enough to be pleasant on weak hardware. Evidenced in §2.2.
  • Tournament mode at all — optional feature that complicates scope (LRU engine pool, concurrent inference scheduling, shareable results, 2,474-line runner). Pays off because it turns "play one game vs. one net" into "study which nets beat which" — the catalog becomes an object of investigation, not just a picker. It's also the UI frame where the personality thesis pays off most clearly. Inferred.

8.8 Additional architectural tradeoffs worth naming

  • Lc0 (policy-first NN) over Stockfish-WASM (handcrafted + NNUE) — Stockfish is stronger, more battle-tested, and smaller on the wire. But Stockfish plays one way; Lc0 networks give 53 distinct personalities (Maia human-style, the Gyal series, Leelenstein, endgame specialists). The product thesis — "play against a personality, not just a strength" — is only possible with swappable NN weights. Inferred.
  • Reimplementing MCTS + encoder/decoder in TypeScript instead of compiling lc0 to WASM — lc0-to-WASM was one of five approaches analyzed (§8.3). Chose TS reimplementation: MCTS runs on the main JS heap with direct access to chess.js, avoids the lc0 build toolchain, and keeps the codebase debuggable in a browser devtools session. Cost: every bug in §9.1 (policy index ordering, promotion encoding, board flip, history ordering, halfmove divisor) is a bug lc0.wasm wouldn't have had. Evidenced in §8.3.
  • COOP/COEP deployment cost for SharedArrayBuffer — SAB requires cross-origin isolation, which breaks third-party embeds, analytics, and some fonts/iframes. On Cloudflare Pages this means a _headers file that's silently broken on any misconfiguration. Product decision: pay the embed/analytics cost to unlock SIMD-threaded WASM inference. Evidenced in MEMORY.md.
  • chess.js replayed along each MCTS path instead of make/unmake — every selection step rebuilds the board from root. At 100–800 nodes with depth ~20, that's O(N·d) move replays per search — likely a major share of the 80–100 nps ceiling on small nets. chess.js has no unmake; writing a correct one (castling rights, ep, 50-move, 3-fold repetition) is a week of effort and bugs. Chose perf regression over correctness risk. Evidenced in mcts.ts.
  • Temperature 0.15 default, not 0 (argmax) — AI proposed 0 for strongest play; I overrode: deterministic play is predictable/boring, you lose the "every game feels different" product experience. 0.15 is tuned for "feels alive without throwing games." PR #11 tried broader temperature sampling and was reverted. Evidenced in §10.2.
  • No router — screen state machine in App.tsx for 5 screens — adding React Router / TanStack Router for five states (home, game, tournament, share-loading, share-confirm) would force URL semantics that don't fit, since the share URL is a payload (base64-encoded game state), not a route. Clean "no-lock-in" stance — swap later if flows grow. Inferred.

8.9 Code-level tradeoffs visible in the source

  • One dedicated Web Worker per engine instance, not a shared worker with a job queueLc0Engine in workerInterface.ts spawns new Worker(...) per instance with a pub-sub API for streaming state plus per-kind pending-promise slots. A shared worker would serialize inference or require a request-ID bookkeeping layer the tournament pool already duplicates. Tournament mode needs N parallel engines with independent ONNX sessions — the one-worker-per-engine invariant is what makes that cheap. Evidenced.
  • Tournament runner as a single 2,474-line hook with useRef for live state, not a reducer or external storeuseTournamentRunner.ts keeps engineMapRef, engineLastUsedAtRef, matchAbortMapRef, stateRef mirroring React state on every render via a custom imperative setRuntime updater. The runner is an async state machine with Promise.race-scheduled concurrent games and eviction — patterns that fight React's render-driven reconciliation. Refs give synchronous read-after-write; a reducer couldn't without threading every continuation through dispatch. Cost: monolith no one else has to read. Evidenced.
  • LRU eviction by next-use distance (Bélády's optimal), not recencyselectEvictionCandidate picks the engine whose next scheduled use is farthest in the sorted upcoming-match list, tie-broken by oldest lastUsedAt. Plain recency LRU is a one-liner. Chess tournaments have a known future schedule (round-robin pairings), so Bélády's optimal is actually computable — and recency would thrash hard because every engine gets touched once per round. Evidenced in useTournamentRunner.ts.
  • Share URLs as plain query params (?network=foo&fen=...), not a compressed encoded payloadshareParams.ts uses four named params and rejects unknown network ids. A base64/LZ blob holding full GameConfig + PGN would allow richer sharing, but human-readable URLs are debuggable and validate cheaply. The share-loadingshare-confirmgame screen chain exists specifically because large models require user consent before a 707 MB download — a compressed blob couldn't surface that decision. Evidenced.
  • Two separate IndexedDB databases: lc0-model-cache and play-lc0-tournaments — model blobs (5–700 MB) and small game records live in separate databases, one via the idb wrapper and one via raw IDB. Browsers can GC the heavy DB independently without touching the light one. Tournament IDs are a content fingerprint t-${fnv1a(entrants + matches)} so resuming a running tournament updates the same record without dedup logic. Evidenced.
  • Policy index as a 1,858-entry hand-baked TypeScript literal, not programmatic generationpolicyIndex.ts is a copy of the canonical lc0 training-time move ordering from the hunter-chessbot reference repo. The AI's original programmatic generation shipped a subtly-wrong ordering that caused "playing random bullshit." The table is the contract with lc0, not a computation; shipping the canonical artifact trades ~60 KB of source for zero chance of regression on move ordering. Evidenced in §9.1 bug story.

9. Major Bugs & Debugging Stories

9.1 "Playing Random Bullshit" — Six Encoding Bugs (Feb 5)

The biggest debugging effort in the project. After the initial build, I reported the engine was playing nonsensical moves despite correct WDL evaluation. Six separate bugs were found:

  1. Policy index table: Programmatic generation produced wrong move ordering. Fixed by using pre-generated 1858-entry array from my reference repo.
  2. Promotion encoding: Inverted (queen treated as normal, n/b/r as underpromotions). Correct: q/r/b have explicit suffixes, knight uses bare 4-char move.
  3. Move flipping for black: Was flipping square indices instead of UCI string ranks.
  4. History ordering: Taking first 7 positions instead of most recent 7.
  5. FenHistory initialization: Started as [] instead of [startFEN].
  6. Halfmove clock: Divided by 100.0, should be 99.0.

I identified that the value head was correct but the policy head was wrong, narrowing the investigation. I pointed to my working hunter-chessbot as the authoritative reference.

9.2 Bus Error in ONNX Conversion (Feb 6)

my custom fine-tuned Maia model crashed with Bus error: 10 during lc0 leela2onnx conversion. Deep investigation:

  • Compared hex dumps of working base model vs my model
  • Found my model had extra training_params fields (policy_loss, accuracy)
  • Traced crash to FloatOnnxWeightsAdapter::GetRawData() — KERN_PROTECTION_FAILURE at memory boundary
  • Root cause: Bug in lc0 v0.32.1 handling models with training_params populated (needed v0.21.0+)

9.3 useEffect Anti-Patterns (Feb 9)

I identified pervasive useEffect problems: "what the fuck are these useEffects for?" Led to a comprehensive rewrite of OpeningPicker (removed all 3 useEffects, removed open prop entirely, parents conditionally render instead). Also fixed a flickering bug where NetworkPicker's selection oscillated due to re-resolve effects depending on [networks, selected.id] and running before localStorage writes completed.

9.4 Vite .gz Interception

Gzipped model files couldn't be served in development because Vite's sirv middleware treated .gz files as pre-compressed assets. Solved by renaming to .onnx.bin.


10. AI Agent Involvement

10.1 Session Data

MetricValue
Sessions9 session directories
Subagent files63 JSONL files
Total dialog lines~3,371
Date rangeFeb 5–13, 2026

10.2 my Direction

I provided a comprehensive spec file (lc0-browser-chess-spec.md) upfront describing a 4-phase plan. I directed the architecture, identified bugs by testing, and pointed the AI to my working reference implementation (hunter-chessbot) when the AI's code was wrong.

Key corrections I made:

  • Policy encoding: AI's programmatic generation was fundamentally wrong — I directed using my reference repo's pre-generated table
  • useEffect quality: I identified anti-patterns the AI had written and directed a comprehensive rewrite ("fix all your code that is dog shit. check over it and do refactors where necessary, but with a brain this time")
  • Temperature default: AI suggested 0; I pointed out that makes moves deterministic and predictable, changed to 0.15
  • Modal vs page navigation: AI suggested page-based game detail view; I directed modal overlay instead
  • Board size: I specified exact sizing (min(90vh, 90vw))
  • MCTS architecture: I pushed back on AI's recommendation to build in-repo, noting batched MCTS is logically a library. AI conceded.
  • Auto health checks: AI added automatic engine health checks on modal open; I directed removing them

10.3 AI's Execution

The AI handled implementation, model conversion research, and the bulk of the coding. It deployed 3 parallel research subagents at project start to investigate Lc0 encoding, policy output format, and ONNX Runtime configuration requirements. The tournament system (2,474 lines in useTournamentRunner.ts alone) was largely AI-generated, with me directing the feature requirements and correcting UI decisions.


11. Development Timeline

DateCommitsKey Achievement
Feb 53First working app: Lc0 in browser, encoding bugs found and fixed
Feb 61430 networks added, Maia series, Git LFS → R2 pivot, model compression, game saving
Feb 73Custom ONNX upload (PR #3), useEffect audit (PR #4), tournament mode (PR #1)
Feb 9-105Opening book system (PR #5, 15K+ openings), FIDE performance ratings (PR #6), modal polish (PR #7)
Feb 121Shareable game URLs (PR #8)
Feb 134MCTS search (PR #9), per-entrant tournament settings
Feb 233Temperature sampling attempt + revert (PR #11, still open)

Key Velocity Facts

  • First working app with neural network inference: 2 hours from initial commit
  • 53 networks cataloged and converted: 1 day
  • Full tournament mode with Swiss/round-robin: 1 day (PR #1, 6 commits)
  • Opening book with 15K+ positions: 2 days (PR #5, 4 commits)
  • MCTS search engine: 1 day (PR #9)

12. Key Files Reference

Engine

FileLinesPurpose
engine/mcts.ts~250MCTS search (PUCT selection, expansion, backpropagation)
engine/inference.ts~120ONNX Runtime session management, WebGPU/WASM
engine/encoding.ts~180FEN → [1,112,8,8] tensor encoding
engine/decoding.ts~801858 policy logits → legal moves with temperature
engine/policyIndex.ts~1900Pre-generated 1858 UCI move lookup table
engine/worker.ts~200Web Worker: model loading, inference, MCTS
engine/workerInterface.ts~150Main-thread Lc0Engine class (pub-sub + promises)
engine/modelCache.ts~60IndexedDB model caching

Tournament

FileLinesPurpose
hooks/useTournamentRunner.ts2474Complete tournament lifecycle management
lib/tournament/pairings.ts~150Round-robin (Berger tables) + Swiss pairings
lib/tournament/standings.ts~100Match/game points, Buchholz, sorting
lib/tournament/performanceRating.ts~80FIDE dp lookup table + computation

Data

FileLinesPurpose
constants/networks.ts~80053 network definitions with metadata
lib/openingBook.ts~50Trie-based opening book lookup
data/openings.ts~20Lazy-loaded ECO opening database

UI

FileLinesPurpose
components/GameScreen.tsx~500Active game view with board, controls, history
components/NetworkPicker.tsx~600Network selection, download, configuration
components/TournamentLiveScreen.tsx~500Live tournament view with standings
components/OpeningPicker.tsx~400Opening book selection modal