Appearance
Readr — Deep Technical Profile
Build timeline — ~7 active days across 3 phases (Feb 23 – Apr 12, 2026, 49 calendar days because of a 43-day gap)
- Scaffold sprint (Feb 23–24, 2 days) — monorepo (pnpm/turbo), Hono server, Expo mobile, web dashboard, EPUB/PDF readers, e-ink mode, sync engine with LWW, TTS pipeline, collections/tags/stats (phases 1–6)
- [43-day gap] (Feb 25 – Apr 7)
- Hardware reality + reader iteration (Apr 8–9, 2 days) — bearer-token auth replacing better-auth, full dockerization, content-addressable storage, offline dictionary, PDF text layer, reader UX overhaul, page-count/CFI fixes, Olares deploy, Cloudflare Pages web
- Polish + platform parity (Apr 10–12, 3 days) — architecture review fixes, esbuild bundling of foliate/pdfjs (zero network), precomputed page counts, md5 alongside sha256, compose split into infra+app stacks, PR-driven web parity (#6–#17), service worker + PWA, Playwright CI, self-hosted Wiktionary/WordNet, kernel-native handwriting for Supernote
Table of Contents
- Architecture
- The Reader Engine
- Server & API
- Sync Engine
- E-ink & Supernote Integration
- Technical Tradeoffs & Decisions
- Major Bugs & Debugging Stories
- AI Agent Involvement
- Development Timeline
- Key Files Reference
1. Project Overview
A self-hosted, cross-platform e-book reader targeting the Supernote A5X e-ink tablet, with cloud sync, offline support, annotations (typed + handwritten), TTS, and a web client. Built as a monorepo with React Native (Expo SDK 54), Hono API server, and a Python TTS worker.
By the numbers:
- 222 commits, 8 PRs (7 merged, 1 open), built over Feb 23 – Apr 12, 2026
- ~92% AI co-authored (204/222 commits have Claude co-author tags)
- ~289 MB of Claude Code session data (the largest AI involvement of any project)
- 119+ source files across TypeScript (mobile + server) and Python (TTS)
- 14-table PostgreSQL schema with content-addressable book storage
- CRDT-lite sync with LWW for progress + tombstone sets for annotations
- 53 API endpoints across 10 route files
- Dual rendering: foliate-js for EPUB, pdf.js for PDF, both in WebView
- 17 bundled Google Fonts, 108K-word offline dictionary
- Kernel-level handwriting integration for Supernote (bypasses SurfaceFlinger, ~20ms latency)
Libraries & Frameworks
Mobile app (mobile/, Expo SDK 54 + React Native 0.81)
- Expo + expo-router — cross-platform runtime and file-based routing for the reader app.
- react-native-webview — the container every EPUB/PDF reader runs inside.
- foliate-js 1.0.1 — EPUB rendering engine loaded into the WebView; patched for custom navigation lock.
- pdfjs-dist 4.10 — PDF rendering inside a sibling WebView.
- @shopify/react-native-skia — GPU canvas for annotation overlays on LCD devices (e-ink path bypasses Skia via the kernel binder).
- react-native-reanimated / gesture-handler / screens — page animations, swipe gestures, native navigation stack.
- @tanstack/react-query — server state (library, progress, annotations) with background refetch + cache.
- zustand — client-only UI state (theme, current book, reader settings).
- expo-sqlite — local DB for offline library, reading positions, annotation queue.
- expo-file-system + expo-document-picker + expo-sharing — book import, local file storage, share sheet for highlights.
- expo-secure-store — server URL + bearer token.
- expo-av / expo-speech — streaming TTS playback and native fallback speech.
- expo-brightness / expo-navigation-bar / expo-status-bar — e-ink-friendly chrome tweaks.
- @react-native-community/netinfo — network triggers for sync queue flush.
- lucide-react-native, react-native-svg, react-native-safe-area-context — icons, SVG, notch handling.
- zod — shared validators mirror the server's API schemas.
- esbuild — bundles the WebView reader HTML (foliate-js +
reader.ts) into a single inline blob.
Server (server/, Node + Hono)
- Hono 4 + @hono/node-server — REST + streaming endpoints (auth, sync, books, TTS stream).
- Drizzle ORM + drizzle-kit + postgres — typed schema, migrations, pooled Postgres client.
- ioredis — cache, session store, sync coordination locks.
- BullMQ — background job queue for TTS generation and book metadata extraction.
- @aws-sdk/client-s3 + s3-request-presigner — S3/MinIO uploads, presigned-URL downloads for books and audio.
- pdf-parse, jszip, fast-xml-parser — extract text/metadata from uploaded PDFs and EPUB OPF.
- sharp — resize and re-encode book covers.
- resend — transactional email (password reset, invites).
- zod — request/response validation shared with mobile via the
shared/package.
Python TTS worker (services/tts/)
- FastAPI + uvicorn + pydantic — async HTTP service for batch jobs and
/tts/stream. - chatterbox-tts, chatterbox-turbo, kokoro — three TTS engines (batch uses Turbo, real-time streaming uses Kokoro).
- torch, numpy, soundfile — model runtime, audio arrays, WAV/FLAC encoding.
- boto3 — uploads finished audio to MinIO.
- redis, psycopg2-binary — reads BullMQ job payloads, logs job metadata.
Shared workspace package (shared/)
- zod — single source of truth for API contract types; imported by both mobile and server.
Sync engine package
- vitest — unit tests for LWW merge + tombstone-set conflict resolution.
Infrastructure
- PostgreSQL 16 — primary store (14 tables: users, books, progress, annotations, sync state).
- Redis 7 — queue + cache.
- MinIO — S3-compatible local object storage for book files, covers, generated audio.
- Caddy 2 — reverse proxy + automatic HTTPS (public-profile deployments).
- Docker + docker-compose (two stacks: infra + app) — deployment; GPU worker uses
nvidia/cuda:12.1.1-runtime-ubuntu22.04. - pnpm 10 + turbo 2 + tsx + TypeScript 5.6 — monorepo package manager, build orchestration, TS execution.
Supernote-specific
- No vendor SDK — Supernote integrations use the stock Android WebView plus direct JNI calls into the Supernote
binderdriver for kernel-level handwriting (see §6.3). No bundled native Supernote modules in npm.
2. Architecture
┌─────────────────────────────────────────────────────────────┐
│ React Native (Expo SDK 54) │
│ ├── Library (grid/list, search, sort, filter, upload) │
│ ├── Reader (WebView: foliate-js for EPUB, pdf.js for PDF) │
│ ├── Notes (Skia canvas / Supernote kernel drawing) │
│ ├── Stats (reading sessions, streaks, daily chart) │
│ └── Settings (e-ink mode, offline cache, auth) │
└──────────────┬──────────────────────────────────────────────┘
│ SQLite (local-first) REST API
│ + sync queue │
┌──────────────▼──────────────────────────────▼───────────────┐
│ Hono API Server (Node.js 22) │
│ ├── Auth (bearer token, email OTP via Resend) │
│ ├── Books (upload, dedup by SHA-256, metadata extraction) │
│ ├── Sync (pull/push with LWW + tombstone set merge) │
│ ├── Annotations (bookmarks, highlights, notes) │
│ ├── TTS (BullMQ queue → Python worker) │
│ └── Stats, Collections, Export │
└──────────────┬──────────────────────────────────────────────┘
│
┌──────────────▼──────────────────────────────────────────────┐
│ PostgreSQL 16 │ Redis 7 │ MinIO/R2 (S3) │ TTS Worker │
└─────────────────────────────────────────────────────────────┘Monorepo Structure
| Package | Purpose |
|---|---|
apps/mobile | React Native + Expo Web (58 TS files) |
apps/server | Hono API (47 TS files, 35 endpoints) |
packages/shared | Types, constants, Zod validators |
packages/sync-engine | LWW merge, tombstone sets, queue dedup |
services/tts-worker | Python FastAPI + BullMQ TTS (Chatterbox/Kokoro) |
deploy/ | Docker Compose (infra + app stacks), Caddy, tunnel configs |
Deployment Topology
Two-stack Docker Compose (split to prevent MinIO bouncing during app redeploys):
- Infra stack: PostgreSQL 16, Redis 7, MinIO (optional, for self-hosted S3)
- App stack: Hono API, Caddy (optional, for TLS)
- GPU overlay: TTS worker with NVIDIA passthrough
Deployed on an Olares home server via rsync + docker compose up -d --build. Cloudflare Tunnel for HTTPS without opening ports.
3. The Reader Engine
3.1 EPUB Rendering (foliate-js in WebView)
The reader uses foliate-js (bundled as an IIFE via esbuild — zero CDN dependencies) running inside a react-native-webview. The pipeline:
ReaderScreenresolves a book source (localfile://path preferred, server presigned URL as fallback)getReaderHtml()generates an HTML document with bundled fonts, polyfills (Chromium 96 compat for Supernote), and the reader bundlereader.ts(1183 lines, runs inside WebView) fetches the EPUB via XHR, opens it with foliate, configures pagination, and handles all in-book interactions- A typed
postMessagebridge (26 message types) connects the WebView to React Native
Critical constraint: Android WebView with inline HTML has about:blank origin, blocking all fetch() and XHR to file:// URLs. Solution: Write reader HTML to a local cache file, load via source={{ uri: filePath }}, then use XHR (not fetch) to read book files. This was the hardest-won technical discovery in the project.
3.2 PDF Rendering (pdf.js in WebView)
Self-contained HTML with inline JS using pdf.js. Renders all pages as <canvas> elements with transparent text overlay for selection. Annotation overlays (highlights, notes) are positioned via percentage-based <div> elements. Scroll tracking computes page number from scrollTop / scrollHeight.
3.3 Annotation System
- Highlights: 5 colors (yellow, green, blue, pink, purple). EPUB uses foliate's SVG overlayer, PDF uses DOM overlays. On e-ink, all colors become solid black outlines.
- Bookmarks: Toggle via header icon, stored with CFI + percentage.
- Typed notes: Modal text editor anchored to CFI positions.
- Handwritten notes: Dual-path drawing system — Skia (LCD devices) or Supernote kernel (e-ink, see Section 6).
- All annotations stored in SQLite locally, synced to server via the sync engine.
3.4 Page Numbering (The Hardest Sub-Problem)
Two-phase system because foliate-js's page counts depend on font, margin, and viewport:
- Stub phase: Uses byte-based
location.total(font-invariant) until measurement completes - Background measurement: Hidden
<foliate-view>iterates every EPUB section, awaits font loading, pollsrenderer.pagesuntil stable, records per-section page counts - Live refinement: Current section's live page count overrides the measured value when they diverge
This went through 8+ consecutive commits of iteration before stabilizing.
3.5 TTS
Page-by-page read-aloud using expo-speech. Text split on sentence boundaries (Android has a 4000-char buffer limit). Auto-advances pages on completion. Rate control 0.5×–2.0×.
3.6 Dictionary
Offline: 27 JSON files (~108K words, ~9MB) bundled via Metro. Fuzzy matching with bounded Levenshtein distance, inflection stripping (-ies, -es, -s, -ing, -ed, -ly), case variant generation.
Online: Server-side dictionary endpoint backed by Wiktionary + WordNet (~1.4M words). 3s timeout, falls back to offline.
4. Server & API
4.1 Auth Model
No passwords, no OAuth. The bearer token IS the user identity.
- Registration: Client generates a random token (16-256 chars), POSTs to
/api/register. Token stored inusers.token. - Every request:
Authorization: Bearer <token>. Server seeks on unique index. - Email (optional): Resend-powered OTP for account recovery and login. Entirely disabled when env vars unset.
4.2 Database Schema (14 tables)
| Table | Key Design |
|---|---|
users | UUID PK (decoupled from token), optional email |
files | Content-addressable by SHA-256. Shared across users. refCount tracks references. |
books | Per-user reference to a file, with optional title/author overrides |
reading_progress | Per (bookId, userId, deviceId) — multi-device support |
bookmarks/highlights/notes | Soft-delete via deletedAt tombstone (required for sync) |
sync_log | Append-only mutation log, indexed by (userId, timestamp), 90-day retention |
tts_jobs + tts_audio_chunks | Batch TTS job tracking with per-chapter audio |
collections + book_collections | User collections with M:N join |
reading_sessions | Duration, pages, progress delta for stats |
4.3 Book Upload Pipeline
Upload multipart → validate format/size/quota → SHA-256 hash → if existing file: reuse + increment refCount → if new: upload to S3, extract metadata (EPUB: JSZip + XML parser; PDF: pdf-parse + pdftoppm), upload cover (normalized to 600×900 JPEG via sharp) → insert files + books rows → update storage_used_mb.
4.4 TTS Pipeline
Hono server enqueues per-chapter BullMQ jobs → Python worker (BRPOP on Redis, speaking BullMQ's protocol directly) → loads Chatterbox or Kokoro model → generates audio → encodes to Opus → uploads to S3 → updates progress in PostgreSQL. Also supports real-time streaming via Kokoro proxy.
5. Sync Engine
5.1 Two-Strategy CRDT-Lite
| Data Type | Strategy | Tie-Breaking |
|---|---|---|
| Reading progress | LWW (Last-Writer-Wins) | Server wins on timestamp tie |
| Annotations (bookmarks, highlights, notes) | Tombstone set merge | Deletions are permanent (no resurrection) |
5.2 Tombstone Set Merge Rules
- Create: Insert if entity doesn't exist. Skip if tombstoned (deleted entities cannot be resurrected — permanent).
- Update: LWW within living entities (newer timestamp wins). Skip if tombstoned.
- Delete: Soft-delete (
deletedAtset). Permanent — once deleted, no create or update can revive it.
5.3 Client-Side Queue
Changes queued in SQLite sync_queue. Before push, deduplicateQueue() keeps only the latest timestamp per entityType:entityId — if the user updated progress 5 times offline, only the final state is pushed. Auto-sync on app open and network reconnect. Debounced opportunistic push (800ms) on every write.
6. E-ink & Supernote Integration
6.1 Device Detection
DisplayContext.tsx checks NativeModules.PlatformConstants for Ratta/Supernote, ONYX/BOOX, Kobo, Kindle. Sets global isEink: true.
6.2 E-ink Adaptations
- All animations disabled, transitions set to
none - High contrast: black text, solid colors (no transparency)
- Larger tap targets: 64px minimum (vs 48px on LCD)
- Paginated scroll instead of smooth
- A2 refresh mode for faster screen updates
- Highlights become solid black outlines instead of translucent colors
- Loading indicators become static text instead of spinners
6.3 Kernel-Level Handwriting (Supernote-Specific)
The most technically ambitious feature, and the one that required the deepest reverse engineering. The Supernote has a dedicated kernel-level handwriting pipeline that its first-party apps (Atelier, Ratta Notes) use for low-latency pen input — but there's zero public documentation on how it works.
The Problem
React Native → Skia rendering has 200-400ms stroke-to-pixel latency on e-ink. The pipeline is: touch event → JS bridge → Skia canvas → SurfaceFlinger → EPD controller → panel refresh (GC16 waveform). Each step adds latency. Worse, GC16 waveforms cause severe ghosting artifacts that survive 10+ screen refreshes — every stroke leaves a faint residue that accumulates until the screen is unreadable.
The first-party Atelier drawing app has none of these problems. Strokes appear in ~20ms with no ghosting. That performance gap meant something was bypassing the entire Android rendering stack. I flagged this through hands-on testing on the physical device and directed the investigation: "do some research on how supernote makes their drawing fluid in their native apps... might require something a little deeper."
Phase 1: Research Agents Hit a Wall
Initial research agents came back with a wrong conclusion — that Atelier was a privileged first-party app with framework-level access that a sideloaded APK could never match. They also identified the wrong device (assumed A5X when the actual device was a Supernote Nomad/A6X2 running Android 11 on a Rockchip RK3566), and pointed to a nonexistent API (View.requestEpdMode) as the waveform control method. The recommendation was essentially "accept the ceiling."
my response: "nope we're gonna have to try harder."
Phase 2: Live Device Reverse Engineering
I directed a shift from web research to actual reverse engineering on the live device. This is where the breakthroughs happened:
APK pulling and JADX decompilation. Three APKs were pulled from the device via adb: Atelier (com.ratta.supernote.paint), Ratta Notes (com.ratta.supernote.note), and a shared drawing library (com.ratta.drawpath). JADX decompilation of drawpath revealed the critical architecture — the drawing engine lives in a native library called librecgnition.so (Ratta's typo). Running strings on it exposed the real architecture: ThreadUpdateEpdc, ioctl, hteink_area_display, myBpService.cpp/myBnService.cpp — a binder service communicating with a kernel driver.
Binder service discovery. System service enumeration (service list) revealed four vendor binder services: eink (android.os.IEinkManager), hteink (hteink.IDeviceManagerService), opt_service, and service_myservice (android.demo.IMyService). The decompiled HandWriteClient classes in Atelier and Notes communicated with service_myservice via raw Parcel transactions — not through the standard Android Canvas/View rendering pipeline, but through direct kernel-level rasterization into the EPD framebuffer.
Mapping the binder protocol. The decompiled HandWriteClient revealed:
- Service name:
service_myservice(registered in Android's ServiceManager) - Interface token:
android.demo.IMyService(written into every Parcel — a byte-for-byte match is required or the service silently drops the transaction) - Transaction codes (from
HandWriteClient.WriteInfo):0 = WRITE_APP_INFO,1 = DISABLE_AREA_INFO,2 = PEN_INFO,3 = SHIFT_INFO,4 = TRAIL_INFO,6 = SYNC_BACKGROUND,13 = SET_RUBBER_INFO - Magic sentinel rectangles: 18888×18888 to enable drawing, 19999×19999 to disable
EPD framework JAR decompilation. The waveform control came from /system/framework/libeinkpwcoreapi.jar, a Rockchip vendor library on the device. Decompiling it revealed:
android.os.EinkManager— a system service atContext.getSystemService("eink")with methods likescreenRefresh(),enableFullUiAuto(), andsendHwcCmd()android.view.View.setEinkUpdateMode(int dataMode, int dispMode)— the actual Rockchip-private waveform API (notrequestEpdModeas the research agents had claimed)- Atelier's
com.ratta.paint.ReflectUtilitiesshowed how they called these APIs via reflection
Kernel source reading. Supernote publishes their kernel source (GPL obligation) at Supernote-Ratta/kernel_Nomad_Manta. The EPD waveform constants came from drivers/gpu/drm/rockchip/ebc-dev/ebc_dev.h: EPD_AUTO = 0, EPD_FULL_GC16 = 2, EPD_PART_GC16 = 7, EPD_A2 = 12, EPD_DU = 14, EPD_FORCE_FULL = 21. This revealed that earlier code was accidentally requesting EPD_FULL_GC16 — the slow, ghosty mode. Switching to EPD_A2 (1-bit animation waveform, ~120ms) eliminated ghosting entirely.
Confirming third-party access. A critical unknown was whether the binder service would accept transactions from a non-Ratta app. A probe function (probeHandwriteService()) ran through the binder transaction sequence with diagnostic logging at every step. Result: full success from a normal UID 10094 third-party app on stock firmware — the service doesn't check the caller's package name, only the interface token and parcel format. The research agents' claim that this was "impossible" from a sideloaded APK was wrong. Also notable: /dev/ebc (the EPD device node) is crw-rw-rw- — world-readable and world-writable — so direct ioctl access requires no special permissions either.
Phase 3: I Design the Architecture
With binder access confirmed, I designed the drawing architecture. The key insight was eliminating Skia from the live drawing path entirely:
"on the supernote, there should just be no skia rendering at all. the flow would be something like, we do all the drawing/erasing/clearing/undo/redo through the kernel."
The initial implementation tried to keep Skia in sync with the kernel (rendering the same strokes in both places), but this caused visual mismatches — kernel strokes and Skia strokes had different thickness, timing, and anti-aliasing. I identified the fix:
"could we even do something different and just skip skia entirely, and make it more 'screenshot' based? and then use that as the new background?"
This led to the final architecture: the kernel draws strokes directly, and when the user saves, screencap captures the framebuffer region (which includes kernel-drawn strokes that never existed in the app's view tree) and persists it. Skia is only used for re-rendering previously saved notes when re-opening them.
I also defined exactly when Skia should be involved: "the only time skia/react should be used to render strokes is in: 1. preview mode, 2. when opening a previously created one, and then just a singular render for the saved strokes. and then kernel gets put on top of that."
Phase 4: Iterative Debugging
I drove debugging through real-time physical testing on the device, identifying issues that were invisible without hardware:
- "drawing feels really good now. it's perfect. erase and clear cause some weird behaviour though" — erase/clear needed
syncBackground()calls to flush the kernel's stroke cache - "strokes kinda just disappear after drawing them?" — sync timing bug between kernel and React Native state
- "strokes don't disappear now but still a minor bug where a stroke gets drawn, and then it gets slightly readjusted?" — the Skia/kernel dual-rendering mismatch that led to eliminating Skia
- "i said that the skia strokes are naturally thicker than kernel strokes, i think you flipped them?" — correctness feedback on pen width mapping
- "now it's buggy as shit, like it causes weird screen flashing and weird stuff to happen. just check how atelier does it" — directing back to the decompiled source as ground truth
The Implementation
Two native Kotlin modules expose the reverse-engineered interfaces to React Native:
HandwriteServiceModule.kt (506 lines) — the binder client:
- Acquires the binder via reflection on
android.os.ServiceManager.getService("service_myservice") - Constructs
Parcelobjects with exact byte-for-byte match to Atelier's format disableRects: pixel-coordinate rectangles defining where the kernel is forbidden from drawing (everything outside the canvas area — the four strips of "window minus card" plus the card's header and toolbar)syncBackground(): tells the kernel to drop its cached stroke-trail buffer on undo/erase/clearcaptureCanvas(): runsscreencap -pwhich reads the full display framebuffer including kernel-drawn strokes (the only way to persist them, since they never existed in the app's View tree)- Stale session cleanup: the constructor clears any leftover handwriting session on module init (prevents strokes appearing on every screen touch if the app crashed mid-session)
EpdModeModule.kt (624 lines) — waveform control:
- All APIs accessed via reflection (they're not in the Android SDK)
setA2(): pins the EPD panel to the fast A2 waveform for active drawing (~120ms, 1-bit)setPart(): partial GC16 refresh for idle canvassetFull(): full GC16 refresh to clear accumulated ghost residueenableFullUiAuto(false): disables the eink service's automatic GC16 promotion during a drawing session (without this, the system fights your A2 mode)- Walks the entire React Native view tree via reflection to apply waveform mode (necessary because React Native's nested view hierarchy means the target view isn't always the one you'd expect)
- Caches all reflection lookups via
by lazy - Distinguishes failure modes separately: class not found vs. service null vs. SecurityException vs. RemoteException — critical for diagnosing SELinux issues on different firmware versions
Canvas integration (HandwritingCanvas.tsx):
- Skia
<Canvas>replaced with a static<Image>snapshot when kernel mode is active - Eraser = same
BALL_PENtype but with color byte 255 (white) — exactly how Atelier does it (the kernel draws white pixels where the pen touches) - Pen sizes: {200, 400, 600} for
BALL_PEN(max width at full pressure, scales dynamically by EMR pressure) - Eraser sizes: {400, 1000, 1600, 2200}
- Palm rejection via native
StylusOnlyViewthat filters non-stylus touch events - All coordinates converted from dp to raw panel pixels (1404×1872 on Nomad) — the kernel driver works in raw pixels, matching Atelier's dp→px conversion
7. Technical Tradeoffs & Decisions
7.1 WebView File Access (The Canonical Solution)
Android WebView with source={{ html }} gets about:blank origin → fetch() and XHR to file:// both blocked. Tried: CDN imports (CORS), base64 injection (crashed renderer on large books), various fetch approaches.
Final solution: Write HTML to local file, load via source={{ uri }}, use XHR (not fetch) for file:// reads. This works because XHR has legacy file:// support that fetch() doesn't.
7.2 Bearer Token Auth (No Passwords)
I directed replacing the initial better-auth library with a simpler custom scheme. The bearer token is self-generated by the client, stored in the users.token column (separate from the UUID primary key). No sessions, no OAuth, no password hashing. Email recovery is optional, disabled when Resend env vars aren't set.
7.3 Content-Addressable Book Storage
Books are deduped by SHA-256 hash. The files table holds one row per unique file, shared across all users. The books table is per-user, referencing files with optional title/author overrides. refCount tracks references; when it hits 0, S3 objects are cleaned up.
7.4 Two-Stack Docker Compose
I directed splitting the single compose file after discovering that MinIO bouncing during app redeploys caused Cloudflare to cache 502s for book downloads. Infra stack (PostgreSQL, Redis, MinIO) is long-lived; app stack (API, Caddy) is rebuilt on every push.
7.5 Rootless Deployment User (ebook-deploy)
Same pattern as the Anna's Archive MCP project, applied independently here. I created a dedicated non-root user for all AI-driven deployment operations on the Olares box:
ebook-deployuser with no sudo access, in thedockergroup for container operations- SSH alias
olares-ebookin~/.ssh/configpointing toebook-deploy@10.0.0.170 - SSH key-based auth via
ssh-copy-id - All deployment happens via
ssh olares-ebook "..."— the AI agent never touches the privileged Olares user .envexcluded from rsync so production secrets are never overwritten from the laptop
The trust model: every SSH command the AI runs goes through Claude Code's permission prompt. I reviews commands before they execute. The restricted user means even an approved command can't escalate beyond Docker container lifecycle and the project directory.
This is the second time I applied this pattern (after annas-deploy for the MCP project), establishing it as a repeatable security practice for AI-driven deployments on shared infrastructure.
7.6 Expo Web Over Standalone Web App
The separate apps/web Vite dashboard was deleted (PR #4, -2639 lines) in favor of Expo Web from apps/mobile. This makes the mobile app the single source of truth for all UI, at the cost of some web-specific features being stubbed.
7.7 Offline-First with Optimistic Auth
Auth check on app boot is optimistic: reads token from SecureStore, sets isAuthenticated: true immediately, then probes the server in the background. Only signs out on explicit 401/403 — network errors keep the user signed in. This enables full offline usage.
7.8 Additional architectural tradeoffs worth naming
- Three TTS engines: Chatterbox / Chatterbox-Turbo / Kokoro — batch jobs use Chatterbox-Turbo, real-time streaming uses Kokoro via
/tts/stream. Picking one would mean either accepting bad latency for live read-aloud or bad quality for bulk chapter pre-rendering. Cost: GPU passthrough setup, two model weights, split code paths. Cloud TTS (Azure, ElevenLabs) would avoid all of this but eats per-minute API bills on a reader expected to consume whole books. Evidenced in arch spec §9. - Reflection-based access to Rockchip private EPD APIs —
EpdModeModule.ktreachesView.setEinkUpdateMode,EinkManager.sendHwcCmd, etc. entirely via Java reflection (cached withby lazy). Compile-time linking against a stub vendor jar would pin a specific Rockchip firmware version; NDKdlopenoflibeinkpwcoreapi.sowould force an ABI commitment. Reflection means one APK works on any firmware that exposes the APIs; cost is per-call dispatch (mitigated by caching) and runtime SELinux/firmware-version failure handling. Inferred from implementation. - Custom LWW + tombstone-set CRDT (≈150 LOC) instead of Automerge / Yjs — reading progress is a scalar, annotations are set-shaped — full CRDT machinery is overkill and would bloat the RN bundle. Tradeoff: tombstones are permanent (can't resurrect deleted highlights), no conflict resolution for overlapping highlight edits, and a device offline beyond the 90-day
sync_logretention loses history. Evidenced insync-engine/src/lww.ts. - Drizzle helpers for tenancy, not Postgres RLS — every query is required to use a
userId-scoped helper. Postgres Row-Level Security with a session GUC would enforce isolation at the DB instead of the ORM, but complicates connection pooling (SET LOCALinteracts poorly with pgBouncer-style transaction pools) and adds a GUC management surface. Application-level enforcement is simpler for single-process Node but fails open on any new raw query. Inferred. - Two-stack compose driven by a Cloudflare cache-poisoning bug — MinIO bouncing during app redeploys caused CF to cache 502s on book URLs. Splitting infra + app compose is cheap; the real fix would be origin cache-control headers or moving hot paths off MinIO to R2. An interviewer would probe: is splitting compose the right fix, or a workaround for a caching policy that should have been fixed at the CDN? Evidenced in §7.4.
- Metro-bundling a 9 MB dictionary into the mobile app — 27-file split of 108K words shipped inside the APK/IPA. On-demand download or OTA via Expo Updates would save binary size, but offline is a hard requirement for e-readers (Supernote often used on airplanes / flaky wifi). The 27-file split is a workaround for Metro's per-asset size limits. Server fallback (Wiktionary+WordNet) handles breadth when online. Inferred from build config.
7.9 Strategic / "why do this at all" tradeoffs
- Self-host over Readwise Reader / Matter / KOReader — Readwise and Matter are closed SaaS where highlights, progress, and library live on someone else's servers, optimized for article/RSS workflows not long-form EPUBs on e-ink. KOReader runs locally but has no cloud sync, no web upload dashboard, and a UI built for Kobo/Kindle that fights the Supernote's Android stack. The product thesis is the gap: own the data, sync across devices, render natively on a pen-first e-ink tablet — a combination no existing product ships. Inferred.
- Supernote A5X as the primary target — the Supernote is an Android-based e-ink tablet with an active stylus and a rooted-accessible kernel driver for pen input, unlike locked-down Kindle or Kobo's limited Linux. Targeting it lets me reuse a standard React Native/Expo stack (sideloaded APK) while still reaching handwriting-annotation territory via kernel reverse-engineering. The tradeoff: a vanishingly small user base — but it's my daily driver. The project is n-of-1, architected to also run everywhere else for free. Inferred.
- Full stack (server + sync + web + mobile) vs. reader-only app — a reader-only would have been ~10% of the code. Building server + sync + web dashboard is the only way to get desktop-browser library ingest (you don't sideload EPUBs onto an e-ink tablet over USB comfortably) and cross-device progress sync — together, that's what turns a reader into a Readwise-class product instead of a standalone app. Accepted ops cost (Postgres, Redis, MinIO, tunnel) to avoid the much worse UX. Inferred.
- 1,166-line upfront architecture spec as an AI-collaboration pattern — writing a book-length architecture doc before implementation (then maintaining it as source-of-truth) is deliberate: the spec becomes the context every agent session loads, so Claude doesn't redesign subsystems each invocation. Tradeoff: upfront cost and drift risk (the doc must match reality) vs. dramatically more coherent multi-session agent work. The 43-day gap between Act 1 and Act 2 only works because the spec carries state humans would forget. Evidenced — the spec exists at
ebook-reader-architecture.md. - Content-addressable storage as multi-tenancy readiness, not a dedup optimization — SHA-256 keying with a shared
filestable + ref-counting means two users uploading Moby-Dick store one blob, and metadata/cover extraction runs once ever. For a self-hosted single-tenant box this is overkill — the real payoff is cheap multi-tenancy readiness and zero re-upload cost when I reinstall a device or add a family member. Scales 1 → 10,000 users without a rewrite. Inferred.
7.10 Code-level tradeoffs visible in the source
- Sync merge logic lives in
@readr/sync-engine; entity I/O is hand-inlined insync.ts— the CRDT core (lww.ts,set.ts) is pure functions returning{action}verdicts;routes/sync.tsthen dispatches onentityTypevia three parallel switch statements (lookupEntity,insertEntity,updateEntity). A genericSyncableEntityinterface would unify them. Each entity has genuinely different update-field semantics (bookmark updateslabel; highlight updatesnote, color; note updatestextContent, strokes), so a generic abstraction needs a per-entity allowlist anyway. Cost: adding a fourth entity type means editing four switch blocks. Evidenced. - Fire-and-forget
pruneSyncLogpiggybacked on everyGET /sync/changes, not a cron or BullMQ job —pruneSyncLog(userId).catch(() => {})runs per-pull, 90-day window, per-user. BullMQ infrastructure already exists for TTS; reusing it would add zero ops surface but a nightly all-users run has per-user pause/resume semantics to handle. Piggybacking means N concurrent pullers = N concurrent DELETEs on the same user's rows, and a user who never syncs never prunes — both noise at single-tenant scale. Evidenced. - TTS jobs split at chapter granularity, not "generate-book" with an internal loop —
queueTTSJobwrites onetts_jobsrow withchaptersTotal, then enqueues Ngenerate-chapterjobs. Chapter-granular retries (one failed chapter doesn't redo the book), concurrency knob comes free from BullMQ's worker concurrency, and a heterogeneous GPU worker pool (Chatterbox-Turbo vs Kokoro) can race. Cost: chapter progress requires atomic DB increments to syncchaptersDone, and payloads duplicatevoiceConfigper job. Evidenced. - Untyped
{type, payload}WebView bridge with dualwindow+documentlisteners —reader.ts:1446-1452registers the same handler on both, wrapped in a silenttry/catch. 20+ message types flow through one switch, no shared Zod schemas with server despite@readr/sharedalready existing. Android WebView deliverspostMessagetodocumenton some versions andwindowon others — dual-registration is a defensive hedge, silent-catch prevents one path's stray events from throwing twice. Staying untyped avoids dragging@readr/sharedinto the esbuild bundle shipped tofile:///android_asset/js/reader-bundle.js(already size-constrained with foliate+pdfjs inside). Cost: renamed message types across the bridge surface as silent no-ops. Evidenced.
8. Major Bugs & Debugging Stories
8.1 The WebView Rendering Saga
The biggest technical challenge. Progression through 4 failed approaches before finding the solution:
- CDN
<script type="module">— blocked by CORS on null origin - Base64 injection via
injectedJavaScriptBeforeContentLoaded— crashed renderer on large books fetch()tofile://— blocked- XHR to
file://fromabout:blank— also blocked
Solution: Write HTML to local file → load via uri → XHR works from file:// origin.
8.2 Foliate Margin Control
I directed margin customization but foliate-js's internal CSS variables (--_max-inline-size, --_max-block-size) resisted override attempts. After multiple failures, I said: "I'm gonna get codex to fix it, you've lost my trust." Codex also struggled. Root cause: foliate's gap attribute expects a CSS percentage, while max-inline-size expects px with units.
8.3 Drawing Canvas Ghosting
Skia-rendered strokes on e-ink caused severe ghosting that survived 10+ screen refreshes and caused "warping" where strokes changed shape before settling. Led to the kernel-level drawing integration (Section 6.3).
8.4 Page Number Estimation
8+ consecutive commits wrestling with accurate page counts from foliate-js. Progression: fraction math → CSS column counts → section byte-size extrapolation → precompute via hidden <foliate-view> → lock counts after precompute.
8.5 Cloudflare 403 on Book Covers
Covers from books.hunterchen.ca returned 403 in browser but 200 from curl. AI methodically tested headers, CORS, Vary. Root cause: Cloudflare Hotlink Protection blocking cross-origin image requests. MinIO was never reached.
8.6 Nested Duplicate Files
Found apps/mobile/apps/mobile/apps/mobile/... three levels deep — 70 duplicate files (34 fonts, 26 JS modules). Cleaned up and added to .gitignore.
9. AI Agent Involvement
9.1 Session Data
| Metric | Value |
|---|---|
| JSONL session files | 7 (main) + 2 (worktrees) |
| Total session data | ~289 MB |
| Subagent files | 162 across all sessions |
| Memory files | 5 (WebView patterns, workflow, Olares deploy/env) |
| Date range | Feb 24 – Apr 12, 2026 |
| Largest session | 92.6 MB, 84 subagents (reader rendering, WebView, e-ink) |
9.2 my Direction
I wrote the 1,166-line architecture spec upfront, made all infrastructure/security decisions, tested on real Supernote hardware, and corrected the AI repeatedly:
- WebView file access: I identified the working approach after the AI exhausted 4 failed strategies
- Auth architecture: I directed replacing
better-authwith bearer tokens - E-ink drawing: I specified kernel-level integration after Skia ghosting, directed "the only time skia/react should be used to render strokes is in preview mode and when opening a previously created one"
- UI quality: "that's fugly as hell, do more thinking on it", "might be one of the most retarded selection menus i've seen in my life", "still looks a little sloppy tbh"
- Process enforcement: "always typecheck before pushing", "research before guessing", "don't go in circles" — all documented in memory files
- Trust boundary: After repeated margin debugging failures, I brought in Codex for a second opinion, then returned when it also struggled
9.3 Workflow Patterns
- Overnight autonomous cycles: I directed "do all of it. you can work through the night. repeatedly audit yourself. just on a cycle."
- Multiple review passes: "Nah we're gonna keep doing passes. Be really careful this time." — pushed for 5+ review passes on PRs
- Worktree-based PR development: 7 active git worktrees, including 2 created by Claude agents for parallel PR work
- Memory-driven continuity: 5 persistent memory files capturing hard-won WebView lessons, workflow rules, and deployment knowledge
10. Development Timeline
The project tells a story in three acts:
Act 1: Scaffold Sprint (Feb 23-24)
A single late-night session scaffolded the entire application in ~2 hours. Phases 1-6 completed: monorepo, server, mobile app, web dashboard, reader, sync engine, TTS, collections, stats. Spec-driven generation — I wrote the architecture doc, Claude executed phase by phase.
Act 2: Hardware Reality (Apr 8-10, after 43-day gap)
Development resumed with fundamentally different character. Instead of generating scaffolding, the work became about making existing code work on real hardware:
- Auth system replaced (better-auth → bearer tokens)
- Reader went through dozens of iterations for page counting, themes, progress
- All external network dependencies eliminated (fonts, foliate-js, pdf.js bundled locally)
- Deployed to Olares, tested on Supernote
Act 3: Polish & Platform Parity (Apr 11-12)
- Standalone web app deleted, consolidated into Expo Web (PR #4)
- Docker Compose split into infra/app stacks (PR #14)
- Offline mode for mobile (PR #16)
- Web parity sprint (PR #15, 9 tracked issues, still open)
- Kernel-level Supernote handwriting integration
- Self-hosted dictionary endpoint
| Date | Key Achievement |
|---|---|
| Feb 23 | Architecture spec + scaffold start |
| Feb 24 | Entire monorepo scaffolded in ~2 hours (Phases 1-6) |
| Apr 8 | Development resumes. Auth rewrite, reader settings, Docker deployment |
| Apr 9 | Reader UX obsession: themes, page counting, fonts, progress bar. PR #1 (email recovery) |
| Apr 10 | foliate bundling, security audit (UUID migration), architecture doc rewrite. PRs #2, #3 |
| Apr 11 | Web consolidation (PR #4), compose stack split (PR #14), offline mode (PR #16), web parity sprint |
| Apr 12 | Kernel handwriting for Supernote, self-hosted dictionary, canvas fixes |
11. Key Files Reference
Reader Engine
| File | Lines | Purpose |
|---|---|---|
mobile/webview-src/reader.ts | 1183 | EPUB reader JS (runs in WebView) |
mobile/components/reader/pdf-html.ts | 511 | PDF reader (inline JS in WebView) |
mobile/app/reader/[bookId].tsx | ~1380 | Main reader screen (RN side) |
mobile/components/notes/HandwritingCanvas.tsx | ~400 | Skia + kernel drawing |
mobile/lib/epd-mode.ts | ~60 | Supernote EPD waveform control |
mobile/lib/handwrite-service.ts | ~100 | Kernel handwriting binder |
Server
| File | Lines | Purpose |
|---|---|---|
server/src/db/schema.ts | ~400 | 14-table Drizzle schema |
server/src/routes/sync.ts | ~200 | Pull/push sync with LWW + set merge |
server/src/routes/books.ts | ~250 | Upload, dedup, metadata extraction |
server/src/services/book-processor.ts | ~200 | EPUB/PDF metadata + cover extraction |
Sync Engine
| File | Lines | Purpose |
|---|---|---|
sync-engine/src/lww.ts | ~30 | Last-Writer-Wins merge |
sync-engine/src/set.ts | ~80 | Tombstone set merge (permanent deletes) |
sync-engine/src/queue.ts | ~40 | Offline dedup queue |
Shared
| File | Lines | Purpose |
|---|---|---|
shared/src/types.ts | ~200 | All domain types |
shared/src/validators.ts | ~300 | 22 Zod schemas for API validation |
shared/src/constants.ts | ~50 | Highlight colors, TTS engines, limits |