Real-time · Published

A real-time multiplayer canvas: a grow-only CRDT and Lamport ordering

Liveboard · 2026~8 min read

The thing I keep coming back to about this one is that it shipped and people actually used it. The board at /liveboard has relayed 701 real messages between strangers' browsers — not a synthetic load test, just traffic. Open it, draw, and watch your strokes appear on every other connected screen in the same order they appear on yours. That last clause is the entire engineering problem, and it's harder than it sounds.

The problem (and why it's actually hard)

A shared canvas with many simultaneous drawers has no single source of truth. Edits arrive concurrently and out of order. A client that joins late has to see the board as it already stands, not an empty page. And the server can restart or scale to multiple instances in the middle of a session.

The requirement underneath all of that is convergence: every client must end up rendering the same canvas, regardless of the order messages arrived in or which server instance they came through. That's the property that breaks first under the naive approach.

The naive approach is "broadcast each stroke and append it on receipt." It fails on ordering. If client A draws a red line and client B draws a blue line that overlaps it, the final picture depends on which stroke is painted on top — its z-order. With plain broadcast, A might receive red-then-blue while B receives blue-then-red, and the two screens now disagree about which color is on top. There's no central clock to break the tie, and adding one (a server-side lock that serializes every stroke) is exactly the coordination bottleneck that kills the thing under load.

So the real question is: how do you get a globally consistent order out of a system where nobody agrees on what "first" means?

How it works

A stroke is identified by (siteId, seq) — the originating client plus a per-client sequence number — and carries a Lamport timestamp. The strokes form a grow-only set: clients only ever add to it. To render, every client sorts the set by (lamport, siteId) and paints in that order. Because the sort key is total and deterministic, two clients holding the same set of strokes always produce the same z-order, even if those strokes arrived in completely different sequences. The siteId is the tiebreaker for equal Lamport values, so there's never an ambiguous comparison.

Transport is a raw WebSocket at /ws/rooms/{roomId}. The server relays each draw to the rest of the room and, for horizontal scale, publishes it to a RoomBus so it also reaches clients connected to other instances. A late joiner gets a snapshot of the whole room on connect, then the live relay takes over.

client A ──draw──▶ ┌────────────┐ │ instance A │──┐ └────────────┘ │ publish frame ▲ ▼ │ ┌──────────┐ relay │ │ Redis │ │ │ pub/sub │ │ └────┬─────┘ ┌────────────┐ │ deliver client B ◀─draw── │ instance B │◀─┘ └─────┬──────┘ │ late joiner ──join──────┤ ▼ ┌──────────────────┐ │ snapshot(roomId) │ sorted (lamport, siteId) │ Postgres + mem │ ─▶ then live relay └──────────────────┘

A draw on instance A reaches clients on B via Redis pub/sub; a late joiner is bootstrapped with a snapshot, then switched to the live relay.

The design decisions that mattered

A grow-only-set CRDT with Lamport ordering, not operational transforms

I chose a grow-only set (G-set) keyed by (siteId, seq), ordered by a Lamport timestamp, over operational transforms or a central server lock. A G-set is the simplest CRDT there is: the only operation is "add," and union is commutative, associative, and idempotent — so it converges with no coordination at all. Two clients that have seen the same strokes are in the same state by definition, and re-delivering a stroke is a no-op.

Strokes are append-only, which is what makes this fit. There's no "move this line 10px left" operation to reconcile, so there's no delete-conflict resolution to get wrong — the entire class of problems that operational transforms exist to solve doesn't arise. OT would have been strictly more machinery for a problem I didn't have.

What I traded away is true erase. A grow-only set has no tombstones, so there is no way to remove a stroke once it's in the set. For a collaborative sketchpad that's an acceptable cut, and I'd rather name it than pretend it isn't there. If erase becomes a requirement, the honest answer is a different CRDT (more on that below), not a patch to this one.

The Lamport ordering is the load-bearing detail. Concurrent strokes get a total order from (lamport, siteId), and the persistence layer reads it back in exactly that order:

sql

SELECT stroke_id, site_id, lamport, color, points
FROM stroke
WHERE room_id = :roomId
ORDER BY lamport, site_id;   -- the canvas's convergent z-order

Raw WebSocket, not Socket.IO or STOMP

I hand-rolled the protocol on a raw WebSocket instead of pulling in Socket.IO or STOMP. The wire schema here is tiny — draw, cursor, presence, snapshot — and owning it outright meant no framework overhead and no fighting an abstraction over a message shape I'd already designed. On the server, each session runs on a Java 21 virtual thread with a per-session send lock, because WebSocketSession.sendMessage isn't thread-safe and a fan-out can race two writes onto one socket.

The trade-off is that I had to build reconnect and heartbeat myself — the client reconnects with backoff rather than getting it for free from a library. Given the CRDT, that's cheap insurance: a reconnecting client just asks for a fresh snapshot and converges again, so a dropped connection is a non-event rather than a corruption risk.

Redis pub/sub for fan-out, Postgres for durability — both optional

Cross-instance fan-out goes through a single seam, the RoomBus. The default is a no-op LocalRoomBus (single instance); under the cluster profile, a RedisRoomBus publishes each room's frames to Redis pub/sub so a draw on instance A reaches clients on instance B. Crucially, correctness doesn't lean on Redis ordering — the CRDT already handles out-of-order delivery, so the bus only has to deliver eventually, not in order. Persistence is the same story: under the persistence profile, strokes are written through Postgres on each batch and a room replays from the database on first access, so the canvas survives a restart. Both layers sit behind a Spring profile with a no-op fallback, so the rest of the site takes on no dependency it doesn't need.

The honest note: production currently runs the in-memory default profile. Durable Postgres persistence is built, tested, and profile-gated — but the deployed instance hasn't had a database wired to it yet, so a restart today clears the board. That's a deferred deploy-time wire-up, not missing code.

Does it actually work?

The numbers come straight off the running process. The /liveboard page polls /api/v1/liveboard/metrics, which returns live gauges and running totals — total_strokes, messages_relayed, peak_concurrent, rooms_active — read directly from the live server, not a separate analytics store. The headline figure is 701 messages relayed: real traffic from real visitors, which is why the dashboard shows its own traction rather than asserting it.

The convergence and fan-out properties are pinned by the backend test suite, green on JDK 21. Cross-instance fan-out specifically is covered by a Testcontainers Redis test in CI, so the "a draw on A reaches B" path is exercised against a real Redis, not a mock.

What these prove: the relay path works under real use, the CRDT ordering logic is covered, and the cross-instance fan-out is real against a real broker. What they don't prove: the metrics are per-instance, so peak_concurrent is a per-process number, and the durable-persistence path — while tested — isn't what the live demo is running right now.

What I'd do differently / what's next

Three honest limitations, each with its 10x version:

Wire Neon Postgres for durable strokes. The code exists and is tested; the live instance just needs a database attached and the persistence profile flipped on. This is the smallest, highest-value next step — it's the difference between "the board resets on deploy" and "the board is permanent."
A delete-tolerant CRDT, if erase is ever needed. The grow-only set is deliberately one-way. The standard upgrade is a 2P-set (two-phase set with tombstones) or an observed-remove set — but tombstones bring their own garbage-collection and metadata-growth problems, so I'd only take that on if erase became a real requirement rather than building it speculatively.
Accurate cross-instance presence. The "N online" count is per-instance today. A correct cluster-wide roster needs Redis-backed membership plus heartbeats to expire stale clients — deliberately out of scope for the current single-instance demo, but the obvious next seam once the cluster profile is the default posture.

The shape I'm happy with: the hard part — convergence under out-of-order, multi-instance delivery — is solved by a data structure (the CRDT), not by coordination, which is why the same design works whether it's one instance or several.

Try it. Draw on the live canvas at /liveboard, and the code is on GitHub.