← PlaygroundMobile system design lab

Fake notes app

Trip planning

broken replay
Run progress20%

Phone workbench

Trip planning note

Offline queue

Local note saved

The phone is about to treat local state as server truth.

Local draftuser text preserved
Server versionv17
Sync queuefull object PUT

Decision layer

Versioned operation log: preserve both versions.

User impact

Data loss is worse than a visible conflict. Users forgive a merge screen faster than they forgive missing work.

Simulator

Offline edit conflict

The user edits a note underground. A second device edits the same note online before the first phone reconnects.

Subway mode

Design workbench

Choose the client strategy

Try the tempting shortcut, then compare it with the safer production answer. Scores are heuristics from the strategy plus active chaos switches.

Safer path

Outcome

The server can reject conflicts and the client can keep both versions visible.

Interview cue: Lead with operation log, base version, server conflict response, then user-visible recovery.

Risk22
Trust88
Readiness65

Chaos controls

Event tape

Step 1

live

Edit note

Step 2

Queue waits offline

Step 3

Another device updates

Step 4

Reconnect

Step 5

Blind overwrite

Sync Chaos Lab

Break a mobile app on purpose.

Pick a real sync failure, flip the chaos switches, advance the event tape, then compare the broken client with the safer design.

What just happened?

A mobile client cannot treat local state as server truth after reconnecting. The app needs a version check and a local operation log so the user can recover without surprise data loss.

Client pattern

local operation log + versioned conflict strategy

Chaos enabled: 0 of 3

Scenario runbook

What to build or say next

Production runbook
  1. Check 1Store every offline edit as an operation with a client mutation ID.
  2. Check 2Attach the base server version to each replayed mutation.
  3. Check 3Keep local text visible until the server accepts, rejects, or returns both versions.
  4. Check 4Measure conflict rate and queued operation age before designing a heavy merge UI.

Architecture diff

Naive client vs Resilient client

Production

Naive client

  1. 1Local cache writes directly into UI state
  2. 2Reconnect sends a blind PUT
  3. 3Server accepts the last write
  4. 4Client replaces state with server response

Resilient client

  1. 1Write a local operation with a client mutation ID
  2. 2Attach base server version to the mutation
  3. 3Server rejects mismatched versions
  4. 4Client opens a conflict resolver and then reconciles

Debug signals

conflict_rate_by_entityqueued_operation_age_p95reconnect_merge_successmanual_conflict_resolution_rate

Trade-off map

Speedscore: 70

Local writes keep the UI fast.

Trustscore: 92

Conflict visibility protects user work.

Complexityscore: 78

Operation logs and merge UI add surface area.

Interview answer

Say the invariant first, then the mechanism.

I would model offline edits as an operation log, not just cached objects. Each mutation carries the base version it was created from. On reconnect, the server can accept, reject, or return both versions. The client keeps the user's local work visible while guiding the user through merge or overwrite choices.

Pick the safest design move

Pick the safest design move when the app reconnects and the server has a newer version.

Sync Chaos Lab

A small Mobile system design playground for client-side reliability. It shows why mobile apps need local queues, idempotency, freshness checks, and reconciliation instead of assuming the network behaves.

Start with the bigger framing post: System design interviews for mobile engineers. Then follow the mobile system design series for deeper examples.

What it teaches

← All playground tools