Thinking through system design for mobile engineers
I think half of being a good mobile engineer is understanding the product and the server side, not just the app. I want to share what I have picked up over the years and start writing out loud about it.
Not because the topics are impossible. Caches, queues, databases, WebSockets, rate limiters, CDNs. You can learn those piece by piece. There are good resources for them.
The part I want to get better at is the shape of the thinking. As a mobile engineer, I am used to thinking about state, rendering, app lifecycle, offline behavior, retries, background limits, push notifications, crash rates, release safety, and app version compatibility. But a lot of high-level system diagrams start with a box labeled “mobile app” and move on.
That box hides a lot.
This series is my attempt to think through system design in public from the perspective I actually have: a mobile engineer trying to design the whole product, not just the app and not just the backend.
This post is part of my System design for mobile engineers series.
One note on process: I use AI as a thinking and writing assistant. It helps me zoom out, research, outline, pressure-test explanations, and turn rough notes into drafts. I fully review the content before publishing, and the final opinions, tradeoffs, and edits are mine. I am writing these posts to learn and think more clearly at a higher level, not to pretend I already have every answer.
Why I am writing this
I am not writing this because the internet needs another generic system design guide. It probably does not.
There are already great resources that explain the usual system design patterns: clarify requirements, estimate scale, design APIs, choose databases, add caching, shard hot paths, talk through queues, and compare tradeoffs. Those are useful. I am using them too.
But I keep noticing a gap.
A lot of system design content treats mobile and web clients as thin shells around backend systems. The diagram starts with a client, then quickly moves to load balancers, services, queues, and databases. That can be useful, but it skips product behavior that matters a lot in real mobile apps:
- What happens when the app opens after being offline all night?
- What does the client cache?
- What is the source of truth?
- Which actions should be optimistic?
- What happens if the request succeeds but the response never reaches the phone?
- How do push notifications interact with WebSockets?
- How do old app versions keep working after the backend changes?
- How do we keep badge counts, unread state, and local data from drifting forever?
- What should the user see when the network is bad?
- What metrics tell us the mobile experience is broken before users complain?
Those are system design questions too.
They change the design. A Slack-like system is not just message storage plus WebSocket fanout. On mobile, it also needs a local database, sync cursors, push notifications, idempotent sends, background recovery, multi-device reconciliation, and careful rules about what the user sees before the server responds.
That is the kind of detail I want to think through more deeply.
The angle for this series
The working title is:
Thinking through system design for mobile engineers
The promise is simple:
Take common product systems and reason through them as full product experiences, with the mobile client treated as a serious part of the architecture.
That means every post will still cover the normal system design material:
- requirements
- rough scale assumptions
- APIs
- data model
- high-level architecture
- bottlenecks
- tradeoffs
- failure modes
But each post will also include the mobile and client angle:
- client state
- local caching
- offline behavior
- sync strategy
- push vs realtime delivery
- API compatibility
- app versioning
- retries and idempotency
- observability from the client
- battery, bandwidth, and app lifecycle constraints
I want these posts to be useful for mobile engineers who want to think more at the system level, and for backend engineers who want to think more carefully about the clients their systems support.
The mental model I want to build
The habit I want to build is this:
Start with product behavior, then design the system that makes that behavior possible.
It is tempting to start by drawing services. I do it too. But the better answers usually start with user-visible behavior.
For example, if we are designing Slack:
- When I send a message from my phone, should it appear immediately?
- If I go offline after tapping send, does the message stay pending?
- If the app retries, how do we avoid duplicates?
- If I read the channel on desktop, when should my phone badge clear?
- If push arrives late, should the app trust it?
- If the app opens after a day offline, what should it sync first?
Once those decisions are clear, the architecture has a job to do.
The client may need an outbox. The send API may need an idempotency key. The backend may need an event log and a sync endpoint. Push becomes a hint, not the source of truth. Read state becomes server-owned state that clients cache locally and converge toward.
That is the kind of reasoning I want to get better at.
Proposed format
I am going to use a repeatable structure so each post feels familiar:
- Problem statement
- Functional and non-functional requirements
- Scale assumptions
- Product behavior before architecture
- API contract
- Data model
- High-level architecture
- Deep dives into the interesting parts
- Mobile and client angle
- Tradeoffs
- Failure modes
- How I would explain it simply
- What I would study next
The “how I would explain it simply” section matters to me. Knowing the design is one thing. Explaining it clearly is a different skill.
I also want to include diagrams where they help. Not giant architecture posters for every post, but enough visual structure to make the system easier to reason about.
Topics I want to cover
The first few posts will probably be:
- How mobile engineers can think about system design
- Design Slack: realtime messaging, offline sync, and mobile push
- Design push notifications for mobile apps
- Design offline-first sync for a notes or tasks app
- Design a URL shortener, then extend it for mobile deep links
- Design Instagram feed for mobile
- Design photo and video upload from mobile to CDN
- Design a rate limiter for mobile APIs
- Design nearby friends or location sharing
- Design mobile feature flags and experimentation
Some of these are classic system design prompts. Some are more mobile-specific. That mix is intentional.
The classic prompts are useful because they exercise familiar architecture patterns. The mobile-specific prompts are useful because they expose the parts of system design that generic guides often compress into one box.
What I am optimizing for
I am not trying to write the most complete answer to every system design question. That is a trap. You can always add another cache, another queue, another index, another failover path.
I want the posts to be practical.
For each system, I want to answer:
- What is the core product behavior?
- What does the first reasonable architecture look like?
- Where does it break as scale grows?
- Which parts need strong guarantees?
- Which parts can be eventually consistent?
- What should the client own?
- What should the server own?
- What would I say if someone asked me to go deeper?
If I can answer those clearly, the post is doing its job.
A small example: push is not state
One phrase I expect to repeat a lot is:
Push is a hint, not the source of truth.
This comes up in chat, feeds, notifications, reminders, social apps, and collaboration tools.
A push notification can be delayed. It can be collapsed. It can arrive after the user already handled the event on another device. The user can disable notifications. The OS can throttle delivery. The app might be killed.
So if we are designing a messaging app, the push payload should not be treated as the durable message history. It should wake the user or tell the app that something changed. The app still needs to fetch or sync authoritative state from the backend.
That one product detail affects the whole design:
- the backend needs a message store
- the client needs a sync cursor
- events need ids for deduping
- read state should reconcile across devices
- badge counts should be corrected when the app opens
- push payloads should carry identifiers, not the entire source of truth
This is why I think the mobile perspective is useful. It forces the design to handle messy real-world edges.
Who this is for
I am writing this for a few groups:
- Mobile engineers who want to think more at the system level
- Engineers who know app development and want to understand backend architecture better
- Backend engineers who want to think more carefully about client behavior
- My future self, who learns better by writing than by passively reading
If you are already very strong at distributed systems, some posts may feel basic. That is fine. My goal is not to impress experts with exotic architecture. My goal is to build a reliable way to reason from product requirements to system design.
How I want feedback
I would love feedback as I write these.
The most useful feedback would be specific:
- Was the explanation clear?
- Did the architecture feel realistic?
- Did I skip an important tradeoff?
- Did the mobile angle make the post stronger, or did it feel bolted on?
- Is there a clearer way to explain this?
- What topic should I cover next?
I am especially interested in feedback from mobile engineers who think deeply about product behavior, and from backend engineers who can point out where my mental model is too client-heavy.
Closing
This series is partly a learning journal and partly a forcing function.
I want to get better at thinking beyond the app without forgetting the app. Good systems are not just scalable backend components. They are the contracts, caches, queues, retries, state machines, and product decisions that make the user experience feel boring in the best way.
That is what I want to build here.
If this was useful, you can buy me a coffee ☕. If you have a question, correction, or a product you want me to think through next, leave a comment.
If you have seen a version of this question in an interview, I would love to hear what part felt hardest: requirements, APIs, mobile state, scale, offline behavior, or tradeoffs.
Comments
Loading…