Thinking through system design for mobile engineers

9 min read

system-designmobile

I think half of being a good mobile engineer is understanding the product and the server side, not just the app. I want to share what I have picked up over the years and start writing out loud about it.

Not because the topics are impossible. Caches, queues, databases, WebSockets, rate limiters, CDNs. You can learn those piece by piece. There are good resources for them.

The part I want to get better at is the shape of the thinking. As a mobile engineer, I am used to thinking about state, rendering, app lifecycle, offline behavior, retries, background limits, push notifications, crash rates, release safety, and app version compatibility. But a lot of high-level system diagrams start with a box labeled “mobile app” and move on.

That box hides a lot.

This series is my attempt to think through system design in public from the perspective I actually have: a mobile engineer trying to design the whole product, not just the app and not just the backend.

This post is part of my System design for mobile engineers series.

One note on process: I use AI as a thinking and writing assistant. It helps me zoom out, research, outline, pressure-test explanations, and turn rough notes into drafts. I fully review the content before publishing, and the final opinions, tradeoffs, and edits are mine. I am writing these posts to learn and think more clearly at a higher level, not to pretend I already have every answer.

Why I am writing this

I am not writing this because the internet needs another generic system design guide. It probably does not.

There are already great resources that explain the usual system design patterns: clarify requirements, estimate scale, design APIs, choose databases, add caching, shard hot paths, talk through queues, and compare tradeoffs. Those are useful. I am using them too.

But I keep noticing a gap.

A lot of system design content treats mobile and web clients as thin shells around backend systems. The diagram starts with a client, then quickly moves to load balancers, services, queues, and databases. That can be useful, but it skips product behavior that matters a lot in real mobile apps:

What happens when the app opens after being offline all night?
What does the client cache?
What is the source of truth?
Which actions should be optimistic?
What happens if the request succeeds but the response never reaches the phone?
How do push notifications interact with WebSockets?
How do old app versions keep working after the backend changes?
How do we keep badge counts, unread state, and local data from drifting forever?
What should the user see when the network is bad?
What metrics tell us the mobile experience is broken before users complain?

Those are system design questions too.

They change the design. A Slack-like system is not just message storage plus WebSocket fanout. On mobile, it also needs a local database, sync cursors, push notifications, idempotent sends, background recovery, multi-device reconciliation, and careful rules about what the user sees before the server responds.

That is the kind of detail I want to think through more deeply.

The angle for this series

The working title is:

Thinking through system design for mobile engineers

The promise is simple:

Take common product systems and reason through them as full product experiences, with the mobile client treated as a serious part of the architecture.

That means every post will still cover the normal system design material:

requirements
rough scale assumptions
APIs
data model
high-level architecture
bottlenecks
tradeoffs
failure modes

But each post will also include the mobile and client angle:

client state
local caching
offline behavior
sync strategy
push vs realtime delivery
API compatibility
app versioning
retries and idempotency
observability from the client
battery, bandwidth, and app lifecycle constraints

I want these posts to be useful for mobile engineers who want to think more at the system level, and for backend engineers who want to think more carefully about the clients their systems support.

The mental model I want to build

The habit I want to build is this:

Start with product behavior, then design the system that makes that behavior possible.

It is tempting to start by drawing services. I do it too. But the better answers usually start with user-visible behavior.

For example, if we are designing Slack:

When I send a message from my phone, should it appear immediately?
If I go offline after tapping send, does the message stay pending?
If the app retries, how do we avoid duplicates?
If I read the channel on desktop, when should my phone badge clear?
If push arrives late, should the app trust it?
If the app opens after a day offline, what should it sync first?

Once those decisions are clear, the architecture has a job to do.

The client may need an outbox. The send API may need an idempotency key. The backend may need an event log and a sync endpoint. Push becomes a hint, not the source of truth. Read state becomes server-owned state that clients cache locally and converge toward.

That is the kind of reasoning I want to get better at.

Proposed format

I am going to use a repeatable structure so each post feels familiar:

Problem statement
Functional and non-functional requirements
Scale assumptions
Product behavior before architecture
API contract
Data model
High-level architecture
Deep dives into the interesting parts
Mobile and client angle
Tradeoffs
Failure modes
How I would explain it simply
What I would study next

The “how I would explain it simply” section matters to me. Knowing the design is one thing. Explaining it clearly is a different skill.

I also want to include diagrams where they help. Not giant architecture posters for every post, but enough visual structure to make the system easier to reason about.

Topics I want to cover

The first few posts will probably be:

How mobile engineers can think about system design
Design Slack: realtime messaging, offline sync, and mobile push
Design push notifications for mobile apps
Design offline-first sync for a notes or tasks app
Design a URL shortener, then extend it for mobile deep links
Design Instagram feed for mobile
Design photo and video upload from mobile to CDN
Design a rate limiter for mobile APIs
Design nearby friends or location sharing
Design mobile feature flags and experimentation

Some of these are classic system design prompts. Some are more mobile-specific. That mix is intentional.

The classic prompts are useful because they exercise familiar architecture patterns. The mobile-specific prompts are useful because they expose the parts of system design that generic guides often compress into one box.

What I am optimizing for

I am not trying to write the most complete answer to every system design question. That is a trap. You can always add another cache, another queue, another index, another failover path.

I want the posts to be practical.

For each system, I want to answer:

What is the core product behavior?
What does the first reasonable architecture look like?
Where does it break as scale grows?
Which parts need strong guarantees?
Which parts can be eventually consistent?
What should the client own?
What should the server own?
What would I say if someone asked me to go deeper?

If I can answer those clearly, the post is doing its job.

A small example: push is not state

One phrase I expect to repeat a lot is:

Push is a hint, not the source of truth.

This comes up in chat, feeds, notifications, reminders, social apps, and collaboration tools.

A push notification can be delayed. It can be collapsed. It can arrive after the user already handled the event on another device. The user can disable notifications. The OS can throttle delivery. The app might be killed.

So if we are designing a messaging app, the push payload should not be treated as the durable message history. It should wake the user or tell the app that something changed. The app still needs to fetch or sync authoritative state from the backend.

That one product detail affects the whole design:

the backend needs a message store
the client needs a sync cursor
events need ids for deduping
read state should reconcile across devices
badge counts should be corrected when the app opens
push payloads should carry identifiers, not the entire source of truth

This is why I think the mobile perspective is useful. It forces the design to handle messy real-world edges.

Who this is for

I am writing this for a few groups:

Mobile engineers who want to think more at the system level
Engineers who know app development and want to understand backend architecture better
Backend engineers who want to think more carefully about client behavior
My future self, who learns better by writing than by passively reading

If you are already very strong at distributed systems, some posts may feel basic. That is fine. My goal is not to impress experts with exotic architecture. My goal is to build a reliable way to reason from product requirements to system design.

How I want feedback

I would love feedback as I write these.

The most useful feedback would be specific:

Was the explanation clear?
Did the architecture feel realistic?
Did I skip an important tradeoff?
Did the mobile angle make the post stronger, or did it feel bolted on?
Is there a clearer way to explain this?
What topic should I cover next?

I am especially interested in feedback from mobile engineers who think deeply about product behavior, and from backend engineers who can point out where my mental model is too client-heavy.

Closing

This series is partly a learning journal and partly a forcing function.

I want to get better at thinking beyond the app without forgetting the app. Good systems are not just scalable backend components. They are the contracts, caches, queues, retries, state machines, and product decisions that make the user experience feel boring in the best way.

That is what I want to build here.

If this was useful, you can buy me a coffee ☕. If you have a question, correction, or a product you want me to think through next, leave a comment.

If you have seen a version of this question in an interview, I would love to hear what part felt hardest: requirements, APIs, mobile state, scale, offline behavior, or tradeoffs.

Comments

Loading…

Thinking through system design for mobile engineers

Why I am writing this

The angle for this series

The mental model I want to build

Proposed format

Topics I want to cover

What I am optimizing for

A small example: push is not state

Who this is for

How I want feedback

Closing

Comments

Leave a comment