System Design Cheatsheet - How to Crack HLD Round?

We have compiled a cheatsheet of main HLD concepts to help you with the System Design Round Preparation. Use this cheatsheet for reference to quickly revise concepts.

System Design rounds are the most unpredictable as you and even the interviewer don’t know in which direction it will go and how the round will end. But, by having a clear understanding of common programming concepts, you can easily make your way through the high-level system designing interview rounds.

I have taken a fair number of HLD rounds and I believe that I have a sound understanding of what an interviewer is looking for in a High-Level System Design round. In such interviews, the interviewer is looking for three main things:

  1. Your ability to understand the problem correctly and align that with the bigger roadmap of the product.
  2. Ability to break down the system into small components. Ideally, each of these components should have a single responsibility.
  3. Choose the best architecture for each of these components. Picking the right architecture = Picking the right battles + Managing trade-offs.
  4. How these components will communicate with each other? Async vs Sync communication strategies.
  5. Your approach towards scalability, fault tolerance, etc.

Note: HLD Interviews are there to test your ability to design systems as well as to check your ability to discuss things in your head and act on the feedback given. Try to make this session as interactive as possible and drive this interview.

Let’s quickly move to the cheatsheet section.

1. Understand the problem statements

  • Clarify the product requirements and relevant questions
    • What are we building, why and for whom?
    • List down all the actors involved. Example: In the case of food delivery apps we have a Restaurant Owner, a User placing a food order and a delivery agent.
  • Discuss app constraints if any:
    • Is this app only available in a locality or a single country or is it a Global app?
    • Discuss the scale of the system like MAU, DAU, Request Per Second, etc. This will help in making informed decisions about the architecture to be selected.

2. Very High System Design

  • Start by listing all the possible services that you can think of. Don’t think about their interaction. Just try to follow the Single-Responsibility Principle.
  • Now, for each of these services think about what sort of database do you need. This decision, will depend on a lot of factors -Whats the scale of this service, Do I need consistency here, etc. Also, consider if you need reader writer configuration with databases.
  • Think about whether caching can be leveraged by this service, if yes then include Caching technologies like Redis in it.

High-Level System Design

  • Once, you have all the services listed down start drawing the flow of data and interactions between them.
  • For every interaction, take a call do you need this to be a synchronous call or can this be async? Prefer, async over a sync call if possible.
  • For async communication, bring in message queues like Kafka, SQS, etc.
  • Also, start drawing complete end-to-end request flow originating from the user’s device. This will help you in improving the flow further.
  • Ask for periodic feedback from the interviewer and make this discussion interactive :)

Fault Tolerance and Reliability

  • Identify all the single points of failure in the system and try to fix them.
  • Make sure that you discuss things like retries, DLQ, timeouts, circuit breakers, etc. with the interviewer and explain how you can make your system more fault tolerant.
  • Introduce basic concepts like Load Balancers, API Gateways, ASG, etc. in your system.

Scaling the System

  • Vertical scaling
    • You scale by adding more power (CPU, RAM) to your existing machine.
  • Horizontal scaling
    • You scale by adding more machines to your pool of resources.
  • Caching
    • Application Cache: Some small amount of data can be cached directly on the application’s server. You can use this to reduce latency by max amount.
      • Make sure this data is small, or else the server can run out of memory and it will crash.
      • Think about how are you going to update this cache, periodic poll.
    • In-Memory Caching Services: You can use services like Redis to provide a cache layer between DBs and your application servers. You can configure this based on the amount of data you want to store and it would be shared between all application servers.
      • The only con is it’s a little expensive.

Load Balancing

  • Public servers of a scalable web service are hidden behind a load balancer. This load balancer evenly distributes load (requests from your users) onto your group/cluster of application servers.
  • Types: Smart client, Hardware load balancers


  • Partitioning of relational data usually refers to decomposing your tables either row-wise (horizontally) or column-wise (vertically).
  • Physical partitioning is called sharding. You split your data across multiple servers and based on your shard key your client connects to either of the DB servers.

This is part 1 of the series. Stay tuned for part 2.

Additional Read: Understanding SOLID Programming Principles - A Guide to Writing Maintainable Code.