Knowledge Guide
HomeSystem DesignSystem Design Problems

hard Designing Reminder Alert System

Image
Image

A Notification System for Reminders is a platform that allows users to schedule alerts and delivers them through various channels (email, SMS, push notifications, or in-app) when due. The system acts as a central hub to manage scheduled events and dispatch notifications to users via their preferred methods. A real-world analogy is the reminder feature in Slack or calendar applications; for instance, Slack relies on scheduled jobs to ensure messages reach users at the precise moment. Here, we generalize that concept to a web-scale service capable of handling millions of concurrent users.

Key Terminology

Before designing the solution, let’s clarify the key requirements.

Functional Requirements

Non-Functional Requirements

Before designing the system, it’s important to gauge the scale of data and traffic we need to handle. Here are some rough estimations for a “web-scale” scenario:

In summary, this system will handle on the order of 10^7–10^8 reminders and notifications per day, require high throughput processing (thousands of events per second), and manage tens of gigabytes of reminder data. These numbers drive us toward a distributed, partitioned architecture with a focus on time-based data access and parallel processing of jobs.

At a high level, we will adopt a distributed microservices architecture with distinct components for managing reminders, scheduling jobs, and delivering notifications. The design will separate the concerns of reminder storage & scheduling from notification delivery, using asynchronous queues to decouple the timing of reminders from the act of sending messages. Major components and their interactions are outlined below:

High Level Design of Reminder Alert System
High Level Design of Reminder Alert System

This high-level design emphasizes separation of concerns: the when (scheduling) is handled independently from the how (delivery), which improves scalability and maintainability. Next, we’ll drill down into each part, including data models and algorithms to meet the requirements.

Designing the right data schema is critical for performance. We need to support two primary access patterns efficiently:

  1. Lookup reminders by next execution time (for the Scheduler to find what’s due now).
  2. Lookup reminders by user (for user queries and for computing next occurrences, etc.).

To achieve this, we will use a denormalized approach with two main tables/collections:

Choice of Database: Given the volume and access patterns, a NoSQL datastore like Cassandra is well-suited. It can handle high write throughput (for inserts/updates on schedule table), and its wide-column model allows the composite key approach we outlined. Cassandra’s partitioning will distribute data by the partition key (which we set to time+segment), and replicate for fault tolerance. Another option is Amazon DynamoDB which has similar partition key concepts (we’d use a compound key with time bucket). DynamoDB even supports TTL on items, which could auto-expire past events if we used it for one-time schedules – but for recurring, we keep updating future times. If we prefer SQL for simplicity, we’d likely partition the schedule table by date (e.g., separate physical partitions or tables for each day or hour) to aid performance, and use an index on time. However, the relational approach might become a bottleneck beyond a certain scale or require careful query planning (full table index on millions of rows might handle thousands of lookups per second, but tens of thousands could be tough). Therefore, we lean towards distributed NoSQL for the scheduler data. For the Reminders definition table, relational could work (since user-based lookups are not extremely high volume), but to simplify our stack, using the same store for both is fine. Cassandra can store the reminders by user partition, as described, and it’s efficient as long as no single user has an enormous number of entries (500 max in our case, which is trivial within one partition).

Example: A sample entry in the Schedule table might look like:

Partition (ExecutionMinute=17084040, Segment=3): { ReminderID: "rem12345", UserID: "user6789", Message: "Meeting in 5 minutes", Channels: ["email","push"], NextExecutionTime: 2025-06-01T09:00:00Z, TimeZone: "America/Los_Angeles" }

This indicates that the reminder rem12345 for user 6789 is due at that UTC time. The scheduler querying the partition for 2025-06-01 09:00 UTC, segment 3 will find this. It can then process sending “Meeting in 5 minutes” via email and push. After sending, it will compute the next occurrence (say daily, so nextExecutionTime becomes 2025-06-02 09:00:00Z) and insert a new entry possibly in a different partition (minute and maybe different segment) for the next day.

6.1 Scheduling Algorithm and Reminder Execution Pipeline

The Scheduler Service is responsible for executing reminders on schedule. It’s essentially a distributed cron engine tailored to our data model. Here’s how it functions in detail:

6.2 Notification Delivery per Channel

Once the Scheduler has enqueued the notification events, the Channel Processors take over to actually deliver messages. We design each channel service to be scalable and reliable:

6.3 Time Zone and Daylight Savings Handling

Handling time zones correctly is crucial for a reminder system – it’s what makes “recurring at 9:00 AM local time” possible. Here’s how we address it:

In summary, by storing timezone info and using proper conversions, we ensure scheduled times align with user expectations. The system always works internally in absolute time (UTC) for coordination, but all calculations of next times go through the lens of the user’s local calendar.

6.4 Failure Handling and Reliability

A robust system must anticipate various failure scenarios and mitigate them:

By planning for these failure scenarios, the system will gracefully handle errors: no single failure will completely stop reminders; at worst some might be delayed, but they should eventually go out. Our design aims for high reliability, borrowing techniques used in large-scale systems (redundancy, idempotent processing, partitioning to limit blast radius, and thorough monitoring).

To achieve web-scale performance, we incorporate several strategies and optimizations:

In conclusion, through sharding, horizontal scaling, asynchronous pipelines, caching, and careful data modeling, this design can handle the demanding throughput of a web-scale recurring reminder system. It draws on best practices from large notification systems – decoupling via queues, ensuring partitions for parallelism, and reliable delivery with at-least-once guarantees. This should meet the requirements of delivering millions of timely reminders daily, across the world, within seconds of their scheduled times.

🤖 Don't fully get this? Learn it with Claude

Stuck on Designing Reminder Alert System? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🪜 Hint ladder (no spoilers)

Progressively stronger hints — you still solve it.

I'm working on the problem **Designing Reminder Alert System** (System Design). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🎨 Explain the approach visually

See the technique, not just code.

Explain the optimal approach to **Designing Reminder Alert System** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔍 Review my solution

Catch bugs, edge cases, sub-optimality.

I'll paste my solution to **Designing Reminder Alert System**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔁 Drill the pattern

Lock in recognition with look-alikes.

Give me 2 problems that use the SAME underlying pattern as **Designing Reminder Alert System**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes