Knowledge Guide
HomeSystem DesignSystem Design Problems

medium Designing Unique ID Generator

Image
Image

A Globally Unique Identifier (GUID) Generation System is a service that produces unique identifiers at scale for use across distributed systems. Its purpose is to generate IDs that never collide (no duplicates) even when issued from multiple data centers or services concurrently. These IDs can serve as database primary keys, event/message IDs, object handles, or any context where a distinct identifier is needed for each entity. The system supports two ID formats: opaque IDs and ordered IDs. Opaque IDs are essentially random or pseudo-random identifiers (like UUIDs) with no intrinsic ordering or meaning, useful for security or purely unique labeling. Ordered IDs are time-based or monotonic IDs that encode time or sequence information so that newer IDs are numerically or lexicographically larger than older ones, which is useful for sorting and roughly reflects creation order. “Global uniqueness” means any ID generated on any node or at any time will be different from all others, and web-scale implies the system can handle very high request rates (hundreds of thousands per second) and operate across many regions and services. Here are a few Key terms:

Use Cases: This system would be used whenever multiple distributed components need to create unique references without a central bottleneck. For example, web servers generating database keys for new records, microservices tagging events with IDs for tracing, or IoT devices labeling sensor readings – all without risk of collision. By providing both opaque IDs (for unpredictable, secure identifiers) and time-ordered IDs (for sorted, timeline-friendly keys), the system covers a wide range of needs.

Functional Requirements:

Non-Functional Requirements:

Before diving into design, let’s estimate the load and storage implications:

Architecture Overview: We will build a dedicated ID Generation Service that runs in multiple regions and multiple instances for scalability. Here are the main components and their interactions in our design:

Request Flow:

  1. Client Request: A client needing a new ID makes a request to the ID generation service’s API (for example, an HTTP GET /generate_id?type=time or type=opaque). This request goes to the load balancer or API gateway.

  2. Routing: The load balancer forwards the request to one of the available ID generator nodes in the nearest region (using health checks and possibly considering client location). For global clients, DNS might resolve to a local region’s endpoint. For a truly global service, we might also allow cross-region calls if one region is overloaded (though normally each region can handle its local traffic).

  3. ID Generation: On the chosen generator node, the service code checks the request type:

    • If time-sortable ID is requested, the node will use the Snowflake-like algorithm: it reads the current timestamp in milliseconds, and combines it with its configured region/machine ID and an internal sequence counter to generate the next ID. (Detailed algorithm in next section.)
    • If opaque ID is requested, the node will use a random ID generator: e.g. call a secure random number generator to produce 128 bits, or use a library call to generate a UUIDv4. This does not require any shared state (each call is independent). In either case, the computation is local and very fast – just a couple of arithmetic and bit operations or random bytes generation. There is no additional network call on this critical path (the service node doesn’t need to ask any other node to get an ID in our design).
  4. Response: The service node returns the generated ID to the client, typically as a numeric string (for time-sortable) or a UUID-formatted string for opaque. For example, it might return {"id": 1541757210912000000} or {"id": "c2f3a8d0-7e5b-4efb-b1a2-5d91c1e315b8"} depending on type. The client receives this response, and then uses the ID (e.g. as a key in their database insert). The whole round-trip is quick; even with network overhead, it should be well under the 200 ms SLA – often a few ms within the same region.

  5. Failure Handling: If the request fails (e.g. the chosen node was unresponsive), the client or gateway can retry, potentially hitting a different node. The ID service is idempotent in the sense that a retried request will just produce a new ID (there’s no harm in unused IDs – an ID that was generated but not used is just lost, with no conflict). Therefore, retries are safe. Clients could even preemptively request a batch of IDs to have spares on hand in case of transient issues (this is an optional client-side optimization).

Multi-Region Uniqueness: We ensure uniqueness across regions by assigning a distinct region identifier to each region’s cluster of ID nodes. This region ID forms part of the generated ID (for time-sortable IDs) or influences the ID (for opaque, we might not include it explicitly, but we can simply rely on randomness being globally unique). For the Snowflake IDs, for example, we use some of the machine-id bits to encode the region. For instance, a 10-bit machine ID could be split into a 5-bit datacenter ID and 5-bit host ID. Thus, Region A might be datacenter 5 and Region B datacenter 6, etc. This means even if two machines in different regions generate an ID at the exact same millisecond with the same sequence number, the datacenter portion will differ, making the 64-bit IDs distinct. The coordinator service can manage this by pre-allocating a range of machine IDs to each region (e.g. region 1 gets IDs 0–31 for its nodes, region 2 gets 32–63, etc., if using 5-bit region). Alternatively, the region ID can be explicitly one field in the ID. In any case, no coordination between regions is needed per request – the design inherently prevents overlap.

Example Flow: Suppose Service X in the US-East region needs a new ID for a user profile creation. It calls the ID API, the US-East load balancer routes to an ID node (with, say, machine ID = region 1, node 3). The node’s Snowflake algorithm takes the current timestamp (e.g. 2025-05-21 18:30:00.123 UTC -> a 41-bit value relative to epoch), combines it with its datacenter=1 and worker=3 bits, and sequence (say 5 if it already made 5 IDs this millisecond), producing a 64-bit ID like 0x1E2C... (some big integer). This is returned in, say, decimal form. Meanwhile, a client in EU region might be calling nearly simultaneously; their ID node has datacenter=2, maybe sequence 1, etc., yielding a different 64-bit number. Even if the timestamp part was identical, the datacenter bits differ so the IDs are unique. Later, when these IDs are stored in a database, sorting them will put the US-East one before or after the EU one depending on timestamp (if timestamps were truly equal to the ms, their relative order in numeric comparison will be determined by datacenter bits – so not strictly by true time, but such ties are extremely rare and only within the same millisecond window).

In summary, the high-level design uses distributed ID generator nodes that operate mostly independently, coordinated only by a lightweight service to assign unique node IDs. Clients interact with it through a simple network API. This design is stateless in the request/response path, enabling easy load balancing and horizontal scaling.

Let’s dive deeper into how the ID generation works, and the rationale behind chosen algorithms. We will compare a few strategies and then describe our chosen solution in detail.

5.1 Comparing ID Generation Approaches

  1. UUID (Random 128-bit IDs): Generating a UUID (v4) is straightforward – each ID is 16 random bytes. The advantage is no coordination or global state needed: any node can generate IDs independently and the probability of collision is astronomically low. UUIDs are widely used in distributed systems for this reason. They are also opaque (no sequence or time info can be inferred). However, they are 128 bits long, which is overkill for many uses and can impact storage or indexing (twice the size of a 64-bit number). They also don’t provide ordering: one cannot sort by UUID and get chronological order (unless using newer versions like UUIDv7 which include time, but v4 is fully random). In contexts where order or smaller size matters, UUID v4 is less ideal.

  2. Database Auto-Increment / Central Counter: One simple approach is to have a single, centralized counter (for example, a row in a SQL database that we UPDATE to increment and return the new value for each ID). This guarantees uniqueness and natural ordering. Some systems have used this (e.g. using a dedicated “ID server” or a DB table). The huge downside is scalability: every ID request hits the same database or primary node, creating a bottleneck. At 100k IDs/sec, a single DB sequence likely cannot keep up (due to transaction log and lock contention). Even if it could, it becomes a single point of failure – if that DB is down, no IDs can be generated. This approach also incurs high latency as each request is a DB write. It fails the high-throughput and HA requirements unless heavily optimized (caching sequences, etc., essentially evolving into the segment approach below).

  3. Segment (Batch) Allocation via Database: This is a compromise approach: use a database or persistent store to allocate chunks or segments of IDs to each generator node. For example, a node might ask the DB: “give me the next block of 10,000 IDs”. The DB atomically increases its counter by 10,000 and returns a range (say 1,000,000–1,009,999). The node can then serve IDs 1,000,000 upward from memory without further DB calls until it exhausts that block, at which point it requests a new block. This drastically reduces the DB contention (only one query per 10k IDs instead of per ID) and thus can scale much better – effectively the throughput is limited by how fast nodes can consume blocks and occasionally hit the DB. It also improves latency for most requests (since serving from memory is fast). However, the system still relies on the availability of that central DB for replenishing ID ranges. If the DB goes down or the network partition occurs when blocks need renewal, the service will eventually stall once all nodes use up their current ranges. Also, assigning fixed segments per node introduces waste: if a node crashes with half its segment unused, those IDs might never be issued (to avoid duplicates, we wouldn’t reassign that half segment). This waste is usually acceptable, though, in exchange for simplicity and uniqueness. The segment approach offers ordered IDs (if a single global counter is used, all IDs are increasing globally), but if multiple nodes in different regions get segments, those segments could be interleaved in time (e.g., node A’s segment covers IDs 1–10000 and node B has 10001–20000, but node B might generate some of its later IDs before node A finishes its range, slightly mixing the order of actual creation). Without careful orchestration, time ordering isn’t strict globally, though each node generates in order within its segment.

  4. Distributed GUID Services (Snowflake and its variants): Twitter’s Snowflake algorithm is a well-known solution designed for high scale. It avoids any per-ID central bottleneck by embedding uniqueness factors into the ID itself. In Snowflake, a 64-bit ID is composed of: a timestamp, a machine identifier, and a sequence number. Each generator node can independently create IDs as long as it has a unique machine ID and the clocks are in sync. Twitter’s original implementation used 41 bits for time, 10 bits for machine (often split into 5-bit datacenter + 5-bit node), and 12 bits for sequence. This yields:

    • 2^41 ≈ 2.2 trillion timestamp values (enough for 69 years at millisecond precision).
    • 2^10 = 1024 unique machine IDs (up to 1024 nodes generating in parallel).
    • 2^12 = 4096 IDs per node per millisecond.

    This design can generate IDs extremely fast. One Snowflake node can output 4096 IDs in a single millisecond before it has to wait for the next millisecond. That’s up to ~4 million IDs/second on one machine theoretically, far above our requirements. In practice, Twitter and others rarely hit that limit, but it gives headroom. Snowflake IDs are time-sortable (mostly – if two different nodes generate IDs, a slightly slower clock could cause some slight disorder globally, but within one node it’s strictly ordered by time and sequence). The downside is that Snowflake requires coordination to assign machine IDs uniquely and needs all nodes to have reasonably synchronized clocks. There’s also a predictability issue: since the ID increases with time and sequence, someone observing IDs might infer approximate timestamps or system load (this is a security consideration; Snowflake IDs are not cryptographically random). Twitter’s implementation used ZooKeeper to manage machine ID assignment and to handle clock issues (e.g., if a clock went backward, Snowflake could halt ID generation until recovery to avoid duplicates). Despite complexity, Snowflake is highly scalable and has no single point of failure in the ID generation path – nodes can work independently. Many companies (Instagram, Discord, etc.) have adopted similar schemes, sometimes tweaking the bit allocations for their needs.

  5. Other Approaches: There are other ID schemes like MongoDB’s ObjectID (a 96-bit ID with time, machine ID, and counter), or CUID/NanoID (client-generated, highly random IDs often as strings), and upcoming UUIDv7 (128-bit time-ordered UUID). MongoDB’s ObjectIDs are interesting – they include a 4-byte timestamp (to seconds precision), a 5-byte random host identifier, and 3-byte counter. They are larger than 64 bits and only roughly ordered (seconds, not milliseconds). CUID and NanoID are more for low-collision at modest scale (often used for front-end or offline generation), but they prioritize uniqueness and some randomness, potentially with bigger size strings. For our scale, these aren’t as commonly used in backend distributed systems as Snowflake or UUID due to either length or throughput considerations.

Why Snowflake + UUID (hybrid) for our design? Based on the above, we choose a hybrid approach to meet all requirements:

Importantly, both methods can be provided by the same service – for example, our ID generator node can implement two code paths: one for Snowflake IDs and one for UUIDs. This way, we maintain one system but support both formats. Many systems actually do use a combination depending on context (e.g., use time-based IDs internally but expose opaque IDs to external clients for security).

5.2 Snowflake ID Algorithm Design

For the time-sortable ID generation, we’ll follow the Snowflake format with some customization:

Snowflake 64 bits ID
Snowflake 64 bits ID

5.3 Opaque ID (UUID) Generation Design

For the opaque IDs, our design is simpler:

5.4 Data Storage and State Management

Our chosen design deliberately minimizes persistent storage:

5.5 Failure Recovery Details

In conclusion, our detailed design chooses the Snowflake algorithm to fulfill the time-sortable, high-throughput ID needs, and a UUID-like random approach for opaque IDs. This covers both functional requirements. By carefully managing the bit schema, machine ID assignment, and clock behavior, we ensure global uniqueness and reliability. The design is battle-tested by industry (Twitter’s Snowflake and its derivatives) which underscores its viability. Meanwhile, using UUIDs for opaque IDs leverages a well-understood standard for uniqueness.

Our system is designed to scale horizontally and remain performant under heavy load. Here are key strategies and considerations for scalability:

In summary, this design is comprehensive and robust: it can easily scale to the required throughput and beyond by adding nodes (thanks to the distributed nature of ID generation) and meets the <200ms latency goal by keeping operations local and lightweight. We avoid single failure points by eliminating central databases in the hot path and using redundancy for the coordinator and nodes. The result is a globally unique ID generation system that provides two flavors of IDs while maintaining consistency, performance, and reliability at scale. Each component’s role is well-defined, and the overall system follows proven patterns used by high-scale platforms like Twitter and others, giving us confidence in its viability for our requirements.

🤖 Don't fully get this? Learn it with Claude

Stuck on Designing Unique ID Generator? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.

🪜 Hint ladder (no spoilers)

Progressively stronger hints — you still solve it.

I'm working on the problem **Designing Unique ID Generator** (System Design). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🎨 Explain the approach visually

See the technique, not just code.

Explain the optimal approach to **Designing Unique ID Generator** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔍 Review my solution

Catch bugs, edge cases, sub-optimality.

I'll paste my solution to **Designing Unique ID Generator**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
🔁 Drill the pattern

Lock in recognition with look-alikes.

Give me 2 problems that use the SAME underlying pattern as **Designing Unique ID Generator**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.

📝 My notes