What is a Snowflake ID?

A Snowflake ID is a 64-bit integer composed of a sign bit, a timestamp in milliseconds, a machine identifier, and a per-millisecond sequence number. Twitter's original layout used 41 bits for the timestamp (about 69 years), 10 bits for the machine ID (1,024 nodes), and 12 bits for the sequence (4,096 IDs per millisecond per node). The composition makes IDs globally unique without coordination on the hot path and roughly time-ordered, because the timestamp occupies the high bits.

Why not just use UUIDs?

Random UUIDv4 is convenient and needs no coordination, but it produces 128-bit identifiers with no ordering, which makes them larger to store and bad for clustered-index databases because each insert lands in a random page. Snowflake-style 64-bit IDs are smaller, sort-friendly for time-range scans, and a strong index choice. UUIDv7, which embeds a millisecond timestamp in the high bits of a 128-bit UUID, is a modern compromise when you accept the larger size.

How does Snowflake stay decentralised?

Each node is assigned a unique machine ID exactly once, at startup, through a coordination service like ZooKeeper or etcd. After that, ID generation is pure local arithmetic - read the clock, combine with the machine ID and an in-process sequence counter, return the integer. There is no network round trip on the generation path, which is what lets a node mint millions of IDs per second.

What happens when the system clock goes backwards?

Wall clocks can step backwards because of NTP correction, leap seconds, or operator changes, and a backwards step could let the generator produce IDs that collide with already-issued ones. The standard defences are to read time from a monotonic clock (which never decreases) rather than the wall clock, to refuse to mint when the observed time is less than the last seen timestamp until it catches up, and to configure NTP to slew time gradually rather than step it.

What happens when more than 4,096 IDs are requested in the same millisecond?

The 12-bit sequence in the original layout caps each node at 4,096 IDs per millisecond. When that bucket fills, the generator blocks until the next millisecond boundary, at which point the sequence resets. If a workload sustains higher per-node throughput, the right answer is to widen the sequence bits at the cost of fewer machine IDs - or to add more nodes, since aggregate throughput scales linearly with node count.

Are Snowflake IDs globally monotonically increasing?

No, they are k-sortable, not strictly monotonic. Within one node, IDs increase strictly, but two nodes minting at the same millisecond can produce IDs whose order does not match true wall-clock order, because their machine ID bits differ. K-sortability is enough for almost every practical use - time-range queries, paginated history - without paying the coordination cost a true global counter would require.

Design a Unique ID Generator: System Design Interview 2026

A unique ID generator is the smallest service in this series and one of the most consequential. Almost every other system here implicitly assumed it existed - the orders, payments, messages, jobs, and shortened URLs all need identifiers that are unique, fast to mint, and ideally sortable. The challenge is that doing this at platform scale rules out the obvious answer: a single database sequence becomes a contention point, a centralised counter becomes a network hop, and random UUIDs solve uniqueness but trade away the ordering that makes indexes efficient. Snowflake-style IDs answer all three constraints by encoding uniqueness in the ID structure itself.

This walkthrough assumes the 6-step system design framework and applies it at senior depth. It is Part 13 of an extended system design series, opening Tier 5.

The Problem
Step 1 - Clarify Requirements
Step 2 - Estimate Scale
Step 3 - API and Data Model
Step 4 - High-Level Design
Step 5 - Deep Dive: The Snowflake Layout, Decentralisation, and Clock Handling
Step 6 - Bottlenecks and Trade-offs
Reference Architecture
Common Mistakes in the Interview
Quick Reference
Related Articles

The Problem

We are designing a service - really a small library - that mints globally unique 64-bit identifiers, used everywhere across a large platform. The canonical example is Twitter's Snowflake, with close relatives at Instagram, Discord, and (in 128-bit form) UUIDv7.

The senior framing is that uniqueness is the only hard requirement; everything interesting in the design follows from refusing to pay a coordination round-trip per ID. The naive answers - a database sequence, a single counter service, a random UUID - each trade away exactly one of the three properties you usually want: scalability, ordering, or compactness. Snowflake declines that trade by packing the answer into the ID itself.

Step 1 - Clarify Requirements

Functional requirements:

Generate globally unique IDs at very high throughput.
IDs should be roughly time-ordered (k-sortable), so they make good clustered-index keys and time-range scans are cheap.
Compact - 64 bits, to fit a BIGINT column.

Out of scope (name, then defer): human-friendly short codes (see Part 1 for that), and security-sensitive non-enumerability (Snowflake IDs are somewhat enumerable; if that is a problem, encrypt or wrap them).

Non-functional requirements:

High per-node throughput - millions of IDs/sec.
Microsecond-level latency - no network on the hot path.
No hot-path coordination - every node mints locally.
Tolerate clock drift, NTP corrections, and leap seconds without producing duplicates.
Tolerate node restart without losing uniqueness.

The clarifying question: how strict is the ordering? Strict global monotonicity demands a single sequencer and serialises the entire platform on it. Almost no real workload needs that. K-sortability - roughly time-ordered, within a millisecond out of order between nodes - is sufficient for indexing and time-range queries and is what Snowflake delivers cheaply. State this trade explicitly.

Step 2 - Estimate Scale

Unlike the rest of the series, the storage and bandwidth numbers here are trivial; the meaningful figure is per-node throughput.

Throughput. Assume a platform that mints ~10 million IDs/sec in aggregate. With the canonical 12-bit sequence (4,096 IDs/ms/node), a single node sustains 4,096 x 1,000 = ~4 million IDs/sec. So three or four ID-generating nodes cover the entire platform - and most production deployments embed the generator as a library inside every service, so there is no separate fleet at all.

Storage. None. IDs are computed from a clock, a machine ID, and an in-process counter; nothing is persisted per ID.

Bit budget. A 64-bit signed integer leaves 63 usable bits. Spending 41 on a millisecond timestamp covers 2^41 / (1000 x 86400 x 365) ≈ ~69 years from a chosen custom epoch - longer than most services live. The remaining 22 bits split between machine ID and sequence, and the split is the only meaningful tuning decision.

Step 3 - API and Data Model

The "API" is a single function:

nextId() -> int64

There is no persistent data model on the request path. Each generating node holds three tiny pieces of state in memory:

Field	Purpose
`machineId`	A unique-per-node integer, claimed once at startup
`lastTimestampMs`	The most recent millisecond observed
`sequence`	The per-millisecond counter, reset each new millisecond

The machine ID itself is the only thing the system needs a coordination service for - and only at startup, never on the generation path.

Step 4 - High-Level Design

flowchart TD
    Boot[Node startup] --> Claim{Claim machine ID<br/>via coordination service}
    Claim -->|got id N| Cache[Cache machine ID in process]
    Cache --> Lib[Snowflake library<br/>nextId is pure local arithmetic]
    Svc1([Service A]) --> Lib1[lib]
    Svc2([Service B]) --> Lib2[lib]
    Lib1 -.no network.-> Out1[ID]
    Lib2 -.no network.-> Out2[ID]
    Coord[(Coordination Service<br/>etcd / ZooKeeper)]
    Claim -.consults.-> Coord

Figure 1. The architecture is shaped by what is NOT on the hot path. Coordination happens once at startup - the node claims a machine ID via the coordination service - and from then on nextId() is pure local arithmetic with no network. This is what makes a few hundred nodes able to mint tens of millions of IDs per second across the platform.

The coordination service is on the startup path only. Once a node has its machine ID, generation is in-process arithmetic against the monotonic clock and a sequence counter - no network, no shared state, no lock.

Step 5 - Deep Dive: The Snowflake Layout, Decentralisation, and Clock Handling

This is the core. Three ideas carry it: how the bits are laid out, why the design refuses any coordination per ID, and how clocks - the only fragile dependency - are made safe.

Part A - The bit layout

flowchart LR
    S["Bit 0<br/>sign (0)"] --- T["Bits 1-41<br/>timestamp ms (~69 yrs)"] --- M["Bits 42-51<br/>machine ID (1,024)"] --- Q["Bits 52-63<br/>sequence (4,096/ms)"]

Figure 2. The canonical 64-bit Snowflake layout. The timestamp occupies the high bits, so integer comparison roughly matches chronological order; the machine ID disambiguates nodes; the per-millisecond sequence disambiguates within a node. The 41/10/12 split is a one-way decision because every consumer decodes it - widen it before you ship, not after.

A 64-bit signed integer is partitioned into four contiguous fields, with the timestamp occupying the high bits so the integer comparison id1 < id2 mostly matches "id1 was minted earlier than id2". The canonical Twitter split:

Bits	Field	Capacity
0	Sign	always 0 - keeps IDs positive
1-41 (41 bits)	Timestamp ms since custom epoch	~69 years
42-51 (10 bits)	Machine ID	1,024 distinct nodes
52-63 (12 bits)	Per-ms sequence	4,096 IDs / ms / node

The split is the only knob worth tuning, and the trade is direct: widen machineId to scale node count, widen sequence to scale per-node throughput, widen timestamp to extend the epoch. The choice is a one-way decision - changing the layout after IDs are in the wild breaks every consumer that decoded them - so spend a minute on your numbers up front, the same way Part 1 sized its short-code length for decades.

Variants worth naming: Sonyflake reshuffles the split, Instagram-style IDs pack a shard ID and pair to a Postgres BIGINT, and UUIDv7 moves the same idea into 128 bits - a millisecond timestamp in the high bits, the remainder random - trading size for "drop-in for UUID consumers". A senior answer mentions these as the available shapes rather than treating Snowflake as the only option.

Part B - Decentralisation - the absence of a hot path

Every alternative pays a coordination cost per ID:

A database sequence serialises every minted ID on one row.
A centralised counter service adds a network round trip per ID.
A block allocator (the Part 1 approach) batches that hop but still pays it periodically.
Random UUIDv4 avoids coordination but is 128 bits and unordered.

Snowflake declines all of them by encoding uniqueness in the value itself. Two IDs cannot collide as long as their (timestamp, machineId, sequence) triples differ, and uniqueness across nodes is guaranteed structurally - the machine ID is different - without any per-ID agreement. The coordination service is on the startup path only, and a node restart amounts to reclaiming or reissuing one small lease.

This is exactly the philosophy of Part 7's ephemeral geospatial index and Part 9's precomputed top-k - do the coordination once, off the hot path, so the hot path is unconditionally fast.

Machine ID assignment

The integrity of the whole scheme rests on machine IDs being unique. Three options, in order of robustness:

Coordination service lease (recommended). At startup, the node claims an unused ID in etcd, ZooKeeper, or Consul via a lease; the lease renews while the node lives. If the node dies, the lease expires and the ID is reusable.
Static configuration. Cheap and convenient - and the canonical way to accidentally assign the same ID to two nodes during a sloppy deploy. Avoid unless the platform is tiny.
Derived from network identity (last octet of the IP, hostname suffix). Convenient but fragile - IP reuse, container rescheduling, and overlay networks can violate uniqueness silently.

A duplicate machine ID is catastrophic and silent - the two nodes generate the same (timestamp, machineId, sequence) triples and your platform gets duplicate IDs with no error. Fail fast on assignment: if a node cannot positively claim an unused ID, it must refuse to start.

Part C - Clock handling

The bit layout assumes the timestamp never decreases. Real clocks violate this. NTP corrections, leap seconds, virtual-machine pauses, and operator intervention can step the wall clock backwards, and a backwards step lets the generator emit an ID it has emitted before.

flowchart TD
    Call(["nextId() called"]) --> Now[now = monotonic clock ms]
    Now --> Cmp{now == lastTimestampMs?}
    Cmp -->|yes| Inc[sequence = sequence + 1]
    Inc --> Full{sequence overflow?}
    Full -->|no| Pack[Pack timestamp + machineId + sequence]
    Full -->|yes| Wait[Wait until next ms]
    Wait --> Now
    Cmp -->|no, now > last| Reset[sequence = 0; lastTimestampMs = now]
    Reset --> Pack
    Cmp -->|now < last - clock regressed| Block[Block / error - never mint a stale ID]
    Pack --> Out([Return 64-bit ID])

Figure 3. The full nextId() algorithm. The same-millisecond and clock-regressed branches are what make the generator safe: same-ms increments the sequence, regression refuses to mint until the clock catches up. Without the regression branch a backwards NTP step would silently re-issue an ID combination - a duplicate with no error.

Four defences cooperate:

Read a monotonic clock, not the wall clock - the operating system's monotonic clock never moves backwards (CLOCK_MONOTONIC on Linux). The wall clock is only acceptable if it is the encoded timestamp's true source.
Refuse to mint when now < lastTimestampMs. Block until the clock catches up, or fail the call. Never quietly emit an ID with the older timestamp - that is how duplicates happen.
Configure NTP to slew, not step. Slewing nudges the clock gradually; stepping jumps it discontinuously. Production setups uniformly prefer slew.
Handle leap seconds. Many large operators (Google, AWS) smear leap seconds across a day so the clock never moves backwards. If your platform does not, your generator must.

Sequence exhaustion within a millisecond - more than 4,096 calls in one ms on one node - is handled the same way: block until the next millisecond and reset the counter. Stealing from the next millisecond is an option but rarely worth its complexity at the throughputs Snowflake supports.

The concrete scenario showing the refuse-to-mint defence in action - normal generation, then a clock that regresses:

sequenceDiagram
    participant App as Service code
    participant Lib as Snowflake library
    participant Clk as Clock
 
    App->>Lib: nextId()
    Lib->>Clk: now()
    Clk-->>Lib: t = 1000
    Note over Lib: last = 1000, seq = 0
    Lib-->>App: id (1000, machine, 0)
    App->>Lib: nextId()
    Lib->>Clk: now()
    Clk-->>Lib: t = 1000 (same ms)
    Note over Lib: same ms - increment seq
    Lib-->>App: id (1000, machine, 1)
    App->>Lib: nextId()
    Lib->>Clk: now()
    Clk-->>Lib: t = 999 (regressed!)
    Note over Lib: now < last - REFUSE TO MINT
    Lib-->>App: block until clock recovers
    Lib->>Clk: now()
    Clk-->>Lib: t = 1001
    Note over Lib: ok - reset seq, advance last
    Lib-->>App: id (1001, machine, 0)

Figure 4. The refuse-to-mint defence in action: two normal generations followed by a clock regression. The third nextId() sees now < last and blocks rather than emitting an ID with the stale timestamp; once the clock recovers, generation resumes with a fresh sequence. This is the single mechanism that stops NTP corrections from silently producing duplicates.

Without the refuse-to-mint check, the third call could re-issue an ID triple already produced - a silent duplicate.

Consistency model

Within a node, IDs are strictly monotonic by construction. Across nodes, IDs are k-sortable: any two IDs whose timestamps differ are correctly ordered by integer comparison, but two IDs minted in the same millisecond on different nodes can compare in either direction (their machine IDs decide the tie, which has no temporal meaning). K-sortability is enough for nearly every use case, and recovering true global order would require a single sequencer the whole design exists to avoid.

Failure modes

Clock regression. Handled by the monotonic-clock + refuse-to-mint discipline above. Alert on any regression event.
Sequence exhaustion. Block until the next millisecond. If it happens often, the per-node throughput exceeds the layout's budget - widen the sequence bits or scale out.
Duplicate machine ID. Catastrophic and silent - prevent it by claiming IDs through a coordination service and failing the boot if no ID is available.
Machine ID exhaustion. 10 bits = 1,024 nodes is a hard cap; if you grow past it, the layout must change before the wall is hit.
Node restart. Reclaim the previous lease if available, or claim a fresh one - either is correct provided uniqueness invariants hold.

Multi-region

Carve the machine ID space per region: dedicate the high bits of machineId to region (3 bits = 8 regions, 7 bits = 128 machines per region). Coordination services are local to a region, IDs across regions stay globally unique because machine-ID ranges are disjoint, and no cross-region hop ever happens on the generation path. This is the cleanest multi-region story in the whole series - the structure of the ID does the work.

Evolution path

Stage	Approach
Launch	A single database sequence - fine while one process can serialise inserts
Growth	Snowflake-style IDs with coordination-service-leased machine IDs
Scale	Per-region machine-ID ranges, monotonic clock everywhere, NTP slewing

Commit to the 64-bit width and the bit layout on day one - both are one-way decisions because every consumer decodes them. Defer multi-region partitioning until you are actually multi-region.

Observability

Track IDs minted per second per node, sequence exhaustion events (rare; sustained occurrence means the layout is undersized), clock-regression alerts (treat every one as an incident), machine-ID-claim failures at boot, and lease-renewal errors. A reasonable SLO is zero duplicate IDs and zero clock-regression mints, both observed by sampling and verification.

Step 6 - Bottlenecks and Trade-offs

Sequence width caps per-node throughput at 2^seq_bits IDs/ms; widen it at the cost of fewer machine IDs.
Machine-ID width caps node count at 2^id_bits; widen at the cost of sequence space.
Clock regression is the canonical failure mode and is defended by the monotonic-clock plus refuse-to-mint discipline.
Coordination at startup is the only true dependency; the service must be available when nodes boot.
K-sortability versus strict global order is a deliberate trade: strict order costs a sequencer; k-sortability is free.

Reference Architecture

The pattern this problem teaches, reusable well beyond IDs:

Encode uniqueness in the produced value itself by combining a time component, a per-node component, and a per-node sequence. Pay coordination once at startup, never on the hot path. Accept k-sortability as the price of decentralisation.

flowchart LR
    subgraph Once["Startup - coordinated"]
        C[Coordination service] -->|lease machine ID| N[Node]
    end
    subgraph Hot["Hot path - local arithmetic"]
        T[Monotonic clock] --> Pack[Pack: time + machineId + seq]
        SeqC[Sequence counter] --> Pack
        Mach[Machine ID] --> Pack
        Pack --> ID[(64-bit ID)]
    end

Figure 5. The reference architecture: coordination happens once at startup; the hot path is pure local arithmetic packing time, machine ID, and sequence into the produced value. The pattern - encode uniqueness in the value's structure so the hot path needs no agreement - recurs in distributed trace IDs, log record IDs, and the (jobId, scheduledAt) execution key of Part 12.

The same idea recurs in distributed trace IDs (timestamp plus randomness), log record IDs, in ordered-event identifiers, and in Part 12's (jobId, scheduledAt) execution-key composition. Whenever you can fold the answer into the value's structure, you remove the most expensive thing distributed systems do: agreeing.

Common Mistakes in the Interview

Reaching for UUIDv4 when ordering matters, then complaining about clustered-index fragmentation.
A centralised counter service or a database sequence as the only design at scale - the contention bottleneck the entire scheme exists to avoid.
Ignoring clock regression, then claiming uniqueness despite an NTP step backwards.
Sharing machine IDs accidentally through sloppy static configuration - silent duplicates.
Choosing bit widths without numbers - throughput and node count are quantitative decisions.
Reading the wall clock instead of a monotonic clock for the timestamp source.

Quick Reference

Topic	Key Point
Core pattern	64-bit ID = sign + timestamp ms + machine ID + per-ms sequence
Canonical split	41 + 10 + 12 bits = ~69 years, 1,024 nodes, 4,096 IDs/ms/node
Decentralisation	Coordination at startup only; generation is local arithmetic
Machine ID	Claim via coordination-service lease; fail boot if unavailable
Clock	Monotonic clock; refuse to mint when time regresses; slew NTP, never step
Sequence exhaustion	Block until the next millisecond, then reset
Ordering	K-sortable across nodes; strictly monotonic within a node
Layout choice	One-way decision - decoders depend on it; size for decades up front
Multi-region	Partition the machine-ID space by region; no cross-region hop
Failure to avoid	Duplicate machine IDs - catastrophic and silent

System Design Interview Problems: A Senior's Roadmap - the full series index and pattern library.
System Design Interview Guide: The 6-Step Framework - the method this walkthrough applies.
Design a URL Shortener - Part 1; a different ID-generation strategy (block allocation) for a different problem.
Design a Payment System - Part 11; IDs as idempotency keys at every boundary.
Design a Job Scheduler - Part 12; coordination services and leases on the startup path.
Design a Key-Value Store - Part 14; the Dynamo-style AP store this generator's IDs naturally key into.

This is Part 13 of an extended system design series, opening Tier 5. Next: Design a Key-Value Store.

Design a Unique ID Generator: System Design Interview 2026

Table of Contents

The Problem

Step 1 - Clarify Requirements

Step 2 - Estimate Scale

Step 3 - API and Data Model

Step 4 - High-Level Design

Step 5 - Deep Dive: The Snowflake Layout, Decentralisation, and Clock Handling

Part A - The bit layout

Part B - Decentralisation - the absence of a hot path

Machine ID assignment

Part C - Clock handling

Consistency model

Failure modes

Multi-region

Evolution path

Observability

Step 6 - Bottlenecks and Trade-offs

Reference Architecture

Common Mistakes in the Interview

Quick Reference

Ready to ace your interview?

Table of Contents

The Problem

Step 1 - Clarify Requirements

Step 2 - Estimate Scale

Step 3 - API and Data Model

Step 4 - High-Level Design

Step 5 - Deep Dive: The Snowflake Layout, Decentralisation, and Clock Handling

Part A - The bit layout

Part B - Decentralisation - the absence of a hot path

Machine ID assignment

Part C - Clock handling

Consistency model

Failure modes

Multi-region

Evolution path

Observability

Step 6 - Bottlenecks and Trade-offs

Reference Architecture

Common Mistakes in the Interview

Quick Reference

Related Articles

Ready to ace your interview?