MongoDB

Designing a Distributed ID Generation System

Build a globally unique ID generator — covering Twitter Snowflake, UUIDs, ULID, and database sequences for distributed systems.

S

srikanthtelkalapally888@gmail.com

Designing a Distributed ID Generation System

Distributed systems need globally unique IDs that can be generated without coordination.

Requirements

  • Globally unique across all nodes
  • Time-ordered (for pagination)
  • High throughput (100K+ IDs/sec)
  • No single point of failure
  • Short enough to be practical

Option 1: UUID v4

550e8400-e29b-41d4-a716-446655440000
128 bits, random

Pros: Simple, no coordination
Cons: Not sortable, large (36 chars), index fragmentation

Option 2: Twitter Snowflake

64-bit integer:
[41 bits: timestamp ms] [10 bits: machine ID] [12 bits: sequence]

Timestamp: ms since custom epoch (2010-01-01)
Machine ID: Unique per node (ZooKeeper assigns)
Sequence: 0-4095 per ms per machine

Max throughput: 4096 IDs/ms per machine
Sortable: YES (time prefix)
Example: 1753789234567_001_0042

Option 3: ULID (Universally Unique Lexicographically Sortable ID)

01ARZ3NDEKTSV4RRFFQ69G5FAV
26 characters, base32

[48 bits: timestamp] [80 bits: random]

Sortable: YES
URL-safe: YES
Monotonic: Within same millisecond

Option 4: Database Sequence

CREATE SEQUENCE global_id_seq START 1;

SELECT nextval('global_id_seq');

Pros: Simple, perfectly sequential Cons: Single DB = bottleneck, not distributed

Option 5: Segment's KSUID

128-bit: [32-bit timestamp] [96-bit random]
Base62 encoded: 27 chars
Sortable, URL-safe

Decision Matrix

Simple system:           UUID v4
Time-sorted, high scale: Snowflake
URL-safe + sortable:     ULID or KSUID
Small system:            DB sequence

Snowflake Machine ID Assignment

Approach 1: Assign via ZooKeeper at startup
Approach 2: Use last 10 bits of IP address
Approach 3: Random + collision detection

Clock Drift Issue

Machine clock goes backward:
  Wait until clock catches up
  OR throw exception
  OR use sequence bits (monotonic)

Conclusion

Snowflake ID is the industry standard for high-scale systems: time-ordered, 64-bit, no coordination after machine ID assignment.

Share this article