MongoDB
Designing a Distributed ID Generation System
Build a globally unique ID generator — covering Twitter Snowflake, UUIDs, ULID, and database sequences for distributed systems.
S
srikanthtelkalapally888@gmail.com
Designing a Distributed ID Generation System
Distributed systems need globally unique IDs that can be generated without coordination.
Requirements
- Globally unique across all nodes
- Time-ordered (for pagination)
- High throughput (100K+ IDs/sec)
- No single point of failure
- Short enough to be practical
Option 1: UUID v4
550e8400-e29b-41d4-a716-446655440000
128 bits, random
Pros: Simple, no coordination
Cons: Not sortable, large (36 chars), index fragmentation
Option 2: Twitter Snowflake
64-bit integer:
[41 bits: timestamp ms] [10 bits: machine ID] [12 bits: sequence]
Timestamp: ms since custom epoch (2010-01-01)
Machine ID: Unique per node (ZooKeeper assigns)
Sequence: 0-4095 per ms per machine
Max throughput: 4096 IDs/ms per machine
Sortable: YES (time prefix)
Example: 1753789234567_001_0042
Option 3: ULID (Universally Unique Lexicographically Sortable ID)
01ARZ3NDEKTSV4RRFFQ69G5FAV
26 characters, base32
[48 bits: timestamp] [80 bits: random]
Sortable: YES
URL-safe: YES
Monotonic: Within same millisecond
Option 4: Database Sequence
CREATE SEQUENCE global_id_seq START 1;
SELECT nextval('global_id_seq');
Pros: Simple, perfectly sequential Cons: Single DB = bottleneck, not distributed
Option 5: Segment's KSUID
128-bit: [32-bit timestamp] [96-bit random]
Base62 encoded: 27 chars
Sortable, URL-safe
Decision Matrix
Simple system: UUID v4
Time-sorted, high scale: Snowflake
URL-safe + sortable: ULID or KSUID
Small system: DB sequence
Snowflake Machine ID Assignment
Approach 1: Assign via ZooKeeper at startup
Approach 2: Use last 10 bits of IP address
Approach 3: Random + collision detection
Clock Drift Issue
Machine clock goes backward:
Wait until clock catches up
OR throw exception
OR use sequence bits (monotonic)
Conclusion
Snowflake ID is the industry standard for high-scale systems: time-ordered, 64-bit, no coordination after machine ID assignment.