MongoDB
Designing a Smart Queue Management System
Build an intelligent queue with priority handling, dead letter queues, poison message detection, back-pressure, and consumer scaling.
S
srikanthtelkalapally888@gmail.com
Designing a Smart Queue Management System
A robust message queue needs more than just enqueue/dequeue — it needs priority, failure handling, and back-pressure.
Queue Architectures
Point-to-Point (Queue):
Producer → Queue → One Consumer
(Competing consumers for load distribution)
Pub/Sub (Topic):
Producer → Topic → Many Consumers
(Fan-out, each consumer gets every message)
Priority Queue
Queues by priority:
high_priority.fifo → Payment jobs (p99 < 1s)
normal.fifo → Email jobs (p99 < 10s)
low_priority.fifo → Report jobs (best effort)
Workers:
Check high_priority first
Fall through if empty
Dead Letter Queue (DLQ)
Message processing fails:
Retry 1: immediate
Retry 2: 30 seconds later
Retry 3: 5 minutes later
Retry 4: 1 hour later
Retry 5: → Dead Letter Queue
DLQ: Manual inspection + replay
# AWS SQS DLQ config
RedrivePolicy:
deadLetterTargetArn: arn:aws:sqs:dlq-orders
maxReceiveCount: 5
Poison Message Detection
Message that always causes consumer to crash:
Detect: Same message_id retried > N times
Action:
Move to poison queue
Alert engineering team
Consumer continues (not stuck)
Back-Pressure
Problem: Producer faster than consumer
→ Queue grows → OOM
Solutions:
Rate limit producer when queue depth > threshold
Return HTTP 429 to producer (explicit back-pressure)
Auto-scale consumers when queue depth > 1000
Exactly-Once Processing
Kafka transactional producer:
begin_transaction()
produce(topic, message)
commit_offset()
commit_transaction()
→ Atomic: message produced + offset committed
Queue Metrics
Monitor:
Queue depth → Is processing keeping up?
Consumer lag → How far behind?
DLQ depth → Failure rate
Throughput → Messages/second
Age of oldest message → Processing stall?
Conclusion
Smart queues need priority routing, exponential backoff + DLQ for failures, poison message handling, and auto-scaling consumers based on queue depth.