MongoDB
Designing a Log-Structured File System
Understand log-structured file systems — append-only writes, garbage collection, segment cleaning, crash recovery, and their influence on modern databases.
S
srikanthtelkalapally888@gmail.com
Log-structured file systems write all changes sequentially to a log, enabling high write throughput and simple crash recovery.
Traditional File Systems Problem
Traditional (ext4, NTFS):
Random writes scatter across disk
Metadata + data written in separate locations
Slow for write-heavy workloads
Complex crash recovery (fsck)
Log-Structured Approach
All writes → Append to end of log (sequential)
Log: [inode][data][inode][data][checkpoint]...
write1 write2
Sequential writes are orders of magnitude faster on HDD (no seek), and predictable on SSD.
Data Structure
Segment (1MB chunk):
[segment_summary][file_data][inodes]
Log = Series of segments written sequentially
Inode Map:
Maps inode_number → current disk location
Cached in memory, checkpointed to disk
Write Path
1. Buffer writes in memory (write buffer ~1MB)
2. When full: write segment to disk (sequential)
3. Update in-memory inode map
4. Periodically checkpoint inode map to disk
Read Path
1. Lookup inode number → location in inode map
2. Read inode → data block locations
3. Read data blocks
Challenge: Data can be anywhere in log → Random reads
Solution: Read cache (LRU buffer cache)
Garbage Collection (Segment Cleaning)
Problem: Old versions of files remain in old log segments
Cleaning:
1. Select segments with most dead data
2. Read live blocks from segment
3. Write live blocks to new segment at end of log
4. Mark old segment free
Crash Recovery
Checkpoint:
Periodically write full inode map to fixed locations
Record checkpoint timestamp
Recovery:
1. Find latest valid checkpoint
2. Restore inode map from checkpoint
3. Roll forward from checkpoint position in log
4. Replay segments written after checkpoint
Influence on Modern Systems
LevelDB / RocksDB: LSM tree (log-structured)
Cassandra: Log-structured storage
Postgres WAL: Log-first writes
ZFS: Copy-on-write (similar principle)
Kafka: Append-only log (same idea!)
Conclusion
Log-structured file systems sacrifice read locality for write performance and simple recovery. Their core ideas — sequential writes, append-only logs, garbage collection — appear throughout modern databases and storage engines.