2024-01-30 16:18:05 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 15:52:09 -05:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								# LSM in a Week
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 21:14:11 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								[](https://github.com/skyzh/mini-lsm/actions/workflows/main.yml)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-26 22:32:55 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Build a simple key-value storage engine in a week! And extend your LSM engine on the second + third week.
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 15:52:09 -05:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-26 22:32:55 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## [Tutorial](https://skyzh.github.io/mini-lsm)
  
						 
					
						
							
								
									
										
										
										
											2022-12-23 21:14:11 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-28 21:29:28 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The Mini-LSM book is available at [https://skyzh.github.io/mini-lsm ](https://skyzh.github.io/mini-lsm ). You may follow this guide and implement the Mini-LSM storage engine. We have 3 weeks (parts) of the tutorial, each of them consists of 7 days (chapters).
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 21:14:11 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-18 15:15:51 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Community
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								You may join skyzh's Discord server and study with the mini-lsm community.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[](https://skyzh.dev/join/discord)
							 
						 
					
						
							
								
									
										
										
										
											2024-01-18 15:11:14 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-28 21:13:10 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								**Add Your Solution**
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								If you finished at least one full week of this tutorial, you can add your solution to the community solution list at [SOLUTIONS.md ](./SOLUTIONS.md ). You can submit a pull request and we might do a quick review of your code in return of your hard work.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 15:52:09 -05:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								## Development
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-28 21:15:48 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								**For Students**
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								You should modify code in `mini-lsm-starter`  directory.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo x install-tools
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo copy-test --week 1 --day 1
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo x scheck
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo run --bin mini-lsm-cli
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo run --bin compaction-simulator
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								**For Course Developers**
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								You should modify `mini-lsm`  and `mini-lsm-mvcc` 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 15:52:09 -05:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo x install-tools
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo x check
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo x book
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
									
										
										
										
											2022-12-23 22:35:38 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								If you changed public API in the reference solution, you might also need to synchronize it to the starter crate.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								To do this, use `cargo x sync` .
							 
						 
					
						
							
								
									
										
										
										
											2022-12-24 00:19:29 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-30 15:42:15 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Code Structure
  
						 
					
						
							
								
									
										
										
										
											2024-01-21 17:45:20 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-26 22:23:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								*  mini-lsm: the final solution code for < = week 2 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								*  mini-lsm-mvcc: the final solution code for week 3 MVCC 
						 
					
						
							
								
									
										
										
										
											2024-01-21 17:45:20 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								*  mini-lsm-starter: the starter code 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								*  mini-lsm-book: the tutorial 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								We have another repo mini-lsm-solution-checkpoint at [https://github.com/skyzh/mini-lsm-solution-checkpoint ](https://github.com/skyzh/mini-lsm-solution-checkpoint ). In this repo, each commit corresponds to a chapter in the tutorial. We will not update the solution checkpoint very often.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:48:18 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Demo
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								You can run the reference solution by yourself to gain an overview of the system before you start.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo run --bin mini-lsm-cli-ref
							 
						 
					
						
							
								
									
										
										
										
											2024-01-26 22:23:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								cargo run --bin mini-lsm-cli-mvcc-ref
							 
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:48:18 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								And we have a compaction simulator to experiment with your compaction algorithm implementation,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cargo run --bin compaction-simulator-ref
							 
						 
					
						
							
								
									
										
										
										
											2024-01-26 22:23:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								cargo run --bin compaction-simulator-mvcc-ref
							 
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:48:18 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-30 15:42:15 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Tutorial Structure
  
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:53:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-30 15:42:15 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								We have 3 weeks + 1 extra week (in progress) for this tutorial.
							 
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:53:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								*  Week 1: Storage Format + Engine Skeleton 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								*  Week 2: Compaction and Persistence 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								*  Week 3: Multi-Version Concurrency Control 
						 
					
						
							
								
									
										
										
										
											2024-01-30 15:42:15 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								*  The Extra Week / Rest of Your Life: Optimizations (unlikely to be available in 2024...) 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| Week + Chapter | Topic                                                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| -------------- | ----------------------------------------------------------- |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.1            | Memtable                                                    |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.2            | Merge Iterator                                              |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.3            | Block                                                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.4            | Sorted String Table (SST)                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.5            | Read Path                                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.6            | Write Path                                                  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1.7            | SST Optimizations: Prefix Key Encoding + Bloom Filters      |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.1            | Compaction Implementation                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.2            | Simple Compaction Strategy (Traditional Leveled Compaction) |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.3            | Tiered Compaction Strategy (RocksDB Universal Compaction)   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.4            | Leveled Compaction Strategy (RocksDB Leveled Compaction)    |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.5            | Manifest                                                    |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.6            | Write-Ahead Log (WAL)                                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2.7            | Batch Write and Checksums                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.1            | Timestamp Key Encoding                                      |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.2            | Snapshot Read - Memtables and Timestamps                    |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.3            | Snapshot Read - Transaction API                             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.4            | Watermark and Garbage Collection                            |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.5            | Transactions and Optimistic Concurrency Control             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.6            | Serializable Snapshot Isolation                             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3.7            | Compaction Filters                                          |
							 
						 
					
						
							
								
									
										
										
										
											2024-01-24 17:53:00 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-01-20 12:18:47 +08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## License
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The Mini-LSM starter code and solution are under Apache 2.0 license. The author reserves the full copyright of the tutorial materials (markdown files and figures).