2022-12-23 15:52:09 -05:00
# LSM in a Week
2022-12-23 21:14:11 -05:00
[](https://github.com/skyzh/mini-lsm/actions/workflows/main.yml)
2022-12-23 15:52:09 -05:00
Build a simple key-value storage engine in a week!
2022-12-23 21:14:11 -05:00
## Tutorial
The tutorial is available at [https://skyzh.github.io/mini-lsm ](https://skyzh.github.io/mini-lsm ). You can use the provided starter
code to kick off your project, and follow the tutorial to implement the LSM tree.
2024-01-18 15:15:51 +08:00
## Community
You may join skyzh's Discord server and study with the mini-lsm community.
[](https://skyzh.dev/join/discord)
2024-01-18 15:11:14 +08:00
2022-12-23 15:52:09 -05:00
## Development
```
cargo x install-tools
cargo x check
cargo x book
```
2022-12-23 22:35:38 -05:00
If you changed public API in the reference solution, you might also need to synchronize it to the starter crate.
To do this, use `cargo x sync` .
2022-12-24 00:19:29 -05:00
2024-01-21 17:45:20 +08:00
## Structure
* mini-lsm: the final solution code
* mini-lsm-starter: the starter code
* mini-lsm-book: the tutorial
We have another repo mini-lsm-solution-checkpoint at [https://github.com/skyzh/mini-lsm-solution-checkpoint ](https://github.com/skyzh/mini-lsm-solution-checkpoint ). In this repo, each commit corresponds to a chapter in the tutorial. We will not update the solution checkpoint very often.
2022-12-24 00:19:29 -05:00
## Progress
2024-01-10 14:25:23 +08:00
We are working on a new version of the mini-lsm tutorial that is split into 3 weeks.
* Week 1: Storage Format + Engine Skeleton
* Week 2: Compaction and Persistence
2024-01-10 14:26:06 +08:00
* Week 3: Multi-Version Concurrency Control
2024-01-10 19:27:27 +08:00
* The Extra Week / Rest of Your Life: Optimizations (unlikely to be available in 2024...)
2024-01-20 23:41:05 +08:00
✅: finished \
🚧: WIP and will likely be available soon
2024-01-10 22:16:39 +08:00
| Week + Chapter | Topic | Solution | Starter Code | Writeup |
| -------------- | ----------------------------------------------- | -------- | ------------ | ------- |
2024-01-18 19:59:49 +08:00
| 1.1 | Memtables | ✅ | ✅ | ✅ |
2024-01-21 12:20:29 +08:00
| 1.2 | Merge Iterators | ✅ | ✅ | ✅ |
2024-01-21 13:55:49 +08:00
| 1.3 | Block Format | ✅ | ✅ | ✅ |
2024-01-21 14:21:09 +08:00
| 1.4 | Table Format | ✅ | ✅ | ✅ |
2024-01-21 15:26:22 +08:00
| 1.5 | Storage Engine - Read Path | ✅ | ✅ | ✅ |
2024-01-21 17:40:47 +08:00
| 1.6 | Storage Engine - Write Path | ✅ | ✅ | ✅ |
2024-01-21 19:33:05 +08:00
| 1.7 | Bloom Filter and Key Compression | ✅ | ✅ | ✅ |
2024-01-23 13:44:48 +08:00
| 2.1 | Compaction Implementation | ✅ | ✅ | ✅ |
2024-01-23 14:43:44 +08:00
| 2.2 | Compaction Strategy - Simple | ✅ | ✅ | ✅ |
2024-01-23 15:53:20 +08:00
| 2.3 | Compaction Strategy - Tiered | ✅ | ✅ | ✅ |
| 2.4 | Compaction Strategy - Leveled | ✅ | ✅ | ✅ |
2024-01-24 14:32:13 +08:00
| 2.5 | Manifest | ✅ | ✅ | 🚧 |
| 2.6 | Write-Ahead Log | ✅ | ✅ | 🚧 |
2024-01-19 17:28:47 +08:00
| 2.7 | Batch Write + Checksum | | | |
2024-01-20 12:01:01 +08:00
| 3.1 | Timestamp Key Encoding + New Block Format | | | |
| 3.2 | Prefix Bloom Filter | | | |
| 3.3 | Snapshot Read | | | |
| 3.4 | Watermark and Garbage Collection | | | |
| 3.5 | Transactions and Optimistic Concurrency Control | | | |
| 3.6 | Serializable Snapshot Isolation | | | |
| 3.7 | TTL (Time-to-Live) Entries | | | |
2024-01-10 22:16:39 +08:00
| 4.1 | Benchmarking | | | |
| 4.2 | Block Compression | | | |
| 4.3 | Trivial Move and Parallel Compaction | | | |
| 4.4 | Alternative Block Encodings | | | |
| 4.5 | Rate Limiter and I/O Optimizations | | | |
| 4.6 | Build Your Own Block Cache | | | |
2024-01-20 11:11:09 +08:00
| 4.7 | Build Your Own SkipList | | | |
| 4.8 | Async Engine | | | |
| 4.9 | Key-Value Separation | | | |
| 4.10 | Column Families | | | |
2024-01-20 23:41:05 +08:00
| 4.11 | Sharding | | | |
| 4.12 | SQL over Mini-LSM | | | |
2024-01-20 12:18:47 +08:00
## License
The Mini-LSM starter code and solution are under Apache 2.0 license. The author reserves the full copyright of the tutorial materials (markdown files and figures).