diff --git a/README.md b/README.md index fb76b65..89c32af 100644 --- a/README.md +++ b/README.md @@ -87,7 +87,7 @@ We are working on chapter 3 and more test cases for all existing contents. | 3.1 | Timestamp Key Encoding | ✅ | ✅ | ✅ | | 3.2 | Snapshot Read - Blocks, Memtables, and SSTs | ✅ | ✅ | ✅ | | 3.3 | Snapshot Read - Engine Read Path | ✅ | ✅ | ✅ | -| 3.4 | Watermark and Garbage Collection | ✅ | 🚧 | 🚧 | +| 3.4 | Watermark and Garbage Collection | ✅ | ✅ | ✅ | | 3.5 | Transactions and Optimistic Concurrency Control | ✅ | 🚧 | 🚧 | | 3.6 | Serializable Snapshot Isolation | ✅ | 🚧 | 🚧 | | 3.7 | Compaction Filter | 🚧 | | | diff --git a/mini-lsm-book/src/SUMMARY.md b/mini-lsm-book/src/SUMMARY.md index 30820e0..38d8eea 100644 --- a/mini-lsm-book/src/SUMMARY.md +++ b/mini-lsm-book/src/SUMMARY.md @@ -22,14 +22,14 @@ - [Write-Ahead Log (WAL)](./week2-06-wal.md) - [Snack Time: Batch Write and Checksums](./week2-07-snacks.md) -- [Week 3 Overview: MVCC (WIP)](./week3-overview.md) +- [Week 3 Overview: MVCC](./week3-overview.md) - [Timestamp Encoding + Refactor](./week3-01-ts-key-refactor.md) - [Snapshots - Memtables and Timestamps](./week3-02-snapshot-read-part-1.md) - [Snapshots - Transaction API](./week3-03-snapshot-read-part-2.md) - [Watermark and GC](./week3-04-watermark.md) - - [Transaction and OCC](./week3-05-txn-occ.md) - - [Serializable Snapshot Isolation](./week3-06-serializable.md) - - [Snack Time: Compaction Filter](./week3-07-compaction-filter.md) + - [Transaction and OCC (WIP)](./week3-05-txn-occ.md) + - [Serializable Snapshot Isolation (WIP)](./week3-06-serializable.md) + - [Snack Time: Compaction Filter (WIP)](./week3-07-compaction-filter.md) - [The Rest of Your Life (TBD)](./week4-overview.md) --- diff --git a/mini-lsm-book/src/week3-04-watermark.md b/mini-lsm-book/src/week3-04-watermark.md index e7f7579..dab67da 100644 --- a/mini-lsm-book/src/week3-04-watermark.md +++ b/mini-lsm-book/src/week3-04-watermark.md @@ -1,10 +1,64 @@ # Watermark and Garbage Collection +In this chapter, you will implement necessary structures to track the lowest read timestamp being used by the user, and collect unused versions from SSTs when doing the compaction. + ## Task 1: Implement Watermark +In this task, you will need to modify: + +``` +src/mvcc/watermark.rs +``` + +Watermark is the structure to track the lowest `read_ts` in the system. When a new transaction is created, it should call `add_reader` to add its read timestamp for tracking. When a transaction aborts or commits, it should remove itself from the watermark. The watermark structures returns the lowest `read_ts` in the system when `watermark()` is called. If there are no ongoing transactions, it simply returns `None`. + +You may implement watermark using a `BTreeMap`. It maintains a counter that how many snapshots are using this read timestamp for each `read_ts`. You should not have entries with 0 readers in the b-tree map. + ## Task 2: Maintain Watermark in Transactions +In this task, you will need to modify: + +``` +src/mvcc/txn.rs +src/mvcc.rs +``` + +You will need to add the `read_ts` to the watermark when a transaction starts, and remove it when `drop` is called for the transaction. + ## Task 3: Garbage Collection in Compaction +Now that we have a watermark for the system, we can clean up unused versions during the compaction process. + +* If a version of a key is above watermark, keep it. +* For all versions of a key below or equal to the watermark, keep the latest version. + +For example, if we have watermark=3 and the following data: + +``` +a@4=del <- above watermark +a@3=3 <- latest version below or equal to watermark +a@2=2 <- can be removed, no one will read it +a@1=1 <- can be removed, no one will read it +b@1=1 <- latest version below or equal to watermark +c@4=4 <- above watermark +d@3=del <- can be removed if compacting to bottom-most level +d@2=2 <- can be removed +``` + +If we do a compaction over these keys, we will get: + +``` +a@4=del +a@3=3 +b@1=1 +c@4=4 +d@3=del (can be removed if compacting to bottom-most level) +``` + +Assume these are all keys in the engine. If we do a scan at ts=3, we will get `a=3,b=1,c=4` before/after compaction. If we do a scan at ts=4, we will get `b=1,c=4` before/after compaction. Compaction *will not* and *should not* affect transactions with read timestamp >= watermark. + +## Bonus Tasks + +* **O(1) Watermark.** You may implement an amortized O(1) watermark structure by using a hash map or a cyclic queue. {{#include copyright.md}}