finish 3.2 3.3

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>
This commit is contained in:
Alex Chi Z
2024-01-29 22:44:49 +08:00
parent 3cecf09d59
commit 1e49ba8a07
5 changed files with 184 additions and 17 deletions

View File

@@ -85,11 +85,11 @@ We are working on chapter 3 and more test cases for all existing contents.
| Week + Chapter | Topic | Solution | Starter Code | Writeup |
| -------------- | ----------------------------------------------- | -------- | ------------ | ------- |
| 3.1 | Timestamp Key Encoding | ✅ | ✅ | ✅ |
| 3.2 | Snapshot Read - Blocks, Memtables, and SSTs | ✅ | ✅ | 🚧 |
| 3.3 | Snapshot Read - Engine Read Path | ✅ | ✅ | 🚧 |
| 3.2 | Snapshot Read - Blocks, Memtables, and SSTs | ✅ | ✅ | |
| 3.3 | Snapshot Read - Engine Read Path | ✅ | ✅ | |
| 3.4 | Watermark and Garbage Collection | ✅ | 🚧 | 🚧 |
| 3.5 | Transactions and Optimistic Concurrency Control | ✅ | | |
| 3.6 | Serializable Snapshot Isolation | ✅ | | |
| 3.5 | Transactions and Optimistic Concurrency Control | ✅ | 🚧 | 🚧 |
| 3.6 | Serializable Snapshot Isolation | ✅ | 🚧 | 🚧 |
| 3.7 | Compaction Filter | 🚧 | | |
## License

View File

@@ -2,7 +2,7 @@
<!-- ![Chapter Overview](./lsm-tutorial/week2-07-overview.svg) -->
In the previous chapter, you already built a full LSM-based storage engine with. At the end of this week, we will implement some easy but important optimizations of the storage engine. Welcome to Mini-LSM's week w snack time!
In the previous chapter, you already built a full LSM-based storage engine. At the end of this week, we will implement some easy but important optimizations of the storage engine. Welcome to Mini-LSM's week 2 snack time!
In this chapter, you will:

View File

@@ -18,7 +18,7 @@ pub struct Key<T: AsRef<[u8]>>(T);
...to:
```rust,no_run
pub struct Key<T: AsRef<[u8]>>(T, u64);
pub struct Key<T: AsRef<[u8]>>(T /* user key */, u64 /* timestamp */);
```
...where we have a timestamp associated with the keys. We only use this key representation internally in the system. On the user interface side, we do not ask users to provide a timestamp, and therefore some structures still use `&[u8]` instead of `KeySlice` in the engine. We will cover the places where we need to change the signature of the functions later. For now, you only need to run,

View File

@@ -11,28 +11,125 @@ During the refactor, you might need to change the signature of some functions fr
## Task 1: MemTable, Write-Ahead Log, and Read Path
Memtable store timestamp, change to scan, encode ts in wal
In this task, you will need to modify:
```
src/wal.rs
src/mem_table.rs
src/lsm_storage.rs
```
We have already made most of the keys in the engine to be a `KeySlice`, which contains a bytes key and a timestamp. However, some part of our system still did not consider the timestamps. In our first task, you will need to modify your memtable and WAL implementation to take timestamps into account.
You will need to first change the type of the `SkipMap` stored in your memtable.
```rust,no_run
pub struct MemTable {
// map: Arc<SkipMap<Bytes, Bytes>>,
map: Arc<SkipMap<KeyBytes, Bytes>>, // Bytes -> KeyBytes
// ...
}
```
After that, you can continue to fix all compiler errors so as to complete this task.
**MemTable::get**
We keep the get interface so that the test cases can still probe a specific version of a key in the memtable. This interface should not be used in your read path after finishing this task. Given that we store `KeyBytes`, which is `(Bytes, u64)` in the skiplist, while the user probe the `KeySlice`, which is `(&[u8], u64)`. We have to find a way to convert the latter to a reference of the former, so that we can retrieve the data in the skiplist.
To do this, you may use unsafe code to force cast the `&[u8]` to be static and use `Bytes::from_static` to create a bytes object from a static slice. This is sound because `Bytes` will not try to free the memory of the slice as it is assumed static.
<details>
<summary>Spoilers: Convert u8 slice to Bytes</summary>
```rust,no_run
Bytes::from_static(unsafe { std::mem::transmute(key.key_ref()) })
```
</details>
This was not a problem because what we had before is `Bytes` and `&[u8]`, where `Bytes` implements `Borrow<[u8]>`.
**MemTable::put**
The signature should be changed to `fn put(&self, key: KeySlice, value: &[u8])` and You will need to convert a key slice to a `KeyBytes` in your implementation.
**MemTable::scan**
The signature should be changed to `fn scan(&self, lower: Bound<KeySlice>, upper: Bound<KeySlice>) -> MemTableIterator`. You will need to convert `KeySlice` to `KeyBytes` and use these as `SkipMap::range` parameters.
**MemTable::flush**
Instead of using the default timestamp, you should now use the key timestamp when flushing the memtable to the SST.
**MemTableIterator**
It should now store `(KeyBytes, Bytes)` and the return key type should be `KeySlice`.
**Wal::recover** and **Wal::put**
Write-ahead log should now accept a key slice instead of a user key slice. When serializing and deserializing the WAL record, you should put timestamp into the WAL file and do checksum over the timestamp and all other fields you had before.
**LsmStorageInner::get**
Previously, we implement `get` as first probe the memtables and then scan the SSTs. Now that we change the memtable to use the new key-ts APIs, we will need to re-implement the `get` interface. The easiest way to do this is to create a merge iterator over everything we have -- memtables, immutable memtables, L0 SSTs, and other level SSTs, the same as what you have done in `scan`, except that we do a bloom filter filtering over the SSTs.
**LsmStorageInner::scan**
You will need to incorporate the new memtable APIs, and you should set the scan range to be `(user_key_begin, TS_RANGE_BEGIN)` and `(user_key_end, TS_RANGE_END)`. Note that when you handle the exclude boundary, you will need to correctly position the iterator to the next key (instead of the current key of the same timestamp).
## Task 2: Write Path
assign mvcc object, take write lock, increase ts by 1
In this task, you will need to modify:
```
src/lsm_storage.rs
```
We have an `mvcc` field in `LsmStorageInner` that includes all data structures we need to use for multi-version concurrency control in this week. When you open a directory and initialize the storage engine, you will need to create that structure.
In your `write_batch` implementation, you will need to obtain a commit timestamp for all keys in a write batch. You can get the timestamp by using `self.mvcc().latest_commit_ts() + 1` at the beginning of the logic, and `self.mvcc().update_commit_ts(ts)` at the end of the logic to increment the next commit timestamp. To ensure all write batches have different timestamps and new keys are placed on top of old keys, you will need to hold a write lock `self.mvcc().write_lock.lock()` at the beginning of the function, so that only one thread can write to the storage engine at the same time.
## Task 3: MVCC Compaction
keep all versions, split file, run merge iterator tests
In this task, you will need to modify:
```
src/compact.rs
```
What we had done in previous chapters is to only keep the latest version of a key and remove a key when we compact the key to the bottom level if the key is removed. With MVCC, we now have timestamps associated with the keys, and we cannot use the same logic for compaction.
In this chapter, you may simply remove the logic to remove the keys. You may ignore `compact_to_bottom_level` for now, and you should keep ALL versions of a key during the compaction.
Also, you will need to implement the compaction algorithm in a way that the same key with different timestamps are put in the same SST file, *even if* it exceeds the SST size limit. This ensures that if a key is found in an SST in a level, it will not be in other SST files in that level, and therefore simplifying the implementation of many parts of the system.
## Task 4: LSM Iterator
return the latest version
In this task, you will need to modify:
pass all tests except week 2 day 6
```
src/lsm_iterator.rs
```
In the previous chapter, we implemented the LSM iterator to act as viewing the same key with different timestamps as different keys. Now, we will need to refactor the LSM iterator to only return the latest version of a key if multiple versions of the keys are retrieved from the child iterator.
You will need to record `prev_key` in the iterator. If we already returned the latest version of a key to the user, we can skip all old versions and proceed to the next key.
At this point, you should pass all tests in previous chapters except persistence tests (2.5 and 2.6).
## Test Your Understanding
* What is the difference of `get` in the MVCC engine and the engine you built in week 2?
* In week 2, you stop at the first memtable/level where a key is found when `get`. Can you do the same in the MVCC version?
* How do you convert `KeySlice` to `&KeyBytes`? Is it a safe/sound operation?
* Why do we need to take a write lock in the write path?
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
## Bonus Tasks
* **Early Stop for Memtable Gets**. Instead of creating a merge iterator over all memtables and SSTs, we can implement `get` as follows: If we find a version of a key in the memtable, we can stop searching. The same applies to SSTs.
{{#include copyright.md}}

View File

@@ -8,21 +8,92 @@ In this chapter, you will:
At the end of the day, your engine will be able to give the user a consistent view of the storage key space.
## Task 1: Lsm Iterator with Read Timestamp
During the refactor, you might need to change the signature of some functions from `&self` to `self: &Arc<Self>` as necessary.
## Task 1: LSM Iterator with Read Timestamp
The goal of this chapter is to have something like:
```rust,no_run
let snapshot1 = engine.new_txn();
// write something to the engine
let snapshot2 = engine.new_txn();
// write something to the engine
snapshot1.get(/* ... */); // we can retrieve a consistent snapshot of a previous state of the engine
```
To achieve this, we can record the read timestamp (which is the latest committed timestamp) when creating the transaction. When we do a read operation over the transaction, we will only read all versions of the keys below or equal to the read timestamp.
In this task, you will need to modify:
```
src/lsm_iterator.rs
```
To do this, you will need to record a read timestamp in `LsmIterator`.
```rust,no_run
impl LsmIterator {
pub(crate) fn new(
iter: LsmIteratorInner,
end_bound: Bound<Bytes>,
read_ts: u64,
) -> Result<Self> {
// ...
}
}
```
And you will need to change your LSM iterator `next` logic to find the correct key.
## Task 2: Multi-Version Scan and Get
For now, inner = `Fused<LsmIterator>`, do not use `TxnLocalIterator`
In this task, you will need to modify:
explain why store txn inside iterator
```
src/mvcc.rs
src/mvcc/txn.rs
src/lsm_storage.rs
```
do not implement put and delete
Now that we have `read_ts` in the LSM iterator, we can implement `scan` and `get` on the transaction structure, so that we can read data at a given point in the storage engine.
We recommend you to create helper functions like `scan_with_ts(/* original parameters */, read_ts: u64)` and `get_with_ts` if necessary in your `LsmStorageInner` structure. The original get/scan on the storage engine should be implemented as creating a transaction (snapshot) and do a get/scan over that transaction. The call path would be like:
```
LsmStorageInner::scan -> new_txn and Transaction::scan -> LsmStorageInner::scan_with_ts
```
To create a transaction in `LsmStorageInner::scan`, we will need to provide a `Arc<LsmStorageInner>` to the transaction constructor. Therefore, we can change the signature of `scan` to take `self: &Arc<Self>` instead of simply `&self`, so that we can create a transaction with `let txn = self.mvcc().new_txn(self.clone(), /* ... */)`.
You will also need to change your `scan` function to return a `TxnIterator`. We must ensure the snapshot is live when the user iterates the engine, and therefore, `TxnIterator` stores the snapshot object. Inside `TxnIterator`, we can store a `FusedIterator<LsmIterator>` for now. We will change it to something else later when we implement OCC.
You do not need to implement `Transaction::put/delete` for now, and all modifications will still go through the engine.
## Task 3: Store Largest Timestamp in SST
In this task, you will need to modify:
```
src/table.rs
src/table/builder.rs
```
In your SST encoding, you should store the largest timestamp after the block metadata, and recover it when loading the SST. This would help the system decide the latest commit timestamp when recovering the system.
## Task 4: Recover Commit Timestamp
We do not have test cases for this section. You should pass all persistence tests from previous chapters (2.5 and 2.6) after finishing this section.
Now that we have largest timestamp information in the SSTs and timestamp information in the WAL, we can obtain the largest timestamp committed before the engine starts, and use that timestamp as the latest committed timestamp when creating the `mvcc` object.
If WAL is not enabled, you can simply compute the latest committed timestamp by finding the largest timestamp among SSTs. If WAL is enabled, you should further iterate all recovered memtables and find the largest timestamp.
In this task, you will need to modify:
```
src/lsm_storage.rs
```
We do not have test cases for this section. You should pass all persistence tests from previous chapters (including 2.5 and 2.6) after finishing this section.
## Test Your Understanding
@@ -31,5 +102,4 @@ We do not have test cases for this section. You should pass all persistence test
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
{{#include copyright.md}}