update toc for week 2

Signed-off-by: Alex Chi <iskyzh@gmail.com>
This commit is contained in:
Alex Chi
2024-01-22 01:10:50 +08:00
parent fa27def116
commit d8dd95a1d6
8 changed files with 50 additions and 2 deletions

View File

@@ -8,6 +8,14 @@ In this chapter, you will:
* Implement the logic to update the LSM states and manage SST files on the filesystem. * Implement the logic to update the LSM states and manage SST files on the filesystem.
* Update LSM read path to incorporate the LSM levels. * Update LSM read path to incorporate the LSM levels.
## Task 1: Compaction Implementation
## Task 2: Update the LSM State
## Task 3: Concat Iterator
## Task 4: Integrate with the Read Path
## Test Your Understanding ## Test Your Understanding
* What are the definitions of read/write/space amplifications? (This is covered in the overview chapter) * What are the definitions of read/write/space amplifications? (This is covered in the overview chapter)

View File

@@ -7,6 +7,12 @@ In this chapter, you will:
* Implement a simple leveled compaction strategy and simulate it on the compaction simulator. * Implement a simple leveled compaction strategy and simulate it on the compaction simulator.
* Start compaction as a background task and implement a compaction trigger in the system. * Start compaction as a background task and implement a compaction trigger in the system.
## Task 1: Simple Level Compaction
## Task 2: Compaction Simulation
## Task 3: Integrate with the Read Path
## Test Your Understanding ## Test Your Understanding
* Is it correct that a key will only be purged from the LSM tree if the user requests to delete it and it has been compacted in the bottom-most level? * Is it correct that a key will only be purged from the LSM tree if the user requests to delete it and it has been compacted in the bottom-most level?

View File

@@ -9,6 +9,12 @@ In this chapter, you will:
The tiered compaction we talk about in this chapter is the same as RocksDB's universal compaction. We will use these two terminologies interchangeably. The tiered compaction we talk about in this chapter is the same as RocksDB's universal compaction. We will use these two terminologies interchangeably.
## Task 1: Universal Compaction
## Task 2: Compaction Simulation
## Task 3: Integrate with the Read Path
## Test Your Understanding ## Test Your Understanding
* What are the pros/cons of universal compaction compared with simple leveled/tiered compaction? * What are the pros/cons of universal compaction compared with simple leveled/tiered compaction?

View File

@@ -7,6 +7,12 @@ In this chapter, you will:
* Implement a leveled compaction strategy and simulate it on the compaction simulator. * Implement a leveled compaction strategy and simulate it on the compaction simulator.
* Incorporate leveled compaction strategy into the system. * Incorporate leveled compaction strategy into the system.
## Task 1: Leveled Compaction
## Task 2: Compaction Simulation
## Task 3: Integrate with the Read Path
## Test Your Understanding ## Test Your Understanding
* Finding a good key split point for compaction may potentially reduce the write amplification, or it does not matter at all? * Finding a good key split point for compaction may potentially reduce the write amplification, or it does not matter at all?

View File

@@ -7,4 +7,10 @@ In this chapter, you will:
* Implement encoding and decoding of the manifest file. * Implement encoding and decoding of the manifest file.
* Recover from the manifest when the system restarts. * Recover from the manifest when the system restarts.
## Task 1: Manifest Encoding
## Task 2: Write Manifests
## Task 3: Recover from the State
{{#include copyright.md}} {{#include copyright.md}}

View File

@@ -7,6 +7,12 @@ In this chapter, you will:
* Implement encoding and decoding of the write-ahead log file. * Implement encoding and decoding of the write-ahead log file.
* Recover memtables from the WALs when the system restarts. * Recover memtables from the WALs when the system restarts.
## Task 1: WAL Encoding
## Task 2: Write WALs
## Task 3: Recover from the WALs
## Test Your Understanding ## Test Your Understanding
* When can you tell the user that their modifications (put/delete) have been persisted? * When can you tell the user that their modifications (put/delete) have been persisted?

View File

@@ -9,6 +9,16 @@ In this chapter, you will:
* Implement the batch write interface. * Implement the batch write interface.
* Add checksums to the blocks, SST metadata, manifest, and WALs. * Add checksums to the blocks, SST metadata, manifest, and WALs.
## Task 1: Write Batch Interface
## Task 2: Block Checksum
## Task 3: SST Checksum
## Task 4: WAL Checksum
## Task 5: Manifest Checksum
## Test Your Understanding ## Test Your Understanding
* Consider the case that an LSM storage engine only provides `write_batch` as the write interface (instead of single put + delete). Is it possible to implement it as follows: there is a single write thread with an mpsc channel receiver to get the changes, and all threads send write batches to the write thread. The write thread is the single point to write to the database. What are the pros/cons of this implementation? (Congrats if you do so you get BadgerDB!) * Consider the case that an LSM storage engine only provides `write_batch` as the write interface (instead of single put + delete). Is it possible to implement it as follows: there is a single write thread with an mpsc channel receiver to get the changes, and all threads send write batches to the write thread. The write thread is the single point to write to the database. What are the pros/cons of this implementation? (Congrats if you do so you get BadgerDB!)

View File

@@ -39,7 +39,7 @@ SST 6: key range 06000 - key 10010, 1000 keys
The 3 new SSTs are created by merging SST 1, 2, and 3. We can get a sorted 3000 keys and then split them into 3 files, so as to avoid having a super large SST file. Now our LSM state has 3 non-overlapping SSTs, and we only need to access SST 4 to find key 02333. The 3 new SSTs are created by merging SST 1, 2, and 3. We can get a sorted 3000 keys and then split them into 3 files, so as to avoid having a super large SST file. Now our LSM state has 3 non-overlapping SSTs, and we only need to access SST 4 to find key 02333.
## Two Extremes and Write Amplification ## Two Extremes of Compaction and Write Amplification
So from the above example, we have 2 naive ways of handling the LSM structure -- not doing compactions at all, and always do full compaction when new SSTs are flushed. So from the above example, we have 2 naive ways of handling the LSM structure -- not doing compactions at all, and always do full compaction when new SSTs are flushed.
@@ -59,7 +59,7 @@ Compaction strategies usually aim to control the number of sorted runs, so as to
In leveled compaction, the user can specify a maximum number of levels, which is the number of sorted runs in the system (except L0). For example, RocksDB usually keeps 6 levels (sorted runs) in leveled compaction mode. During the compaction process, SSTs from two adjacent levels will be merged and then the produced SSTs will be put to the lower level of the two levels. The sorted runs (levels) grow exponentially in size -- the lower level will be < some number x > of the upper level in size. In leveled compaction, the user can specify a maximum number of levels, which is the number of sorted runs in the system (except L0). For example, RocksDB usually keeps 6 levels (sorted runs) in leveled compaction mode. During the compaction process, SSTs from two adjacent levels will be merged and then the produced SSTs will be put to the lower level of the two levels. The sorted runs (levels) grow exponentially in size -- the lower level will be < some number x > of the upper level in size.
In tiered compaction, the engine will dynamically adjust the number of sorted runs by merging them to minimize write amplification. The number of sorted runs can be high if the compaction strategy does not choose to merge them, therefore making read amplification high. In this tutorial, we will implement RocksDB's universal compaction, which is a kind of tiered compaction strategy. In tiered compaction, the engine will dynamically adjust the number of sorted runs by merging them or letting new SSTs flushed as new sorted run (a tier) to minimize write amplification. The number of tiers can be high if the compaction strategy does not choose to merge tiers, therefore making read amplification high. In this tutorial, we will implement RocksDB's universal compaction, which is a kind of tiered compaction strategy.
## Space Amplification ## Space Amplification