@@ -168,7 +168,13 @@ The implementation should be similar to simple leveled compaction. Remember to c
|
|||||||
* Is it true that with a lower `level_size_multiplier`, you can always get a lower write amplification?
|
* Is it true that with a lower `level_size_multiplier`, you can always get a lower write amplification?
|
||||||
* What needs to be done if a user not using compaction at all decides to migrate to leveled compaction?
|
* What needs to be done if a user not using compaction at all decides to migrate to leveled compaction?
|
||||||
* Some people propose to do intra-L0 compaction (compact L0 tables and still put them in L0) before pushing them to lower layers. What might be the benefits of doing so? (Might be related: [PebblesDB SOSP'17](https://www.cs.utexas.edu/~rak/papers/sosp17-pebblesdb.pdf))
|
* Some people propose to do intra-L0 compaction (compact L0 tables and still put them in L0) before pushing them to lower layers. What might be the benefits of doing so? (Might be related: [PebblesDB SOSP'17](https://www.cs.utexas.edu/~rak/papers/sosp17-pebblesdb.pdf))
|
||||||
|
* Consider the case that the upper level has two tables of `[100, 200], [201, 300]` and the lower level has `[50, 150], [151, 250], [251, 350]`. In this case, do you still want to compact one file in the upper level at a time? Why?
|
||||||
|
|
||||||
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
|
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
|
||||||
|
|
||||||
|
## Bonus Tasks
|
||||||
|
|
||||||
|
* **SST Ingestion.** A common optimization of data migration / batch import in LSM trees is to ask the upstream to generate SST files of their data, and directly place these files in the LSM state without going through the write path.
|
||||||
|
* **SST Selection.** Instead of selecting the oldest SST, you may think of other heuristics to choose the SST to compact.
|
||||||
|
|
||||||
{{#include copyright.md}}
|
{{#include copyright.md}}
|
||||||
|
|||||||
@@ -108,6 +108,7 @@ We do not have test cases for this section. You should pass all persistence test
|
|||||||
|
|
||||||
* So far, we have assumed that our SST files use a monotonically increasing id as the file name. Is it okay to use `<level>_<begin_key>_<end_key>_<max_ts>.sst` as the SST file name? What might be the potential problems with that?
|
* So far, we have assumed that our SST files use a monotonically increasing id as the file name. Is it okay to use `<level>_<begin_key>_<end_key>_<max_ts>.sst` as the SST file name? What might be the potential problems with that?
|
||||||
* Consider an alternative implementation of transaction/snapshot. In our implementation, we have `read_ts` in our iterators and transaction context, so that the user can always access a consistent view of one version of the database based on the timestamp. Is it viable to store the current LSM state directly in the transaction context in order to gain a consistent snapshot? (i.e., all SST ids, their level information, and all memtables + ts) What are the pros/cons with that? What if the engine does not have memtables? What if the engine is running on a distributed storage system like S3 object store?
|
* Consider an alternative implementation of transaction/snapshot. In our implementation, we have `read_ts` in our iterators and transaction context, so that the user can always access a consistent view of one version of the database based on the timestamp. Is it viable to store the current LSM state directly in the transaction context in order to gain a consistent snapshot? (i.e., all SST ids, their level information, and all memtables + ts) What are the pros/cons with that? What if the engine does not have memtables? What if the engine is running on a distributed storage system like S3 object store?
|
||||||
|
* Consider that you are implementing a backup utility of the MVCC Mini-LSM engine. Is it enough to simply copy all SST files out without backing up the LSM state? Why or why not?
|
||||||
|
|
||||||
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
|
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
|
||||||
|
|
||||||
|
|||||||
@@ -70,6 +70,11 @@ d@3=del (can be removed if compacting to bottom-most level)
|
|||||||
|
|
||||||
Assume these are all keys in the engine. If we do a scan at ts=3, we will get `a=3,b=1,c=4` before/after compaction. If we do a scan at ts=4, we will get `b=1,c=4` before/after compaction. Compaction *will not* and *should not* affect transactions with read timestamp >= watermark.
|
Assume these are all keys in the engine. If we do a scan at ts=3, we will get `a=3,b=1,c=4` before/after compaction. If we do a scan at ts=4, we will get `b=1,c=4` before/after compaction. Compaction *will not* and *should not* affect transactions with read timestamp >= watermark.
|
||||||
|
|
||||||
|
## Test Your Understanding
|
||||||
|
|
||||||
|
* In our implementation, we manage watermarks by ourselves with the lifecycle of `Transaction` (so-called un-managed mode). If the user intends to manage key timestamps and the watermarks by themselves (i.e., when they have their own timestamp generator), what do you need to do in the write_batch/get/scan API to validate their requests? Is there any architectural assumption we had that might be hard to maintain in this case?
|
||||||
|
* Why do we need to store an `Arc` of `Transaction` inside a transaction iterator?
|
||||||
|
|
||||||
## Bonus Tasks
|
## Bonus Tasks
|
||||||
|
|
||||||
* **O(1) Watermark.** You may implement an amortized O(1) watermark structure by using a hash map or a cyclic queue.
|
* **O(1) Watermark.** You may implement an amortized O(1) watermark structure by using a hash map or a cyclic queue.
|
||||||
|
|||||||
@@ -44,5 +44,10 @@ Your commit implementation should simply collect all key-value pairs from the lo
|
|||||||
## Test Your Understanding
|
## Test Your Understanding
|
||||||
|
|
||||||
* With all the things we have implemented up to this point, does the system satisfy snapshot isolation? If not, what else do we need to do to support snapshot isolation? (Note: snapshot isolation is different from serializable snapshot isolation we will talk about in the next chapter)
|
* With all the things we have implemented up to this point, does the system satisfy snapshot isolation? If not, what else do we need to do to support snapshot isolation? (Note: snapshot isolation is different from serializable snapshot isolation we will talk about in the next chapter)
|
||||||
|
* What if the user wants to batch import data (i.e., 1TB?) If they use the transaction API to do that, will you give them some advice? Is there any opportunity to optimize for this case?
|
||||||
|
|
||||||
|
## Bonus Tasks
|
||||||
|
|
||||||
|
* **Spill to Disk.** If the private workspace of a transaction gets too large, you may flush some of the data to the disk.
|
||||||
|
|
||||||
{{#include copyright.md}}
|
{{#include copyright.md}}
|
||||||
|
|||||||
Reference in New Issue
Block a user