more interesting questions

Signed-off-by: Alex Chi <iskyzh@gmail.com>
2024-05-21 23:47:50 -04:00
parent 7d69cab60b
commit 14518aa7a8
4 changed files with 5 additions and 0 deletions
--- a/mini-lsm-book/src/week2-05-manifest.md
+++ b/mini-lsm-book/src/week2-05-manifest.md
@@ -90,6 +90,8 @@ get 1500
 * When do you need to call `fsync`? Why do you need to fsync the directory?
 * What are the places you will need to write to the manifest?
 * Consider an alternative implementation of an LSM engine that does not use a manifest file. Instead, it records the level/tier information in the header of each file, scans the storage directory every time it restarts, and recover the LSM state solely from the files present in the directory. Is it possible to correctly maintain the LSM state in this implementation and what might be the problems/challenges with that?
+* Currently, we create all SST/concat iterators before creating the merge iterator, which means that we have to load the first block of the first SST in all levels into memory before starting the scanning process. We have start/end key in the manifest, and is it possible to leverage this information to delay the loading of the data blocks and make the time to return the first key-value pair faster?
+* Is it possible not to store the tier/level information in the manifest? i.e., we only store the list of SSTs we have in the manifest without the level information, and rebuild the tier/level using the key range and timestamp information (SST metadata).

 ## Bonus Tasks