add w1d1 and update starter code
Signed-off-by: Alex Chi <iskyzh@gmail.com>
This commit is contained in:
		| @@ -10,8 +10,128 @@ In this chapter, you will: | |||||||
|  |  | ||||||
| ## Task 1: SkipList Memtable | ## Task 1: SkipList Memtable | ||||||
|  |  | ||||||
| ## Task 2: Write Path - Freezing a Memtable | In this task, you will need to modify: | ||||||
|  |  | ||||||
| ## Task 3: Read Path - Get | ``` | ||||||
|  | src/mem_table.rs | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | Firstly, let us implement the in-memory structure of an LSM storage engine -- the memtable. We choose [crossbeam's skiplist implementation](link) as the data structure of the memtable as it supports lock-free concurrent read and write. We will not cover in-depth how a skiplist works, and in a nutshell, it is an ordered key-value map that easily allows concurrent read and write. | ||||||
|  |  | ||||||
|  | crossbeam-skiplist provides similar interfaces to the Rust std's `BTreeMap`: insert, get, and iter. The only difference is that the modification interfaces (i.e., `insert`) only require an immutable reference to the skiplist, instead of a mutable one. Therefore, in your implementation, you should not take any mutex when implementing the memtable structure. | ||||||
|  |  | ||||||
|  | You will also notice that the `MemTable` structure does not have a `delete` interface. In the mini-lsm implementation, deletion is represented as a key corresponding to an empty value. | ||||||
|  |  | ||||||
|  | In this task, you will need to implement `MemTable::get` and `MemTable::put` to enable modifications of the memtable. | ||||||
|  |  | ||||||
|  | ## Task 2: A Single Memtable in the Engine | ||||||
|  |  | ||||||
|  | In this task, you will need to modify: | ||||||
|  |  | ||||||
|  | ``` | ||||||
|  | src/lsm_storage.rs | ||||||
|  | src/mem_table.rs | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | Now, we will add our first data structure, the memtable, to the LSM state. In `LsmStorageState::create`, you will find that when a LSM structure is created, we will initialize a memtable of id 0. This is the **mutable memtable** in the initial state. At any point of the time, the engine will have only one single mutable memtable. A memtable usually has a size limit (i.e., 256MB), and it will be frozen to an immutable memtable when it reaches the size limit. | ||||||
|  |  | ||||||
|  | Taking a look at `lsm_storage.rs`, you will find there are two structures that represents a storage engine: `MiniLSM` and `LsmStorageInner`. `MiniLSM` is a thin wrapper for `LsmStorageInner`. You will implement most of the functionalities in `LsmStorageInner`, until week 2 compaction. | ||||||
|  |  | ||||||
|  | `LsmStorageState` stores the current structure of the LSM storage engine. For now, we will only use the `memtable` field, which stores the current memtable. In this task, you will need to implement `LsmStorageInner::get`, `LsmStorageInner::put`, and `LsmStorageInner::delete`. All of them should directly dispatch the request to the current memtable. | ||||||
|  |  | ||||||
|  | Your `delete` implementation should simply put an empty slice for that key, and we call it a *delete tombstone*. Your `get` implementation should handle this case correspondingly. | ||||||
|  |  | ||||||
|  | To access the memtable, you will need to take the `state` lock. As our memtable implementation only requires an immutable reference for `put`, you ONLY need to take the read lock on `state` in order to modify the memtable. This allows concurrent access to the memtable from multiple threads. | ||||||
|  |  | ||||||
|  | ## Task 3: Write Path - Freezing a Memtable | ||||||
|  |  | ||||||
|  | A memtable cannot continuously grow in size, and we will need to freeze them (and later flush to the disk) when it reaches the size limit. You may find the memtable size limit, which is equal to the SST size limit, in the `LsmStorageOptions`. This is not a hard limit and you should freeze the memtable at best effort. | ||||||
|  |  | ||||||
|  | In this task, you will need to compute the approximate memtable size when put/delete a key in the memtable. This can be computed by simply adding the total number of bytes of keys and values when `put` is called. Is a key is put twice, though the skiplist only contains the latest value, you may count it twice in the approximate memtable size. Once a memtable reaches the limit, you should call `force_freeze_memtable` to freeze the memtable and create a new one. | ||||||
|  |  | ||||||
|  | Because there could be multiple threads getting data into the storage engine, `force_freeze_memtable` might be called concurrently from multiple threads. You will need to think about how to avoid race conditions in this case. | ||||||
|  |  | ||||||
|  | There are multiple places where you may want to modify the LSM state: freeze a mutable memtable, flush memtable to SST, and GC/compaction. During all of these modifications, there could be I/O operations. An intuitive way to structure the locking strategy is to: | ||||||
|  |  | ||||||
|  | ```rust,no_run | ||||||
|  | fn freeze_memtable(&self) { | ||||||
|  |     let state = self.state.write(); | ||||||
|  |     state.immutable_memtable.push(/* something */); | ||||||
|  |     state.memtable = MemTable::create(); | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | ...that you modify everything in LSM state's write lock. | ||||||
|  |  | ||||||
|  | This works fine for now. However, consider the case where you want to create a write-ahead log file for every memtables you have created. | ||||||
|  |  | ||||||
|  | ```rust,no_run | ||||||
|  | fn freeze_memtable(&self) { | ||||||
|  |     let state = self.state.write(); | ||||||
|  |     state.immutable_memtable.push(/* something */); | ||||||
|  |     state.memtable = MemTable::create_with_wal()?; // <- could take several milliseconds | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | Now when we freeze the memtable, no other threads could have access to the LSM state for several milliseconds, which creates a spike of latency. | ||||||
|  |  | ||||||
|  | To solve this problem, we can put I/O operations outside of the lock region. | ||||||
|  |  | ||||||
|  | ```rust,no_run | ||||||
|  | fn freeze_memtable(&self) { | ||||||
|  |     let memtable = MemTable::create_with_wal()?; // <- could take several milliseconds | ||||||
|  |     { | ||||||
|  |         let state = self.state.write(); | ||||||
|  |         state.immutable_memtable.push(/* something */); | ||||||
|  |         state.memtable = memtable; | ||||||
|  |     } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | Then, we do not have costly operations within the state write lock region. Now, consider the case that the memtable is about to reach the capacity limit and two threads successfully put two keys into the memtable, both of them discovering the memtable reaches capacity limit after putting the two keys. They will both do a size check on the memtable and decide to freeze it. In this case, we might create one empty memtable which is then immediately frozen. | ||||||
|  |  | ||||||
|  | To solve the problem, all state modification should be synchronized through the state lock. | ||||||
|  |  | ||||||
|  | ```rust,no_run | ||||||
|  | fn put(&self, key: &[u8], value: &[u8]) { | ||||||
|  |     // put things into the memtable, checks capacity, and drop the read lock on LSM state | ||||||
|  |     if memtable_reaches_capacity_on_put { | ||||||
|  |         let state_lock = self.state_lock.lock(); | ||||||
|  |         if /* check again current memtable reaches capacity */ { | ||||||
|  |             self.freeze_memtable(&state_lock)?; | ||||||
|  |         } | ||||||
|  |     } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | You will notice this kind of pattern very often in future chapters. For example, for L0 flush, | ||||||
|  |  | ||||||
|  | ```rust,no_run | ||||||
|  | fn force_flush_next_imm_memtable(&self) { | ||||||
|  |     let state_lock = self.state_lock.lock(); | ||||||
|  |     // get the oldest memtable and drop the read lock on LSM state | ||||||
|  |     // write the contents to the disk | ||||||
|  |     // get the write lock on LSM state and update the state | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | This ensures only one thread will be able to modify the LSM state while still allowing concurrent access to the LSM storage. | ||||||
|  |  | ||||||
|  | In this task, you will need to modify `put` and `delete` to respect the soft capacity limit on the memtable. When it reaches the limit, call `force_freeze_memtable` to freeze the memtable. Note that we do not have test cases over this concurrent scenario, and you will need to think about all possible race conditions on your own. Also, remember to check lock regions to ensure the critical regions are the minimum required. | ||||||
|  |  | ||||||
|  | You can simply assign the next memtable id as `self.next_sst_id()`. Note that the `imm_memtables` stores the memtables from the latest one to the earliest one. That is to say, `imm_memtables.first()` should be the last frozen memtable. | ||||||
|  |  | ||||||
|  | ## Task 4: Read Path - Get | ||||||
|  |  | ||||||
|  | Now that you have multiple memtables, you may modify your read path `get` function to get the latest version of a key. Ensure that you probe the memtables from the latest one to the earliest one. | ||||||
|  |  | ||||||
|  | ## Test Your Understanding | ||||||
|  |  | ||||||
|  | * Why doesn't the memtable provide a `delete` API? | ||||||
|  | * Is it possible to use other data structures as the memtable in LSM? What are the pros/cons of using the skiplist? | ||||||
|  | * Why do we need a combination of `state` and `state_lock`? Can we only use `state.read()` and `state.write()`? | ||||||
|  | * Why does the order to store and to probe the memtables matter? | ||||||
|  |  | ||||||
|  | We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community. | ||||||
|  |  | ||||||
| {{#include copyright.md}} | {{#include copyright.md}} | ||||||
|   | |||||||
| @@ -8,12 +8,12 @@ use std::sync::Arc; | |||||||
|  |  | ||||||
| use anyhow::Result; | use anyhow::Result; | ||||||
| use bytes::Bytes; | use bytes::Bytes; | ||||||
| use parking_lot::{Mutex, RwLock}; | use parking_lot::{Mutex, MutexGuard, RwLock}; | ||||||
|  |  | ||||||
| use crate::block::Block; | use crate::block::Block; | ||||||
| use crate::compact::{ | use crate::compact::{ | ||||||
|     CompactionController, CompactionOptions, LeveledCompactionOptions, |     CompactionController, CompactionOptions, LeveledCompactionController, LeveledCompactionOptions, | ||||||
|     SimpleLeveledCompactionOptions, |     SimpleLeveledCompactionController, SimpleLeveledCompactionOptions, TieredCompactionController, | ||||||
| }; | }; | ||||||
| use crate::lsm_iterator::{FusedIterator, LsmIterator}; | use crate::lsm_iterator::{FusedIterator, LsmIterator}; | ||||||
| use crate::manifest::Manifest; | use crate::manifest::Manifest; | ||||||
| @@ -22,13 +22,14 @@ use crate::table::SsTable; | |||||||
|  |  | ||||||
| pub type BlockCache = moka::sync::Cache<(usize, usize), Arc<Block>>; | pub type BlockCache = moka::sync::Cache<(usize, usize), Arc<Block>>; | ||||||
|  |  | ||||||
|  | /// Represents the state of the storage engine. | ||||||
| #[derive(Clone)] | #[derive(Clone)] | ||||||
| pub struct LsmStorageState { | pub struct LsmStorageState { | ||||||
|     /// The current memtable. |     /// The current memtable. | ||||||
|     pub memtable: Arc<MemTable>, |     pub memtable: Arc<MemTable>, | ||||||
|     /// Immutable memtables, from earliest to latest. |     /// Immutable memtables, from latest to earliest. | ||||||
|     pub imm_memtables: Vec<Arc<MemTable>>, |     pub imm_memtables: Vec<Arc<MemTable>>, | ||||||
|     /// L0 SSTs, from earliest to latest. |     /// L0 SSTs, from latest to earliest. | ||||||
|     pub l0_sstables: Vec<usize>, |     pub l0_sstables: Vec<usize>, | ||||||
|     /// SsTables sorted by key range; L1 - L_max for leveled compaction, or tiers for tiered |     /// SsTables sorted by key range; L1 - L_max for leveled compaction, or tiers for tiered | ||||||
|     /// compaction. |     /// compaction. | ||||||
| @@ -58,8 +59,11 @@ impl LsmStorageState { | |||||||
| } | } | ||||||
|  |  | ||||||
| pub struct LsmStorageOptions { | pub struct LsmStorageOptions { | ||||||
|  |     // Block size in bytes | ||||||
|     pub block_size: usize, |     pub block_size: usize, | ||||||
|  |     // SST size in bytes, also the approximate memtable capacity limit | ||||||
|     pub target_sst_size: usize, |     pub target_sst_size: usize, | ||||||
|  |     // Maximum number of memtables in memory, flush to L0 when exceeding this limit | ||||||
|     pub num_memtable_limit: usize, |     pub num_memtable_limit: usize, | ||||||
|     pub compaction_options: CompactionOptions, |     pub compaction_options: CompactionOptions, | ||||||
|     pub enable_wal: bool, |     pub enable_wal: bool, | ||||||
| @@ -72,7 +76,7 @@ impl LsmStorageOptions { | |||||||
|             target_sst_size: 2 << 20, |             target_sst_size: 2 << 20, | ||||||
|             compaction_options: CompactionOptions::NoCompaction, |             compaction_options: CompactionOptions::NoCompaction, | ||||||
|             enable_wal: false, |             enable_wal: false, | ||||||
|             num_memtable_limit: 3, |             num_memtable_limit: 50, | ||||||
|         } |         } | ||||||
|     } |     } | ||||||
| } | } | ||||||
| @@ -86,12 +90,15 @@ pub(crate) struct LsmStorageInner { | |||||||
|     next_sst_id: AtomicUsize, |     next_sst_id: AtomicUsize, | ||||||
|     pub(crate) options: Arc<LsmStorageOptions>, |     pub(crate) options: Arc<LsmStorageOptions>, | ||||||
|     pub(crate) compaction_controller: CompactionController, |     pub(crate) compaction_controller: CompactionController, | ||||||
|     pub(crate) manifest: Manifest, |     pub(crate) manifest: Option<Manifest>, | ||||||
| } | } | ||||||
|  |  | ||||||
|  | /// A thin wrapper for `LsmStorageInner` and the user interface for MiniLSM. | ||||||
| pub struct MiniLsm { | pub struct MiniLsm { | ||||||
|     pub(crate) inner: Arc<LsmStorageInner>, |     pub(crate) inner: Arc<LsmStorageInner>, | ||||||
|  |     /// Notifies the compaction thread to stop working. (In week 2) | ||||||
|     compaction_notifier: crossbeam_channel::Sender<()>, |     compaction_notifier: crossbeam_channel::Sender<()>, | ||||||
|  |     /// The handle for the compaction thread. (In week 2) | ||||||
|     compaction_thread: Mutex<Option<std::thread::JoinHandle<()>>>, |     compaction_thread: Mutex<Option<std::thread::JoinHandle<()>>>, | ||||||
| } | } | ||||||
|  |  | ||||||
| @@ -106,8 +113,17 @@ impl MiniLsm { | |||||||
|         unimplemented!() |         unimplemented!() | ||||||
|     } |     } | ||||||
|  |  | ||||||
|     pub fn open(_path: impl AsRef<Path>, _options: LsmStorageOptions) -> Result<Arc<Self>> { |     /// Start the storage engine by either loading an existing directory or creating a new one if the directory does | ||||||
|         unimplemented!() |     /// not exist. | ||||||
|  |     pub fn open(path: impl AsRef<Path>, options: LsmStorageOptions) -> Result<Arc<Self>> { | ||||||
|  |         let inner = Arc::new(LsmStorageInner::open(path, options)?); | ||||||
|  |         let (tx, rx) = crossbeam_channel::unbounded(); | ||||||
|  |         let compaction_thread = inner.spawn_compaction_thread(rx)?; | ||||||
|  |         Ok(Arc::new(Self { | ||||||
|  |             inner, | ||||||
|  |             compaction_notifier: tx, | ||||||
|  |             compaction_thread: Mutex::new(compaction_thread), | ||||||
|  |         })) | ||||||
|     } |     } | ||||||
|  |  | ||||||
|     pub fn get(&self, key: &[u8]) -> Result<Option<Bytes>> { |     pub fn get(&self, key: &[u8]) -> Result<Option<Bytes>> { | ||||||
| @@ -131,7 +147,8 @@ impl MiniLsm { | |||||||
|     } |     } | ||||||
|  |  | ||||||
|     pub fn force_flush(&self) -> Result<()> { |     pub fn force_flush(&self) -> Result<()> { | ||||||
|         self.inner.force_freeze_memtable()?; |         self.inner | ||||||
|  |             .force_freeze_memtable(&self.inner.state_lock.lock())?; | ||||||
|         self.inner.force_flush_next_imm_memtable() |         self.inner.force_flush_next_imm_memtable() | ||||||
|     } |     } | ||||||
|  |  | ||||||
| @@ -146,8 +163,37 @@ impl LsmStorageInner { | |||||||
|             .fetch_add(1, std::sync::atomic::Ordering::SeqCst) |             .fetch_add(1, std::sync::atomic::Ordering::SeqCst) | ||||||
|     } |     } | ||||||
|  |  | ||||||
|     pub(crate) fn open(_path: impl AsRef<Path>, _options: LsmStorageOptions) -> Result<Self> { |     /// Start the storage engine by either loading an existing directory or creating a new one if the directory does | ||||||
|         unimplemented!() |     /// not exist. | ||||||
|  |     pub(crate) fn open(path: impl AsRef<Path>, options: LsmStorageOptions) -> Result<Self> { | ||||||
|  |         let path = path.as_ref(); | ||||||
|  |         let state = LsmStorageState::create(&options); | ||||||
|  |  | ||||||
|  |         let compaction_controller = match &options.compaction_options { | ||||||
|  |             CompactionOptions::Leveled(options) => { | ||||||
|  |                 CompactionController::Leveled(LeveledCompactionController::new(options.clone())) | ||||||
|  |             } | ||||||
|  |             CompactionOptions::Tiered(options) => { | ||||||
|  |                 CompactionController::Tiered(TieredCompactionController::new(options.clone())) | ||||||
|  |             } | ||||||
|  |             CompactionOptions::Simple(options) => CompactionController::Simple( | ||||||
|  |                 SimpleLeveledCompactionController::new(options.clone()), | ||||||
|  |             ), | ||||||
|  |             CompactionOptions::NoCompaction => CompactionController::NoCompaction, | ||||||
|  |         }; | ||||||
|  |  | ||||||
|  |         let storage = Self { | ||||||
|  |             state: Arc::new(RwLock::new(Arc::new(state))), | ||||||
|  |             state_lock: Mutex::new(()), | ||||||
|  |             path: path.to_path_buf(), | ||||||
|  |             block_cache: Arc::new(BlockCache::new(1024)), | ||||||
|  |             next_sst_id: AtomicUsize::new(1), | ||||||
|  |             compaction_controller, | ||||||
|  |             manifest: None, | ||||||
|  |             options: options.into(), | ||||||
|  |         }; | ||||||
|  |  | ||||||
|  |         Ok(storage) | ||||||
|     } |     } | ||||||
|  |  | ||||||
|     /// Get a key from the storage. In day 7, this can be further optimized by using a bloom filter. |     /// Get a key from the storage. In day 7, this can be further optimized by using a bloom filter. | ||||||
| @@ -185,8 +231,8 @@ impl LsmStorageInner { | |||||||
|         unimplemented!() |         unimplemented!() | ||||||
|     } |     } | ||||||
|  |  | ||||||
|     /// Force freeze the current memetable to an immutable memtable |     /// Force freeze the current memtable to an immutable memtable | ||||||
|     pub fn force_freeze_memtable(&self) -> Result<()> { |     pub fn force_freeze_memtable(&self, _state_lock_observer: &MutexGuard<'_, ()>) -> Result<()> { | ||||||
|         unimplemented!() |         unimplemented!() | ||||||
|     } |     } | ||||||
|  |  | ||||||
|   | |||||||
| @@ -2,6 +2,7 @@ | |||||||
|  |  | ||||||
| use std::ops::Bound; | use std::ops::Bound; | ||||||
| use std::path::Path; | use std::path::Path; | ||||||
|  | use std::sync::atomic::AtomicUsize; | ||||||
| use std::sync::Arc; | use std::sync::Arc; | ||||||
|  |  | ||||||
| use anyhow::Result; | use anyhow::Result; | ||||||
| @@ -13,13 +14,18 @@ use crate::iterators::StorageIterator; | |||||||
| use crate::table::SsTableBuilder; | use crate::table::SsTableBuilder; | ||||||
| use crate::wal::Wal; | use crate::wal::Wal; | ||||||
|  |  | ||||||
| /// A basic mem-table based on crossbeam-skiplist | /// A basic mem-table based on crossbeam-skiplist. | ||||||
|  | /// | ||||||
|  | /// An initial implementation of memtable is part of week 1, day 1. It will be incrementally implemented in other | ||||||
|  | /// chapters of week 1 and week 2. | ||||||
| pub struct MemTable { | pub struct MemTable { | ||||||
|     map: Arc<SkipMap<Bytes, Bytes>>, |     map: Arc<SkipMap<Bytes, Bytes>>, | ||||||
|     wal: Option<Wal>, |     wal: Option<Wal>, | ||||||
|     id: usize, |     id: usize, | ||||||
|  |     approximate_size: Arc<AtomicUsize>, | ||||||
| } | } | ||||||
|  |  | ||||||
|  | /// Create a bound of `Bytes` from a bound of `&[u8]`. | ||||||
| pub(crate) fn map_bound(bound: Bound<&[u8]>) -> Bound<Bytes> { | pub(crate) fn map_bound(bound: Bound<&[u8]>) -> Bound<Bytes> { | ||||||
|     match bound { |     match bound { | ||||||
|         Bound::Included(x) => Bound::Included(Bytes::copy_from_slice(x)), |         Bound::Included(x) => Bound::Included(Bytes::copy_from_slice(x)), | ||||||
| @@ -50,6 +56,9 @@ impl MemTable { | |||||||
|     } |     } | ||||||
|  |  | ||||||
|     /// Put a key-value pair into the mem-table. |     /// Put a key-value pair into the mem-table. | ||||||
|  |     /// | ||||||
|  |     /// In week 1, day 1, simply put the key-value pair into the skipmap. | ||||||
|  |     /// In week 2, day 6, also flush the data to WAL. | ||||||
|     pub fn put(&self, _key: &[u8], _value: &[u8]) -> Result<()> { |     pub fn put(&self, _key: &[u8], _value: &[u8]) -> Result<()> { | ||||||
|         unimplemented!() |         unimplemented!() | ||||||
|     } |     } | ||||||
| @@ -74,18 +83,29 @@ impl MemTable { | |||||||
|     pub fn id(&self) -> usize { |     pub fn id(&self) -> usize { | ||||||
|         self.id |         self.id | ||||||
|     } |     } | ||||||
|  |  | ||||||
|  |     pub fn approximate_size(&self) -> usize { | ||||||
|  |         self.approximate_size | ||||||
|  |             .load(std::sync::atomic::Ordering::Relaxed) | ||||||
|  |     } | ||||||
| } | } | ||||||
|  |  | ||||||
| type SkipMapRangeIter<'a> = | type SkipMapRangeIter<'a> = | ||||||
|     crossbeam_skiplist::map::Range<'a, Bytes, (Bound<Bytes>, Bound<Bytes>), Bytes, Bytes>; |     crossbeam_skiplist::map::Range<'a, Bytes, (Bound<Bytes>, Bound<Bytes>), Bytes, Bytes>; | ||||||
|  |  | ||||||
| /// An iterator over a range of `SkipMap`. | /// An iterator over a range of `SkipMap`. This is a self-referential structure and please refer to week 1, day 2 | ||||||
|  | /// chapter for more information. | ||||||
|  | /// | ||||||
|  | /// This is part of week 1, day 2. | ||||||
| #[self_referencing] | #[self_referencing] | ||||||
| pub struct MemTableIterator { | pub struct MemTableIterator { | ||||||
|  |     /// Stores a reference to the skipmap. | ||||||
|     map: Arc<SkipMap<Bytes, Bytes>>, |     map: Arc<SkipMap<Bytes, Bytes>>, | ||||||
|  |     /// Stores a skipmap iterator that refers to the lifetime of `MemTableIterator` itself. | ||||||
|     #[borrows(map)] |     #[borrows(map)] | ||||||
|     #[not_covariant] |     #[not_covariant] | ||||||
|     iter: SkipMapRangeIter<'this>, |     iter: SkipMapRangeIter<'this>, | ||||||
|  |     /// Stores the current key-value pair. | ||||||
|     item: (Bytes, Bytes), |     item: (Bytes, Bytes), | ||||||
| } | } | ||||||
|  |  | ||||||
| @@ -106,6 +126,3 @@ impl StorageIterator for MemTableIterator { | |||||||
|         unimplemented!() |         unimplemented!() | ||||||
|     } |     } | ||||||
| } | } | ||||||
|  |  | ||||||
| #[cfg(test)] |  | ||||||
| mod tests; |  | ||||||
|   | |||||||
| @@ -1 +0,0 @@ | |||||||
| //! Please copy `mini-lsm/src/mem_table/tests.rs` here so that you can run tests. |  | ||||||
| @@ -1 +1 @@ | |||||||
| pub mod day4_tests; |  | ||||||
|   | |||||||
| @@ -1 +0,0 @@ | |||||||
| //! Please copy `mini-lsm/src/tests/day4_tests.rs` here so that you can run tests. |  | ||||||
| @@ -1 +1 @@ | |||||||
| pub mod day4_tests; |  | ||||||
|   | |||||||
							
								
								
									
										125
									
								
								mini-lsm/src/tests/day1.rs
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										125
									
								
								mini-lsm/src/tests/day1.rs
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,125 @@ | |||||||
|  | use tempfile::tempdir; | ||||||
|  |  | ||||||
|  | use crate::{ | ||||||
|  |     lsm_storage::{LsmStorageInner, LsmStorageOptions}, | ||||||
|  |     mem_table::MemTable, | ||||||
|  | }; | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task1_memtable_get() { | ||||||
|  |     let memtable = MemTable::create(0); | ||||||
|  |     memtable.put(b"key1", b"value1").unwrap(); | ||||||
|  |     memtable.put(b"key2", b"value2").unwrap(); | ||||||
|  |     memtable.put(b"key3", b"value3").unwrap(); | ||||||
|  |     assert_eq!(&memtable.get(b"key1").unwrap()[..], b"value1"); | ||||||
|  |     assert_eq!(&memtable.get(b"key2").unwrap()[..], b"value2"); | ||||||
|  |     assert_eq!(&memtable.get(b"key3").unwrap()[..], b"value3"); | ||||||
|  | } | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task1_memtable_overwrite() { | ||||||
|  |     let memtable = MemTable::create(0); | ||||||
|  |     memtable.put(b"key1", b"value1").unwrap(); | ||||||
|  |     memtable.put(b"key2", b"value2").unwrap(); | ||||||
|  |     memtable.put(b"key3", b"value3").unwrap(); | ||||||
|  |     memtable.put(b"key1", b"value11").unwrap(); | ||||||
|  |     memtable.put(b"key2", b"value22").unwrap(); | ||||||
|  |     memtable.put(b"key3", b"value33").unwrap(); | ||||||
|  |     assert_eq!(&memtable.get(b"key1").unwrap()[..], b"value11"); | ||||||
|  |     assert_eq!(&memtable.get(b"key2").unwrap()[..], b"value22"); | ||||||
|  |     assert_eq!(&memtable.get(b"key3").unwrap()[..], b"value33"); | ||||||
|  | } | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task2_storage_integration() { | ||||||
|  |     let dir = tempdir().unwrap(); | ||||||
|  |     let storage = | ||||||
|  |         LsmStorageInner::open(dir.path(), LsmStorageOptions::default_for_week1_test()).unwrap(); | ||||||
|  |     assert_eq!(&storage.get(b"0").unwrap(), &None); | ||||||
|  |     storage.put(b"1", b"233").unwrap(); | ||||||
|  |     storage.put(b"2", b"2333").unwrap(); | ||||||
|  |     storage.put(b"3", b"23333").unwrap(); | ||||||
|  |     assert_eq!(&storage.get(b"1").unwrap().unwrap()[..], b"233"); | ||||||
|  |     assert_eq!(&storage.get(b"2").unwrap().unwrap()[..], b"2333"); | ||||||
|  |     assert_eq!(&storage.get(b"3").unwrap().unwrap()[..], b"23333"); | ||||||
|  |     storage.delete(b"2").unwrap(); | ||||||
|  |     assert!(storage.get(b"2").unwrap().is_none()); | ||||||
|  |     storage.delete(b"0").unwrap(); // should NOT report any error | ||||||
|  | } | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task3_storage_integration() { | ||||||
|  |     let dir = tempdir().unwrap(); | ||||||
|  |     let storage = | ||||||
|  |         LsmStorageInner::open(dir.path(), LsmStorageOptions::default_for_week1_test()).unwrap(); | ||||||
|  |     storage.put(b"1", b"233").unwrap(); | ||||||
|  |     storage.put(b"2", b"2333").unwrap(); | ||||||
|  |     storage.put(b"3", b"23333").unwrap(); | ||||||
|  |     storage | ||||||
|  |         .force_freeze_memtable(&storage.state_lock.lock()) | ||||||
|  |         .unwrap(); | ||||||
|  |     assert_eq!(storage.state.read().imm_memtables.len(), 1); | ||||||
|  |     let previous_approximate_size = storage.state.read().imm_memtables[0].approximate_size(); | ||||||
|  |     assert!(previous_approximate_size >= 15); | ||||||
|  |     storage.put(b"1", b"2333").unwrap(); | ||||||
|  |     storage.put(b"2", b"23333").unwrap(); | ||||||
|  |     storage.put(b"3", b"233333").unwrap(); | ||||||
|  |     storage | ||||||
|  |         .force_freeze_memtable(&storage.state_lock.lock()) | ||||||
|  |         .unwrap(); | ||||||
|  |     assert_eq!(storage.state.read().imm_memtables.len(), 2); | ||||||
|  |     assert!( | ||||||
|  |         storage.state.read().imm_memtables[1].approximate_size() == previous_approximate_size, | ||||||
|  |         "wrong order of memtables?" | ||||||
|  |     ); | ||||||
|  |     assert!(storage.state.read().imm_memtables[0].approximate_size() > previous_approximate_size); | ||||||
|  | } | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task3_freeze_on_capacity() { | ||||||
|  |     let dir = tempdir().unwrap(); | ||||||
|  |     let mut options = LsmStorageOptions::default_for_week1_test(); | ||||||
|  |     options.target_sst_size = 1024; | ||||||
|  |     options.num_memtable_limit = 1000; | ||||||
|  |     let storage = LsmStorageInner::open(dir.path(), options).unwrap(); | ||||||
|  |     for _ in 0..1000 { | ||||||
|  |         storage.put(b"1", b"2333").unwrap(); | ||||||
|  |     } | ||||||
|  |     let num_imm_memtables = storage.state.read().imm_memtables.len(); | ||||||
|  |     assert!(num_imm_memtables >= 1, "no memtable frozen?"); | ||||||
|  |     for _ in 0..1000 { | ||||||
|  |         storage.delete(b"1").unwrap(); | ||||||
|  |     } | ||||||
|  |     assert!( | ||||||
|  |         storage.state.read().imm_memtables.len() > num_imm_memtables, | ||||||
|  |         "no more memtable frozen?" | ||||||
|  |     ); | ||||||
|  | } | ||||||
|  |  | ||||||
|  | #[test] | ||||||
|  | fn test_task4_storage_integration() { | ||||||
|  |     let dir = tempdir().unwrap(); | ||||||
|  |     let storage = | ||||||
|  |         LsmStorageInner::open(dir.path(), LsmStorageOptions::default_for_week1_test()).unwrap(); | ||||||
|  |     assert_eq!(&storage.get(b"0").unwrap(), &None); | ||||||
|  |     storage.put(b"1", b"233").unwrap(); | ||||||
|  |     storage.put(b"2", b"2333").unwrap(); | ||||||
|  |     storage.put(b"3", b"23333").unwrap(); | ||||||
|  |     storage | ||||||
|  |         .force_freeze_memtable(&storage.state_lock.lock()) | ||||||
|  |         .unwrap(); | ||||||
|  |     storage.delete(b"1").unwrap(); | ||||||
|  |     storage.delete(b"2").unwrap(); | ||||||
|  |     storage.put(b"3", b"2333").unwrap(); | ||||||
|  |     storage.put(b"4", b"23333").unwrap(); | ||||||
|  |     storage | ||||||
|  |         .force_freeze_memtable(&storage.state_lock.lock()) | ||||||
|  |         .unwrap(); | ||||||
|  |     storage.put(b"1", b"233333").unwrap(); | ||||||
|  |     storage.put(b"3", b"233333").unwrap(); | ||||||
|  |     assert_eq!(storage.state.read().imm_memtables.len(), 2); | ||||||
|  |     assert_eq!(&storage.get(b"1").unwrap().unwrap()[..], b"233333"); | ||||||
|  |     assert_eq!(&storage.get(b"2").unwrap(), &None); | ||||||
|  |     assert_eq!(&storage.get(b"3").unwrap().unwrap()[..], b"233333"); | ||||||
|  |     assert_eq!(&storage.get(b"4").unwrap().unwrap()[..], b"23333"); | ||||||
|  | } | ||||||
		Reference in New Issue
	
	Block a user
	 Alex Chi
					Alex Chi