update week 2 starter code

Signed-off-by: Alex Chi <iskyzh@gmail.com>
This commit is contained in:
Alex Chi
2024-01-22 22:05:47 +08:00
parent d694f8fb00
commit 39924ee538
5 changed files with 59 additions and 4 deletions

View File

@@ -64,7 +64,7 @@ In this task, you will need to modify,
src/iterators/concat.rs
```
Now that you have created sorted runs in your system, it is possible to do a simple optimization over the read path. You do not always need to create merge iterators for your SSTs. If SSTs belong to one sorted run, you can create a concat iterator that simply iterates the keys in each SST in order, because SSTs in one sorted run do not contain overlapping key ranges and they are sorted by their first key.
Now that you have created sorted runs in your system, it is possible to do a simple optimization over the read path. You do not always need to create merge iterators for your SSTs. If SSTs belong to one sorted run, you can create a concat iterator that simply iterates the keys in each SST in order, because SSTs in one sorted run do not contain overlapping key ranges and they are sorted by their first key. We do not want to create all SST iterators in advance (because it will lead to one block read), and therefore we only store SST objects in this iterator.
## Task 3: Integrate with the Read Path
@@ -88,6 +88,7 @@ You will need to implement `num_active_iterators` for concat iterator so that th
* Is it correct that a key will take some storage space even if a user requests to delete it?
* Given that compaction takes a lot of write bandwidth and read bandwidth and may interfere with foreground operations, it is a good idea to postpone compaction when there are large write flow. It is even beneficial to stop/pause existing compaction tasks in this situation. What do you think of this idea? (Read the Slik paper!)
* Is it a good idea to use/fill the block cache for compactions? Or is it better to fully bypass the block cache when compaction?
* Does it make sense to have a `struct ConcatIterator<I: StorageIterator>` in the system?
* Some researchers/engineers propose to offload compaction to a remote server or a serverless lambda function. What are the benefits, and what might be the potential challenges and performance impacts of doing remote compaction? (Think of the point when a compaction completes and the block cache...)
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.

View File

@@ -23,13 +23,16 @@ pub enum CompactionTask {
Leveled(LeveledCompactionTask),
Tiered(TieredCompactionTask),
Simple(SimpleLeveledCompactionTask),
ForceFullCompaction(Vec<usize>),
ForceFullCompaction {
l0_sstables: Vec<usize>,
l1_sstables: Vec<usize>,
},
}
impl CompactionTask {
fn compact_to_bottom_level(&self) -> bool {
match self {
CompactionTask::ForceFullCompaction(_) => true,
CompactionTask::ForceFullCompaction { .. } => true,
CompactionTask::Leveled(task) => task.is_lower_level_bottom_level,
CompactionTask::Simple(task) => task.is_lower_level_bottom_level,
CompactionTask::Tiered(task) => task.bottom_tier_included,

View File

@@ -1,3 +1,4 @@
pub mod concat_iterator;
pub mod merge_iterator;
pub mod two_merge_iterator;

View File

@@ -0,0 +1,49 @@
#![allow(unused_variables)] // TODO(you): remove this lint after implementing this mod
#![allow(dead_code)] // TODO(you): remove this lint after implementing this mod
use std::sync::Arc;
use anyhow::Result;
use super::StorageIterator;
use crate::table::{SsTable, SsTableIterator};
/// Concat multiple iterators ordered in key order and their key ranges do not overlap. We do not want to create the
/// iterators when initializing this iterator to reduce the overhead of seeking.
pub struct SstConcatIterator {
current: Option<SsTableIterator>,
next_sst_idx: usize,
sstables: Vec<Arc<SsTable>>,
}
impl SstConcatIterator {
pub fn create_and_seek_to_first(sstables: Vec<Arc<SsTable>>) -> Result<Self> {
unimplemented!()
}
pub fn create_and_seek_to_key(sstables: Vec<Arc<SsTable>>, key: &[u8]) -> Result<Self> {
unimplemented!()
}
}
impl StorageIterator for SstConcatIterator {
fn key(&self) -> &[u8] {
unimplemented!()
}
fn value(&self) -> &[u8] {
unimplemented!()
}
fn is_valid(&self) -> bool {
unimplemented!()
}
fn next(&mut self) -> Result<()> {
unimplemented!()
}
fn num_active_iterators(&self) -> usize {
1
}
}

View File

@@ -46,7 +46,8 @@ impl LsmStorageState {
..=*max_levels)
.map(|level| (level, Vec::new()))
.collect::<Vec<_>>(),
CompactionOptions::Tiered(_) | CompactionOptions::NoCompaction => Vec::new(),
CompactionOptions::Tiered(_) => Vec::new(),
CompactionOptions::NoCompaction => vec![(0, Vec::new())],
};
Self {
memtable: Arc::new(MemTable::create(0)),