mini_lsm/mini-lsm-book/src/week2-01-compaction.md

# Compaction Implementation

![Chapter Overview](./lsm-tutorial/week2-01-overview.svg)

In this chapter, you will:

* Implement the compaction logic that combines some files and produces new files.
* Implement the logic to update the LSM states and manage SST files on the filesystem.
* Update LSM read path to incorporate the LSM levels.

## Task 1: Compaction Implementation

## Task 2: Update the LSM State

## Task 3: Concat Iterator

## Task 4: Integrate with the Read Path

## Test Your Understanding

* What are the definitions of read/write/space amplifications? (This is covered in the overview chapter)
* What are the ways to accurately compute the read/write/space amplifications, and what are the ways to estimate them?
* Is it correct that a key will take some storage space even if a user requests to delete it?
* Given that compaction takes a lot of write bandwidth and read bandwidth and may interfere with foreground operations, it is a good idea to postpone compaction when there are large write flow. It is even beneficial to stop/pause existing compaction tasks in this situation. What do you think of this idea? (Read the Slik paper!)
* Is it a good idea to use/fill the block cache for compactions? Or is it better to fully bypass the block cache when compaction?
* Some researchers/engineers propose to offload compaction to a remote server or a serverless lambda function. What are the benefits, and what might be the potential challenges and performance impacts of doing remote compaction? (Think of the point when a compaction completes and the block cache...)

We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.

{{#include copyright.md}}
migrate to v2 tutorial Signed-off-by: Alex Chi Z <iskyzh@gmail.com> 2024-01-19 12:00:36 +08:00			`# Compaction Implementation`

			`![Chapter Overview](./lsm-tutorial/week2-01-overview.svg)`
update toc for v2 Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-20 11:55:10 +08:00
			`In this chapter, you will:`

			`* Implement the compaction logic that combines some files and produces new files.`
			`* Implement the logic to update the LSM states and manage SST files on the filesystem.`
			`* Update LSM read path to incorporate the LSM levels.`
copyright notice Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-20 12:05:57 +08:00
update toc for week 2 Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-22 01:10:50 +08:00			`## Task 1: Compaction Implementation`

			`## Task 2: Update the LSM State`

			`## Task 3: Concat Iterator`

			`## Task 4: Integrate with the Read Path`

i love questions Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-21 00:45:10 +08:00			`## Test Your Understanding`

add overview of week 2 Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-22 01:03:34 +08:00			`* What are the definitions of read/write/space amplifications? (This is covered in the overview chapter)`
			`* What are the ways to accurately compute the read/write/space amplifications, and what are the ways to estimate them?`
i love questions Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-21 00:45:10 +08:00			`* Is it correct that a key will take some storage space even if a user requests to delete it?`
			`* Given that compaction takes a lot of write bandwidth and read bandwidth and may interfere with foreground operations, it is a good idea to postpone compaction when there are large write flow. It is even beneficial to stop/pause existing compaction tasks in this situation. What do you think of this idea? (Read the Slik paper!)`
add week 1 day 2 tutorial Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-21 11:56:09 +08:00			`* Is it a good idea to use/fill the block cache for compactions? Or is it better to fully bypass the block cache when compaction?`
			`* Some researchers/engineers propose to offload compaction to a remote server or a serverless lambda function. What are the benefits, and what might be the potential challenges and performance impacts of doing remote compaction? (Think of the point when a compaction completes and the block cache...)`
i love questions Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-21 00:45:10 +08:00
			`We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.`

copyright notice Signed-off-by: Alex Chi <iskyzh@gmail.com> 2024-01-20 12:05:57 +08:00			`{{#include copyright.md}}`