Signed-off-by: Alex Chi Z <iskyzh@gmail.com>
This commit is contained in:
Alex Chi Z
2024-01-30 14:48:03 +08:00
parent 9eab75ec1a
commit acc3c959aa
9 changed files with 345 additions and 24 deletions

View File

@@ -89,7 +89,7 @@ We are working on chapter 3 and more test cases for all existing contents.
| 3.3 | Snapshot Read - Engine Read Path | ✅ | ✅ | ✅ | | 3.3 | Snapshot Read - Engine Read Path | ✅ | ✅ | ✅ |
| 3.4 | Watermark and Garbage Collection | ✅ | ✅ | ✅ | | 3.4 | Watermark and Garbage Collection | ✅ | ✅ | ✅ |
| 3.5 | Transactions and Optimistic Concurrency Control | ✅ | ✅ | ✅ | | 3.5 | Transactions and Optimistic Concurrency Control | ✅ | ✅ | ✅ |
| 3.6 | Serializable Snapshot Isolation | ✅ | 🚧 | 🚧 | | 3.6 | Serializable Snapshot Isolation | ✅ | | |
| 3.7 | Compaction Filter | 🚧 | | | | 3.7 | Compaction Filter | 🚧 | | |
## License ## License

View File

@@ -1,22 +1,118 @@
# Serializable Snapshot Isolation # (A Partial) Serializable Snapshot Isolation
Now, we are going to add a conflict detection algorithm at the transaction commit time, so as to make the engine serializable. Now, we are going to add a conflict detection algorithm at the transaction commit time, so as to make the engine to have some level of serializable.
To run test cases,
```
cargo x copy-test --week 3 --day 6
cargo x scheck
```
Let us go through an example of serializable. Consider that we have two transactions in the engine that:
```
txn1: put("key1", get("key2"))
txn2: put("key2", get("key1"))
```
The initial state of the database is `key1=1, key2=2`. Serializable means that the outcome of the execution has the same result of executing the transactions one by one in serial in some order. If we execute txn1 then txn2, we will get `key1=2, key2=2`. If we execute txn2 then txn1, we will get `key1=1, key2=1`.
However, with our current implementation, if the execution of these two transactions overlaps:
```
txn1: get key2 <- 2
txn2: get key1 <- 1
txn1: put key1=2, commit
txn2: put key2=1, commit
```
We will get `key1=2, key2=1`. This cannot be produced with a serial execution of these two transactions. This phenomenon
With serializable validation, we can ensure the modifications to the database corresponds to a serial execution order, and therefore, users may run some critical workloads over the system that requires serializable execution. For example, if a user runs bank transfer workloads on Mini-LSM, they would expect the sum of money at any point of time is the same. We cannot guarantee this invariant without serializable checks.
One technique of serializable validation is to record read set and write set of each transaction in the system. We do the validation before committing a transaction (optimistic concurrency control). If the read set of the transaction overlaps with any transaction committed after its read timestamp, then we fail the validation, and abort the transaction.
Back to the above example, if we have txn1 and txn2 both started at timestamp = 1.
```
txn1: get key2 <- 2
txn2: get key1 <- 1
txn1: put key1=2, commit ts = 2
txn2: put key2=1, start serializable verification
```
When we validate txn2, we will go through all transactions started before the expected commit timestamp of itself and after its read timestamp (in this case, 1 < ts < 3). The only transaction satisfying the criteria is txn1. The write set of txn1 is `key1`, and the read set of txn2 is `key1`. As they overlap, we should abort txn2.
## Task 1: Track Read Set in Get and Write Set ## Task 1: Track Read Set in Get and Write Set
In this task, you will need to modify:
```
src/mvcc/txn.rs
src/mvcc.rs
```
When `get` is called, you should add the key to the read set of the transaction. In our implementation, we store the hashes of the keys, so as to reduce memory usage and make probing the read set faster, though this might cause false positives when two keys have the same hash. You can use `farmhash::hash32` to generate the hash for a key. Note that even if `get` returns a key is not found, this key should still be tracked in the read set.
In `LsmMvccInner::new_txn`, you should create an empty read/write set for the transaction is `serializable=true`.
## Task 2: Track Read Set in Scan ## Task 2: Track Read Set in Scan
## Task 3: Serializable Verification In this task, you will need to modify:
```
src/mvcc/txn.rs
```
In this tutorial, we only guarantee full serializability for `get` requests. You still need to track the read set for scans, but in some specific cases, you might still get non-serializable result.
To understand why this is hard, let us go through the following example.
```
txn1: put("key1", len(scan(..)))
txn2: put("key2", len(scan(..)))
```
If the database starts with an initial state of `a=1,b=2`, we should get either `a=1,b=2,key1=2,key2=3` or `a=1,b=2,key1=3,key2=2`. However, if the transaction execution is as follows:
```
txn1: len(scan(..)) = 2
txn2: len(scan(..)) = 2
txn1: put key1 = 2, commit, read set = {a, b}, write set = {key1}
txn2: put key2 = 2, commit, read set = {a, b}, write set = {key1}
```
This passes our serializable validation and does not correspond to any serial order of execution! Therefore, a fully-working serializable validation will need to track key ranges, and using key hashes can accelerate the serializable check if only `get` is called. Please refer to the bonus tasks on how you can implement serializable checks correctly.
## Task 3: Engine Interface and Serializable Validation
In this task, you will need to modify:
```
src/mvcc/txn.rs
src/lsm_storage.rs
```
Now, we can go ahead and implement the validation in the commit phase. You should take the `commit_lock` every time we process a transaction commit. This ensures only one transaction goes into the transaction verification and commit phase.
You will need to go through all transactions with commit timestamp within range `(read_ts, expected_commit_ts)` (both excluded bounds), and see if the read set of the current transaction overlaps with the write set of any transaction satisfying the criteria. If we can commit the transaction, submit a write batch, and insert the write set of this transaction into `self.inner.mvcc().committed_txns`, where the key is the commit timestamp.
You can skip the check if `write_set` is empty. A read-only transaction can always be committed.
You should also modify the `put`, `delete`, and `write_batch` interface in `LsmStorageInner`. We recommend you define a helper function `write_batch_inner` that processes a write batch. If `options.serializable = true`, `put`, `delete`, and the user-facing `write_batch` should create a transaction instead of directly creating a write batch. Your write batch helper function should also return a `u64` commit timestamp so that `Transaction::Commit` can correctly store the committed transaction data into the MVCC structure.
## Test Your Understanding ## Test Your Understanding
* If you have some experience with building a relational database, you may think about the following question: assume that we build a database based on Mini-LSM where we store each row in the relation table as a key-value pair (key: primary key, value: serialized row) and enable serializable verification, does the database system directly gain ANSI serializable isolation level capability? Why or why not? * If you have some experience with building a relational database, you may think about the following question: assume that we build a database based on Mini-LSM where we store each row in the relation table as a key-value pair (key: primary key, value: serialized row) and enable serializable verification, does the database system directly gain ANSI serializable isolation level capability? Why or why not?
* The thing we implement here is actually write snapshot-isolation (see [A critique of snapshot isolation](https://dl.acm.org/doi/abs/10.1145/2168836.2168853)) that guarantees serializable. Is there any cases where the execution is serializable, but will be rejected by the write snapshot-isolation validation?
* There are databases that claim they have serializable snapshot isolation support by only tracking the keys accessed in gets and scans. Do they really prevent write skews caused by phantoms? (Okay... Actually, I'm talking about [BadgerDB](https://dgraph.io/blog/post/badger-txn/).)
We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community. We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
## Bonus Tasks ## Bonus Tasks
* **Read-Only Transactions.** With serializable enabled, we will need to keep track of the read set for a transaction. * **Read-Only Transactions.** With serializable enabled, we will need to keep track of the read set for a transaction.
* **Precision/Predicate Locking.** The read set can be maintained using a range instead of a single key. This would be useful when a user scans the full key space. * **Precision/Predicate Locking.** The read set can be maintained using a range instead of a single key. This would be useful when a user scans the full key space. This will also enable serializable verification for scan.
{{#include copyright.md}} {{#include copyright.md}}

View File

@@ -243,15 +243,11 @@ impl MiniLsm {
})) }))
} }
pub fn new_txn(&self) -> Result<Arc<Transaction>> {
self.inner.new_txn()
}
pub fn get(&self, key: &[u8]) -> Result<Option<Bytes>> { pub fn get(&self, key: &[u8]) -> Result<Option<Bytes>> {
self.inner.get(key) self.inner.get(key)
} }
pub fn write_batch<T: AsRef<[u8]>>(&self, batch: &[WriteBatchRecord<T>]) -> Result<u64> { pub fn write_batch<T: AsRef<[u8]>>(&self, batch: &[WriteBatchRecord<T>]) -> Result<()> {
self.inner.write_batch(batch) self.inner.write_batch(batch)
} }
@@ -267,6 +263,10 @@ impl MiniLsm {
self.inner.sync() self.inner.sync()
} }
pub fn new_txn(&self) -> Result<Arc<Transaction>> {
self.inner.new_txn()
}
pub fn scan(&self, lower: Bound<&[u8]>, upper: Bound<&[u8]>) -> Result<TxnIterator> { pub fn scan(&self, lower: Bound<&[u8]>, upper: Bound<&[u8]>) -> Result<TxnIterator> {
self.inner.scan(lower, upper) self.inner.scan(lower, upper)
} }
@@ -302,8 +302,6 @@ impl LsmStorageInner {
self.manifest.as_ref().unwrap() self.manifest.as_ref().unwrap()
} }
/// Start the storage engine by either loading an existing directory or creating a new one if the directory does
/// not exist.
/// Start the storage engine by either loading an existing directory or creating a new one if the directory does /// Start the storage engine by either loading an existing directory or creating a new one if the directory does
/// not exist. /// not exist.
pub(crate) fn open(path: impl AsRef<Path>, options: LsmStorageOptions) -> Result<Self> { pub(crate) fn open(path: impl AsRef<Path>, options: LsmStorageOptions) -> Result<Self> {
@@ -443,9 +441,6 @@ impl LsmStorageInner {
self.state.read().memtable.sync_wal() self.state.read().memtable.sync_wal()
} }
pub fn new_txn(self: &Arc<Self>) -> Result<Arc<Transaction>> {
Ok(self.mvcc().new_txn(self.clone(), self.options.serializable))
}
/// Get a key from the storage. In day 7, this can be further optimized by using a bloom filter. /// Get a key from the storage. In day 7, this can be further optimized by using a bloom filter.
pub fn get(self: &Arc<Self>, key: &[u8]) -> Result<Option<Bytes>> { pub fn get(self: &Arc<Self>, key: &[u8]) -> Result<Option<Bytes>> {
let txn = self.mvcc().new_txn(self.clone(), self.options.serializable); let txn = self.mvcc().new_txn(self.clone(), self.options.serializable);
@@ -531,7 +526,7 @@ impl LsmStorageInner {
Ok(None) Ok(None)
} }
pub fn write_batch<T: AsRef<[u8]>>(&self, batch: &[WriteBatchRecord<T>]) -> Result<u64> { pub fn write_batch_inner<T: AsRef<[u8]>>(&self, batch: &[WriteBatchRecord<T>]) -> Result<u64> {
let _lck = self.mvcc().write_lock.lock(); let _lck = self.mvcc().write_lock.lock();
let ts = self.mvcc().latest_commit_ts() + 1; let ts = self.mvcc().latest_commit_ts() + 1;
for record in batch { for record in batch {
@@ -566,10 +561,33 @@ impl LsmStorageInner {
Ok(ts) Ok(ts)
} }
pub fn write_batch<T: AsRef<[u8]>>(
self: &Arc<Self>,
batch: &[WriteBatchRecord<T>],
) -> Result<()> {
if !self.options.serializable {
self.write_batch_inner(batch)?;
} else {
let txn = self.mvcc().new_txn(self.clone(), self.options.serializable);
for record in batch {
match record {
WriteBatchRecord::Del(key) => {
txn.delete(key.as_ref());
}
WriteBatchRecord::Put(key, value) => {
txn.put(key.as_ref(), value.as_ref());
}
}
}
txn.commit()?;
}
Ok(())
}
/// Put a key-value pair into the storage by writing into the current memtable. /// Put a key-value pair into the storage by writing into the current memtable.
pub fn put(self: &Arc<Self>, key: &[u8], value: &[u8]) -> Result<()> { pub fn put(self: &Arc<Self>, key: &[u8], value: &[u8]) -> Result<()> {
if !self.options.serializable { if !self.options.serializable {
self.write_batch(&[WriteBatchRecord::Put(key, value)])?; self.write_batch_inner(&[WriteBatchRecord::Put(key, value)])?;
} else { } else {
let txn = self.mvcc().new_txn(self.clone(), self.options.serializable); let txn = self.mvcc().new_txn(self.clone(), self.options.serializable);
txn.put(key, value); txn.put(key, value);
@@ -581,7 +599,7 @@ impl LsmStorageInner {
/// Remove a key from the storage by writing an empty value. /// Remove a key from the storage by writing an empty value.
pub fn delete(self: &Arc<Self>, key: &[u8]) -> Result<()> { pub fn delete(self: &Arc<Self>, key: &[u8]) -> Result<()> {
if !self.options.serializable { if !self.options.serializable {
self.write_batch(&[WriteBatchRecord::Del(key)])?; self.write_batch_inner(&[WriteBatchRecord::Del(key)])?;
} else { } else {
let txn = self.mvcc().new_txn(self.clone(), self.options.serializable); let txn = self.mvcc().new_txn(self.clone(), self.options.serializable);
txn.delete(key); txn.delete(key);
@@ -720,6 +738,10 @@ impl LsmStorageInner {
Ok(()) Ok(())
} }
pub fn new_txn(self: &Arc<Self>) -> Result<Arc<Transaction>> {
Ok(self.mvcc().new_txn(self.clone(), self.options.serializable))
}
/// Create an iterator over a range of keys. /// Create an iterator over a range of keys.
pub fn scan<'a>( pub fn scan<'a>(
self: &'a Arc<Self>, self: &'a Arc<Self>,

View File

@@ -1,3 +1,6 @@
#![allow(unused_variables)] // TODO(you): remove this lint after implementing this mod
#![allow(dead_code)] // TODO(you): remove this lint after implementing this mod
pub mod txn; pub mod txn;
pub mod watermark; pub mod watermark;
@@ -23,6 +26,7 @@ pub(crate) struct CommittedTxnData {
pub(crate) struct LsmMvccInner { pub(crate) struct LsmMvccInner {
pub(crate) write_lock: Mutex<()>, pub(crate) write_lock: Mutex<()>,
pub(crate) commit_lock: Mutex<()>,
pub(crate) ts: Arc<Mutex<(u64, Watermark)>>, pub(crate) ts: Arc<Mutex<(u64, Watermark)>>,
pub(crate) committed_txns: Arc<Mutex<BTreeMap<u64, CommittedTxnData>>>, pub(crate) committed_txns: Arc<Mutex<BTreeMap<u64, CommittedTxnData>>>,
} }
@@ -31,6 +35,7 @@ impl LsmMvccInner {
pub fn new(initial_ts: u64) -> Self { pub fn new(initial_ts: u64) -> Self {
Self { Self {
write_lock: Mutex::new(()), write_lock: Mutex::new(()),
commit_lock: Mutex::new(()),
ts: Arc::new(Mutex::new((initial_ts, Watermark::new()))), ts: Arc::new(Mutex::new((initial_ts, Watermark::new()))),
committed_txns: Arc::new(Mutex::new(BTreeMap::new())), committed_txns: Arc::new(Mutex::new(BTreeMap::new())),
} }

View File

@@ -7,7 +7,7 @@ use std::{
}, },
}; };
use anyhow::Result; use anyhow::{bail, Result};
use bytes::Bytes; use bytes::Bytes;
use crossbeam_skiplist::{map::Entry, SkipMap}; use crossbeam_skiplist::{map::Entry, SkipMap};
use ouroboros::self_referencing; use ouroboros::self_referencing;
@@ -18,6 +18,7 @@ use crate::{
lsm_iterator::{FusedIterator, LsmIterator}, lsm_iterator::{FusedIterator, LsmIterator},
lsm_storage::{LsmStorageInner, WriteBatchRecord}, lsm_storage::{LsmStorageInner, WriteBatchRecord},
mem_table::map_bound, mem_table::map_bound,
mvcc::CommittedTxnData,
}; };
pub struct Transaction { pub struct Transaction {
@@ -34,6 +35,11 @@ impl Transaction {
if self.committed.load(Ordering::SeqCst) { if self.committed.load(Ordering::SeqCst) {
panic!("cannot operate on committed txn!"); panic!("cannot operate on committed txn!");
} }
if let Some(guard) = &self.key_hashes {
let mut guard = guard.lock();
let (_, read_set) = &mut *guard;
read_set.insert(farmhash::hash32(key));
}
if let Some(entry) = self.local_storage.get(key) { if let Some(entry) = self.local_storage.get(key) {
if entry.value().is_empty() { if entry.value().is_empty() {
return Ok(None); return Ok(None);
@@ -75,7 +81,7 @@ impl Transaction {
if let Some(key_hashes) = &self.key_hashes { if let Some(key_hashes) = &self.key_hashes {
let mut key_hashes = key_hashes.lock(); let mut key_hashes = key_hashes.lock();
let (write_hashes, _) = &mut *key_hashes; let (write_hashes, _) = &mut *key_hashes;
write_hashes.insert(crc32fast::hash(key)); write_hashes.insert(farmhash::hash32(key));
} }
} }
@@ -88,7 +94,7 @@ impl Transaction {
if let Some(key_hashes) = &self.key_hashes { if let Some(key_hashes) = &self.key_hashes {
let mut key_hashes = key_hashes.lock(); let mut key_hashes = key_hashes.lock();
let (write_hashes, _) = &mut *key_hashes; let (write_hashes, _) = &mut *key_hashes;
write_hashes.insert(crc32fast::hash(key)); write_hashes.insert(farmhash::hash32(key));
} }
} }
@@ -96,6 +102,29 @@ impl Transaction {
self.committed self.committed
.compare_exchange(false, true, Ordering::SeqCst, Ordering::SeqCst) .compare_exchange(false, true, Ordering::SeqCst, Ordering::SeqCst)
.expect("cannot operate on committed txn!"); .expect("cannot operate on committed txn!");
let _commit_lock = self.inner.mvcc().commit_lock.lock();
let serializability_check;
if let Some(guard) = &self.key_hashes {
let guard = guard.lock();
let (write_set, read_set) = &*guard;
println!(
"commit txn: write_set: {:?}, read_set: {:?}",
write_set, read_set
);
if !write_set.is_empty() {
let committed_txns = self.inner.mvcc().committed_txns.lock();
for (_, txn_data) in committed_txns.range((self.read_ts + 1)..) {
for key_hash in read_set {
if txn_data.key_hashes.contains(key_hash) {
bail!("serializable check failed");
}
}
}
}
serializability_check = true;
} else {
serializability_check = false;
}
let batch = self let batch = self
.local_storage .local_storage
.iter() .iter()
@@ -107,7 +136,32 @@ impl Transaction {
} }
}) })
.collect::<Vec<_>>(); .collect::<Vec<_>>();
self.inner.write_batch(&batch)?; let ts = self.inner.write_batch_inner(&batch)?;
if serializability_check {
let mut committed_txns = self.inner.mvcc().committed_txns.lock();
let mut key_hashes = self.key_hashes.as_ref().unwrap().lock();
let (write_set, _) = &mut *key_hashes;
let old_data = committed_txns.insert(
ts,
CommittedTxnData {
key_hashes: std::mem::take(write_set),
read_ts: self.read_ts,
commit_ts: ts,
},
);
assert!(old_data.is_none());
// remove unneeded txn data
let watermark = self.inner.mvcc().watermark();
while let Some(entry) = committed_txns.first_entry() {
if *entry.key() < watermark {
entry.remove();
} else {
break;
}
}
}
Ok(()) Ok(())
} }
} }
@@ -164,7 +218,7 @@ impl StorageIterator for TxnLocalIterator {
} }
pub struct TxnIterator { pub struct TxnIterator {
_txn: Arc<Transaction>, txn: Arc<Transaction>,
iter: TwoMergeIterator<TxnLocalIterator, FusedIterator<LsmIterator>>, iter: TwoMergeIterator<TxnLocalIterator, FusedIterator<LsmIterator>>,
} }
@@ -173,8 +227,11 @@ impl TxnIterator {
txn: Arc<Transaction>, txn: Arc<Transaction>,
iter: TwoMergeIterator<TxnLocalIterator, FusedIterator<LsmIterator>>, iter: TwoMergeIterator<TxnLocalIterator, FusedIterator<LsmIterator>>,
) -> Result<Self> { ) -> Result<Self> {
let mut iter = Self { _txn: txn, iter }; let mut iter = Self { txn, iter };
iter.skip_deletes()?; iter.skip_deletes()?;
if iter.is_valid() {
iter.add_to_read_set(iter.key());
}
Ok(iter) Ok(iter)
} }
@@ -184,6 +241,14 @@ impl TxnIterator {
} }
Ok(()) Ok(())
} }
fn add_to_read_set(&self, key: &[u8]) {
if let Some(guard) = &self.txn.key_hashes {
let mut guard = guard.lock();
let (_, read_set) = &mut *guard;
read_set.insert(farmhash::hash32(key));
}
}
} }
impl StorageIterator for TxnIterator { impl StorageIterator for TxnIterator {
@@ -204,6 +269,9 @@ impl StorageIterator for TxnIterator {
fn next(&mut self) -> Result<()> { fn next(&mut self) -> Result<()> {
self.iter.next()?; self.iter.next()?;
self.skip_deletes()?; self.skip_deletes()?;
if self.is_valid() {
self.add_to_read_set(self.key());
}
Ok(()) Ok(())
} }

View File

@@ -17,3 +17,4 @@ mod week3_day2;
mod week3_day3; mod week3_day3;
mod week3_day4; mod week3_day4;
mod week3_day5; mod week3_day5;
mod week3_day6;

View File

@@ -46,6 +46,8 @@ fn test_txn_integration() {
], ],
); );
let txn4 = storage.new_txn().unwrap(); let txn4 = storage.new_txn().unwrap();
assert_eq!(txn4.get(b"test1").unwrap(), Some(Bytes::from("233")));
assert_eq!(txn4.get(b"test2").unwrap(), Some(Bytes::from("233")));
check_lsm_iter_result_by_key( check_lsm_iter_result_by_key(
&mut txn4.scan(Bound::Unbounded, Bound::Unbounded).unwrap(), &mut txn4.scan(Bound::Unbounded, Bound::Unbounded).unwrap(),
vec![ vec![
@@ -53,4 +55,21 @@ fn test_txn_integration() {
(Bytes::from("test2"), Bytes::from("233")), (Bytes::from("test2"), Bytes::from("233")),
], ],
); );
txn4.put(b"test2", b"2333");
assert_eq!(txn4.get(b"test1").unwrap(), Some(Bytes::from("233")));
assert_eq!(txn4.get(b"test2").unwrap(), Some(Bytes::from("2333")));
check_lsm_iter_result_by_key(
&mut txn4.scan(Bound::Unbounded, Bound::Unbounded).unwrap(),
vec![
(Bytes::from("test1"), Bytes::from("233")),
(Bytes::from("test2"), Bytes::from("2333")),
],
);
txn4.delete(b"test2");
assert_eq!(txn4.get(b"test1").unwrap(), Some(Bytes::from("233")));
assert_eq!(txn4.get(b"test2").unwrap(), None);
check_lsm_iter_result_by_key(
&mut txn4.scan(Bound::Unbounded, Bound::Unbounded).unwrap(),
vec![(Bytes::from("test1"), Bytes::from("233"))],
);
} }

View File

@@ -0,0 +1,108 @@
use std::ops::Bound;
use bytes::Bytes;
use tempfile::tempdir;
use crate::{
compact::CompactionOptions,
iterators::StorageIterator,
lsm_storage::{LsmStorageOptions, MiniLsm},
};
#[test]
fn test_serializable_1() {
let dir = tempdir().unwrap();
let mut options = LsmStorageOptions::default_for_week2_test(CompactionOptions::NoCompaction);
options.serializable = true;
let storage = MiniLsm::open(&dir, options.clone()).unwrap();
storage.put(b"key1", b"1").unwrap();
storage.put(b"key2", b"2").unwrap();
let txn1 = storage.new_txn().unwrap();
let txn2 = storage.new_txn().unwrap();
txn1.put(b"key1", &txn1.get(b"key2").unwrap().unwrap());
txn2.put(b"key2", &txn2.get(b"key1").unwrap().unwrap());
txn1.commit().unwrap();
assert!(txn2.commit().is_err());
drop(txn2);
assert_eq!(storage.get(b"key1").unwrap(), Some(Bytes::from("2")));
assert_eq!(storage.get(b"key2").unwrap(), Some(Bytes::from("2")));
}
#[test]
fn test_serializable_2() {
let dir = tempdir().unwrap();
let mut options = LsmStorageOptions::default_for_week2_test(CompactionOptions::NoCompaction);
options.serializable = true;
let storage = MiniLsm::open(&dir, options.clone()).unwrap();
let txn1 = storage.new_txn().unwrap();
let txn2 = storage.new_txn().unwrap();
txn1.put(b"key1", b"1");
txn2.put(b"key1", b"2");
txn1.commit().unwrap();
txn2.commit().unwrap();
assert_eq!(storage.get(b"key1").unwrap(), Some(Bytes::from("2")));
}
#[test]
fn test_serializable_3_ts_range() {
let dir = tempdir().unwrap();
let mut options = LsmStorageOptions::default_for_week2_test(CompactionOptions::NoCompaction);
options.serializable = true;
let storage = MiniLsm::open(&dir, options.clone()).unwrap();
storage.put(b"key1", b"1").unwrap();
storage.put(b"key2", b"2").unwrap();
let txn1 = storage.new_txn().unwrap();
txn1.put(b"key1", &txn1.get(b"key2").unwrap().unwrap());
txn1.commit().unwrap();
let txn2 = storage.new_txn().unwrap();
txn2.put(b"key2", &txn2.get(b"key1").unwrap().unwrap());
txn2.commit().unwrap();
drop(txn2);
assert_eq!(storage.get(b"key1").unwrap(), Some(Bytes::from("2")));
assert_eq!(storage.get(b"key2").unwrap(), Some(Bytes::from("2")));
}
#[test]
fn test_serializable_4_scan() {
let dir = tempdir().unwrap();
let mut options = LsmStorageOptions::default_for_week2_test(CompactionOptions::NoCompaction);
options.serializable = true;
let storage = MiniLsm::open(&dir, options.clone()).unwrap();
storage.put(b"key1", b"1").unwrap();
storage.put(b"key2", b"2").unwrap();
let txn1 = storage.new_txn().unwrap();
let txn2 = storage.new_txn().unwrap();
txn1.put(b"key1", &txn1.get(b"key2").unwrap().unwrap());
txn1.commit().unwrap();
let mut iter = txn2.scan(Bound::Unbounded, Bound::Unbounded).unwrap();
while iter.is_valid() {
iter.next().unwrap();
}
txn2.put(b"key2", b"1");
assert!(txn2.commit().is_err());
drop(txn2);
assert_eq!(storage.get(b"key1").unwrap(), Some(Bytes::from("2")));
assert_eq!(storage.get(b"key2").unwrap(), Some(Bytes::from("2")));
}
#[test]
fn test_serializable_5_read_only() {
let dir = tempdir().unwrap();
let mut options = LsmStorageOptions::default_for_week2_test(CompactionOptions::NoCompaction);
options.serializable = true;
let storage = MiniLsm::open(&dir, options.clone()).unwrap();
storage.put(b"key1", b"1").unwrap();
storage.put(b"key2", b"2").unwrap();
let txn1 = storage.new_txn().unwrap();
txn1.put(b"key1", &txn1.get(b"key2").unwrap().unwrap());
txn1.commit().unwrap();
let txn2 = storage.new_txn().unwrap();
txn2.get(b"key1").unwrap().unwrap();
let mut iter = txn2.scan(Bound::Unbounded, Bound::Unbounded).unwrap();
while iter.is_valid() {
iter.next().unwrap();
}
txn2.commit().unwrap();
assert_eq!(storage.get(b"key1").unwrap(), Some(Bytes::from("2")));
assert_eq!(storage.get(b"key2").unwrap(), Some(Bytes::from("2")));
}

View File

@@ -25,6 +25,7 @@ pub(crate) struct CommittedTxnData {
pub(crate) struct LsmMvccInner { pub(crate) struct LsmMvccInner {
pub(crate) write_lock: Mutex<()>, pub(crate) write_lock: Mutex<()>,
pub(crate) commit_lock: Mutex<()>,
pub(crate) ts: Arc<Mutex<(u64, Watermark)>>, pub(crate) ts: Arc<Mutex<(u64, Watermark)>>,
pub(crate) committed_txns: Arc<Mutex<BTreeMap<u64, CommittedTxnData>>>, pub(crate) committed_txns: Arc<Mutex<BTreeMap<u64, CommittedTxnData>>>,
} }
@@ -33,6 +34,7 @@ impl LsmMvccInner {
pub fn new(initial_ts: u64) -> Self { pub fn new(initial_ts: u64) -> Self {
Self { Self {
write_lock: Mutex::new(()), write_lock: Mutex::new(()),
commit_lock: Mutex::new(()),
ts: Arc::new(Mutex::new((initial_ts, Watermark::new()))), ts: Arc::new(Mutex::new((initial_ts, Watermark::new()))),
committed_txns: Arc::new(Mutex::new(BTreeMap::new())), committed_txns: Arc::new(Mutex::new(BTreeMap::new())),
} }