Skip to content

Java API: Add SizeApproximationOptions to the JNI binding#10640

Closed
bremac wants to merge 16 commits into
facebook:mainfrom
sightmachine:java-approximate-sizes
Closed

Java API: Add SizeApproximationOptions to the JNI binding#10640
bremac wants to merge 16 commits into
facebook:mainfrom
sightmachine:java-approximate-sizes

Conversation

@bremac

@bremac bremac commented Sep 5, 2022

Copy link
Copy Markdown
Contributor

This commit adds the SizeApproximationOptions structure to the JNI binding, plus the corresponding overload for RocksDB#getApproximateSizes(). With this change, it is possible to set the size estimation error margin from Java.

This patch is pretty straightforward, though I'm not terribly happy about the length of the two tests in RocksDBTest.java. Let me know if I should break them down further, and I'd be happy to oblige.

@bremac bremac changed the title Add SizeApproximationOptions to the JNI binding Sep 5, 2022
@bremac bremac force-pushed the java-approximate-sizes branch from 373d357 to dad13ff Compare September 5, 2022 22:45
@bremac bremac marked this pull request as draft September 5, 2022 22:56
@bremac bremac force-pushed the java-approximate-sizes branch from dad13ff to 815db69 Compare September 5, 2022 23:04
@bremac bremac marked this pull request as ready for review September 5, 2022 23:46
@vrdhn vrdhn deleted the java-approximate-sizes branch January 10, 2023 05:17
hx235 and others added 16 commits January 23, 2023 11:11
Summary:
PR facebook#11020 fixed a case where it was easy to deadlock the DB with LockWAL() but introduced a bug showing up as a rare assertion failure in the stress test. Specifically, `assert(w->state == STATE_INIT)` in `WriteThread::LinkOne()` called from `BeginWriteStall()`, `DelayWrite()`, `WriteImplWALOnly()`. I haven't been about to generate a unit test that reproduces this failure but I believe the root cause is that DelayWrite() was never meant to be re-entrant, only called from the DB's write_thread_ leader. facebook#11020 introduced a call to DelayWrite() from the nonmem_write_thread_ group leader.

This fix is to make DelayWrite() apply to the specific write queue that it is being called from (inject a dummy write stall entry to the head of the appropriate write queue). WriteController is re-entrant, based on polling and state changes signalled with bg_cv_, so can manage stalling two queues. The only anticipated complication (called out by Andrew in previous PR) is that we don't want timed write delays being injected in parallel for the two queues, because that dimishes the intended throttling effect. Thus, we only allow timed delays for the primary write queue.

HISTORY not updated because this is intended for the same release where the bug was introduced.

Pull Request resolved: facebook#11130

Test Plan:
Although I was not able to reproduce the assertion failure, I was able to reproduce a distinct flaw with what I believe is the same root cause: a kind of deadlock if both write queues need to wake up from stopped writes. Only one will be waiting on bg_cv_ (the other waiting in `LinkOne()` for the write queue to open up), so a single SignalAll() will only unblock one of the queues, with the other re-instating the stop until another signal on bg_cv_. A simple unit test is added for this case.

Will also run crash_test_with_multiops_wc_txn for a while looking for issues.

Reviewed By: ajkr

Differential Revision: D42749330

Pulled By: pdillinger

fbshipit-source-id: 4317dd899a93d57c26fd5af7143038f82d4d4d1b
…uest/Job owns flush_reason instead of CFD (facebook#11111)

Summary:
**Context:**
Concurrent flushes on the same CF can set on `ColumnFamilyData::flush_reason` before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush  `db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed. `

**Summary:**
Suggested by ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define `ColumnFamilyData::flush_reason` in face of concurrent flushes on the same CF (which wasn't the case a long time ago when `ColumnFamilyData::flush_reason ` first introduced`)

**Tets:**
- new unit test
- make check
- aggressive crash test rehearsal

Pull Request resolved: facebook#11111

Reviewed By: ajkr

Differential Revision: D42644600

Pulled By: hx235

fbshipit-source-id: 8589c8184869d3415e5b780c887f877818a5ebaf
…ction (facebook#11165)

Summary:
This PR adds logic to the `RunManualCompaction()` loop to check for cancellation before waiting on any conflicting compactions to finish. In case of cancellation, `RunManualCompaction()` no longer waits on conflicting compactions

Pull Request resolved: facebook#11165

Test Plan: repro test case

Reviewed By: cbi42

Differential Revision: D42864058

Pulled By: ajkr

fbshipit-source-id: ea4dd1a8f294abe212905495a8fbe8f07fca3f5a
Summary: Pull Request resolved: facebook#11136

Test Plan: the provided unit test used to fail due to `GetMergeOperands()` returning `Status::MergeInProgress()`; it passes now because the `GetMergeOperands()` call returns `Status::OK()`

Reviewed By: pdillinger

Differential Revision: D42759198

Pulled By: ajkr

fbshipit-source-id: 878f9f40ccc1d7e2fe7b1352814bae3a49c19939
…ook#11171)

Summary:
Currently, we incorrectly return a Status::Corruption to the MultiGet caller if the file system ReadAsync cannot issue a read and returns an error for some reason, such as IOStatus::NotSupported(). In this PR, we copy the ReadAsync error to the request status so it can be returned to the user.

Tests:
Update existing unit tests and add a new one for this scenario

Pull Request resolved: facebook#11171

Reviewed By: akankshamahajan15

Differential Revision: D42950057

Pulled By: anand1976

fbshipit-source-id: 85ffcb015fa6c064c311f8a28488fec78c487869
Summary:
Fix a bug in the calculation of the input buffer address/offset in log_reader.cc. The bug is when consecutive fragments of a compressed record are located at the same offset in the log reader buffer, the second fragment input buffer is treated as a leftover from the previous input buffer. As a result, the offset in the `ZSTD_inBuffer` is not reset.

Pull Request resolved: facebook#11198

Test Plan: Add a unit test in log_test.cc that fails without the fix and passes with it.

Reviewed By: ajkr, cbi42

Differential Revision: D43102692

Pulled By: anand1976

fbshipit-source-id: aa2648f4802c33991b76a3233c5a58d4cc9e77fd
We're not interested in triggering the extremely slow upstream CircleCI
configuration, since it takes forever to run and burns a lot of cash.

This commit renames it so that we can add our own configuration without
conflicts.
Summary:
While PR#9749 nominally added support for XXH3 in the Java API, it did not update the `toCppChecksumType` method. As a result, setting the checksum type to XXH3 actually set it to CRC32c instead.

This commit adds the missing entry to portal.h, and also updates the tests so that they verify the options passed to RocksDB, instead of simply checking that the getter returns the value set by the setter.

Pull Request resolved: facebook#10862

Reviewed By: pdillinger

Differential Revision: D40665031

Pulled By: ajkr

fbshipit-source-id: 2834419b3361a4bac47db3b858951fb451b5bdc8
This import breaks the build since it doesn't work with the Junit 5 bundle.
Fortunately, nothing actually _uses_ the import, so we can just delete it.
This commit adds the SizeApproximationOptions structure to the JNI binding,
plus the corresponding overload for RocksDB#getApproximateSizes(). With this
change, it is possible to set the size estimation error margin from Java.
@bremac bremac force-pushed the java-approximate-sizes branch from 8c8f4a2 to 75df9a5 Compare March 10, 2023 01:31
@bremac bremac closed this Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

6 participants