Skip to content

update to latest memchr + upgrade to Rust 2018 + bump MSRV to Rust 1.41 #767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 1, 2021

Conversation

BurntSushi
Copy link
Member

The main motivation for this PR is to use the new memmem implementation in memchr 2.4 (not quite released at time of writing, but in a PR). This let's us delete regex's own bespoke substring search implementations ("FreqyPacked" along with Boyer-Moore). The main benefit of the new implementation is that it should roughly maintain the speed of the old algorithms, but keep its speed in a lot more cases. i.e., It should have far fewer weaknesses. Plus, the algorithm is now available for anyone to use without bringing in regex.

While we're here, we (finally) move to Rust 2018 and bump the MSRV to Rust 1.41 (since that's what's in Debian Stable). There's no particular reason why I waited so long to do this. It was never my intent to support such an old version of Rust for so long. There was just never a strong impetus to upgrade. But with Rust 2021 around the bend, it seems appropriate to at least migrate to Rust 2018. Hopefully we'll get to Rust 2021 sooner.

(The plan is to merge this PR once I do a similar change to the aho-corasick crate.)

BurntSushi added a commit to BurntSushi/aho-corasick that referenced this pull request Apr 30, 2021
This is in line with similar changes to the regex and memchr crates:
BurntSushi/memchr#82
and
rust-lang/regex#767
BurntSushi added a commit to BurntSushi/aho-corasick that referenced this pull request Apr 30, 2021
This is in line with similar changes to the regex and memchr crates:
BurntSushi/memchr#82
and
rust-lang/regex#767
@BurntSushi BurntSushi force-pushed the ag/rust-2018-memmem branch from 69f66d9 to a09f8d0 Compare April 30, 2021 23:39
This removes the ad hoc FreqyPacked searcher and the implementation of
Boyer-Moore, and replaces it with a new implementation of memmem in the
memchr crate. (Introduced in memchr 2.4.) Since memchr 2.4 also moves to
Rust 2018, we'll do the same in subsequent commits. (Finally.)

The benchmarks look about as expected. Latency on some of the smaller
benchmarks has worsened slightly by a nanosecond or two. The top
throughput speed has also decreased, and some other benchmarks
(especially ones with frequent literal matches) have improved
dramatically.
This commit does a number of manual fixups to the code after the
previous two commits were done via 'cargo fix' automatically.

Actually, this contains more 'cargo fix' annotations, since I had
forgotten to add 'edition = "2018"' to all sub-crates.
This was long overdue, and we were motivated by memchr's move to Rust
2018 in BurntSushi/memchr#82.

Rust 1.41.1 was selected because it's the current version of Rust in
Debian Stable. It also feels old enough to assure wide support.
It looks like 'cargo fix' didn't do this.
@BurntSushi BurntSushi force-pushed the ag/rust-2018-memmem branch from a09f8d0 to dada2ce Compare April 30, 2021 23:54
@BurntSushi BurntSushi merged commit a2a393f into master May 1, 2021
@BurntSushi BurntSushi deleted the ag/rust-2018-memmem branch May 1, 2021 00:04
@BurntSushi
Copy link
Member Author

This PR is on crates.io in regex 1.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant