mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2025-08-17 13:13:57 -07:00
Compare commits
9 Commits
grep-print
...
ag/partial
Author | SHA1 | Date | |
---|---|---|---|
|
a872d33714 | ||
|
f08f274c5f | ||
|
db7e828989 | ||
|
fb6cad7152 | ||
|
8e1d40ed7d | ||
|
b1c064d5af | ||
|
26a83c6301 | ||
|
5e50a3c43c | ||
|
85417e52e9 |
@@ -63,13 +63,13 @@ matrix:
|
|||||||
# Minimum Rust supported channel. We enable these to make sure ripgrep
|
# Minimum Rust supported channel. We enable these to make sure ripgrep
|
||||||
# continues to work on the advertised minimum Rust version.
|
# continues to work on the advertised minimum Rust version.
|
||||||
- os: linux
|
- os: linux
|
||||||
rust: 1.34.0
|
rust: 1.32.0
|
||||||
env: TARGET=x86_64-unknown-linux-gnu
|
env: TARGET=x86_64-unknown-linux-gnu
|
||||||
- os: linux
|
- os: linux
|
||||||
rust: 1.34.0
|
rust: 1.32.0
|
||||||
env: TARGET=x86_64-unknown-linux-musl
|
env: TARGET=x86_64-unknown-linux-musl
|
||||||
- os: linux
|
- os: linux
|
||||||
rust: 1.34.0
|
rust: 1.32.0
|
||||||
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
|
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
|
||||||
addons:
|
addons:
|
||||||
apt:
|
apt:
|
||||||
|
80
CHANGELOG.md
80
CHANGELOG.md
@@ -1,20 +1,6 @@
|
|||||||
11.0.0 (TBD)
|
0.11.0 (TBD)
|
||||||
============
|
============
|
||||||
ripgrep 11 is a new major version release of ripgrep that contains many bug
|
TODO.
|
||||||
fixes, some performance improvements and a few feature enhancements. Notably,
|
|
||||||
ripgrep's user experience for binary file filtering has been improved. See the
|
|
||||||
[guide's new section on binary data](GUIDE.md#binary-data) for more details.
|
|
||||||
|
|
||||||
This release also marks a change in ripgrep's versioning. Where as the previous
|
|
||||||
version was `0.10.0`, this version is `11.0.0`. Moving forward, ripgrep's
|
|
||||||
major version will be increased a few times per year. ripgrep will continue to
|
|
||||||
be conservative with respect to backwards compatibility, but may occasionally
|
|
||||||
introduce breaking changes, which will always be documented in this CHANGELOG.
|
|
||||||
See [issue 1172](https://github.com/BurntSushi/ripgrep/issues/1172) for a bit
|
|
||||||
more detail on why this versioning change was made.
|
|
||||||
|
|
||||||
This release increases the **minimum supported Rust version** from 1.28.0 to
|
|
||||||
1.34.0.
|
|
||||||
|
|
||||||
**BREAKING CHANGES**:
|
**BREAKING CHANGES**:
|
||||||
|
|
||||||
@@ -25,91 +11,45 @@ This release increases the **minimum supported Rust version** from 1.28.0 to
|
|||||||
error (e.g., regex syntax error). One exception to this is if ripgrep is run
|
error (e.g., regex syntax error). One exception to this is if ripgrep is run
|
||||||
with `-q/--quiet`. In that case, if an error occurs and a match is found,
|
with `-q/--quiet`. In that case, if an error occurs and a match is found,
|
||||||
then ripgrep will exit with a `0` exit status code.
|
then ripgrep will exit with a `0` exit status code.
|
||||||
* Supplying the `-u/--unrestricted` flag three times is now equivalent to
|
|
||||||
supplying `--no-ignore --hidden --binary`. Previously, `-uuu` was equivalent
|
|
||||||
to `--no-ignore --hidden --text`. The difference is that `--binary` disables
|
|
||||||
binary file filtering without potentially dumping binary data into your
|
|
||||||
terminal. That is, `rg -uuu foo` should now be equivalent to `grep -r foo`.
|
|
||||||
* The `avx-accel` feature of ripgrep has been removed since it is no longer
|
* The `avx-accel` feature of ripgrep has been removed since it is no longer
|
||||||
necessary. All uses of AVX in ripgrep are now enabled automatically via
|
necessary. All uses of AVX in ripgrep are now enabled automatically via
|
||||||
runtime CPU feature detection. The `simd-accel` feature does remain
|
runtime CPU feature detection. The `simd-accel` feature does remain
|
||||||
available, however, it does increase compilation times substantially at the
|
available, however, it does increase compilation times substantially at the
|
||||||
moment.
|
moment.
|
||||||
|
|
||||||
Performance improvements:
|
|
||||||
|
|
||||||
* [PERF #497](https://github.com/BurntSushi/ripgrep/issues/497),
|
|
||||||
[PERF #838](https://github.com/BurntSushi/ripgrep/issues/838):
|
|
||||||
Make `rg -F -f dictionary-of-literals` much faster.
|
|
||||||
|
|
||||||
Feature enhancements:
|
Feature enhancements:
|
||||||
|
|
||||||
* Added or improved file type filtering for Apache Thrift, ASP, Bazel, Brotli,
|
|
||||||
BuildStream, bzip2, C, C++, Cython, gzip, Java, Make, Postscript, QML, Tex,
|
|
||||||
XML, xz, zig and zstd.
|
|
||||||
* [FEATURE #855](https://github.com/BurntSushi/ripgrep/issues/855):
|
|
||||||
Add `--binary` flag for disabling binary file filtering.
|
|
||||||
* [FEATURE #1078](https://github.com/BurntSushi/ripgrep/pull/1078):
|
|
||||||
Add `--max-columns-preview` flag for showing a preview of long lines.
|
|
||||||
* [FEATURE #1099](https://github.com/BurntSushi/ripgrep/pull/1099):
|
* [FEATURE #1099](https://github.com/BurntSushi/ripgrep/pull/1099):
|
||||||
Add support for Brotli and Zstd to the `-z/--search-zip` flag.
|
Add support for Brotli and Zstd to the `-z/--search-zip` flag.
|
||||||
* [FEATURE #1138](https://github.com/BurntSushi/ripgrep/pull/1138):
|
* [FEATURE #1138](https://github.com/BurntSushi/ripgrep/pull/1138):
|
||||||
Add `--no-ignore-dot` flag for ignoring `.ignore` files.
|
Add `--no-ignore-dot` flag for ignoring `.ignore` files.
|
||||||
* [FEATURE #1155](https://github.com/BurntSushi/ripgrep/pull/1155):
|
|
||||||
Add `--auto-hybrid-regex` flag for automatically falling back to PCRE2.
|
|
||||||
* [FEATURE #1159](https://github.com/BurntSushi/ripgrep/pull/1159):
|
* [FEATURE #1159](https://github.com/BurntSushi/ripgrep/pull/1159):
|
||||||
ripgrep's exit status logic should now match GNU grep. See updated man page.
|
ripgrep's exit status logic should now match GNU grep. See updated man page.
|
||||||
* [FEATURE #1164](https://github.com/BurntSushi/ripgrep/pull/1164):
|
* [FEATURE #1170](https://github.com/BurntSushi/ripgrep/pull/1170):
|
||||||
Add `--ignore-file-case-insensitive` for case insensitive ignore globs.
|
Add `--ignore-file-case-insensitive` for case insensitive .ignore globs.
|
||||||
* [FEATURE #1185](https://github.com/BurntSushi/ripgrep/pull/1185):
|
|
||||||
Add `-I` flag as a short option for the `--no-filename` flag.
|
|
||||||
* [FEATURE #1207](https://github.com/BurntSushi/ripgrep/pull/1207):
|
|
||||||
Add `none` value to `-E/--encoding` to forcefully disable all transcoding.
|
|
||||||
* [FEATURE da9d7204](https://github.com/BurntSushi/ripgrep/commit/da9d7204):
|
|
||||||
Add `--pcre2-version` for querying showing PCRE2 version information.
|
|
||||||
|
|
||||||
Bug fixes:
|
Bug fixes:
|
||||||
|
|
||||||
* [BUG #306](https://github.com/BurntSushi/ripgrep/issues/306),
|
|
||||||
[BUG #855](https://github.com/BurntSushi/ripgrep/issues/855):
|
|
||||||
Improve the user experience for ripgrep's binary file filtering.
|
|
||||||
* [BUG #373](https://github.com/BurntSushi/ripgrep/issues/373),
|
* [BUG #373](https://github.com/BurntSushi/ripgrep/issues/373),
|
||||||
[BUG #1098](https://github.com/BurntSushi/ripgrep/issues/1098):
|
[BUG #1098](https://github.com/BurntSushi/ripgrep/issues/1098):
|
||||||
`**` is now accepted as valid syntax anywhere in a glob.
|
`**` is now accepted as valid syntax anywhere in a glob.
|
||||||
* [BUG #916](https://github.com/BurntSushi/ripgrep/issues/916):
|
* [BUG #916](https://github.com/BurntSushi/ripgrep/issues/916):
|
||||||
ripgrep no longer hangs when searching `/proc` with a zombie process present.
|
ripgrep no longer hangs when searching `/proc` with a zombie process present.
|
||||||
* [BUG #1052](https://github.com/BurntSushi/ripgrep/issues/1052):
|
|
||||||
Fix bug where ripgrep could panic when transcoding UTF-16 files.
|
|
||||||
* [BUG #1055](https://github.com/BurntSushi/ripgrep/issues/1055):
|
|
||||||
Suggest `-U/--multiline` when a pattern contains a `\n`.
|
|
||||||
* [BUG #1063](https://github.com/BurntSushi/ripgrep/issues/1063):
|
|
||||||
Always strip a BOM if it's present, even for UTF-8.
|
|
||||||
* [BUG #1064](https://github.com/BurntSushi/ripgrep/issues/1064):
|
|
||||||
Fix inner literal detection that could lead to incorrect matches.
|
|
||||||
* [BUG #1079](https://github.com/BurntSushi/ripgrep/issues/1079):
|
|
||||||
Fixes a bug where the order of globs could result in missing a match.
|
|
||||||
* [BUG #1089](https://github.com/BurntSushi/ripgrep/issues/1089):
|
|
||||||
Fix another bug where ripgrep could panic when transcoding UTF-16 files.
|
|
||||||
* [BUG #1091](https://github.com/BurntSushi/ripgrep/issues/1091):
|
* [BUG #1091](https://github.com/BurntSushi/ripgrep/issues/1091):
|
||||||
Add note about inverted flags to the man page.
|
Add note about inverted flags to the man page.
|
||||||
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
|
|
||||||
Fix handling of literal slashes in gitignore patterns.
|
|
||||||
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
|
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
|
||||||
Fix corner cases involving the `--crlf` flag.
|
Fix corner cases involving the `--crlf` flag.
|
||||||
* [BUG #1101](https://github.com/BurntSushi/ripgrep/issues/1101):
|
|
||||||
Fix AsciiDoc escaping for man page output.
|
|
||||||
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
|
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
|
||||||
Clarify what `--encoding auto` does.
|
Clarify what `--encoding auto` does.
|
||||||
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
|
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
|
||||||
`--files-with-matches` and `--files-without-match` work with one file.
|
`--files-with-matches` and `--files-without-match` work with one file.
|
||||||
|
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
|
||||||
|
Fix handling of literal slashes in gitignore patterns.
|
||||||
* [BUG #1121](https://github.com/BurntSushi/ripgrep/issues/1121):
|
* [BUG #1121](https://github.com/BurntSushi/ripgrep/issues/1121):
|
||||||
Fix bug that was triggering Windows antimalware when using the `--files`
|
Fix bug that was triggering Windows antimalware when using the --files flag.
|
||||||
flag.
|
|
||||||
* [BUG #1125](https://github.com/BurntSushi/ripgrep/issues/1125),
|
* [BUG #1125](https://github.com/BurntSushi/ripgrep/issues/1125),
|
||||||
[BUG #1159](https://github.com/BurntSushi/ripgrep/issues/1159):
|
[BUG #1159](https://github.com/BurntSushi/ripgrep/issues/1159):
|
||||||
ripgrep shouldn't panic for `rg -h | rg` and should emit correct exit status.
|
ripgrep shouldn't panic for `rg -h | rg` and should emit correct exit status.
|
||||||
* [BUG #1144](https://github.com/BurntSushi/ripgrep/issues/1144):
|
|
||||||
Fixes a bug where line numbers could be wrong on big-endian machines.
|
|
||||||
* [BUG #1154](https://github.com/BurntSushi/ripgrep/issues/1154):
|
* [BUG #1154](https://github.com/BurntSushi/ripgrep/issues/1154):
|
||||||
Windows files with "hidden" attribute are now treated as hidden.
|
Windows files with "hidden" attribute are now treated as hidden.
|
||||||
* [BUG #1173](https://github.com/BurntSushi/ripgrep/issues/1173):
|
* [BUG #1173](https://github.com/BurntSushi/ripgrep/issues/1173):
|
||||||
@@ -118,12 +58,6 @@ Bug fixes:
|
|||||||
Fix handling of repeated `**` patterns in gitignore files.
|
Fix handling of repeated `**` patterns in gitignore files.
|
||||||
* [BUG #1176](https://github.com/BurntSushi/ripgrep/issues/1176):
|
* [BUG #1176](https://github.com/BurntSushi/ripgrep/issues/1176):
|
||||||
Fix bug where `-F`/`-x` weren't applied to patterns given via `-f`.
|
Fix bug where `-F`/`-x` weren't applied to patterns given via `-f`.
|
||||||
* [BUG #1189](https://github.com/BurntSushi/ripgrep/issues/1189):
|
|
||||||
Document cases where ripgrep may use a lot of memory.
|
|
||||||
* [BUG #1203](https://github.com/BurntSushi/ripgrep/issues/1203):
|
|
||||||
Fix a matching bug related to the suffix literal optimization.
|
|
||||||
* [BUG 8f14cb18](https://github.com/BurntSushi/ripgrep/commit/8f14cb18):
|
|
||||||
Increase the default stack size for PCRE2's JIT.
|
|
||||||
|
|
||||||
|
|
||||||
0.10.0 (2018-09-07)
|
0.10.0 (2018-09-07)
|
||||||
|
117
Cargo.lock
generated
117
Cargo.lock
generated
@@ -58,7 +58,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "cc"
|
name = "cc"
|
||||||
version = "1.0.35"
|
version = "1.0.34"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -68,12 +68,12 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "clap"
|
name = "clap"
|
||||||
version = "2.33.0"
|
version = "2.32.0"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"bitflags 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bitflags 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"textwrap 0.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
@@ -114,7 +114,7 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "encoding_rs_io"
|
name = "encoding_rs_io"
|
||||||
version = "0.1.6"
|
version = "0.1.5"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)",
|
"encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -132,17 +132,17 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "glob"
|
name = "glob"
|
||||||
version = "0.3.0"
|
version = "0.2.11"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "globset"
|
name = "globset"
|
||||||
version = "0.4.3"
|
version = "0.4.2"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)",
|
"aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
@@ -152,11 +152,11 @@ name = "grep"
|
|||||||
version = "0.2.3"
|
version = "0.2.3"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"grep-cli 0.1.1",
|
"grep-cli 0.1.1",
|
||||||
"grep-matcher 0.1.2",
|
"grep-matcher 0.1.1",
|
||||||
"grep-pcre2 0.1.3",
|
"grep-pcre2 0.1.2",
|
||||||
"grep-printer 0.1.2",
|
"grep-printer 0.1.1",
|
||||||
"grep-regex 0.1.3",
|
"grep-regex 0.1.2",
|
||||||
"grep-searcher 0.1.4",
|
"grep-searcher 0.1.3",
|
||||||
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
|
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"walkdir 2.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
"walkdir 2.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
@@ -167,7 +167,7 @@ version = "0.1.1"
|
|||||||
dependencies = [
|
dependencies = [
|
||||||
"atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
|
"atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"globset 0.4.3",
|
"globset 0.4.2",
|
||||||
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -178,7 +178,7 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grep-matcher"
|
name = "grep-matcher"
|
||||||
version = "0.1.2"
|
version = "0.1.1"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -186,21 +186,21 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grep-pcre2"
|
name = "grep-pcre2"
|
||||||
version = "0.1.3"
|
version = "0.1.2"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"grep-matcher 0.1.2",
|
"grep-matcher 0.1.1",
|
||||||
"pcre2 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"pcre2 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grep-printer"
|
name = "grep-printer"
|
||||||
version = "0.1.2"
|
version = "0.1.1"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"base64 0.10.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
"base64 0.10.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"grep-matcher 0.1.2",
|
"grep-matcher 0.1.1",
|
||||||
"grep-regex 0.1.3",
|
"grep-regex 0.1.2",
|
||||||
"grep-searcher 0.1.4",
|
"grep-searcher 0.1.3",
|
||||||
"serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
|
"serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
|
"serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)",
|
"serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -209,10 +209,9 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grep-regex"
|
name = "grep-regex"
|
||||||
version = "0.1.3"
|
version = "0.1.2"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)",
|
"grep-matcher 0.1.1",
|
||||||
"grep-matcher 0.1.2",
|
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex-syntax 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex-syntax 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -222,14 +221,14 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grep-searcher"
|
name = "grep-searcher"
|
||||||
version = "0.1.4"
|
version = "0.1.3"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)",
|
"encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"encoding_rs_io 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"encoding_rs_io 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"grep-matcher 0.1.2",
|
"grep-matcher 0.1.1",
|
||||||
"grep-regex 0.1.3",
|
"grep-regex 0.1.2",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -237,10 +236,10 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "ignore"
|
name = "ignore"
|
||||||
version = "0.4.7"
|
version = "0.4.6"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
|
"crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"globset 0.4.3",
|
"globset 0.4.2",
|
||||||
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -307,21 +306,21 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "pcre2"
|
name = "pcre2"
|
||||||
version = "0.2.0"
|
version = "0.1.1"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"pcre2-sys 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"pcre2-sys 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "pcre2-sys"
|
name = "pcre2-sys"
|
||||||
version = "0.2.0"
|
version = "0.1.1"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"cc 1.0.35 (registry+https://github.com/rust-lang/crates.io-index)",
|
"cc 1.0.34 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)",
|
"pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
@@ -341,7 +340,7 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "quote"
|
name = "quote"
|
||||||
version = "0.6.12"
|
version = "0.6.11"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -453,7 +452,7 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "redox_syscall"
|
name = "redox_syscall"
|
||||||
version = "0.1.54"
|
version = "0.1.52"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -461,7 +460,7 @@ name = "redox_termios"
|
|||||||
version = "0.1.1"
|
version = "0.1.1"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"redox_syscall 0.1.54 (registry+https://github.com/rust-lang/crates.io-index)",
|
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -505,9 +504,9 @@ name = "ripgrep"
|
|||||||
version = "0.10.0"
|
version = "0.10.0"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"clap 2.32.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"grep 0.2.3",
|
"grep 0.2.3",
|
||||||
"ignore 0.4.7",
|
"ignore 0.4.6",
|
||||||
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -542,8 +541,8 @@ version = "1.0.90"
|
|||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"quote 0.6.12 (registry+https://github.com/rust-lang/crates.io-index)",
|
"quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"syn 0.15.31 (registry+https://github.com/rust-lang/crates.io-index)",
|
"syn 0.15.30 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -563,16 +562,16 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "strsim"
|
name = "strsim"
|
||||||
version = "0.8.0"
|
version = "0.7.0"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "syn"
|
name = "syn"
|
||||||
version = "0.15.31"
|
version = "0.15.30"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"quote 0.6.12 (registry+https://github.com/rust-lang/crates.io-index)",
|
"quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"unicode-xid 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
"unicode-xid 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
@@ -584,7 +583,7 @@ dependencies = [
|
|||||||
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"redox_syscall 0.1.54 (registry+https://github.com/rust-lang/crates.io-index)",
|
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
"remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
@@ -603,13 +602,13 @@ version = "1.5.1"
|
|||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"redox_syscall 0.1.54 (registry+https://github.com/rust-lang/crates.io-index)",
|
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
"redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
"redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "textwrap"
|
name = "textwrap"
|
||||||
version = "0.11.0"
|
version = "0.10.0"
|
||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||||
@@ -698,17 +697,17 @@ dependencies = [
|
|||||||
"checksum bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6c8203ca06c502958719dae5f653a79e0cc6ba808ed02beffbf27d09610f2143"
|
"checksum bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6c8203ca06c502958719dae5f653a79e0cc6ba808ed02beffbf27d09610f2143"
|
||||||
"checksum bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "be0fdd54b507df8f22012890aadd099979befdba27713c767993f8380112ca7c"
|
"checksum bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "be0fdd54b507df8f22012890aadd099979befdba27713c767993f8380112ca7c"
|
||||||
"checksum byteorder 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a019b10a2a7cdeb292db131fc8113e57ea2a908f6e7894b0c3c671893b65dbeb"
|
"checksum byteorder 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a019b10a2a7cdeb292db131fc8113e57ea2a908f6e7894b0c3c671893b65dbeb"
|
||||||
"checksum cc 1.0.35 (registry+https://github.com/rust-lang/crates.io-index)" = "5e5f3fee5eeb60324c2781f1e41286bdee933850fff9b3c672587fed5ec58c83"
|
"checksum cc 1.0.34 (registry+https://github.com/rust-lang/crates.io-index)" = "30f813bf45048a18eda9190fd3c6b78644146056740c43172a5a3699118588fd"
|
||||||
"checksum cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)" = "11d43355396e872eefb45ce6342e4374ed7bc2b3a502d1b28e36d6e23c05d1f4"
|
"checksum cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)" = "11d43355396e872eefb45ce6342e4374ed7bc2b3a502d1b28e36d6e23c05d1f4"
|
||||||
"checksum clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)" = "5067f5bb2d80ef5d68b4c87db81601f0b75bca627bc2ef76b141d7b846a3c6d9"
|
"checksum clap 2.32.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b957d88f4b6a63b9d70d5f454ac8011819c6efa7727858f458ab71c756ce2d3e"
|
||||||
"checksum cloudabi 0.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "ddfc5b9aa5d4507acaf872de71051dfd0e309860e88966e1051e462a077aac4f"
|
"checksum cloudabi 0.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "ddfc5b9aa5d4507acaf872de71051dfd0e309860e88966e1051e462a077aac4f"
|
||||||
"checksum crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "0f0ed1a4de2235cabda8558ff5840bffb97fcb64c97827f354a451307df5f72b"
|
"checksum crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "0f0ed1a4de2235cabda8558ff5840bffb97fcb64c97827f354a451307df5f72b"
|
||||||
"checksum crossbeam-utils 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "f8306fcef4a7b563b76b7dd949ca48f52bc1141aa067d2ea09565f3e2652aa5c"
|
"checksum crossbeam-utils 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "f8306fcef4a7b563b76b7dd949ca48f52bc1141aa067d2ea09565f3e2652aa5c"
|
||||||
"checksum encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)" = "4155785c79f2f6701f185eb2e6b4caf0555ec03477cb4c70db67b465311620ed"
|
"checksum encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)" = "4155785c79f2f6701f185eb2e6b4caf0555ec03477cb4c70db67b465311620ed"
|
||||||
"checksum encoding_rs_io 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "9619ee7a2bf4e777e020b95c1439abaf008f8ea8041b78a0552c4f1bcf4df32c"
|
"checksum encoding_rs_io 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "f94ef2bcdb2f5d58e982ef565baa1ecfd04b7cb653d0bf1b49af1dd472faa8d8"
|
||||||
"checksum fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "2fad85553e09a6f881f739c29f0b00b0f01357c743266d478b68951ce23285f3"
|
"checksum fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "2fad85553e09a6f881f739c29f0b00b0f01357c743266d478b68951ce23285f3"
|
||||||
"checksum fuchsia-cprng 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a06f77d526c1a601b7c4cdd98f54b5eaabffc14d5f2f0296febdc7f357c6d3ba"
|
"checksum fuchsia-cprng 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a06f77d526c1a601b7c4cdd98f54b5eaabffc14d5f2f0296febdc7f357c6d3ba"
|
||||||
"checksum glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "9b919933a397b79c37e33b77bb2aa3dc8eb6e165ad809e58ff75bc7db2e34574"
|
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
|
||||||
"checksum itoa 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)" = "1306f3464951f30e30d12373d31c79fbd52d236e5e896fd92f96ec7babbbe60b"
|
"checksum itoa 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)" = "1306f3464951f30e30d12373d31c79fbd52d236e5e896fd92f96ec7babbbe60b"
|
||||||
"checksum lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bc5729f27f159ddd61f4df6228e827e86643d4d3e7c32183cb30a1c08f604a14"
|
"checksum lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bc5729f27f159ddd61f4df6228e827e86643d4d3e7c32183cb30a1c08f604a14"
|
||||||
"checksum libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)" = "bedcc7a809076656486ffe045abeeac163da1b558e963a31e29fbfbeba916917"
|
"checksum libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)" = "bedcc7a809076656486ffe045abeeac163da1b558e963a31e29fbfbeba916917"
|
||||||
@@ -717,11 +716,11 @@ dependencies = [
|
|||||||
"checksum memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "6585fd95e7bb50d6cc31e20d4cf9afb4e2ba16c5846fc76793f11218da9c475b"
|
"checksum memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "6585fd95e7bb50d6cc31e20d4cf9afb4e2ba16c5846fc76793f11218da9c475b"
|
||||||
"checksum num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1a23f0ed30a54abaa0c7e83b1d2d87ada7c3c23078d1d87815af3e3b6385fbba"
|
"checksum num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1a23f0ed30a54abaa0c7e83b1d2d87ada7c3c23078d1d87815af3e3b6385fbba"
|
||||||
"checksum packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "a85ea9fc0d4ac0deb6fe7911d38786b32fc11119afd9e9d38b84ff691ce64220"
|
"checksum packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "a85ea9fc0d4ac0deb6fe7911d38786b32fc11119afd9e9d38b84ff691ce64220"
|
||||||
"checksum pcre2 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "a08c8195dd1d8a2a1b5e2af94bf0c4c3c195c2359930442a016bf123196f7155"
|
"checksum pcre2 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3ae0a2682105ec5ca0ee5910bbc7e926386d348a05166348f74007942983c319"
|
||||||
"checksum pcre2-sys 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1e0092a7eae1c569cf7dbec61eef956516df93eb4afda8f600ccb16980aca849"
|
"checksum pcre2-sys 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a9027f9474e4e13d3b965538aafcaebe48c803488ad76b3c97ef061a8324695f"
|
||||||
"checksum pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)" = "676e8eb2b1b4c9043511a9b7bea0915320d7e502b0a079fb03f9635a5252b18c"
|
"checksum pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)" = "676e8eb2b1b4c9043511a9b7bea0915320d7e502b0a079fb03f9635a5252b18c"
|
||||||
"checksum proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)" = "4d317f9caece796be1980837fd5cb3dfec5613ebdb04ad0956deea83ce168915"
|
"checksum proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)" = "4d317f9caece796be1980837fd5cb3dfec5613ebdb04ad0956deea83ce168915"
|
||||||
"checksum quote 0.6.12 (registry+https://github.com/rust-lang/crates.io-index)" = "faf4799c5d274f3868a4aae320a0a182cbd2baee377b378f080e16a23e9d80db"
|
"checksum quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)" = "cdd8e04bd9c52e0342b406469d494fcb033be4bdbe5c606016defbb1681411e1"
|
||||||
"checksum rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "6d71dacdc3c88c1fde3885a3be3fbab9f35724e6ce99467f7d9c5026132184ca"
|
"checksum rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "6d71dacdc3c88c1fde3885a3be3fbab9f35724e6ce99467f7d9c5026132184ca"
|
||||||
"checksum rand_chacha 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "556d3a1ca6600bfcbab7c7c91ccb085ac7fbbcd70e008a98742e7847f4f7bcef"
|
"checksum rand_chacha 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "556d3a1ca6600bfcbab7c7c91ccb085ac7fbbcd70e008a98742e7847f4f7bcef"
|
||||||
"checksum rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7a6fdeb83b075e8266dcc8762c22776f6877a63111121f5f8c7411e5be7eed4b"
|
"checksum rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7a6fdeb83b075e8266dcc8762c22776f6877a63111121f5f8c7411e5be7eed4b"
|
||||||
@@ -733,7 +732,7 @@ dependencies = [
|
|||||||
"checksum rand_pcg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "abf9b09b01790cfe0364f52bf32995ea3c39f4d2dd011eac241d2914146d0b44"
|
"checksum rand_pcg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "abf9b09b01790cfe0364f52bf32995ea3c39f4d2dd011eac241d2914146d0b44"
|
||||||
"checksum rand_xorshift 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "cbf7e9e623549b0e21f6e97cf8ecf247c1a8fd2e8a992ae265314300b2455d5c"
|
"checksum rand_xorshift 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "cbf7e9e623549b0e21f6e97cf8ecf247c1a8fd2e8a992ae265314300b2455d5c"
|
||||||
"checksum rdrand 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "678054eb77286b51581ba43620cc911abf02758c91f93f479767aed0f90458b2"
|
"checksum rdrand 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "678054eb77286b51581ba43620cc911abf02758c91f93f479767aed0f90458b2"
|
||||||
"checksum redox_syscall 0.1.54 (registry+https://github.com/rust-lang/crates.io-index)" = "12229c14a0f65c4f1cb046a3b52047cdd9da1f4b30f8a39c5063c8bae515e252"
|
"checksum redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)" = "d32b3053e5ced86e4bc0411fec997389532bf56b000e66cb4884eeeb41413d69"
|
||||||
"checksum redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7e891cfe48e9100a70a3b6eb652fef28920c117d366339687bd5576160db0f76"
|
"checksum redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7e891cfe48e9100a70a3b6eb652fef28920c117d366339687bd5576160db0f76"
|
||||||
"checksum regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "559008764a17de49a3146b234641644ed37d118d1ef641a0bb573d146edc6ce0"
|
"checksum regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "559008764a17de49a3146b234641644ed37d118d1ef641a0bb573d146edc6ce0"
|
||||||
"checksum regex-automata 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "a25a7daa2eea48550e9946133d6cc9621020d29cc7069089617234bf8b6a8693"
|
"checksum regex-automata 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "a25a7daa2eea48550e9946133d6cc9621020d29cc7069089617234bf8b6a8693"
|
||||||
@@ -745,12 +744,12 @@ dependencies = [
|
|||||||
"checksum serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)" = "58fc82bec244f168b23d1963b45c8bf5726e9a15a9d146a067f9081aeed2de79"
|
"checksum serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)" = "58fc82bec244f168b23d1963b45c8bf5726e9a15a9d146a067f9081aeed2de79"
|
||||||
"checksum serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)" = "5a23aa71d4a4d43fdbfaac00eff68ba8a06a51759a89ac3304323e800c4dd40d"
|
"checksum serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)" = "5a23aa71d4a4d43fdbfaac00eff68ba8a06a51759a89ac3304323e800c4dd40d"
|
||||||
"checksum smallvec 0.6.9 (registry+https://github.com/rust-lang/crates.io-index)" = "c4488ae950c49d403731982257768f48fada354a5203fe81f9bb6f43ca9002be"
|
"checksum smallvec 0.6.9 (registry+https://github.com/rust-lang/crates.io-index)" = "c4488ae950c49d403731982257768f48fada354a5203fe81f9bb6f43ca9002be"
|
||||||
"checksum strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)" = "8ea5119cdb4c55b55d432abb513a0429384878c15dde60cc77b1c99de1a95a6a"
|
"checksum strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bb4f380125926a99e52bc279241539c018323fab05ad6368b56f93d9369ff550"
|
||||||
"checksum syn 0.15.31 (registry+https://github.com/rust-lang/crates.io-index)" = "d2b4cfac95805274c6afdb12d8f770fa2d27c045953e7b630a81801953699a9a"
|
"checksum syn 0.15.30 (registry+https://github.com/rust-lang/crates.io-index)" = "66c8865bf5a7cbb662d8b011950060b3c8743dca141b054bf7195b20d314d8e2"
|
||||||
"checksum tempfile 3.0.7 (registry+https://github.com/rust-lang/crates.io-index)" = "b86c784c88d98c801132806dadd3819ed29d8600836c4088e855cdf3e178ed8a"
|
"checksum tempfile 3.0.7 (registry+https://github.com/rust-lang/crates.io-index)" = "b86c784c88d98c801132806dadd3819ed29d8600836c4088e855cdf3e178ed8a"
|
||||||
"checksum termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)" = "4096add70612622289f2fdcdbd5086dc81c1e2675e6ae58d6c4f62a16c6d7f2f"
|
"checksum termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)" = "4096add70612622289f2fdcdbd5086dc81c1e2675e6ae58d6c4f62a16c6d7f2f"
|
||||||
"checksum termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "689a3bdfaab439fd92bc87df5c4c78417d3cbe537487274e9b0b2dce76e92096"
|
"checksum termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "689a3bdfaab439fd92bc87df5c4c78417d3cbe537487274e9b0b2dce76e92096"
|
||||||
"checksum textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)" = "d326610f408c7a4eb6f51c37c330e496b08506c9457c9d34287ecc38809fb060"
|
"checksum textwrap 0.10.0 (registry+https://github.com/rust-lang/crates.io-index)" = "307686869c93e71f94da64286f9a9524c0f308a9e1c87a583de8e9c9039ad3f6"
|
||||||
"checksum thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)" = "c6b53e329000edc2b34dbe8545fd20e55a333362d0a321909685a19bd28c3f1b"
|
"checksum thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)" = "c6b53e329000edc2b34dbe8545fd20e55a333362d0a321909685a19bd28c3f1b"
|
||||||
"checksum ucd-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "535c204ee4d8434478593480b8f86ab45ec9aae0e83c568ca81abf0fd0e88f86"
|
"checksum ucd-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "535c204ee4d8434478593480b8f86ab45ec9aae0e83c568ca81abf0fd0e88f86"
|
||||||
"checksum unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "882386231c45df4700b275c7ff55b6f3698780a650026380e72dabe76fa46526"
|
"checksum unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "882386231c45df4700b275c7ff55b6f3698780a650026380e72dabe76fa46526"
|
||||||
|
@@ -48,7 +48,7 @@ members = [
|
|||||||
[dependencies]
|
[dependencies]
|
||||||
bstr = "0.1.2"
|
bstr = "0.1.2"
|
||||||
grep = { version = "0.2.3", path = "grep" }
|
grep = { version = "0.2.3", path = "grep" }
|
||||||
ignore = { version = "0.4.7", path = "ignore" }
|
ignore = { version = "0.4.4", path = "ignore" }
|
||||||
lazy_static = "1.1.0"
|
lazy_static = "1.1.0"
|
||||||
log = "0.4.5"
|
log = "0.4.5"
|
||||||
num_cpus = "1.8.0"
|
num_cpus = "1.8.0"
|
||||||
|
106
GUIDE.md
106
GUIDE.md
@@ -18,7 +18,6 @@ translatable to any command line shell environment.
|
|||||||
* [Replacements](#replacements)
|
* [Replacements](#replacements)
|
||||||
* [Configuration file](#configuration-file)
|
* [Configuration file](#configuration-file)
|
||||||
* [File encoding](#file-encoding)
|
* [File encoding](#file-encoding)
|
||||||
* [Binary data](#binary-data)
|
|
||||||
* [Common options](#common-options)
|
* [Common options](#common-options)
|
||||||
|
|
||||||
|
|
||||||
@@ -538,9 +537,8 @@ formatting peculiarities:
|
|||||||
|
|
||||||
```
|
```
|
||||||
$ cat $HOME/.ripgreprc
|
$ cat $HOME/.ripgreprc
|
||||||
# Don't let ripgrep vomit really long lines to my terminal, and show a preview.
|
# Don't let ripgrep vomit really long lines to my terminal.
|
||||||
--max-columns=150
|
--max-columns=150
|
||||||
--max-columns-preview
|
|
||||||
|
|
||||||
# Add my 'web' type.
|
# Add my 'web' type.
|
||||||
--type-add
|
--type-add
|
||||||
@@ -605,7 +603,7 @@ topic, but we can try to summarize its relevancy to ripgrep:
|
|||||||
* Files are generally just a bundle of bytes. There is no reliable way to know
|
* Files are generally just a bundle of bytes. There is no reliable way to know
|
||||||
their encoding.
|
their encoding.
|
||||||
* Either the encoding of the pattern must match the encoding of the files being
|
* Either the encoding of the pattern must match the encoding of the files being
|
||||||
searched, or a form of transcoding must be performed that converts either the
|
searched, or a form of transcoding must be performed converts either the
|
||||||
pattern or the file to the same encoding as the other.
|
pattern or the file to the same encoding as the other.
|
||||||
* ripgrep tends to work best on plain text files, and among plain text files,
|
* ripgrep tends to work best on plain text files, and among plain text files,
|
||||||
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
|
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
|
||||||
@@ -628,15 +626,12 @@ given, which is the default:
|
|||||||
they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
|
they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
|
||||||
the file from UTF-16 to UTF-8, and then execute the search on the transcoded
|
the file from UTF-16 to UTF-8, and then execute the search on the transcoded
|
||||||
version of the file. (This incurs a performance penalty since transcoding
|
version of the file. (This incurs a performance penalty since transcoding
|
||||||
is slower than regex searching.) If the file contains invalid UTF-16, then
|
is slower than regex searching.)
|
||||||
the Unicode replacement codepoint is substituted in place of invalid code
|
|
||||||
units.
|
|
||||||
* To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
|
* To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
|
||||||
you to specify an encoding from the
|
you to specify an encoding from the
|
||||||
[Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
|
[Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
|
||||||
ripgrep will assume *all* files searched are the encoding specified (unless
|
ripgrep will assume *all* files searched are the encoding specified and
|
||||||
the file has a BOM) and will perform a transcoding step just like in the
|
will perform a transcoding step just like in the UTF-16 case described above.
|
||||||
UTF-16 case described above.
|
|
||||||
|
|
||||||
By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
|
By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
|
||||||
can and will search arbitrary bytes. The key here is that if you're searching
|
can and will search arbitrary bytes. The key here is that if you're searching
|
||||||
@@ -646,26 +641,9 @@ pattern won't find anything. With all that said, this mode of operation is
|
|||||||
important, because it lets you find ASCII or UTF-8 *within* files that are
|
important, because it lets you find ASCII or UTF-8 *within* files that are
|
||||||
otherwise arbitrary bytes.
|
otherwise arbitrary bytes.
|
||||||
|
|
||||||
As a special case, the `-E/--encoding` flag supports the value `none`, which
|
|
||||||
will completely disable all encoding related logic, including BOM sniffing.
|
|
||||||
When `-E/--encoding` is set to `none`, ripgrep will search the raw bytes of
|
|
||||||
the underlying file with no transcoding step. For example, here's how you might
|
|
||||||
search the raw UTF-16 encoding of the string `Шерлок`:
|
|
||||||
|
|
||||||
```
|
|
||||||
$ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
|
|
||||||
```
|
|
||||||
|
|
||||||
Of course, that's just an example meant to show how one can drop down into
|
|
||||||
raw bytes. Namely, the simpler command works as you might expect automatically:
|
|
||||||
|
|
||||||
```
|
|
||||||
$ rg 'Шерлок' some-utf16-file
|
|
||||||
```
|
|
||||||
|
|
||||||
Finally, it is possible to disable ripgrep's Unicode support from within the
|
Finally, it is possible to disable ripgrep's Unicode support from within the
|
||||||
regular expression. For example, let's say you wanted `.` to match any byte
|
pattern regular expression. For example, let's say you wanted `.` to match any
|
||||||
rather than any Unicode codepoint. (You might want this while searching a
|
byte rather than any Unicode codepoint. (You might want this while searching a
|
||||||
binary file, since `.` by default will not match invalid UTF-8.) You could do
|
binary file, since `.` by default will not match invalid UTF-8.) You could do
|
||||||
this by disabling Unicode via a regular expression flag:
|
this by disabling Unicode via a regular expression flag:
|
||||||
|
|
||||||
@@ -682,76 +660,6 @@ $ rg '\w(?-u:\w)\w'
|
|||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
### Binary data
|
|
||||||
|
|
||||||
In addition to skipping hidden files and files in your `.gitignore` by default,
|
|
||||||
ripgrep also attempts to skip binary files. ripgrep does this by default
|
|
||||||
because binary files (like PDFs or images) are typically not things you want to
|
|
||||||
search when searching for regex matches. Moreover, if content in a binary file
|
|
||||||
did match, then it's possible for undesirable binary data to be printed to your
|
|
||||||
terminal and wreak havoc.
|
|
||||||
|
|
||||||
Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
|
|
||||||
rules, a file cannot as easily be classified as binary. In order to figure out
|
|
||||||
whether a file is binary, the most effective heuristic that balances
|
|
||||||
correctness with performance is to simply look for `NUL` bytes. At that point,
|
|
||||||
the determination is simple: a file is considered "binary" if and only if it
|
|
||||||
contains a `NUL` byte somewhere in its contents.
|
|
||||||
|
|
||||||
The issue is that while most binary files will have a `NUL` byte toward the
|
|
||||||
beginning of its contents, this is not necessarily true. The `NUL` byte might
|
|
||||||
be the very last byte in a large file, but that file is still considered
|
|
||||||
binary. While this leads to a fair amount of complexity inside ripgrep's
|
|
||||||
implementation, it also results in some unintuitive user experiences.
|
|
||||||
|
|
||||||
At a high level, ripgrep operates in three different modes with respect to
|
|
||||||
binary files:
|
|
||||||
|
|
||||||
1. The default mode is to attempt to remove binary files from a search
|
|
||||||
completely. This is meant to mirror how ripgrep removes hidden files and
|
|
||||||
files in your `.gitignore` automatically. That is, as soon as a file is
|
|
||||||
detected as binary, searching stops. If a match was already printed (because
|
|
||||||
it was detected long before a `NUL` byte), then ripgrep will print a warning
|
|
||||||
message indicating that the search stopped prematurely. This default mode
|
|
||||||
**only applies to files searched by ripgrep as a result of recursive
|
|
||||||
directory traversal**, which is consistent with ripgrep's other automatic
|
|
||||||
filtering. For example, `rg foo .file` will search `.file` even though it
|
|
||||||
is hidden. Similarly, `rg foo binary-file` search `binary-file` in "binary"
|
|
||||||
mode automatically.
|
|
||||||
2. Binary mode is similar to the default mode, except it will not always
|
|
||||||
stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
|
|
||||||
will continue searching a file that is known to be binary until the first
|
|
||||||
of two conditions is met: 1) the end of the file has been reached or 2) a
|
|
||||||
match is or has been seen. This means that in binary mode, if ripgrep
|
|
||||||
reports no matches, then there are no matches in the file. When a match does
|
|
||||||
occur, ripgrep prints a message similar to one it prints when in its default
|
|
||||||
mode indicating that the search has stopped prematurely. This mode can be
|
|
||||||
forcefully enabled for all files with the `--binary` flag. The purpose of
|
|
||||||
binary mode is to provide a way to discover matches in all files, but to
|
|
||||||
avoid having binary data dumped into your terminal.
|
|
||||||
3. Text mode completely disables all binary detection and searches all files
|
|
||||||
as if they were text. This is useful when searching a file that is
|
|
||||||
predominantly text but contains a `NUL` byte, or if you are specifically
|
|
||||||
trying to search binary data. This mode can be enabled with the `-a/--text`
|
|
||||||
flag. Note that when using this mode on very large binary files, it is
|
|
||||||
possible for ripgrep to use a lot of memory.
|
|
||||||
|
|
||||||
Unfortunately, there is one additional complexity in ripgrep that can make it
|
|
||||||
difficult to reason about binary files. That is, the way binary detection works
|
|
||||||
depends on the way that ripgrep searches your files. Specifically:
|
|
||||||
|
|
||||||
* When ripgrep uses memory maps, then binary detection is only performed on the
|
|
||||||
first few kilobytes of the file in addition to every matching line.
|
|
||||||
* When ripgrep doesn't use memory maps, then binary detection is performed on
|
|
||||||
all bytes searched.
|
|
||||||
|
|
||||||
This means that whether a file is detected as binary or not can change based
|
|
||||||
on the internal search strategy used by ripgrep. If you prefer to keep
|
|
||||||
ripgrep's binary file detection consistent, then you can disable memory maps
|
|
||||||
via the `--no-mmap` flag. (The cost will be a small performance regression when
|
|
||||||
searching very large files on some platforms.)
|
|
||||||
|
|
||||||
|
|
||||||
### Common options
|
### Common options
|
||||||
|
|
||||||
ripgrep has a lot of flags. Too many to keep in your head at once. This section
|
ripgrep has a lot of flags. Too many to keep in your head at once. This section
|
||||||
|
@@ -11,7 +11,6 @@ and grep.
|
|||||||
[](https://travis-ci.org/BurntSushi/ripgrep)
|
[](https://travis-ci.org/BurntSushi/ripgrep)
|
||||||
[](https://ci.appveyor.com/project/BurntSushi/ripgrep)
|
[](https://ci.appveyor.com/project/BurntSushi/ripgrep)
|
||||||
[](https://crates.io/crates/ripgrep)
|
[](https://crates.io/crates/ripgrep)
|
||||||
[](https://repology.org/project/ripgrep/badges)
|
|
||||||
|
|
||||||
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
|
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
|
||||||
|
|
||||||
@@ -340,7 +339,7 @@ If you're a **NetBSD** user, then you can install ripgrep from
|
|||||||
|
|
||||||
If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
|
If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
|
||||||
|
|
||||||
* Note that the minimum supported version of Rust for ripgrep is **1.34.0**,
|
* Note that the minimum supported version of Rust for ripgrep is **1.32.0**,
|
||||||
although ripgrep may work with older versions.
|
although ripgrep may work with older versions.
|
||||||
* Note that the binary may be bigger than expected because it contains debug
|
* Note that the binary may be bigger than expected because it contains debug
|
||||||
symbols. This is intentional. To remove debug symbols and therefore reduce
|
symbols. This is intentional. To remove debug symbols and therefore reduce
|
||||||
@@ -350,6 +349,9 @@ If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
|
|||||||
$ cargo install ripgrep
|
$ cargo install ripgrep
|
||||||
```
|
```
|
||||||
|
|
||||||
|
When compiling with Rust 1.27 or newer, this will automatically enable SIMD
|
||||||
|
optimizations for search.
|
||||||
|
|
||||||
ripgrep isn't currently in any other package repositories.
|
ripgrep isn't currently in any other package repositories.
|
||||||
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
|
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
|
||||||
|
|
||||||
@@ -358,7 +360,7 @@ ripgrep isn't currently in any other package repositories.
|
|||||||
|
|
||||||
ripgrep is written in Rust, so you'll need to grab a
|
ripgrep is written in Rust, so you'll need to grab a
|
||||||
[Rust installation](https://www.rust-lang.org/) in order to compile it.
|
[Rust installation](https://www.rust-lang.org/) in order to compile it.
|
||||||
ripgrep compiles with Rust 1.34.0 (stable) or newer. In general, ripgrep tracks
|
ripgrep compiles with Rust 1.32.0 (stable) or newer. In general, ripgrep tracks
|
||||||
the latest stable release of the Rust compiler.
|
the latest stable release of the Rust compiler.
|
||||||
|
|
||||||
To build ripgrep:
|
To build ripgrep:
|
||||||
|
15
complete/_rg
15
complete/_rg
@@ -43,7 +43,6 @@ _rg() {
|
|||||||
+ '(exclusive)' # Misc. fully exclusive options
|
+ '(exclusive)' # Misc. fully exclusive options
|
||||||
'(: * -)'{-h,--help}'[display help information]'
|
'(: * -)'{-h,--help}'[display help information]'
|
||||||
'(: * -)'{-V,--version}'[display version information]'
|
'(: * -)'{-V,--version}'[display version information]'
|
||||||
'(: * -)'--pcre2-version'[print the version of PCRE2 used by ripgrep, if available]'
|
|
||||||
|
|
||||||
+ '(buffered)' # buffering options
|
+ '(buffered)' # buffering options
|
||||||
'--line-buffered[force line buffering]'
|
'--line-buffered[force line buffering]'
|
||||||
@@ -86,7 +85,7 @@ _rg() {
|
|||||||
|
|
||||||
+ '(file-name)' # File-name options
|
+ '(file-name)' # File-name options
|
||||||
{-H,--with-filename}'[show file name for matches]'
|
{-H,--with-filename}'[show file name for matches]'
|
||||||
{-I,--no-filename}"[don't show file name for matches]"
|
"--no-filename[don't show file name for matches]"
|
||||||
|
|
||||||
+ '(file-system)' # File system options
|
+ '(file-system)' # File system options
|
||||||
"--one-file-system[don't descend into directories on other file systems]"
|
"--one-file-system[don't descend into directories on other file systems]"
|
||||||
@@ -112,10 +111,6 @@ _rg() {
|
|||||||
'--hidden[search hidden files and directories]'
|
'--hidden[search hidden files and directories]'
|
||||||
$no"--no-hidden[don't search hidden files and directories]"
|
$no"--no-hidden[don't search hidden files and directories]"
|
||||||
|
|
||||||
+ '(hybrid)' # hybrid regex options
|
|
||||||
'--auto-hybrid-regex[dynamically use PCRE2 if necessary]'
|
|
||||||
$no"--no-auto-hybrid-regex[don't dynamically use PCRE2 if necessary]"
|
|
||||||
|
|
||||||
+ '(ignore)' # Ignore-file options
|
+ '(ignore)' # Ignore-file options
|
||||||
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
|
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
|
||||||
$no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
|
$no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
|
||||||
@@ -153,10 +148,6 @@ _rg() {
|
|||||||
$no"--no-crlf[don't use CRLF as line terminator]"
|
$no"--no-crlf[don't use CRLF as line terminator]"
|
||||||
'(text)--null-data[use NUL as line terminator]'
|
'(text)--null-data[use NUL as line terminator]'
|
||||||
|
|
||||||
+ '(max-columns-preview)' # max column preview options
|
|
||||||
'--max-columns-preview[show preview for long lines (with -M)]'
|
|
||||||
$no"--no-max-columns-preview[don't show preview for long lines (with -M)]"
|
|
||||||
|
|
||||||
+ '(max-depth)' # Directory-depth options
|
+ '(max-depth)' # Directory-depth options
|
||||||
'--max-depth=[specify max number of directories to descend]:number of directories'
|
'--max-depth=[specify max number of directories to descend]:number of directories'
|
||||||
'!--maxdepth=:number of directories'
|
'!--maxdepth=:number of directories'
|
||||||
@@ -236,8 +227,6 @@ _rg() {
|
|||||||
|
|
||||||
+ '(text)' # Binary-search options
|
+ '(text)' # Binary-search options
|
||||||
{-a,--text}'[search binary files as if they were text]'
|
{-a,--text}'[search binary files as if they were text]'
|
||||||
"--binary[search binary files, don't print binary data]"
|
|
||||||
$no"--no-binary[don't search binary files]"
|
|
||||||
$no"(--null-data)--no-text[don't search binary files as if they were text]"
|
$no"(--null-data)--no-text[don't search binary files as if they were text]"
|
||||||
|
|
||||||
+ '(threads)' # Thread-count options
|
+ '(threads)' # Thread-count options
|
||||||
@@ -389,7 +378,7 @@ _rg_encodings() {
|
|||||||
shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
|
shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
|
||||||
utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
|
utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
|
||||||
windows-{31j,874,949,125{0..8}} dos-874 tis-620 ansi_x3.4-1968
|
windows-{31j,874,949,125{0..8}} dos-874 tis-620 ansi_x3.4-1968
|
||||||
x-user-defined auto none
|
x-user-defined auto
|
||||||
)
|
)
|
||||||
|
|
||||||
_wanted encodings expl encoding compadd -a "$@" - _encodings
|
_wanted encodings expl encoding compadd -a "$@" - _encodings
|
||||||
|
@@ -41,9 +41,6 @@ configuration file. The file can specify one shell argument per line. Lines
|
|||||||
starting with *#* are ignored. For more details, see the man page or the
|
starting with *#* are ignored. For more details, see the man page or the
|
||||||
*README*.
|
*README*.
|
||||||
|
|
||||||
Tip: to disable all smart filtering and make ripgrep behave a bit more like
|
|
||||||
classical grep, use *rg -uuu*.
|
|
||||||
|
|
||||||
|
|
||||||
REGEX SYNTAX
|
REGEX SYNTAX
|
||||||
------------
|
------------
|
||||||
@@ -192,21 +189,6 @@ file that is simultaneously truncated. This behavior can be avoided by passing
|
|||||||
the *--no-mmap* flag which will forcefully disable the use of memory maps in
|
the *--no-mmap* flag which will forcefully disable the use of memory maps in
|
||||||
all cases.
|
all cases.
|
||||||
|
|
||||||
ripgrep may use a large amount of memory depending on a few factors. Firstly,
|
|
||||||
if ripgrep uses parallelism for search (the default), then the entire output
|
|
||||||
for each individual file is buffered into memory in order to prevent
|
|
||||||
interleaving matches in the output. To avoid this, you can disable parallelism
|
|
||||||
with the *-j1* flag. Secondly, ripgrep always needs to have at least a single
|
|
||||||
line in memory in order to execute a search. A file with a very long line can
|
|
||||||
thus cause ripgrep to use a lot of memory. Generally, this only occurs when
|
|
||||||
searching binary data with the *-a* flag enabled. (When the *-a* flag isn't
|
|
||||||
enabled, ripgrep will replace all NUL bytes with line terminators, which
|
|
||||||
typically prevents exorbitant memory usage.) Thirdly, when ripgrep searches
|
|
||||||
a large file using a memory map, the process will report its resident memory
|
|
||||||
usage as the size of the file. However, this does not mean ripgrep actually
|
|
||||||
needed to use that much memory; the operating system will generally handle this
|
|
||||||
for you.
|
|
||||||
|
|
||||||
|
|
||||||
VERSION
|
VERSION
|
||||||
-------
|
-------
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "globset"
|
name = "globset"
|
||||||
version = "0.4.3" #:version
|
version = "0.4.2" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
Cross platform single glob and glob set matching. Glob set matching is the
|
Cross platform single glob and glob set matching. Glob set matching is the
|
||||||
@@ -26,7 +26,7 @@ log = "0.4.5"
|
|||||||
regex = "1.1.5"
|
regex = "1.1.5"
|
||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
glob = "0.3.0"
|
glob = "0.2.11"
|
||||||
|
|
||||||
[features]
|
[features]
|
||||||
simd-accel = []
|
simd-accel = []
|
||||||
|
@@ -15,7 +15,7 @@ license = "Unlicense/MIT"
|
|||||||
[dependencies]
|
[dependencies]
|
||||||
atty = "0.2.11"
|
atty = "0.2.11"
|
||||||
bstr = "0.1.2"
|
bstr = "0.1.2"
|
||||||
globset = { version = "0.4.3", path = "../globset" }
|
globset = { version = "0.4.2", path = "../globset" }
|
||||||
lazy_static = "1.1.0"
|
lazy_static = "1.1.0"
|
||||||
log = "0.4.5"
|
log = "0.4.5"
|
||||||
regex = "1.1"
|
regex = "1.1"
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "grep-matcher"
|
name = "grep-matcher"
|
||||||
version = "0.1.2" #:version
|
version = "0.1.1" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
A trait for regular expressions, with a focus on line oriented search.
|
A trait for regular expressions, with a focus on line oriented search.
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "grep-pcre2"
|
name = "grep-pcre2"
|
||||||
version = "0.1.3" #:version
|
version = "0.1.2" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
Use PCRE2 with the 'grep' crate.
|
Use PCRE2 with the 'grep' crate.
|
||||||
@@ -13,5 +13,5 @@ keywords = ["regex", "grep", "pcre", "backreference", "look"]
|
|||||||
license = "Unlicense/MIT"
|
license = "Unlicense/MIT"
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
|
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
|
||||||
pcre2 = "0.2.0"
|
pcre2 = "0.1.1"
|
||||||
|
@@ -10,7 +10,6 @@ extern crate pcre2;
|
|||||||
|
|
||||||
pub use error::{Error, ErrorKind};
|
pub use error::{Error, ErrorKind};
|
||||||
pub use matcher::{RegexCaptures, RegexMatcher, RegexMatcherBuilder};
|
pub use matcher::{RegexCaptures, RegexMatcher, RegexMatcherBuilder};
|
||||||
pub use pcre2::{is_jit_available, version};
|
|
||||||
|
|
||||||
mod error;
|
mod error;
|
||||||
mod matcher;
|
mod matcher;
|
||||||
|
@@ -227,27 +227,6 @@ impl RegexMatcherBuilder {
|
|||||||
self.builder.jit_if_available(yes);
|
self.builder.jit_if_available(yes);
|
||||||
self
|
self
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Set the maximum size of PCRE2's JIT stack, in bytes. If the JIT is
|
|
||||||
/// not enabled, then this has no effect.
|
|
||||||
///
|
|
||||||
/// When `None` is given, no custom JIT stack will be created, and instead,
|
|
||||||
/// the default JIT stack is used. When the default is used, its maximum
|
|
||||||
/// size is 32 KB.
|
|
||||||
///
|
|
||||||
/// When this is set, then a new JIT stack will be created with the given
|
|
||||||
/// maximum size as its limit.
|
|
||||||
///
|
|
||||||
/// Increasing the stack size can be useful for larger regular expressions.
|
|
||||||
///
|
|
||||||
/// By default, this is set to `None`.
|
|
||||||
pub fn max_jit_stack_size(
|
|
||||||
&mut self,
|
|
||||||
bytes: Option<usize>,
|
|
||||||
) -> &mut RegexMatcherBuilder {
|
|
||||||
self.builder.max_jit_stack_size(bytes);
|
|
||||||
self
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// An implementation of the `Matcher` trait using PCRE2.
|
/// An implementation of the `Matcher` trait using PCRE2.
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "grep-printer"
|
name = "grep-printer"
|
||||||
version = "0.1.2" #:version
|
version = "0.1.1" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
An implementation of the grep crate's Sink trait that provides standard
|
An implementation of the grep crate's Sink trait that provides standard
|
||||||
@@ -20,12 +20,12 @@ serde1 = ["base64", "serde", "serde_derive", "serde_json"]
|
|||||||
[dependencies]
|
[dependencies]
|
||||||
base64 = { version = "0.10.0", optional = true }
|
base64 = { version = "0.10.0", optional = true }
|
||||||
bstr = "0.1.2"
|
bstr = "0.1.2"
|
||||||
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
|
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
|
||||||
grep-searcher = { version = "0.1.4", path = "../grep-searcher" }
|
grep-searcher = { version = "0.1.1", path = "../grep-searcher" }
|
||||||
termcolor = "1.0.4"
|
termcolor = "1.0.4"
|
||||||
serde = { version = "1.0.77", optional = true }
|
serde = { version = "1.0.77", optional = true }
|
||||||
serde_derive = { version = "1.0.77", optional = true }
|
serde_derive = { version = "1.0.77", optional = true }
|
||||||
serde_json = { version = "1.0.27", optional = true }
|
serde_json = { version = "1.0.27", optional = true }
|
||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
grep-regex = { version = "0.1.3", path = "../grep-regex" }
|
grep-regex = { version = "0.1.1", path = "../grep-regex" }
|
||||||
|
@@ -5,7 +5,6 @@ use std::path::Path;
|
|||||||
use std::sync::Arc;
|
use std::sync::Arc;
|
||||||
use std::time::Instant;
|
use std::time::Instant;
|
||||||
|
|
||||||
use bstr::BStr;
|
|
||||||
use grep_matcher::{Match, Matcher};
|
use grep_matcher::{Match, Matcher};
|
||||||
use grep_searcher::{
|
use grep_searcher::{
|
||||||
LineStep, Searcher,
|
LineStep, Searcher,
|
||||||
@@ -17,7 +16,10 @@ use termcolor::{ColorSpec, NoColor, WriteColor};
|
|||||||
use color::ColorSpecs;
|
use color::ColorSpecs;
|
||||||
use counter::CounterWriter;
|
use counter::CounterWriter;
|
||||||
use stats::Stats;
|
use stats::Stats;
|
||||||
use util::{PrinterPath, Replacer, Sunk, trim_ascii_prefix};
|
use util::{
|
||||||
|
PrinterPath, Replacer, Sunk,
|
||||||
|
trim_ascii_prefix, trim_ascii_prefix_range,
|
||||||
|
};
|
||||||
|
|
||||||
/// The configuration for the standard printer.
|
/// The configuration for the standard printer.
|
||||||
///
|
///
|
||||||
@@ -34,7 +36,6 @@ struct Config {
|
|||||||
per_match: bool,
|
per_match: bool,
|
||||||
replacement: Arc<Option<Vec<u8>>>,
|
replacement: Arc<Option<Vec<u8>>>,
|
||||||
max_columns: Option<u64>,
|
max_columns: Option<u64>,
|
||||||
max_columns_preview: bool,
|
|
||||||
max_matches: Option<u64>,
|
max_matches: Option<u64>,
|
||||||
column: bool,
|
column: bool,
|
||||||
byte_offset: bool,
|
byte_offset: bool,
|
||||||
@@ -58,7 +59,6 @@ impl Default for Config {
|
|||||||
per_match: false,
|
per_match: false,
|
||||||
replacement: Arc::new(None),
|
replacement: Arc::new(None),
|
||||||
max_columns: None,
|
max_columns: None,
|
||||||
max_columns_preview: false,
|
|
||||||
max_matches: None,
|
max_matches: None,
|
||||||
column: false,
|
column: false,
|
||||||
byte_offset: false,
|
byte_offset: false,
|
||||||
@@ -263,21 +263,6 @@ impl StandardBuilder {
|
|||||||
self
|
self
|
||||||
}
|
}
|
||||||
|
|
||||||
/// When enabled, if a line is found to be over the configured maximum
|
|
||||||
/// column limit (measured in terms of bytes), then a preview of the long
|
|
||||||
/// line will be printed instead.
|
|
||||||
///
|
|
||||||
/// The preview will correspond to the first `N` *grapheme clusters* of
|
|
||||||
/// the line, where `N` is the limit configured by `max_columns`.
|
|
||||||
///
|
|
||||||
/// If no limit is set, then enabling this has no effect.
|
|
||||||
///
|
|
||||||
/// This is disabled by default.
|
|
||||||
pub fn max_columns_preview(&mut self, yes: bool) -> &mut StandardBuilder {
|
|
||||||
self.config.max_columns_preview = yes;
|
|
||||||
self
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Set the maximum amount of matching lines that are printed.
|
/// Set the maximum amount of matching lines that are printed.
|
||||||
///
|
///
|
||||||
/// If multi line search is enabled and a match spans multiple lines, then
|
/// If multi line search is enabled and a match spans multiple lines, then
|
||||||
@@ -758,11 +743,6 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
|
|||||||
stats.add_matches(self.standard.matches.len() as u64);
|
stats.add_matches(self.standard.matches.len() as u64);
|
||||||
stats.add_matched_lines(mat.lines().count() as u64);
|
stats.add_matched_lines(mat.lines().count() as u64);
|
||||||
}
|
}
|
||||||
if searcher.binary_detection().convert_byte().is_some() {
|
|
||||||
if self.binary_byte_offset.is_some() {
|
|
||||||
return Ok(false);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
StandardImpl::from_match(searcher, self, mat).sink()?;
|
StandardImpl::from_match(searcher, self, mat).sink()?;
|
||||||
Ok(!self.should_quit())
|
Ok(!self.should_quit())
|
||||||
@@ -784,12 +764,6 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
|
|||||||
self.record_matches(ctx.bytes())?;
|
self.record_matches(ctx.bytes())?;
|
||||||
self.replace(ctx.bytes())?;
|
self.replace(ctx.bytes())?;
|
||||||
}
|
}
|
||||||
if searcher.binary_detection().convert_byte().is_some() {
|
|
||||||
if self.binary_byte_offset.is_some() {
|
|
||||||
return Ok(false);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
StandardImpl::from_context(searcher, self, ctx).sink()?;
|
StandardImpl::from_context(searcher, self, ctx).sink()?;
|
||||||
Ok(!self.should_quit())
|
Ok(!self.should_quit())
|
||||||
}
|
}
|
||||||
@@ -802,15 +776,6 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
|
|||||||
Ok(true)
|
Ok(true)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn binary_data(
|
|
||||||
&mut self,
|
|
||||||
_searcher: &Searcher,
|
|
||||||
binary_byte_offset: u64,
|
|
||||||
) -> Result<bool, io::Error> {
|
|
||||||
self.binary_byte_offset = Some(binary_byte_offset);
|
|
||||||
Ok(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
fn begin(
|
fn begin(
|
||||||
&mut self,
|
&mut self,
|
||||||
_searcher: &Searcher,
|
_searcher: &Searcher,
|
||||||
@@ -828,12 +793,10 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
|
|||||||
|
|
||||||
fn finish(
|
fn finish(
|
||||||
&mut self,
|
&mut self,
|
||||||
searcher: &Searcher,
|
_searcher: &Searcher,
|
||||||
finish: &SinkFinish,
|
finish: &SinkFinish,
|
||||||
) -> Result<(), io::Error> {
|
) -> Result<(), io::Error> {
|
||||||
if let Some(offset) = self.binary_byte_offset {
|
self.binary_byte_offset = finish.binary_byte_offset();
|
||||||
StandardImpl::new(searcher, self).write_binary_message(offset)?;
|
|
||||||
}
|
|
||||||
if let Some(stats) = self.stats.as_mut() {
|
if let Some(stats) = self.stats.as_mut() {
|
||||||
stats.add_elapsed(self.start_time.elapsed());
|
stats.add_elapsed(self.start_time.elapsed());
|
||||||
stats.add_searches(1);
|
stats.add_searches(1);
|
||||||
@@ -1037,11 +1000,43 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
)?;
|
)?;
|
||||||
count += 1;
|
count += 1;
|
||||||
if self.exceeds_max_columns(&bytes[line]) {
|
if self.exceeds_max_columns(&bytes[line]) {
|
||||||
self.write_exceeded_line(bytes, line, matches, &mut midx)?;
|
self.write_exceeded_line()?;
|
||||||
} else {
|
continue;
|
||||||
self.write_colored_matches(bytes, line, matches, &mut midx)?;
|
|
||||||
self.write_line_term()?;
|
|
||||||
}
|
}
|
||||||
|
if self.has_line_terminator(&bytes[line]) {
|
||||||
|
line = line.with_end(line.end() - 1);
|
||||||
|
}
|
||||||
|
if self.config().trim_ascii {
|
||||||
|
line = self.trim_ascii_prefix_range(bytes, line);
|
||||||
|
}
|
||||||
|
|
||||||
|
while !line.is_empty() {
|
||||||
|
if matches[midx].end() <= line.start() {
|
||||||
|
if midx + 1 < matches.len() {
|
||||||
|
midx += 1;
|
||||||
|
continue;
|
||||||
|
} else {
|
||||||
|
self.end_color_match()?;
|
||||||
|
self.write(&bytes[line])?;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
let m = matches[midx];
|
||||||
|
|
||||||
|
if line.start() < m.start() {
|
||||||
|
let upto = cmp::min(line.end(), m.start());
|
||||||
|
self.end_color_match()?;
|
||||||
|
self.write(&bytes[line.with_end(upto)])?;
|
||||||
|
line = line.with_start(upto);
|
||||||
|
} else {
|
||||||
|
let upto = cmp::min(line.end(), m.end());
|
||||||
|
self.start_color_match()?;
|
||||||
|
self.write(&bytes[line.with_end(upto)])?;
|
||||||
|
line = line.with_start(upto);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
self.end_color_match()?;
|
||||||
|
self.write_line_term()?;
|
||||||
}
|
}
|
||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
@@ -1056,8 +1051,12 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
let mut stepper = LineStep::new(line_term, 0, bytes.len());
|
let mut stepper = LineStep::new(line_term, 0, bytes.len());
|
||||||
while let Some((start, end)) = stepper.next(bytes) {
|
while let Some((start, end)) = stepper.next(bytes) {
|
||||||
let mut line = Match::new(start, end);
|
let mut line = Match::new(start, end);
|
||||||
self.trim_line_terminator(bytes, &mut line);
|
if self.has_line_terminator(&bytes[line]) {
|
||||||
self.trim_ascii_prefix(bytes, &mut line);
|
line = line.with_end(line.end() - 1);
|
||||||
|
}
|
||||||
|
if self.config().trim_ascii {
|
||||||
|
line = self.trim_ascii_prefix_range(bytes, line);
|
||||||
|
}
|
||||||
while !line.is_empty() {
|
while !line.is_empty() {
|
||||||
if matches[midx].end() <= line.start() {
|
if matches[midx].end() <= line.start() {
|
||||||
if midx + 1 < matches.len() {
|
if midx + 1 < matches.len() {
|
||||||
@@ -1080,19 +1079,14 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
Some(m.start() as u64 + 1),
|
Some(m.start() as u64 + 1),
|
||||||
)?;
|
)?;
|
||||||
|
|
||||||
let this_line = line.with_end(upto);
|
let buf = &bytes[line.with_end(upto)];
|
||||||
line = line.with_start(upto);
|
line = line.with_start(upto);
|
||||||
if self.exceeds_max_columns(&bytes[this_line]) {
|
if self.exceeds_max_columns(&buf) {
|
||||||
self.write_exceeded_line(
|
self.write_exceeded_line()?;
|
||||||
bytes,
|
continue;
|
||||||
this_line,
|
|
||||||
matches,
|
|
||||||
&mut midx,
|
|
||||||
)?;
|
|
||||||
} else {
|
|
||||||
self.write_spec(spec, &bytes[this_line])?;
|
|
||||||
self.write_line_term()?;
|
|
||||||
}
|
}
|
||||||
|
self.write_spec(spec, buf)?;
|
||||||
|
self.write_line_term()?;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
count += 1;
|
count += 1;
|
||||||
@@ -1123,11 +1117,15 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
)?;
|
)?;
|
||||||
count += 1;
|
count += 1;
|
||||||
if self.exceeds_max_columns(&bytes[line]) {
|
if self.exceeds_max_columns(&bytes[line]) {
|
||||||
self.write_exceeded_line(bytes, line, &[m], &mut 0)?;
|
self.write_exceeded_line()?;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
self.trim_line_terminator(bytes, &mut line);
|
if self.has_line_terminator(&bytes[line]) {
|
||||||
self.trim_ascii_prefix(bytes, &mut line);
|
line = line.with_end(line.end() - 1);
|
||||||
|
}
|
||||||
|
if self.config().trim_ascii {
|
||||||
|
line = self.trim_ascii_prefix_range(bytes, line);
|
||||||
|
}
|
||||||
|
|
||||||
while !line.is_empty() {
|
while !line.is_empty() {
|
||||||
if m.end() <= line.start() {
|
if m.end() <= line.start() {
|
||||||
@@ -1184,10 +1182,7 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
line: &[u8],
|
line: &[u8],
|
||||||
) -> io::Result<()> {
|
) -> io::Result<()> {
|
||||||
if self.exceeds_max_columns(line) {
|
if self.exceeds_max_columns(line) {
|
||||||
let range = Match::new(0, line.len());
|
self.write_exceeded_line()?;
|
||||||
self.write_exceeded_line(
|
|
||||||
line, range, self.sunk.matches(), &mut 0,
|
|
||||||
)?;
|
|
||||||
} else {
|
} else {
|
||||||
self.write_trim(line)?;
|
self.write_trim(line)?;
|
||||||
if !self.has_line_terminator(line) {
|
if !self.has_line_terminator(line) {
|
||||||
@@ -1200,114 +1195,50 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
fn write_colored_line(
|
fn write_colored_line(
|
||||||
&self,
|
&self,
|
||||||
matches: &[Match],
|
matches: &[Match],
|
||||||
bytes: &[u8],
|
line: &[u8],
|
||||||
) -> io::Result<()> {
|
) -> io::Result<()> {
|
||||||
// If we know we aren't going to emit color, then we can go faster.
|
// If we know we aren't going to emit color, then we can go faster.
|
||||||
let spec = self.config().colors.matched();
|
let spec = self.config().colors.matched();
|
||||||
if !self.wtr().borrow().supports_color() || spec.is_none() {
|
if !self.wtr().borrow().supports_color() || spec.is_none() {
|
||||||
return self.write_line(bytes);
|
return self.write_line(line);
|
||||||
|
}
|
||||||
|
if self.exceeds_max_columns(line) {
|
||||||
|
return self.write_exceeded_line();
|
||||||
}
|
}
|
||||||
|
|
||||||
let line = Match::new(0, bytes.len());
|
let mut last_written =
|
||||||
if self.exceeds_max_columns(bytes) {
|
if !self.config().trim_ascii {
|
||||||
self.write_exceeded_line(bytes, line, matches, &mut 0)
|
0
|
||||||
} else {
|
} else {
|
||||||
self.write_colored_matches(bytes, line, matches, &mut 0)?;
|
self.trim_ascii_prefix_range(
|
||||||
self.write_line_term()?;
|
line,
|
||||||
Ok(())
|
Match::new(0, line.len()),
|
||||||
}
|
).start()
|
||||||
}
|
|
||||||
|
|
||||||
/// Write the `line` portion of `bytes`, with appropriate coloring for
|
|
||||||
/// each `match`, starting at `match_index`.
|
|
||||||
///
|
|
||||||
/// This accounts for trimming any whitespace prefix and will *never* print
|
|
||||||
/// a line terminator. If a match exceeds the range specified by `line`,
|
|
||||||
/// then only the part of the match within `line` (if any) is printed.
|
|
||||||
fn write_colored_matches(
|
|
||||||
&self,
|
|
||||||
bytes: &[u8],
|
|
||||||
mut line: Match,
|
|
||||||
matches: &[Match],
|
|
||||||
match_index: &mut usize,
|
|
||||||
) -> io::Result<()> {
|
|
||||||
self.trim_line_terminator(bytes, &mut line);
|
|
||||||
self.trim_ascii_prefix(bytes, &mut line);
|
|
||||||
if matches.is_empty() {
|
|
||||||
self.write(&bytes[line])?;
|
|
||||||
return Ok(());
|
|
||||||
}
|
|
||||||
while !line.is_empty() {
|
|
||||||
if matches[*match_index].end() <= line.start() {
|
|
||||||
if *match_index + 1 < matches.len() {
|
|
||||||
*match_index += 1;
|
|
||||||
continue;
|
|
||||||
} else {
|
|
||||||
self.end_color_match()?;
|
|
||||||
self.write(&bytes[line])?;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
let m = matches[*match_index];
|
|
||||||
if line.start() < m.start() {
|
|
||||||
let upto = cmp::min(line.end(), m.start());
|
|
||||||
self.end_color_match()?;
|
|
||||||
self.write(&bytes[line.with_end(upto)])?;
|
|
||||||
line = line.with_start(upto);
|
|
||||||
} else {
|
|
||||||
let upto = cmp::min(line.end(), m.end());
|
|
||||||
self.start_color_match()?;
|
|
||||||
self.write(&bytes[line.with_end(upto)])?;
|
|
||||||
line = line.with_start(upto);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
self.end_color_match()?;
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
fn write_exceeded_line(
|
|
||||||
&self,
|
|
||||||
bytes: &[u8],
|
|
||||||
mut line: Match,
|
|
||||||
matches: &[Match],
|
|
||||||
match_index: &mut usize,
|
|
||||||
) -> io::Result<()> {
|
|
||||||
if self.config().max_columns_preview {
|
|
||||||
let original = line;
|
|
||||||
let end = BStr::new(&bytes[line])
|
|
||||||
.grapheme_indices()
|
|
||||||
.map(|(_, end, _)| end)
|
|
||||||
.take(self.config().max_columns.unwrap_or(0) as usize)
|
|
||||||
.last()
|
|
||||||
.unwrap_or(0) + line.start();
|
|
||||||
line = line.with_end(end);
|
|
||||||
self.write_colored_matches(bytes, line, matches, match_index)?;
|
|
||||||
|
|
||||||
if matches.is_empty() {
|
|
||||||
self.write(b" [... omitted end of long line]")?;
|
|
||||||
} else {
|
|
||||||
let remaining = matches
|
|
||||||
.iter()
|
|
||||||
.filter(|m| {
|
|
||||||
m.start() >= line.end() && m.start() < original.end()
|
|
||||||
})
|
|
||||||
.count();
|
|
||||||
let tense =
|
|
||||||
if remaining == 1 {
|
|
||||||
"match"
|
|
||||||
} else {
|
|
||||||
"matches"
|
|
||||||
};
|
};
|
||||||
write!(
|
for mut m in matches.iter().map(|&m| m) {
|
||||||
self.wtr().borrow_mut(),
|
if last_written < m.start() {
|
||||||
" [... {} more {}]",
|
self.end_color_match()?;
|
||||||
remaining, tense,
|
self.write(&line[last_written..m.start()])?;
|
||||||
)?;
|
} else if last_written < m.end() {
|
||||||
|
m = m.with_start(last_written);
|
||||||
|
} else {
|
||||||
|
continue;
|
||||||
}
|
}
|
||||||
|
if !m.is_empty() {
|
||||||
|
self.start_color_match()?;
|
||||||
|
self.write(&line[m])?;
|
||||||
|
}
|
||||||
|
last_written = m.end();
|
||||||
|
}
|
||||||
|
self.end_color_match()?;
|
||||||
|
self.write(&line[last_written..])?;
|
||||||
|
if !self.has_line_terminator(line) {
|
||||||
self.write_line_term()?;
|
self.write_line_term()?;
|
||||||
return Ok(());
|
|
||||||
}
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn write_exceeded_line(&self) -> io::Result<()> {
|
||||||
if self.sunk.original_matches().is_empty() {
|
if self.sunk.original_matches().is_empty() {
|
||||||
if self.is_context() {
|
if self.is_context() {
|
||||||
self.write(b"[Omitted long context line]")?;
|
self.write(b"[Omitted long context line]")?;
|
||||||
@@ -1383,38 +1314,6 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
fn write_binary_message(&self, offset: u64) -> io::Result<()> {
|
|
||||||
if self.sink.match_count == 0 {
|
|
||||||
return Ok(());
|
|
||||||
}
|
|
||||||
|
|
||||||
let bin = self.searcher.binary_detection();
|
|
||||||
if let Some(byte) = bin.quit_byte() {
|
|
||||||
self.write(b"WARNING: stopped searching binary file ")?;
|
|
||||||
if let Some(path) = self.path() {
|
|
||||||
self.write_spec(self.config().colors.path(), path.as_bytes())?;
|
|
||||||
self.write(b" ")?;
|
|
||||||
}
|
|
||||||
let remainder = format!(
|
|
||||||
"after match (found {:?} byte around offset {})\n",
|
|
||||||
BStr::new(&[byte]), offset,
|
|
||||||
);
|
|
||||||
self.write(remainder.as_bytes())?;
|
|
||||||
} else if let Some(byte) = bin.convert_byte() {
|
|
||||||
self.write(b"Binary file ")?;
|
|
||||||
if let Some(path) = self.path() {
|
|
||||||
self.write_spec(self.config().colors.path(), path.as_bytes())?;
|
|
||||||
self.write(b" ")?;
|
|
||||||
}
|
|
||||||
let remainder = format!(
|
|
||||||
"matches (found {:?} byte around offset {})\n",
|
|
||||||
BStr::new(&[byte]), offset,
|
|
||||||
);
|
|
||||||
self.write(remainder.as_bytes())?;
|
|
||||||
}
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
fn write_context_separator(&self) -> io::Result<()> {
|
fn write_context_separator(&self) -> io::Result<()> {
|
||||||
if let Some(ref sep) = *self.config().separator_context {
|
if let Some(ref sep) = *self.config().separator_context {
|
||||||
self.write(sep)?;
|
self.write(sep)?;
|
||||||
@@ -1490,26 +1389,13 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
if !self.config().trim_ascii {
|
if !self.config().trim_ascii {
|
||||||
return self.write(buf);
|
return self.write(buf);
|
||||||
}
|
}
|
||||||
let mut range = Match::new(0, buf.len());
|
self.write(self.trim_ascii_prefix(buf))
|
||||||
self.trim_ascii_prefix(buf, &mut range);
|
|
||||||
self.write(&buf[range])
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fn write(&self, buf: &[u8]) -> io::Result<()> {
|
fn write(&self, buf: &[u8]) -> io::Result<()> {
|
||||||
self.wtr().borrow_mut().write_all(buf)
|
self.wtr().borrow_mut().write_all(buf)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn trim_line_terminator(&self, buf: &[u8], line: &mut Match) {
|
|
||||||
let lineterm = self.searcher.line_terminator();
|
|
||||||
if lineterm.is_suffix(&buf[*line]) {
|
|
||||||
let mut end = line.end() - 1;
|
|
||||||
if lineterm.is_crlf() && buf[end - 1] == b'\r' {
|
|
||||||
end -= 1;
|
|
||||||
}
|
|
||||||
*line = line.with_end(end);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
fn has_line_terminator(&self, buf: &[u8]) -> bool {
|
fn has_line_terminator(&self, buf: &[u8]) -> bool {
|
||||||
self.searcher.line_terminator().is_suffix(buf)
|
self.searcher.line_terminator().is_suffix(buf)
|
||||||
}
|
}
|
||||||
@@ -1565,12 +1451,14 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
|
|||||||
///
|
///
|
||||||
/// This stops trimming a prefix as soon as it sees non-whitespace or a
|
/// This stops trimming a prefix as soon as it sees non-whitespace or a
|
||||||
/// line terminator.
|
/// line terminator.
|
||||||
fn trim_ascii_prefix(&self, slice: &[u8], range: &mut Match) {
|
fn trim_ascii_prefix_range(&self, slice: &[u8], range: Match) -> Match {
|
||||||
if !self.config().trim_ascii {
|
trim_ascii_prefix_range(self.searcher.line_terminator(), slice, range)
|
||||||
return;
|
|
||||||
}
|
}
|
||||||
let lineterm = self.searcher.line_terminator();
|
|
||||||
*range = trim_ascii_prefix(lineterm, slice, *range)
|
/// Trim prefix ASCII spaces from the given slice and return the
|
||||||
|
/// corresponding sub-slice.
|
||||||
|
fn trim_ascii_prefix<'s>(&self, slice: &'s [u8]) -> &'s [u8] {
|
||||||
|
trim_ascii_prefix(self.searcher.line_terminator(), slice)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2337,31 +2225,6 @@ but Doctor Watson has to have it taken out for him and dusted,
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn max_columns_preview() {
|
|
||||||
let matcher = RegexMatcher::new("exhibited|dusted").unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.max_columns(Some(46))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
but Doctor Watson has to have it taken out for [... omitted end of long line]
|
|
||||||
and exhibited clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn max_columns_with_count() {
|
fn max_columns_with_count() {
|
||||||
let matcher = RegexMatcher::new("cigar|ash|dusted").unwrap();
|
let matcher = RegexMatcher::new("cigar|ash|dusted").unwrap();
|
||||||
@@ -2387,86 +2250,6 @@ but Doctor Watson has to have it taken out for him and dusted,
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn max_columns_with_count_preview_no_match() {
|
|
||||||
let matcher = RegexMatcher::new("exhibited|has to have it").unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.stats(true)
|
|
||||||
.max_columns(Some(46))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
but Doctor Watson has to have it taken out for [... 0 more matches]
|
|
||||||
and exhibited clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn max_columns_with_count_preview_one_match() {
|
|
||||||
let matcher = RegexMatcher::new("exhibited|dusted").unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.stats(true)
|
|
||||||
.max_columns(Some(46))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
but Doctor Watson has to have it taken out for [... 1 more match]
|
|
||||||
and exhibited clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn max_columns_with_count_preview_two_matches() {
|
|
||||||
let matcher = RegexMatcher::new(
|
|
||||||
"exhibited|dusted|has to have it",
|
|
||||||
).unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.stats(true)
|
|
||||||
.max_columns(Some(46))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
but Doctor Watson has to have it taken out for [... 1 more match]
|
|
||||||
and exhibited clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn max_columns_multi_line() {
|
fn max_columns_multi_line() {
|
||||||
let matcher = RegexMatcher::new("(?s)ash.+dusted").unwrap();
|
let matcher = RegexMatcher::new("(?s)ash.+dusted").unwrap();
|
||||||
@@ -2492,36 +2275,6 @@ but Doctor Watson has to have it taken out for him and dusted,
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn max_columns_multi_line_preview() {
|
|
||||||
let matcher = RegexMatcher::new(
|
|
||||||
"(?s)clew|cigar ash.+have it|exhibited",
|
|
||||||
).unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.stats(true)
|
|
||||||
.max_columns(Some(46))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.multi_line(true)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
can extract a clew from a wisp of straw or a f [... 1 more match]
|
|
||||||
but Doctor Watson has to have it taken out for [... 0 more matches]
|
|
||||||
and exhibited clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn max_matches() {
|
fn max_matches() {
|
||||||
let matcher = RegexMatcher::new("Sherlock").unwrap();
|
let matcher = RegexMatcher::new("Sherlock").unwrap();
|
||||||
@@ -2811,40 +2564,8 @@ Holmeses, success in the province of detective work must always
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn only_matching_max_columns_preview() {
|
|
||||||
let matcher = RegexMatcher::new("Doctor Watsons|Sherlock").unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.only_matching(true)
|
|
||||||
.max_columns(Some(10))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.column(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(true)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
1:9:Doctor Wat [... 0 more matches]
|
|
||||||
1:57:Sherlock
|
|
||||||
3:49:Sherlock
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn only_matching_max_columns_multi_line1() {
|
fn only_matching_max_columns_multi_line1() {
|
||||||
// The `(?s:.{0})` trick fools the matcher into thinking that it
|
|
||||||
// can match across multiple lines without actually doing so. This is
|
|
||||||
// so we can test multi-line handling in the case of a match on only
|
|
||||||
// one line.
|
|
||||||
let matcher = RegexMatcher::new(
|
let matcher = RegexMatcher::new(
|
||||||
r"(?s:.{0})(Doctor Watsons|Sherlock)"
|
r"(?s:.{0})(Doctor Watsons|Sherlock)"
|
||||||
).unwrap();
|
).unwrap();
|
||||||
@@ -2873,41 +2594,6 @@ Holmeses, success in the province of detective work must always
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn only_matching_max_columns_preview_multi_line1() {
|
|
||||||
// The `(?s:.{0})` trick fools the matcher into thinking that it
|
|
||||||
// can match across multiple lines without actually doing so. This is
|
|
||||||
// so we can test multi-line handling in the case of a match on only
|
|
||||||
// one line.
|
|
||||||
let matcher = RegexMatcher::new(
|
|
||||||
r"(?s:.{0})(Doctor Watsons|Sherlock)"
|
|
||||||
).unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.only_matching(true)
|
|
||||||
.max_columns(Some(10))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.column(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.multi_line(true)
|
|
||||||
.line_number(true)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
1:9:Doctor Wat [... 0 more matches]
|
|
||||||
1:57:Sherlock
|
|
||||||
3:49:Sherlock
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn only_matching_max_columns_multi_line2() {
|
fn only_matching_max_columns_multi_line2() {
|
||||||
let matcher = RegexMatcher::new(
|
let matcher = RegexMatcher::new(
|
||||||
@@ -2939,38 +2625,6 @@ Holmeses, success in the province of detective work must always
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn only_matching_max_columns_preview_multi_line2() {
|
|
||||||
let matcher = RegexMatcher::new(
|
|
||||||
r"(?s)Watson.+?(Holmeses|clearly)"
|
|
||||||
).unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.only_matching(true)
|
|
||||||
.max_columns(Some(50))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.column(true)
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.multi_line(true)
|
|
||||||
.line_number(true)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
1:16:Watsons of this world, as opposed to the Sherlock
|
|
||||||
2:16:Holmeses
|
|
||||||
5:12:Watson has to have it taken out for him and dusted [... 0 more matches]
|
|
||||||
6:12:and exhibited clearly
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn per_match() {
|
fn per_match() {
|
||||||
let matcher = RegexMatcher::new("Doctor Watsons|Sherlock").unwrap();
|
let matcher = RegexMatcher::new("Doctor Watsons|Sherlock").unwrap();
|
||||||
@@ -3166,61 +2820,6 @@ Holmeses, success in the province of detective work must always
|
|||||||
assert_eq_printed!(expected, got);
|
assert_eq_printed!(expected, got);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn replacement_max_columns_preview1() {
|
|
||||||
let matcher = RegexMatcher::new(r"Sherlock|Doctor (\w+)").unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.max_columns(Some(67))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.replacement(Some(b"doctah $1 MD".to_vec()))
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(true)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
1:For the doctah Watsons MD of this world, as opposed to the doctah [... 0 more matches]
|
|
||||||
3:be, to a very large extent, the result of luck. doctah MD Holmes
|
|
||||||
5:but doctah Watson MD has to have it taken out for him and dusted,
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn replacement_max_columns_preview2() {
|
|
||||||
let matcher = RegexMatcher::new(
|
|
||||||
"exhibited|dusted|has to have it",
|
|
||||||
).unwrap();
|
|
||||||
let mut printer = StandardBuilder::new()
|
|
||||||
.max_columns(Some(43))
|
|
||||||
.max_columns_preview(true)
|
|
||||||
.replacement(Some(b"xxx".to_vec()))
|
|
||||||
.build(NoColor::new(vec![]));
|
|
||||||
SearcherBuilder::new()
|
|
||||||
.line_number(false)
|
|
||||||
.build()
|
|
||||||
.search_reader(
|
|
||||||
&matcher,
|
|
||||||
SHERLOCK.as_bytes(),
|
|
||||||
printer.sink(&matcher),
|
|
||||||
)
|
|
||||||
.unwrap();
|
|
||||||
|
|
||||||
let got = printer_contents(&mut printer);
|
|
||||||
let expected = "\
|
|
||||||
but Doctor Watson xxx taken out for him and [... 1 more match]
|
|
||||||
and xxx clearly, with a label attached.
|
|
||||||
";
|
|
||||||
assert_eq_printed!(expected, got);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn replacement_only_matching() {
|
fn replacement_only_matching() {
|
||||||
let matcher = RegexMatcher::new(r"Sherlock|Doctor (\w+)").unwrap();
|
let matcher = RegexMatcher::new(r"Sherlock|Doctor (\w+)").unwrap();
|
||||||
|
@@ -636,34 +636,6 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for SummarySink<'p, 's, M, W> {
|
|||||||
stats.add_bytes_searched(finish.byte_count());
|
stats.add_bytes_searched(finish.byte_count());
|
||||||
stats.add_bytes_printed(self.summary.wtr.borrow().count());
|
stats.add_bytes_printed(self.summary.wtr.borrow().count());
|
||||||
}
|
}
|
||||||
// If our binary detection method says to quit after seeing binary
|
|
||||||
// data, then we shouldn't print any results at all, even if we've
|
|
||||||
// found a match before detecting binary data. The intent here is to
|
|
||||||
// keep BinaryDetection::quit as a form of filter. Otherwise, we can
|
|
||||||
// present a matching file with a smaller number of matches than
|
|
||||||
// there might be, which can be quite misleading.
|
|
||||||
//
|
|
||||||
// If our binary detection method is to convert binary data, then we
|
|
||||||
// don't quit and therefore search the entire contents of the file.
|
|
||||||
//
|
|
||||||
// There is an unfortunate inconsistency here. Namely, when using
|
|
||||||
// Quiet or PathWithMatch, then the printer can quit after the first
|
|
||||||
// match seen, which could be long before seeing binary data. This
|
|
||||||
// means that using PathWithMatch can print a path where as using
|
|
||||||
// Count might not print it at all because of binary data.
|
|
||||||
//
|
|
||||||
// It's not possible to fix this without also potentially significantly
|
|
||||||
// impacting the performance of Quiet or PathWithMatch, so we accept
|
|
||||||
// the bug.
|
|
||||||
if self.binary_byte_offset.is_some()
|
|
||||||
&& searcher.binary_detection().quit_byte().is_some()
|
|
||||||
{
|
|
||||||
// Squash the match count. The statistics reported will still
|
|
||||||
// contain the match count, but the "official" match count should
|
|
||||||
// be zero.
|
|
||||||
self.match_count = 0;
|
|
||||||
return Ok(());
|
|
||||||
}
|
|
||||||
|
|
||||||
let show_count =
|
let show_count =
|
||||||
!self.summary.config.exclude_zero
|
!self.summary.config.exclude_zero
|
||||||
|
@@ -346,7 +346,7 @@ impl Serialize for NiceDuration {
|
|||||||
///
|
///
|
||||||
/// This stops trimming a prefix as soon as it sees non-whitespace or a line
|
/// This stops trimming a prefix as soon as it sees non-whitespace or a line
|
||||||
/// terminator.
|
/// terminator.
|
||||||
pub fn trim_ascii_prefix(
|
pub fn trim_ascii_prefix_range(
|
||||||
line_term: LineTerminator,
|
line_term: LineTerminator,
|
||||||
slice: &[u8],
|
slice: &[u8],
|
||||||
range: Match,
|
range: Match,
|
||||||
@@ -366,3 +366,14 @@ pub fn trim_ascii_prefix(
|
|||||||
.count();
|
.count();
|
||||||
range.with_start(range.start() + count)
|
range.with_start(range.start() + count)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Trim prefix ASCII spaces from the given slice and return the corresponding
|
||||||
|
/// sub-slice.
|
||||||
|
pub fn trim_ascii_prefix(line_term: LineTerminator, slice: &[u8]) -> &[u8] {
|
||||||
|
let range = trim_ascii_prefix_range(
|
||||||
|
line_term,
|
||||||
|
slice,
|
||||||
|
Match::new(0, slice.len()),
|
||||||
|
);
|
||||||
|
&slice[range]
|
||||||
|
}
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "grep-regex"
|
name = "grep-regex"
|
||||||
version = "0.1.3" #:version
|
version = "0.1.2" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
Use Rust's regex library with the 'grep' crate.
|
Use Rust's regex library with the 'grep' crate.
|
||||||
@@ -13,9 +13,8 @@ keywords = ["regex", "grep", "search", "pattern", "line"]
|
|||||||
license = "Unlicense/MIT"
|
license = "Unlicense/MIT"
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
aho-corasick = "0.7.3"
|
|
||||||
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
|
|
||||||
log = "0.4.5"
|
log = "0.4.5"
|
||||||
|
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
|
||||||
regex = "1.1"
|
regex = "1.1"
|
||||||
regex-syntax = "0.6.5"
|
regex-syntax = "0.6.5"
|
||||||
thread_local = "0.3.6"
|
thread_local = "0.3.6"
|
||||||
|
@@ -1,13 +1,12 @@
|
|||||||
use grep_matcher::{ByteSet, LineTerminator};
|
use grep_matcher::{ByteSet, LineTerminator};
|
||||||
use regex::bytes::{Regex, RegexBuilder};
|
use regex::bytes::{Regex, RegexBuilder};
|
||||||
use regex_syntax::ast::{self, Ast};
|
use regex_syntax::ast::{self, Ast};
|
||||||
use regex_syntax::hir::{self, Hir};
|
use regex_syntax::hir::Hir;
|
||||||
|
|
||||||
use ast::AstAnalysis;
|
use ast::AstAnalysis;
|
||||||
use crlf::crlfify;
|
use crlf::crlfify;
|
||||||
use error::Error;
|
use error::Error;
|
||||||
use literal::LiteralSets;
|
use literal::LiteralSets;
|
||||||
use multi::alternation_literals;
|
|
||||||
use non_matching::non_matching_bytes;
|
use non_matching::non_matching_bytes;
|
||||||
use strip::strip_from_match;
|
use strip::strip_from_match;
|
||||||
|
|
||||||
@@ -68,17 +67,19 @@ impl Config {
|
|||||||
/// If there was a problem parsing the given expression then an error
|
/// If there was a problem parsing the given expression then an error
|
||||||
/// is returned.
|
/// is returned.
|
||||||
pub fn hir(&self, pattern: &str) -> Result<ConfiguredHIR, Error> {
|
pub fn hir(&self, pattern: &str) -> Result<ConfiguredHIR, Error> {
|
||||||
let ast = self.ast(pattern)?;
|
let analysis = self.analysis(pattern)?;
|
||||||
let analysis = self.analysis(&ast)?;
|
let expr = ::regex_syntax::ParserBuilder::new()
|
||||||
let expr = hir::translate::TranslatorBuilder::new()
|
.nest_limit(self.nest_limit)
|
||||||
|
.octal(self.octal)
|
||||||
.allow_invalid_utf8(true)
|
.allow_invalid_utf8(true)
|
||||||
.case_insensitive(self.is_case_insensitive(&analysis))
|
.ignore_whitespace(self.ignore_whitespace)
|
||||||
|
.case_insensitive(self.is_case_insensitive(&analysis)?)
|
||||||
.multi_line(self.multi_line)
|
.multi_line(self.multi_line)
|
||||||
.dot_matches_new_line(self.dot_matches_new_line)
|
.dot_matches_new_line(self.dot_matches_new_line)
|
||||||
.swap_greed(self.swap_greed)
|
.swap_greed(self.swap_greed)
|
||||||
.unicode(self.unicode)
|
.unicode(self.unicode)
|
||||||
.build()
|
.build()
|
||||||
.translate(pattern, &ast)
|
.parse(pattern)
|
||||||
.map_err(Error::regex)?;
|
.map_err(Error::regex)?;
|
||||||
let expr = match self.line_terminator {
|
let expr = match self.line_terminator {
|
||||||
None => expr,
|
None => expr,
|
||||||
@@ -98,34 +99,21 @@ impl Config {
|
|||||||
fn is_case_insensitive(
|
fn is_case_insensitive(
|
||||||
&self,
|
&self,
|
||||||
analysis: &AstAnalysis,
|
analysis: &AstAnalysis,
|
||||||
) -> bool {
|
) -> Result<bool, Error> {
|
||||||
if self.case_insensitive {
|
if self.case_insensitive {
|
||||||
return true;
|
return Ok(true);
|
||||||
}
|
}
|
||||||
if !self.case_smart {
|
if !self.case_smart {
|
||||||
return false;
|
return Ok(false);
|
||||||
}
|
}
|
||||||
analysis.any_literal() && !analysis.any_uppercase()
|
Ok(analysis.any_literal() && !analysis.any_uppercase())
|
||||||
}
|
|
||||||
|
|
||||||
/// Returns true if and only if this config is simple enough such that
|
|
||||||
/// if the pattern is a simple alternation of literals, then it can be
|
|
||||||
/// constructed via a plain Aho-Corasick automaton.
|
|
||||||
///
|
|
||||||
/// Note that it is OK to return true even when settings like `multi_line`
|
|
||||||
/// are enabled, since if multi-line can impact the match semantics of a
|
|
||||||
/// regex, then it is by definition not a simple alternation of literals.
|
|
||||||
pub fn can_plain_aho_corasick(&self) -> bool {
|
|
||||||
!self.word
|
|
||||||
&& !self.case_insensitive
|
|
||||||
&& !self.case_smart
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Perform analysis on the AST of this pattern.
|
/// Perform analysis on the AST of this pattern.
|
||||||
///
|
///
|
||||||
/// This returns an error if the given pattern failed to parse.
|
/// This returns an error if the given pattern failed to parse.
|
||||||
fn analysis(&self, ast: &Ast) -> Result<AstAnalysis, Error> {
|
fn analysis(&self, pattern: &str) -> Result<AstAnalysis, Error> {
|
||||||
Ok(AstAnalysis::from_ast(ast))
|
Ok(AstAnalysis::from_ast(&self.ast(pattern)?))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Parse the given pattern into its abstract syntax.
|
/// Parse the given pattern into its abstract syntax.
|
||||||
@@ -185,15 +173,6 @@ impl ConfiguredHIR {
|
|||||||
self.pattern_to_regex(&self.expr.to_string())
|
self.pattern_to_regex(&self.expr.to_string())
|
||||||
}
|
}
|
||||||
|
|
||||||
/// If this HIR corresponds to an alternation of literals with no
|
|
||||||
/// capturing groups, then this returns those literals.
|
|
||||||
pub fn alternation_literals(&self) -> Option<Vec<Vec<u8>>> {
|
|
||||||
if !self.config.can_plain_aho_corasick() {
|
|
||||||
return None;
|
|
||||||
}
|
|
||||||
alternation_literals(&self.expr)
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Applies the given function to the concrete syntax of this HIR and then
|
/// Applies the given function to the concrete syntax of this HIR and then
|
||||||
/// generates a new HIR based on the result of the function in a way that
|
/// generates a new HIR based on the result of the function in a way that
|
||||||
/// preserves the configuration.
|
/// preserves the configuration.
|
||||||
|
@@ -76,9 +76,7 @@ impl Matcher for CRLFMatcher {
|
|||||||
caps: &mut RegexCaptures,
|
caps: &mut RegexCaptures,
|
||||||
) -> Result<bool, NoError> {
|
) -> Result<bool, NoError> {
|
||||||
caps.strip_crlf(false);
|
caps.strip_crlf(false);
|
||||||
let r = self.regex.captures_read_at(
|
let r = self.regex.captures_read_at(caps.locations(), haystack, at);
|
||||||
caps.locations_mut(), haystack, at,
|
|
||||||
);
|
|
||||||
if !r.is_some() {
|
if !r.is_some() {
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
|
@@ -4,7 +4,6 @@ An implementation of `grep-matcher`'s `Matcher` trait for Rust's regex engine.
|
|||||||
|
|
||||||
#![deny(missing_docs)]
|
#![deny(missing_docs)]
|
||||||
|
|
||||||
extern crate aho_corasick;
|
|
||||||
extern crate grep_matcher;
|
extern crate grep_matcher;
|
||||||
#[macro_use]
|
#[macro_use]
|
||||||
extern crate log;
|
extern crate log;
|
||||||
@@ -22,7 +21,6 @@ mod crlf;
|
|||||||
mod error;
|
mod error;
|
||||||
mod literal;
|
mod literal;
|
||||||
mod matcher;
|
mod matcher;
|
||||||
mod multi;
|
|
||||||
mod non_matching;
|
mod non_matching;
|
||||||
mod strip;
|
mod strip;
|
||||||
mod util;
|
mod util;
|
||||||
|
@@ -8,7 +8,6 @@ use regex::bytes::{CaptureLocations, Regex};
|
|||||||
use config::{Config, ConfiguredHIR};
|
use config::{Config, ConfiguredHIR};
|
||||||
use crlf::CRLFMatcher;
|
use crlf::CRLFMatcher;
|
||||||
use error::Error;
|
use error::Error;
|
||||||
use multi::MultiLiteralMatcher;
|
|
||||||
use word::WordMatcher;
|
use word::WordMatcher;
|
||||||
|
|
||||||
/// A builder for constructing a `Matcher` using regular expressions.
|
/// A builder for constructing a `Matcher` using regular expressions.
|
||||||
@@ -62,29 +61,6 @@ impl RegexMatcherBuilder {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Build a new matcher from a plain alternation of literals.
|
|
||||||
///
|
|
||||||
/// Depending on the configuration set by the builder, this may be able to
|
|
||||||
/// build a matcher substantially faster than by joining the patterns with
|
|
||||||
/// a `|` and calling `build`.
|
|
||||||
pub fn build_literals<B: AsRef<str>>(
|
|
||||||
&self,
|
|
||||||
literals: &[B],
|
|
||||||
) -> Result<RegexMatcher, Error> {
|
|
||||||
let slices: Vec<_> = literals.iter().map(|s| s.as_ref()).collect();
|
|
||||||
if !self.config.can_plain_aho_corasick() || literals.len() < 40 {
|
|
||||||
return self.build(&slices.join("|"));
|
|
||||||
}
|
|
||||||
let matcher = MultiLiteralMatcher::new(&slices)?;
|
|
||||||
let imp = RegexMatcherImpl::MultiLiteral(matcher);
|
|
||||||
Ok(RegexMatcher {
|
|
||||||
config: self.config.clone(),
|
|
||||||
matcher: imp,
|
|
||||||
fast_line_regex: None,
|
|
||||||
non_matching_bytes: ByteSet::empty(),
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Set the value for the case insensitive (`i`) flag.
|
/// Set the value for the case insensitive (`i`) flag.
|
||||||
///
|
///
|
||||||
/// When enabled, letters in the pattern will match both upper case and
|
/// When enabled, letters in the pattern will match both upper case and
|
||||||
@@ -372,8 +348,6 @@ impl RegexMatcher {
|
|||||||
enum RegexMatcherImpl {
|
enum RegexMatcherImpl {
|
||||||
/// The standard matcher used for all regular expressions.
|
/// The standard matcher used for all regular expressions.
|
||||||
Standard(StandardMatcher),
|
Standard(StandardMatcher),
|
||||||
/// A matcher for an alternation of plain literals.
|
|
||||||
MultiLiteral(MultiLiteralMatcher),
|
|
||||||
/// A matcher that strips `\r` from the end of matches.
|
/// A matcher that strips `\r` from the end of matches.
|
||||||
///
|
///
|
||||||
/// This is only used when the CRLF hack is enabled and the regex is line
|
/// This is only used when the CRLF hack is enabled and the regex is line
|
||||||
@@ -396,23 +370,16 @@ impl RegexMatcherImpl {
|
|||||||
} else if expr.needs_crlf_stripped() {
|
} else if expr.needs_crlf_stripped() {
|
||||||
Ok(RegexMatcherImpl::CRLF(CRLFMatcher::new(expr)?))
|
Ok(RegexMatcherImpl::CRLF(CRLFMatcher::new(expr)?))
|
||||||
} else {
|
} else {
|
||||||
if let Some(lits) = expr.alternation_literals() {
|
|
||||||
if lits.len() >= 40 {
|
|
||||||
let matcher = MultiLiteralMatcher::new(&lits)?;
|
|
||||||
return Ok(RegexMatcherImpl::MultiLiteral(matcher));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Ok(RegexMatcherImpl::Standard(StandardMatcher::new(expr)?))
|
Ok(RegexMatcherImpl::Standard(StandardMatcher::new(expr)?))
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Return the underlying regex object used.
|
/// Return the underlying regex object used.
|
||||||
fn regex(&self) -> String {
|
fn regex(&self) -> &Regex {
|
||||||
match *self {
|
match *self {
|
||||||
RegexMatcherImpl::Word(ref x) => x.regex().to_string(),
|
RegexMatcherImpl::Word(ref x) => x.regex(),
|
||||||
RegexMatcherImpl::CRLF(ref x) => x.regex().to_string(),
|
RegexMatcherImpl::CRLF(ref x) => x.regex(),
|
||||||
RegexMatcherImpl::MultiLiteral(_) => "<N/A>".to_string(),
|
RegexMatcherImpl::Standard(ref x) => &x.regex,
|
||||||
RegexMatcherImpl::Standard(ref x) => x.regex.to_string(),
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -432,7 +399,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.find_at(haystack, at),
|
Standard(ref m) => m.find_at(haystack, at),
|
||||||
MultiLiteral(ref m) => m.find_at(haystack, at),
|
|
||||||
CRLF(ref m) => m.find_at(haystack, at),
|
CRLF(ref m) => m.find_at(haystack, at),
|
||||||
Word(ref m) => m.find_at(haystack, at),
|
Word(ref m) => m.find_at(haystack, at),
|
||||||
}
|
}
|
||||||
@@ -442,7 +408,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.new_captures(),
|
Standard(ref m) => m.new_captures(),
|
||||||
MultiLiteral(ref m) => m.new_captures(),
|
|
||||||
CRLF(ref m) => m.new_captures(),
|
CRLF(ref m) => m.new_captures(),
|
||||||
Word(ref m) => m.new_captures(),
|
Word(ref m) => m.new_captures(),
|
||||||
}
|
}
|
||||||
@@ -452,7 +417,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.capture_count(),
|
Standard(ref m) => m.capture_count(),
|
||||||
MultiLiteral(ref m) => m.capture_count(),
|
|
||||||
CRLF(ref m) => m.capture_count(),
|
CRLF(ref m) => m.capture_count(),
|
||||||
Word(ref m) => m.capture_count(),
|
Word(ref m) => m.capture_count(),
|
||||||
}
|
}
|
||||||
@@ -462,7 +426,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.capture_index(name),
|
Standard(ref m) => m.capture_index(name),
|
||||||
MultiLiteral(ref m) => m.capture_index(name),
|
|
||||||
CRLF(ref m) => m.capture_index(name),
|
CRLF(ref m) => m.capture_index(name),
|
||||||
Word(ref m) => m.capture_index(name),
|
Word(ref m) => m.capture_index(name),
|
||||||
}
|
}
|
||||||
@@ -472,7 +435,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.find(haystack),
|
Standard(ref m) => m.find(haystack),
|
||||||
MultiLiteral(ref m) => m.find(haystack),
|
|
||||||
CRLF(ref m) => m.find(haystack),
|
CRLF(ref m) => m.find(haystack),
|
||||||
Word(ref m) => m.find(haystack),
|
Word(ref m) => m.find(haystack),
|
||||||
}
|
}
|
||||||
@@ -488,7 +450,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.find_iter(haystack, matched),
|
Standard(ref m) => m.find_iter(haystack, matched),
|
||||||
MultiLiteral(ref m) => m.find_iter(haystack, matched),
|
|
||||||
CRLF(ref m) => m.find_iter(haystack, matched),
|
CRLF(ref m) => m.find_iter(haystack, matched),
|
||||||
Word(ref m) => m.find_iter(haystack, matched),
|
Word(ref m) => m.find_iter(haystack, matched),
|
||||||
}
|
}
|
||||||
@@ -504,7 +465,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.try_find_iter(haystack, matched),
|
Standard(ref m) => m.try_find_iter(haystack, matched),
|
||||||
MultiLiteral(ref m) => m.try_find_iter(haystack, matched),
|
|
||||||
CRLF(ref m) => m.try_find_iter(haystack, matched),
|
CRLF(ref m) => m.try_find_iter(haystack, matched),
|
||||||
Word(ref m) => m.try_find_iter(haystack, matched),
|
Word(ref m) => m.try_find_iter(haystack, matched),
|
||||||
}
|
}
|
||||||
@@ -518,7 +478,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.captures(haystack, caps),
|
Standard(ref m) => m.captures(haystack, caps),
|
||||||
MultiLiteral(ref m) => m.captures(haystack, caps),
|
|
||||||
CRLF(ref m) => m.captures(haystack, caps),
|
CRLF(ref m) => m.captures(haystack, caps),
|
||||||
Word(ref m) => m.captures(haystack, caps),
|
Word(ref m) => m.captures(haystack, caps),
|
||||||
}
|
}
|
||||||
@@ -535,7 +494,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.captures_iter(haystack, caps, matched),
|
Standard(ref m) => m.captures_iter(haystack, caps, matched),
|
||||||
MultiLiteral(ref m) => m.captures_iter(haystack, caps, matched),
|
|
||||||
CRLF(ref m) => m.captures_iter(haystack, caps, matched),
|
CRLF(ref m) => m.captures_iter(haystack, caps, matched),
|
||||||
Word(ref m) => m.captures_iter(haystack, caps, matched),
|
Word(ref m) => m.captures_iter(haystack, caps, matched),
|
||||||
}
|
}
|
||||||
@@ -552,9 +510,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.try_captures_iter(haystack, caps, matched),
|
Standard(ref m) => m.try_captures_iter(haystack, caps, matched),
|
||||||
MultiLiteral(ref m) => {
|
|
||||||
m.try_captures_iter(haystack, caps, matched)
|
|
||||||
}
|
|
||||||
CRLF(ref m) => m.try_captures_iter(haystack, caps, matched),
|
CRLF(ref m) => m.try_captures_iter(haystack, caps, matched),
|
||||||
Word(ref m) => m.try_captures_iter(haystack, caps, matched),
|
Word(ref m) => m.try_captures_iter(haystack, caps, matched),
|
||||||
}
|
}
|
||||||
@@ -569,7 +524,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.captures_at(haystack, at, caps),
|
Standard(ref m) => m.captures_at(haystack, at, caps),
|
||||||
MultiLiteral(ref m) => m.captures_at(haystack, at, caps),
|
|
||||||
CRLF(ref m) => m.captures_at(haystack, at, caps),
|
CRLF(ref m) => m.captures_at(haystack, at, caps),
|
||||||
Word(ref m) => m.captures_at(haystack, at, caps),
|
Word(ref m) => m.captures_at(haystack, at, caps),
|
||||||
}
|
}
|
||||||
@@ -586,7 +540,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.replace(haystack, dst, append),
|
Standard(ref m) => m.replace(haystack, dst, append),
|
||||||
MultiLiteral(ref m) => m.replace(haystack, dst, append),
|
|
||||||
CRLF(ref m) => m.replace(haystack, dst, append),
|
CRLF(ref m) => m.replace(haystack, dst, append),
|
||||||
Word(ref m) => m.replace(haystack, dst, append),
|
Word(ref m) => m.replace(haystack, dst, append),
|
||||||
}
|
}
|
||||||
@@ -606,9 +559,6 @@ impl Matcher for RegexMatcher {
|
|||||||
Standard(ref m) => {
|
Standard(ref m) => {
|
||||||
m.replace_with_captures(haystack, caps, dst, append)
|
m.replace_with_captures(haystack, caps, dst, append)
|
||||||
}
|
}
|
||||||
MultiLiteral(ref m) => {
|
|
||||||
m.replace_with_captures(haystack, caps, dst, append)
|
|
||||||
}
|
|
||||||
CRLF(ref m) => {
|
CRLF(ref m) => {
|
||||||
m.replace_with_captures(haystack, caps, dst, append)
|
m.replace_with_captures(haystack, caps, dst, append)
|
||||||
}
|
}
|
||||||
@@ -622,7 +572,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.is_match(haystack),
|
Standard(ref m) => m.is_match(haystack),
|
||||||
MultiLiteral(ref m) => m.is_match(haystack),
|
|
||||||
CRLF(ref m) => m.is_match(haystack),
|
CRLF(ref m) => m.is_match(haystack),
|
||||||
Word(ref m) => m.is_match(haystack),
|
Word(ref m) => m.is_match(haystack),
|
||||||
}
|
}
|
||||||
@@ -636,7 +585,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.is_match_at(haystack, at),
|
Standard(ref m) => m.is_match_at(haystack, at),
|
||||||
MultiLiteral(ref m) => m.is_match_at(haystack, at),
|
|
||||||
CRLF(ref m) => m.is_match_at(haystack, at),
|
CRLF(ref m) => m.is_match_at(haystack, at),
|
||||||
Word(ref m) => m.is_match_at(haystack, at),
|
Word(ref m) => m.is_match_at(haystack, at),
|
||||||
}
|
}
|
||||||
@@ -649,7 +597,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.shortest_match(haystack),
|
Standard(ref m) => m.shortest_match(haystack),
|
||||||
MultiLiteral(ref m) => m.shortest_match(haystack),
|
|
||||||
CRLF(ref m) => m.shortest_match(haystack),
|
CRLF(ref m) => m.shortest_match(haystack),
|
||||||
Word(ref m) => m.shortest_match(haystack),
|
Word(ref m) => m.shortest_match(haystack),
|
||||||
}
|
}
|
||||||
@@ -663,7 +610,6 @@ impl Matcher for RegexMatcher {
|
|||||||
use self::RegexMatcherImpl::*;
|
use self::RegexMatcherImpl::*;
|
||||||
match self.matcher {
|
match self.matcher {
|
||||||
Standard(ref m) => m.shortest_match_at(haystack, at),
|
Standard(ref m) => m.shortest_match_at(haystack, at),
|
||||||
MultiLiteral(ref m) => m.shortest_match_at(haystack, at),
|
|
||||||
CRLF(ref m) => m.shortest_match_at(haystack, at),
|
CRLF(ref m) => m.shortest_match_at(haystack, at),
|
||||||
Word(ref m) => m.shortest_match_at(haystack, at),
|
Word(ref m) => m.shortest_match_at(haystack, at),
|
||||||
}
|
}
|
||||||
@@ -764,9 +710,7 @@ impl Matcher for StandardMatcher {
|
|||||||
at: usize,
|
at: usize,
|
||||||
caps: &mut RegexCaptures,
|
caps: &mut RegexCaptures,
|
||||||
) -> Result<bool, NoError> {
|
) -> Result<bool, NoError> {
|
||||||
Ok(self.regex.captures_read_at(
|
Ok(self.regex.captures_read_at(&mut caps.locs, haystack, at).is_some())
|
||||||
&mut caps.locations_mut(), haystack, at,
|
|
||||||
).is_some())
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fn shortest_match_at(
|
fn shortest_match_at(
|
||||||
@@ -793,15 +737,7 @@ impl Matcher for StandardMatcher {
|
|||||||
/// index of the group using the corresponding matcher's `capture_index`
|
/// index of the group using the corresponding matcher's `capture_index`
|
||||||
/// method, and then use that index with `RegexCaptures::get`.
|
/// method, and then use that index with `RegexCaptures::get`.
|
||||||
#[derive(Clone, Debug)]
|
#[derive(Clone, Debug)]
|
||||||
pub struct RegexCaptures(RegexCapturesImp);
|
pub struct RegexCaptures {
|
||||||
|
|
||||||
#[derive(Clone, Debug)]
|
|
||||||
enum RegexCapturesImp {
|
|
||||||
AhoCorasick {
|
|
||||||
/// The start and end of the match, corresponding to capture group 0.
|
|
||||||
mat: Option<Match>,
|
|
||||||
},
|
|
||||||
Regex {
|
|
||||||
/// Where the locations are stored.
|
/// Where the locations are stored.
|
||||||
locs: CaptureLocations,
|
locs: CaptureLocations,
|
||||||
/// These captures behave as if the capturing groups begin at the given
|
/// These captures behave as if the capturing groups begin at the given
|
||||||
@@ -809,68 +745,46 @@ enum RegexCapturesImp {
|
|||||||
/// indexed like normal.
|
/// indexed like normal.
|
||||||
///
|
///
|
||||||
/// This is useful when building matchers that wrap arbitrary regular
|
/// This is useful when building matchers that wrap arbitrary regular
|
||||||
/// expressions. For example, `WordMatcher` takes an existing regex
|
/// expressions. For example, `WordMatcher` takes an existing regex `re`
|
||||||
/// `re` and creates `(?:^|\W)(re)(?:$|\W)`, but hides the fact that
|
/// and creates `(?:^|\W)(re)(?:$|\W)`, but hides the fact that the regex
|
||||||
/// the regex has been wrapped from the caller. In order to do this,
|
/// has been wrapped from the caller. In order to do this, the matcher
|
||||||
/// the matcher and the capturing groups must behave as if `(re)` is
|
/// and the capturing groups must behave as if `(re)` is the `0`th capture
|
||||||
/// the `0`th capture group.
|
/// group.
|
||||||
offset: usize,
|
offset: usize,
|
||||||
/// When enable, the end of a match has `\r` stripped from it, if one
|
/// When enable, the end of a match has `\r` stripped from it, if one
|
||||||
/// exists.
|
/// exists.
|
||||||
strip_crlf: bool,
|
strip_crlf: bool,
|
||||||
},
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Captures for RegexCaptures {
|
impl Captures for RegexCaptures {
|
||||||
fn len(&self) -> usize {
|
fn len(&self) -> usize {
|
||||||
match self.0 {
|
self.locs.len().checked_sub(self.offset).unwrap()
|
||||||
RegexCapturesImp::AhoCorasick { .. } => 1,
|
|
||||||
RegexCapturesImp::Regex { ref locs, offset, .. } => {
|
|
||||||
locs.len().checked_sub(offset).unwrap()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fn get(&self, i: usize) -> Option<Match> {
|
fn get(&self, i: usize) -> Option<Match> {
|
||||||
match self.0 {
|
if !self.strip_crlf {
|
||||||
RegexCapturesImp::AhoCorasick { mat, .. } => {
|
let actual = i.checked_add(self.offset).unwrap();
|
||||||
if i == 0 {
|
return self.locs.pos(actual).map(|(s, e)| Match::new(s, e));
|
||||||
mat
|
|
||||||
} else {
|
|
||||||
None
|
|
||||||
}
|
|
||||||
}
|
|
||||||
RegexCapturesImp::Regex { ref locs, offset, strip_crlf } => {
|
|
||||||
if !strip_crlf {
|
|
||||||
let actual = i.checked_add(offset).unwrap();
|
|
||||||
return locs.pos(actual).map(|(s, e)| Match::new(s, e));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// currently don't support capture offsetting with CRLF
|
// currently don't support capture offsetting with CRLF stripping
|
||||||
// stripping
|
assert_eq!(self.offset, 0);
|
||||||
assert_eq!(offset, 0);
|
let m = match self.locs.pos(i).map(|(s, e)| Match::new(s, e)) {
|
||||||
let m = match locs.pos(i).map(|(s, e)| Match::new(s, e)) {
|
|
||||||
None => return None,
|
None => return None,
|
||||||
Some(m) => m,
|
Some(m) => m,
|
||||||
};
|
};
|
||||||
// If the end position of this match corresponds to the end
|
// If the end position of this match corresponds to the end position
|
||||||
// position of the overall match, then we apply our CRLF
|
// of the overall match, then we apply our CRLF stripping. Otherwise,
|
||||||
// stripping. Otherwise, we cannot assume stripping is correct.
|
// we cannot assume stripping is correct.
|
||||||
if i == 0 || m.end() == locs.pos(0).unwrap().1 {
|
if i == 0 || m.end() == self.locs.pos(0).unwrap().1 {
|
||||||
Some(m.with_end(m.end() - 1))
|
Some(m.with_end(m.end() - 1))
|
||||||
} else {
|
} else {
|
||||||
Some(m)
|
Some(m)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl RegexCaptures {
|
impl RegexCaptures {
|
||||||
pub(crate) fn simple() -> RegexCaptures {
|
|
||||||
RegexCaptures(RegexCapturesImp::AhoCorasick { mat: None })
|
|
||||||
}
|
|
||||||
|
|
||||||
pub(crate) fn new(locs: CaptureLocations) -> RegexCaptures {
|
pub(crate) fn new(locs: CaptureLocations) -> RegexCaptures {
|
||||||
RegexCaptures::with_offset(locs, 0)
|
RegexCaptures::with_offset(locs, 0)
|
||||||
}
|
}
|
||||||
@@ -879,53 +793,15 @@ impl RegexCaptures {
|
|||||||
locs: CaptureLocations,
|
locs: CaptureLocations,
|
||||||
offset: usize,
|
offset: usize,
|
||||||
) -> RegexCaptures {
|
) -> RegexCaptures {
|
||||||
RegexCaptures(RegexCapturesImp::Regex {
|
RegexCaptures { locs, offset, strip_crlf: false }
|
||||||
locs, offset, strip_crlf: false,
|
|
||||||
})
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn locations(&self) -> &CaptureLocations {
|
pub(crate) fn locations(&mut self) -> &mut CaptureLocations {
|
||||||
match self.0 {
|
&mut self.locs
|
||||||
RegexCapturesImp::AhoCorasick { .. } => {
|
|
||||||
panic!("getting locations for simple captures is invalid")
|
|
||||||
}
|
|
||||||
RegexCapturesImp::Regex { ref locs, .. } => {
|
|
||||||
locs
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
pub(crate) fn locations_mut(&mut self) -> &mut CaptureLocations {
|
|
||||||
match self.0 {
|
|
||||||
RegexCapturesImp::AhoCorasick { .. } => {
|
|
||||||
panic!("getting locations for simple captures is invalid")
|
|
||||||
}
|
|
||||||
RegexCapturesImp::Regex { ref mut locs, .. } => {
|
|
||||||
locs
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn strip_crlf(&mut self, yes: bool) {
|
pub(crate) fn strip_crlf(&mut self, yes: bool) {
|
||||||
match self.0 {
|
self.strip_crlf = yes;
|
||||||
RegexCapturesImp::AhoCorasick { .. } => {
|
|
||||||
panic!("setting strip_crlf for simple captures is invalid")
|
|
||||||
}
|
|
||||||
RegexCapturesImp::Regex { ref mut strip_crlf, .. } => {
|
|
||||||
*strip_crlf = yes;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
pub(crate) fn set_simple(&mut self, one: Option<Match>) {
|
|
||||||
match self.0 {
|
|
||||||
RegexCapturesImp::AhoCorasick { ref mut mat } => {
|
|
||||||
*mat = one;
|
|
||||||
}
|
|
||||||
RegexCapturesImp::Regex { .. } => {
|
|
||||||
panic!("setting simple captures for regex is invalid")
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -1,127 +0,0 @@
|
|||||||
use aho_corasick::{AhoCorasick, AhoCorasickBuilder, MatchKind};
|
|
||||||
use grep_matcher::{Matcher, Match, NoError};
|
|
||||||
use regex_syntax::hir::Hir;
|
|
||||||
|
|
||||||
use error::Error;
|
|
||||||
use matcher::RegexCaptures;
|
|
||||||
|
|
||||||
/// A matcher for an alternation of literals.
|
|
||||||
///
|
|
||||||
/// Ideally, this optimization would be pushed down into the regex engine, but
|
|
||||||
/// making this work correctly there would require quite a bit of refactoring.
|
|
||||||
/// Moreover, doing it one layer above lets us do thing like, "if we
|
|
||||||
/// specifically only want to search for literals, then don't bother with
|
|
||||||
/// regex parsing at all."
|
|
||||||
#[derive(Clone, Debug)]
|
|
||||||
pub struct MultiLiteralMatcher {
|
|
||||||
/// The Aho-Corasick automaton.
|
|
||||||
ac: AhoCorasick,
|
|
||||||
}
|
|
||||||
|
|
||||||
impl MultiLiteralMatcher {
|
|
||||||
/// Create a new multi-literal matcher from the given literals.
|
|
||||||
pub fn new<B: AsRef<[u8]>>(
|
|
||||||
literals: &[B],
|
|
||||||
) -> Result<MultiLiteralMatcher, Error> {
|
|
||||||
let ac = AhoCorasickBuilder::new()
|
|
||||||
.match_kind(MatchKind::LeftmostFirst)
|
|
||||||
.auto_configure(literals)
|
|
||||||
.build_with_size::<usize, _, _>(literals)
|
|
||||||
.map_err(Error::regex)?;
|
|
||||||
Ok(MultiLiteralMatcher { ac })
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl Matcher for MultiLiteralMatcher {
|
|
||||||
type Captures = RegexCaptures;
|
|
||||||
type Error = NoError;
|
|
||||||
|
|
||||||
fn find_at(
|
|
||||||
&self,
|
|
||||||
haystack: &[u8],
|
|
||||||
at: usize,
|
|
||||||
) -> Result<Option<Match>, NoError> {
|
|
||||||
match self.ac.find(&haystack[at..]) {
|
|
||||||
None => Ok(None),
|
|
||||||
Some(m) => Ok(Some(Match::new(at + m.start(), at + m.end()))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
fn new_captures(&self) -> Result<RegexCaptures, NoError> {
|
|
||||||
Ok(RegexCaptures::simple())
|
|
||||||
}
|
|
||||||
|
|
||||||
fn capture_count(&self) -> usize {
|
|
||||||
1
|
|
||||||
}
|
|
||||||
|
|
||||||
fn capture_index(&self, _: &str) -> Option<usize> {
|
|
||||||
None
|
|
||||||
}
|
|
||||||
|
|
||||||
fn captures_at(
|
|
||||||
&self,
|
|
||||||
haystack: &[u8],
|
|
||||||
at: usize,
|
|
||||||
caps: &mut RegexCaptures,
|
|
||||||
) -> Result<bool, NoError> {
|
|
||||||
caps.set_simple(None);
|
|
||||||
let mat = self.find_at(haystack, at)?;
|
|
||||||
caps.set_simple(mat);
|
|
||||||
Ok(mat.is_some())
|
|
||||||
}
|
|
||||||
|
|
||||||
// We specifically do not implement other methods like find_iter. Namely,
|
|
||||||
// the iter methods are guaranteed to be correct by virtue of implementing
|
|
||||||
// find_at above.
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Alternation literals checks if the given HIR is a simple alternation of
|
|
||||||
/// literals, and if so, returns them. Otherwise, this returns None.
|
|
||||||
pub fn alternation_literals(expr: &Hir) -> Option<Vec<Vec<u8>>> {
|
|
||||||
use regex_syntax::hir::{HirKind, Literal};
|
|
||||||
|
|
||||||
// This is pretty hacky, but basically, if `is_alternation_literal` is
|
|
||||||
// true, then we can make several assumptions about the structure of our
|
|
||||||
// HIR. This is what justifies the `unreachable!` statements below.
|
|
||||||
|
|
||||||
if !expr.is_alternation_literal() {
|
|
||||||
return None;
|
|
||||||
}
|
|
||||||
let alts = match *expr.kind() {
|
|
||||||
HirKind::Alternation(ref alts) => alts,
|
|
||||||
_ => return None, // one literal isn't worth it
|
|
||||||
};
|
|
||||||
|
|
||||||
let extendlit = |lit: &Literal, dst: &mut Vec<u8>| {
|
|
||||||
match *lit {
|
|
||||||
Literal::Unicode(c) => {
|
|
||||||
let mut buf = [0; 4];
|
|
||||||
dst.extend_from_slice(c.encode_utf8(&mut buf).as_bytes());
|
|
||||||
}
|
|
||||||
Literal::Byte(b) => {
|
|
||||||
dst.push(b);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
let mut lits = vec![];
|
|
||||||
for alt in alts {
|
|
||||||
let mut lit = vec![];
|
|
||||||
match *alt.kind() {
|
|
||||||
HirKind::Empty => {}
|
|
||||||
HirKind::Literal(ref x) => extendlit(x, &mut lit),
|
|
||||||
HirKind::Concat(ref exprs) => {
|
|
||||||
for e in exprs {
|
|
||||||
match *e.kind() {
|
|
||||||
HirKind::Literal(ref x) => extendlit(x, &mut lit),
|
|
||||||
_ => unreachable!("expected literal, got {:?}", e),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
_ => unreachable!("expected literal or concat, got {:?}", alt),
|
|
||||||
}
|
|
||||||
lits.push(lit);
|
|
||||||
}
|
|
||||||
Some(lits)
|
|
||||||
}
|
|
@@ -103,9 +103,7 @@ impl Matcher for WordMatcher {
|
|||||||
at: usize,
|
at: usize,
|
||||||
caps: &mut RegexCaptures,
|
caps: &mut RegexCaptures,
|
||||||
) -> Result<bool, NoError> {
|
) -> Result<bool, NoError> {
|
||||||
let r = self.regex.captures_read_at(
|
let r = self.regex.captures_read_at(caps.locations(), haystack, at);
|
||||||
caps.locations_mut(), haystack, at,
|
|
||||||
);
|
|
||||||
Ok(r.is_some())
|
Ok(r.is_some())
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "grep-searcher"
|
name = "grep-searcher"
|
||||||
version = "0.1.4" #:version
|
version = "0.1.3" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
Fast line oriented regex searching as a library.
|
Fast line oriented regex searching as a library.
|
||||||
@@ -16,13 +16,13 @@ license = "Unlicense/MIT"
|
|||||||
bstr = { version = "0.1.2", default-features = false, features = ["std"] }
|
bstr = { version = "0.1.2", default-features = false, features = ["std"] }
|
||||||
bytecount = "0.5"
|
bytecount = "0.5"
|
||||||
encoding_rs = "0.8.14"
|
encoding_rs = "0.8.14"
|
||||||
encoding_rs_io = "0.1.6"
|
encoding_rs_io = "0.1.4"
|
||||||
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
|
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
|
||||||
log = "0.4.5"
|
log = "0.4.5"
|
||||||
memmap = "0.7"
|
memmap = "0.7"
|
||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
grep-regex = { version = "0.1.3", path = "../grep-regex" }
|
grep-regex = { version = "0.1.1", path = "../grep-regex" }
|
||||||
regex = "1.1"
|
regex = "1.1"
|
||||||
|
|
||||||
[features]
|
[features]
|
||||||
|
@@ -317,14 +317,6 @@ pub struct LineBuffer {
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl LineBuffer {
|
impl LineBuffer {
|
||||||
/// Set the binary detection method used on this line buffer.
|
|
||||||
///
|
|
||||||
/// This permits dynamically changing the binary detection strategy on
|
|
||||||
/// an existing line buffer without needing to create a new one.
|
|
||||||
pub fn set_binary_detection(&mut self, binary: BinaryDetection) {
|
|
||||||
self.config.binary = binary;
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Reset this buffer, such that it can be used with a new reader.
|
/// Reset this buffer, such that it can be used with a new reader.
|
||||||
fn clear(&mut self) {
|
fn clear(&mut self) {
|
||||||
self.pos = 0;
|
self.pos = 0;
|
||||||
|
@@ -90,13 +90,6 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
self.sink_matched(buf, range)
|
self.sink_matched(buf, range)
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn binary_data(
|
|
||||||
&mut self,
|
|
||||||
binary_byte_offset: u64,
|
|
||||||
) -> Result<bool, S::Error> {
|
|
||||||
self.sink.binary_data(&self.searcher, binary_byte_offset)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub fn begin(&mut self) -> Result<bool, S::Error> {
|
pub fn begin(&mut self) -> Result<bool, S::Error> {
|
||||||
self.sink.begin(&self.searcher)
|
self.sink.begin(&self.searcher)
|
||||||
}
|
}
|
||||||
@@ -148,28 +141,19 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
consumed
|
consumed
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn detect_binary(
|
pub fn detect_binary(&mut self, buf: &[u8], range: &Range) -> bool {
|
||||||
&mut self,
|
|
||||||
buf: &[u8],
|
|
||||||
range: &Range,
|
|
||||||
) -> Result<bool, S::Error> {
|
|
||||||
if self.binary_byte_offset.is_some() {
|
if self.binary_byte_offset.is_some() {
|
||||||
return Ok(self.config.binary.quit_byte().is_some());
|
return true;
|
||||||
}
|
}
|
||||||
let binary_byte = match self.config.binary.0 {
|
let binary_byte = match self.config.binary.0 {
|
||||||
BinaryDetection::Quit(b) => b,
|
BinaryDetection::Quit(b) => b,
|
||||||
BinaryDetection::Convert(b) => b,
|
_ => return false,
|
||||||
_ => return Ok(false),
|
|
||||||
};
|
};
|
||||||
if let Some(i) = B(&buf[*range]).find_byte(binary_byte) {
|
if let Some(i) = B(&buf[*range]).find_byte(binary_byte) {
|
||||||
let offset = range.start() + i;
|
self.binary_byte_offset = Some(range.start() + i);
|
||||||
self.binary_byte_offset = Some(offset);
|
true
|
||||||
if !self.binary_data(offset as u64)? {
|
|
||||||
return Ok(true);
|
|
||||||
}
|
|
||||||
Ok(self.config.binary.quit_byte().is_some())
|
|
||||||
} else {
|
} else {
|
||||||
Ok(false)
|
false
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -432,7 +416,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
buf: &[u8],
|
buf: &[u8],
|
||||||
range: &Range,
|
range: &Range,
|
||||||
) -> Result<bool, S::Error> {
|
) -> Result<bool, S::Error> {
|
||||||
if self.binary && self.detect_binary(buf, range)? {
|
if self.binary && self.detect_binary(buf, range) {
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
if !self.sink_break_context(range.start())? {
|
if !self.sink_break_context(range.start())? {
|
||||||
@@ -464,7 +448,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
buf: &[u8],
|
buf: &[u8],
|
||||||
range: &Range,
|
range: &Range,
|
||||||
) -> Result<bool, S::Error> {
|
) -> Result<bool, S::Error> {
|
||||||
if self.binary && self.detect_binary(buf, range)? {
|
if self.binary && self.detect_binary(buf, range) {
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
self.count_lines(buf, range.start());
|
self.count_lines(buf, range.start());
|
||||||
@@ -494,7 +478,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
) -> Result<bool, S::Error> {
|
) -> Result<bool, S::Error> {
|
||||||
assert!(self.after_context_left >= 1);
|
assert!(self.after_context_left >= 1);
|
||||||
|
|
||||||
if self.binary && self.detect_binary(buf, range)? {
|
if self.binary && self.detect_binary(buf, range) {
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
self.count_lines(buf, range.start());
|
self.count_lines(buf, range.start());
|
||||||
@@ -523,7 +507,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
|
|||||||
buf: &[u8],
|
buf: &[u8],
|
||||||
range: &Range,
|
range: &Range,
|
||||||
) -> Result<bool, S::Error> {
|
) -> Result<bool, S::Error> {
|
||||||
if self.binary && self.detect_binary(buf, range)? {
|
if self.binary && self.detect_binary(buf, range) {
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
self.count_lines(buf, range.start());
|
self.count_lines(buf, range.start());
|
||||||
|
@@ -51,7 +51,6 @@ where M: Matcher,
|
|||||||
fn fill(&mut self) -> Result<bool, S::Error> {
|
fn fill(&mut self) -> Result<bool, S::Error> {
|
||||||
assert!(self.rdr.buffer()[self.core.pos()..].is_empty());
|
assert!(self.rdr.buffer()[self.core.pos()..].is_empty());
|
||||||
|
|
||||||
let already_binary = self.rdr.binary_byte_offset().is_some();
|
|
||||||
let old_buf_len = self.rdr.buffer().len();
|
let old_buf_len = self.rdr.buffer().len();
|
||||||
let consumed = self.core.roll(self.rdr.buffer());
|
let consumed = self.core.roll(self.rdr.buffer());
|
||||||
self.rdr.consume(consumed);
|
self.rdr.consume(consumed);
|
||||||
@@ -59,14 +58,7 @@ where M: Matcher,
|
|||||||
Err(err) => return Err(S::Error::error_io(err)),
|
Err(err) => return Err(S::Error::error_io(err)),
|
||||||
Ok(didread) => didread,
|
Ok(didread) => didread,
|
||||||
};
|
};
|
||||||
if !already_binary {
|
if !didread || self.rdr.binary_byte_offset().is_some() {
|
||||||
if let Some(offset) = self.rdr.binary_byte_offset() {
|
|
||||||
if !self.core.binary_data(offset)? {
|
|
||||||
return Ok(false);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if !didread || self.should_binary_quit() {
|
|
||||||
return Ok(false);
|
return Ok(false);
|
||||||
}
|
}
|
||||||
// If rolling the buffer didn't result in consuming anything and if
|
// If rolling the buffer didn't result in consuming anything and if
|
||||||
@@ -79,11 +71,6 @@ where M: Matcher,
|
|||||||
}
|
}
|
||||||
Ok(true)
|
Ok(true)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn should_binary_quit(&self) -> bool {
|
|
||||||
self.rdr.binary_byte_offset().is_some()
|
|
||||||
&& self.config.binary.quit_byte().is_some()
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
#[derive(Debug)]
|
#[derive(Debug)]
|
||||||
@@ -116,7 +103,7 @@ impl<'s, M: Matcher, S: Sink> SliceByLine<'s, M, S> {
|
|||||||
DEFAULT_BUFFER_CAPACITY,
|
DEFAULT_BUFFER_CAPACITY,
|
||||||
);
|
);
|
||||||
let binary_range = Range::new(0, binary_upto);
|
let binary_range = Range::new(0, binary_upto);
|
||||||
if !self.core.detect_binary(self.slice, &binary_range)? {
|
if !self.core.detect_binary(self.slice, &binary_range) {
|
||||||
while
|
while
|
||||||
!self.slice[self.core.pos()..].is_empty()
|
!self.slice[self.core.pos()..].is_empty()
|
||||||
&& self.core.match_by_line(self.slice)?
|
&& self.core.match_by_line(self.slice)?
|
||||||
@@ -168,7 +155,7 @@ impl<'s, M: Matcher, S: Sink> MultiLine<'s, M, S> {
|
|||||||
DEFAULT_BUFFER_CAPACITY,
|
DEFAULT_BUFFER_CAPACITY,
|
||||||
);
|
);
|
||||||
let binary_range = Range::new(0, binary_upto);
|
let binary_range = Range::new(0, binary_upto);
|
||||||
if !self.core.detect_binary(self.slice, &binary_range)? {
|
if !self.core.detect_binary(self.slice, &binary_range) {
|
||||||
let mut keepgoing = true;
|
let mut keepgoing = true;
|
||||||
while !self.slice[self.core.pos()..].is_empty() && keepgoing {
|
while !self.slice[self.core.pos()..].is_empty() && keepgoing {
|
||||||
keepgoing = self.sink()?;
|
keepgoing = self.sink()?;
|
||||||
|
@@ -75,41 +75,25 @@ impl BinaryDetection {
|
|||||||
BinaryDetection(line_buffer::BinaryDetection::Quit(binary_byte))
|
BinaryDetection(line_buffer::BinaryDetection::Quit(binary_byte))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Binary detection is performed by looking for the given byte, and
|
// TODO(burntsushi): Figure out how to make binary conversion work. This
|
||||||
/// replacing it with the line terminator configured on the searcher.
|
// permits implementing GNU grep's default behavior, which is to zap NUL
|
||||||
/// (If the searcher is configured to use `CRLF` as the line terminator,
|
// bytes but still execute a search (if a match is detected, then GNU grep
|
||||||
/// then this byte is replaced by just `LF`.)
|
// stops and reports that a match was found but doesn't print the matching
|
||||||
///
|
// line itself).
|
||||||
/// When searching is performed using a fixed size buffer, then the
|
//
|
||||||
/// contents of that buffer are always searched for the presence of this
|
// This behavior is pretty simple to implement using the line buffer (and
|
||||||
/// byte and replaced with the line terminator. In effect, the caller is
|
// in fact, it is already implemented and tested), since there's a fixed
|
||||||
/// guaranteed to never observe this byte while searching.
|
// size buffer that we can easily write to. The issue arises when searching
|
||||||
///
|
// a `&[u8]` (whether on the heap or via a memory map), since this isn't
|
||||||
/// When searching is performed with the entire contents mapped into
|
// something we can easily write to.
|
||||||
/// memory, then this setting has no effect and is ignored.
|
|
||||||
pub fn convert(binary_byte: u8) -> BinaryDetection {
|
/// The given byte is searched in all contents read by the line buffer. If
|
||||||
|
/// it occurs, then it is replaced by the line terminator. The line buffer
|
||||||
|
/// guarantees that this byte will never be observable by callers.
|
||||||
|
#[allow(dead_code)]
|
||||||
|
fn convert(binary_byte: u8) -> BinaryDetection {
|
||||||
BinaryDetection(line_buffer::BinaryDetection::Convert(binary_byte))
|
BinaryDetection(line_buffer::BinaryDetection::Convert(binary_byte))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// If this binary detection uses the "quit" strategy, then this returns
|
|
||||||
/// the byte that will cause a search to quit. In any other case, this
|
|
||||||
/// returns `None`.
|
|
||||||
pub fn quit_byte(&self) -> Option<u8> {
|
|
||||||
match self.0 {
|
|
||||||
line_buffer::BinaryDetection::Quit(b) => Some(b),
|
|
||||||
_ => None,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// If this binary detection uses the "convert" strategy, then this returns
|
|
||||||
/// the byte that will be replaced by the line terminator. In any other
|
|
||||||
/// case, this returns `None`.
|
|
||||||
pub fn convert_byte(&self) -> Option<u8> {
|
|
||||||
match self.0 {
|
|
||||||
line_buffer::BinaryDetection::Convert(b) => Some(b),
|
|
||||||
_ => None,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// An encoding to use when searching.
|
/// An encoding to use when searching.
|
||||||
@@ -171,8 +155,6 @@ pub struct Config {
|
|||||||
/// An encoding that, when present, causes the searcher to transcode all
|
/// An encoding that, when present, causes the searcher to transcode all
|
||||||
/// input from the encoding to UTF-8.
|
/// input from the encoding to UTF-8.
|
||||||
encoding: Option<Encoding>,
|
encoding: Option<Encoding>,
|
||||||
/// Whether to do automatic transcoding based on a BOM or not.
|
|
||||||
bom_sniffing: bool,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Default for Config {
|
impl Default for Config {
|
||||||
@@ -189,7 +171,6 @@ impl Default for Config {
|
|||||||
binary: BinaryDetection::default(),
|
binary: BinaryDetection::default(),
|
||||||
multi_line: false,
|
multi_line: false,
|
||||||
encoding: None,
|
encoding: None,
|
||||||
bom_sniffing: true,
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -322,15 +303,12 @@ impl SearcherBuilder {
|
|||||||
config.before_context = 0;
|
config.before_context = 0;
|
||||||
config.after_context = 0;
|
config.after_context = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
let mut decode_builder = DecodeReaderBytesBuilder::new();
|
let mut decode_builder = DecodeReaderBytesBuilder::new();
|
||||||
decode_builder
|
decode_builder
|
||||||
.encoding(self.config.encoding.as_ref().map(|e| e.0))
|
.encoding(self.config.encoding.as_ref().map(|e| e.0))
|
||||||
.utf8_passthru(true)
|
.utf8_passthru(true)
|
||||||
.strip_bom(self.config.bom_sniffing)
|
.strip_bom(true)
|
||||||
.bom_override(true)
|
.bom_override(true);
|
||||||
.bom_sniffing(self.config.bom_sniffing);
|
|
||||||
|
|
||||||
Searcher {
|
Searcher {
|
||||||
config: config,
|
config: config,
|
||||||
decode_builder: decode_builder,
|
decode_builder: decode_builder,
|
||||||
@@ -528,13 +506,12 @@ impl SearcherBuilder {
|
|||||||
/// transcoding process encounters an error, then bytes are replaced with
|
/// transcoding process encounters an error, then bytes are replaced with
|
||||||
/// the Unicode replacement codepoint.
|
/// the Unicode replacement codepoint.
|
||||||
///
|
///
|
||||||
/// When no encoding is specified (the default), then BOM sniffing is
|
/// When no encoding is specified (the default), then BOM sniffing is used
|
||||||
/// used (if it's enabled, which it is, by default) to determine whether
|
/// to determine whether the source data is UTF-8 or UTF-16, and
|
||||||
/// the source data is UTF-8 or UTF-16, and transcoding will be performed
|
/// transcoding will be performed automatically. If no BOM could be found,
|
||||||
/// automatically. If no BOM could be found, then the source data is
|
/// then the source data is searched _as if_ it were UTF-8. However, so
|
||||||
/// searched _as if_ it were UTF-8. However, so long as the source data is
|
/// long as the source data is at least ASCII compatible, then it is
|
||||||
/// at least ASCII compatible, then it is possible for a search to produce
|
/// possible for a search to produce useful results.
|
||||||
/// useful results.
|
|
||||||
pub fn encoding(
|
pub fn encoding(
|
||||||
&mut self,
|
&mut self,
|
||||||
encoding: Option<Encoding>,
|
encoding: Option<Encoding>,
|
||||||
@@ -542,23 +519,6 @@ impl SearcherBuilder {
|
|||||||
self.config.encoding = encoding;
|
self.config.encoding = encoding;
|
||||||
self
|
self
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Enable automatic transcoding based on BOM sniffing.
|
|
||||||
///
|
|
||||||
/// When this is enabled and an explicit encoding is not set, then this
|
|
||||||
/// searcher will try to detect the encoding of the bytes being searched
|
|
||||||
/// by sniffing its byte-order mark (BOM). In particular, when this is
|
|
||||||
/// enabled, UTF-16 encoded files will be searched seamlessly.
|
|
||||||
///
|
|
||||||
/// When this is disabled and if an explicit encoding is not set, then
|
|
||||||
/// the bytes from the source stream will be passed through unchanged,
|
|
||||||
/// including its BOM, if one is present.
|
|
||||||
///
|
|
||||||
/// This is enabled by default.
|
|
||||||
pub fn bom_sniffing(&mut self, yes: bool) -> &mut SearcherBuilder {
|
|
||||||
self.config.bom_sniffing = yes;
|
|
||||||
self
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A searcher executes searches over a haystack and writes results to a caller
|
/// A searcher executes searches over a haystack and writes results to a caller
|
||||||
@@ -755,12 +715,6 @@ impl Searcher {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Set the binary detection method used on this searcher.
|
|
||||||
pub fn set_binary_detection(&mut self, detection: BinaryDetection) {
|
|
||||||
self.config.binary = detection.clone();
|
|
||||||
self.line_buffer.borrow_mut().set_binary_detection(detection.0);
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Check that the searcher's configuration and the matcher are consistent
|
/// Check that the searcher's configuration and the matcher are consistent
|
||||||
/// with each other.
|
/// with each other.
|
||||||
fn check_config<M: Matcher>(&self, matcher: M) -> Result<(), ConfigError> {
|
fn check_config<M: Matcher>(&self, matcher: M) -> Result<(), ConfigError> {
|
||||||
@@ -784,8 +738,7 @@ impl Searcher {
|
|||||||
|
|
||||||
/// Returns true if and only if the given slice needs to be transcoded.
|
/// Returns true if and only if the given slice needs to be transcoded.
|
||||||
fn slice_needs_transcoding(&self, slice: &[u8]) -> bool {
|
fn slice_needs_transcoding(&self, slice: &[u8]) -> bool {
|
||||||
self.config.encoding.is_some()
|
self.config.encoding.is_some() || slice_has_utf16_bom(slice)
|
||||||
|| (self.config.bom_sniffing && slice_has_utf16_bom(slice))
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -800,12 +753,6 @@ impl Searcher {
|
|||||||
self.config.line_term
|
self.config.line_term
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the type of binary detection configured on this searcher.
|
|
||||||
#[inline]
|
|
||||||
pub fn binary_detection(&self) -> &BinaryDetection {
|
|
||||||
&self.config.binary
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Returns true if and only if this searcher is configured to invert its
|
/// Returns true if and only if this searcher is configured to invert its
|
||||||
/// search results. That is, matching lines are lines that do **not** match
|
/// search results. That is, matching lines are lines that do **not** match
|
||||||
/// the searcher's matcher.
|
/// the searcher's matcher.
|
||||||
|
@@ -167,28 +167,6 @@ pub trait Sink {
|
|||||||
Ok(true)
|
Ok(true)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// This method is called whenever binary detection is enabled and binary
|
|
||||||
/// data is found. If binary data is found, then this is called at least
|
|
||||||
/// once for the first occurrence with the absolute byte offset at which
|
|
||||||
/// the binary data begins.
|
|
||||||
///
|
|
||||||
/// If this returns `true`, then searching continues. If this returns
|
|
||||||
/// `false`, then searching is stopped immediately and `finish` is called.
|
|
||||||
///
|
|
||||||
/// If this returns an error, then searching is stopped immediately,
|
|
||||||
/// `finish` is not called and the error is bubbled back up to the caller
|
|
||||||
/// of the searcher.
|
|
||||||
///
|
|
||||||
/// By default, it does nothing and returns `true`.
|
|
||||||
#[inline]
|
|
||||||
fn binary_data(
|
|
||||||
&mut self,
|
|
||||||
_searcher: &Searcher,
|
|
||||||
_binary_byte_offset: u64,
|
|
||||||
) -> Result<bool, Self::Error> {
|
|
||||||
Ok(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
/// This method is called when a search has begun, before any search is
|
/// This method is called when a search has begun, before any search is
|
||||||
/// executed. By default, this does nothing.
|
/// executed. By default, this does nothing.
|
||||||
///
|
///
|
||||||
@@ -250,15 +228,6 @@ impl<'a, S: Sink> Sink for &'a mut S {
|
|||||||
(**self).context_break(searcher)
|
(**self).context_break(searcher)
|
||||||
}
|
}
|
||||||
|
|
||||||
#[inline]
|
|
||||||
fn binary_data(
|
|
||||||
&mut self,
|
|
||||||
searcher: &Searcher,
|
|
||||||
binary_byte_offset: u64,
|
|
||||||
) -> Result<bool, S::Error> {
|
|
||||||
(**self).binary_data(searcher, binary_byte_offset)
|
|
||||||
}
|
|
||||||
|
|
||||||
#[inline]
|
#[inline]
|
||||||
fn begin(
|
fn begin(
|
||||||
&mut self,
|
&mut self,
|
||||||
@@ -306,15 +275,6 @@ impl<S: Sink + ?Sized> Sink for Box<S> {
|
|||||||
(**self).context_break(searcher)
|
(**self).context_break(searcher)
|
||||||
}
|
}
|
||||||
|
|
||||||
#[inline]
|
|
||||||
fn binary_data(
|
|
||||||
&mut self,
|
|
||||||
searcher: &Searcher,
|
|
||||||
binary_byte_offset: u64,
|
|
||||||
) -> Result<bool, S::Error> {
|
|
||||||
(**self).binary_data(searcher, binary_byte_offset)
|
|
||||||
}
|
|
||||||
|
|
||||||
#[inline]
|
#[inline]
|
||||||
fn begin(
|
fn begin(
|
||||||
&mut self,
|
&mut self,
|
||||||
|
@@ -14,11 +14,11 @@ license = "Unlicense/MIT"
|
|||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
grep-cli = { version = "0.1.1", path = "../grep-cli" }
|
grep-cli = { version = "0.1.1", path = "../grep-cli" }
|
||||||
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
|
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
|
||||||
grep-pcre2 = { version = "0.1.3", path = "../grep-pcre2", optional = true }
|
grep-pcre2 = { version = "0.1.2", path = "../grep-pcre2", optional = true }
|
||||||
grep-printer = { version = "0.1.2", path = "../grep-printer" }
|
grep-printer = { version = "0.1.1", path = "../grep-printer" }
|
||||||
grep-regex = { version = "0.1.3", path = "../grep-regex" }
|
grep-regex = { version = "0.1.1", path = "../grep-regex" }
|
||||||
grep-searcher = { version = "0.1.4", path = "../grep-searcher" }
|
grep-searcher = { version = "0.1.1", path = "../grep-searcher" }
|
||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
termcolor = "1.0.4"
|
termcolor = "1.0.4"
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "ignore"
|
name = "ignore"
|
||||||
version = "0.4.7" #:version
|
version = "0.4.6" #:version
|
||||||
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
authors = ["Andrew Gallant <jamslam@gmail.com>"]
|
||||||
description = """
|
description = """
|
||||||
A fast library for efficiently matching ignore files such as `.gitignore`
|
A fast library for efficiently matching ignore files such as `.gitignore`
|
||||||
@@ -19,7 +19,7 @@ bench = false
|
|||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
crossbeam-channel = "0.3.6"
|
crossbeam-channel = "0.3.6"
|
||||||
globset = { version = "0.4.3", path = "../globset" }
|
globset = { version = "0.4.2", path = "../globset" }
|
||||||
lazy_static = "1.1"
|
lazy_static = "1.1"
|
||||||
log = "0.4.5"
|
log = "0.4.5"
|
||||||
memchr = "2.1"
|
memchr = "2.1"
|
||||||
|
@@ -111,7 +111,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
|
|||||||
("brotli", &["*.br"]),
|
("brotli", &["*.br"]),
|
||||||
("buildstream", &["*.bst"]),
|
("buildstream", &["*.bst"]),
|
||||||
("bzip2", &["*.bz2", "*.tbz2"]),
|
("bzip2", &["*.bz2", "*.tbz2"]),
|
||||||
("c", &["*.[chH]", "*.[chH].in", "*.cats"]),
|
("c", &["*.c", "*.h", "*.H", "*.cats"]),
|
||||||
("cabal", &["*.cabal"]),
|
("cabal", &["*.cabal"]),
|
||||||
("cbor", &["*.cbor"]),
|
("cbor", &["*.cbor"]),
|
||||||
("ceylon", &["*.ceylon"]),
|
("ceylon", &["*.ceylon"]),
|
||||||
@@ -121,8 +121,8 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
|
|||||||
("creole", &["*.creole"]),
|
("creole", &["*.creole"]),
|
||||||
("config", &["*.cfg", "*.conf", "*.config", "*.ini"]),
|
("config", &["*.cfg", "*.conf", "*.config", "*.ini"]),
|
||||||
("cpp", &[
|
("cpp", &[
|
||||||
"*.[ChH]", "*.cc", "*.[ch]pp", "*.[ch]xx", "*.hh", "*.inl",
|
"*.C", "*.cc", "*.cpp", "*.cxx",
|
||||||
"*.[ChH].in", "*.cc.in", "*.[ch]pp.in", "*.[ch]xx.in", "*.hh.in",
|
"*.h", "*.H", "*.hh", "*.hpp", "*.hxx", "*.inl",
|
||||||
]),
|
]),
|
||||||
("crystal", &["Projectfile", "*.cr"]),
|
("crystal", &["Projectfile", "*.cr"]),
|
||||||
("cs", &["*.cs"]),
|
("cs", &["*.cs"]),
|
||||||
@@ -156,7 +156,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
|
|||||||
("hs", &["*.hs", "*.lhs"]),
|
("hs", &["*.hs", "*.lhs"]),
|
||||||
("html", &["*.htm", "*.html", "*.ejs"]),
|
("html", &["*.htm", "*.html", "*.ejs"]),
|
||||||
("idris", &["*.idr", "*.lidr"]),
|
("idris", &["*.idr", "*.lidr"]),
|
||||||
("java", &["*.java", "*.jsp", "*.jspx", "*.properties"]),
|
("java", &["*.java", "*.jsp"]),
|
||||||
("jinja", &["*.j2", "*.jinja", "*.jinja2"]),
|
("jinja", &["*.j2", "*.jinja", "*.jinja2"]),
|
||||||
("js", &[
|
("js", &[
|
||||||
"*.js", "*.jsx", "*.vue",
|
"*.js", "*.jsx", "*.vue",
|
||||||
@@ -196,16 +196,14 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
|
|||||||
"OFL-*[0-9]*",
|
"OFL-*[0-9]*",
|
||||||
]),
|
]),
|
||||||
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
|
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
|
||||||
("lock", &["*.lock", "package-lock.json"]),
|
|
||||||
("log", &["*.log"]),
|
("log", &["*.log"]),
|
||||||
("lua", &["*.lua"]),
|
("lua", &["*.lua"]),
|
||||||
("lzma", &["*.lzma"]),
|
("lzma", &["*.lzma"]),
|
||||||
("lz4", &["*.lz4"]),
|
("lz4", &["*.lz4"]),
|
||||||
("m4", &["*.ac", "*.m4"]),
|
("m4", &["*.ac", "*.m4"]),
|
||||||
("make", &[
|
("make", &[
|
||||||
"[Gg][Nn][Uu]makefile", "[Mm]akefile",
|
"gnumakefile", "Gnumakefile", "GNUmakefile",
|
||||||
"[Gg][Nn][Uu]makefile.am", "[Mm]akefile.am",
|
"makefile", "Makefile",
|
||||||
"[Gg][Nn][Uu]makefile.in", "[Mm]akefile.in",
|
|
||||||
"*.mk", "*.mak"
|
"*.mk", "*.mak"
|
||||||
]),
|
]),
|
||||||
("mako", &["*.mako", "*.mao"]),
|
("mako", &["*.mako", "*.mao"]),
|
||||||
@@ -301,10 +299,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
|
|||||||
("vimscript", &["*.vim"]),
|
("vimscript", &["*.vim"]),
|
||||||
("wiki", &["*.mediawiki", "*.wiki"]),
|
("wiki", &["*.mediawiki", "*.wiki"]),
|
||||||
("webidl", &["*.idl", "*.webidl", "*.widl"]),
|
("webidl", &["*.idl", "*.webidl", "*.widl"]),
|
||||||
("xml", &[
|
("xml", &["*.xml", "*.xml.dist"]),
|
||||||
"*.xml", "*.xml.dist", "*.dtd", "*.xsl", "*.xslt", "*.xsd", "*.xjb",
|
|
||||||
"*.rng", "*.sch",
|
|
||||||
]),
|
|
||||||
("xz", &["*.xz", "*.txz"]),
|
("xz", &["*.xz", "*.txz"]),
|
||||||
("yacc", &["*.y"]),
|
("yacc", &["*.y"]),
|
||||||
("yaml", &["*.yaml", "*.yml"]),
|
("yaml", &["*.yaml", "*.yml"]),
|
||||||
|
175
src/app.rs
175
src/app.rs
@@ -27,9 +27,6 @@ configuration file. The file can specify one shell argument per line. Lines
|
|||||||
starting with '#' are ignored. For more details, see the man page or the
|
starting with '#' are ignored. For more details, see the man page or the
|
||||||
README.
|
README.
|
||||||
|
|
||||||
Tip: to disable all smart filtering and make ripgrep behave a bit more like
|
|
||||||
classical grep, use 'rg -uuu'.
|
|
||||||
|
|
||||||
Project home page: https://github.com/BurntSushi/ripgrep
|
Project home page: https://github.com/BurntSushi/ripgrep
|
||||||
|
|
||||||
Use -h for short descriptions and --help for more details.";
|
Use -h for short descriptions and --help for more details.";
|
||||||
@@ -547,9 +544,7 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
|
|||||||
// flags are hidden and merely mentioned in the docs of the corresponding
|
// flags are hidden and merely mentioned in the docs of the corresponding
|
||||||
// "positive" flag.
|
// "positive" flag.
|
||||||
flag_after_context(&mut args);
|
flag_after_context(&mut args);
|
||||||
flag_auto_hybrid_regex(&mut args);
|
|
||||||
flag_before_context(&mut args);
|
flag_before_context(&mut args);
|
||||||
flag_binary(&mut args);
|
|
||||||
flag_block_buffered(&mut args);
|
flag_block_buffered(&mut args);
|
||||||
flag_byte_offset(&mut args);
|
flag_byte_offset(&mut args);
|
||||||
flag_case_sensitive(&mut args);
|
flag_case_sensitive(&mut args);
|
||||||
@@ -583,7 +578,6 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
|
|||||||
flag_line_number(&mut args);
|
flag_line_number(&mut args);
|
||||||
flag_line_regexp(&mut args);
|
flag_line_regexp(&mut args);
|
||||||
flag_max_columns(&mut args);
|
flag_max_columns(&mut args);
|
||||||
flag_max_columns_preview(&mut args);
|
|
||||||
flag_max_count(&mut args);
|
flag_max_count(&mut args);
|
||||||
flag_max_depth(&mut args);
|
flag_max_depth(&mut args);
|
||||||
flag_max_filesize(&mut args);
|
flag_max_filesize(&mut args);
|
||||||
@@ -606,7 +600,6 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
|
|||||||
flag_path_separator(&mut args);
|
flag_path_separator(&mut args);
|
||||||
flag_passthru(&mut args);
|
flag_passthru(&mut args);
|
||||||
flag_pcre2(&mut args);
|
flag_pcre2(&mut args);
|
||||||
flag_pcre2_version(&mut args);
|
|
||||||
flag_pre(&mut args);
|
flag_pre(&mut args);
|
||||||
flag_pre_glob(&mut args);
|
flag_pre_glob(&mut args);
|
||||||
flag_pretty(&mut args);
|
flag_pretty(&mut args);
|
||||||
@@ -653,7 +646,7 @@ will be provided. Namely, the following is equivalent to the above:
|
|||||||
let arg = RGArg::positional("pattern", "PATTERN")
|
let arg = RGArg::positional("pattern", "PATTERN")
|
||||||
.help(SHORT).long_help(LONG)
|
.help(SHORT).long_help(LONG)
|
||||||
.required_unless(&[
|
.required_unless(&[
|
||||||
"file", "files", "regexp", "type-list", "pcre2-version",
|
"file", "files", "regexp", "type-list",
|
||||||
]);
|
]);
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
@@ -684,50 +677,6 @@ This overrides the --context flag.
|
|||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
|
|
||||||
fn flag_auto_hybrid_regex(args: &mut Vec<RGArg>) {
|
|
||||||
const SHORT: &str = "Dynamically use PCRE2 if necessary.";
|
|
||||||
const LONG: &str = long!("\
|
|
||||||
When this flag is used, ripgrep will dynamically choose between supported regex
|
|
||||||
engines depending on the features used in a pattern. When ripgrep chooses a
|
|
||||||
regex engine, it applies that choice for every regex provided to ripgrep (e.g.,
|
|
||||||
via multiple -e/--regexp or -f/--file flags).
|
|
||||||
|
|
||||||
As an example of how this flag might behave, ripgrep will attempt to use
|
|
||||||
its default finite automata based regex engine whenever the pattern can be
|
|
||||||
successfully compiled with that regex engine. If PCRE2 is enabled and if the
|
|
||||||
pattern given could not be compiled with the default regex engine, then PCRE2
|
|
||||||
will be automatically used for searching. If PCRE2 isn't available, then this
|
|
||||||
flag has no effect because there is only one regex engine to choose from.
|
|
||||||
|
|
||||||
In the future, ripgrep may adjust its heuristics for how it decides which
|
|
||||||
regex engine to use. In general, the heuristics will be limited to a static
|
|
||||||
analysis of the patterns, and not to any specific runtime behavior observed
|
|
||||||
while searching files.
|
|
||||||
|
|
||||||
The primary downside of using this flag is that it may not always be obvious
|
|
||||||
which regex engine ripgrep uses, and thus, the match semantics or performance
|
|
||||||
profile of ripgrep may subtly and unexpectedly change. However, in many cases,
|
|
||||||
all regex engines will agree on what constitutes a match and it can be nice
|
|
||||||
to transparently support more advanced regex features like look-around and
|
|
||||||
backreferences without explicitly needing to enable them.
|
|
||||||
|
|
||||||
This flag can be disabled with --no-auto-hybrid-regex.
|
|
||||||
");
|
|
||||||
let arg = RGArg::switch("auto-hybrid-regex")
|
|
||||||
.help(SHORT).long_help(LONG)
|
|
||||||
.overrides("no-auto-hybrid-regex")
|
|
||||||
.overrides("pcre2")
|
|
||||||
.overrides("no-pcre2");
|
|
||||||
args.push(arg);
|
|
||||||
|
|
||||||
let arg = RGArg::switch("no-auto-hybrid-regex")
|
|
||||||
.hidden()
|
|
||||||
.overrides("auto-hybrid-regex")
|
|
||||||
.overrides("pcre2")
|
|
||||||
.overrides("no-pcre2");
|
|
||||||
args.push(arg);
|
|
||||||
}
|
|
||||||
|
|
||||||
fn flag_before_context(args: &mut Vec<RGArg>) {
|
fn flag_before_context(args: &mut Vec<RGArg>) {
|
||||||
const SHORT: &str = "Show NUM lines before each match.";
|
const SHORT: &str = "Show NUM lines before each match.";
|
||||||
const LONG: &str = long!("\
|
const LONG: &str = long!("\
|
||||||
@@ -742,55 +691,6 @@ This overrides the --context flag.
|
|||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
|
|
||||||
fn flag_binary(args: &mut Vec<RGArg>) {
|
|
||||||
const SHORT: &str = "Search binary files.";
|
|
||||||
const LONG: &str = long!("\
|
|
||||||
Enabling this flag will cause ripgrep to search binary files. By default,
|
|
||||||
ripgrep attempts to automatically skip binary files in order to improve the
|
|
||||||
relevance of results and make the search faster.
|
|
||||||
|
|
||||||
Binary files are heuristically detected based on whether they contain a NUL
|
|
||||||
byte or not. By default (without this flag set), once a NUL byte is seen,
|
|
||||||
ripgrep will stop searching the file. Usually, NUL bytes occur in the beginning
|
|
||||||
of most binary files. If a NUL byte occurs after a match, then ripgrep will
|
|
||||||
still stop searching the rest of the file, but a warning will be printed.
|
|
||||||
|
|
||||||
In contrast, when this flag is provided, ripgrep will continue searching a file
|
|
||||||
even if a NUL byte is found. In particular, if a NUL byte is found then ripgrep
|
|
||||||
will continue searching until either a match is found or the end of the file is
|
|
||||||
reached, whichever comes sooner. If a match is found, then ripgrep will stop
|
|
||||||
and print a warning saying that the search stopped prematurely.
|
|
||||||
|
|
||||||
If you want ripgrep to search a file without any special NUL byte handling at
|
|
||||||
all (and potentially print binary data to stdout), then you should use the
|
|
||||||
'-a/--text' flag.
|
|
||||||
|
|
||||||
The '--binary' flag is a flag for controlling ripgrep's automatic filtering
|
|
||||||
mechanism. As such, it does not need to be used when searching a file
|
|
||||||
explicitly or when searching stdin. That is, it is only applicable when
|
|
||||||
recursively searching a directory.
|
|
||||||
|
|
||||||
Note that when the '-u/--unrestricted' flag is provided for a third time, then
|
|
||||||
this flag is automatically enabled.
|
|
||||||
|
|
||||||
This flag can be disabled with '--no-binary'. It overrides the '-a/--text'
|
|
||||||
flag.
|
|
||||||
");
|
|
||||||
let arg = RGArg::switch("binary")
|
|
||||||
.help(SHORT).long_help(LONG)
|
|
||||||
.overrides("no-binary")
|
|
||||||
.overrides("text")
|
|
||||||
.overrides("no-text");
|
|
||||||
args.push(arg);
|
|
||||||
|
|
||||||
let arg = RGArg::switch("no-binary")
|
|
||||||
.hidden()
|
|
||||||
.overrides("binary")
|
|
||||||
.overrides("text")
|
|
||||||
.overrides("no-text");
|
|
||||||
args.push(arg);
|
|
||||||
}
|
|
||||||
|
|
||||||
fn flag_block_buffered(args: &mut Vec<RGArg>) {
|
fn flag_block_buffered(args: &mut Vec<RGArg>) {
|
||||||
const SHORT: &str = "Force block buffering.";
|
const SHORT: &str = "Force block buffering.";
|
||||||
const LONG: &str = long!("\
|
const LONG: &str = long!("\
|
||||||
@@ -1084,9 +984,7 @@ Specify the text encoding that ripgrep will use on all files searched. The
|
|||||||
default value is 'auto', which will cause ripgrep to do a best effort automatic
|
default value is 'auto', which will cause ripgrep to do a best effort automatic
|
||||||
detection of encoding on a per-file basis. Automatic detection in this case
|
detection of encoding on a per-file basis. Automatic detection in this case
|
||||||
only applies to files that begin with a UTF-8 or UTF-16 byte-order mark (BOM).
|
only applies to files that begin with a UTF-8 or UTF-16 byte-order mark (BOM).
|
||||||
No other automatic detection is performed. One can also specify 'none' which
|
No other automatic detection is performend.
|
||||||
will then completely disable BOM sniffing and always result in searching the
|
|
||||||
raw bytes, including a BOM if it's present, regardless of its encoding.
|
|
||||||
|
|
||||||
Other supported values can be found in the list of labels here:
|
Other supported values can be found in the list of labels here:
|
||||||
https://encoding.spec.whatwg.org/#concept-encoding-get
|
https://encoding.spec.whatwg.org/#concept-encoding-get
|
||||||
@@ -1490,30 +1388,6 @@ When this flag is omitted or is set to 0, then it has no effect.
|
|||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
|
|
||||||
fn flag_max_columns_preview(args: &mut Vec<RGArg>) {
|
|
||||||
const SHORT: &str = "Print a preview for lines exceeding the limit.";
|
|
||||||
const LONG: &str = long!("\
|
|
||||||
When the '--max-columns' flag is used, ripgrep will by default completely
|
|
||||||
replace any line that is too long with a message indicating that a matching
|
|
||||||
line was removed. When this flag is combined with '--max-columns', a preview
|
|
||||||
of the line (corresponding to the limit size) is shown instead, where the part
|
|
||||||
of the line exceeding the limit is not shown.
|
|
||||||
|
|
||||||
If the '--max-columns' flag is not set, then this has no effect.
|
|
||||||
|
|
||||||
This flag can be disabled with '--no-max-columns-preview'.
|
|
||||||
");
|
|
||||||
let arg = RGArg::switch("max-columns-preview")
|
|
||||||
.help(SHORT).long_help(LONG)
|
|
||||||
.overrides("no-max-columns-preview");
|
|
||||||
args.push(arg);
|
|
||||||
|
|
||||||
let arg = RGArg::switch("no-max-columns-preview")
|
|
||||||
.hidden()
|
|
||||||
.overrides("max-columns-preview");
|
|
||||||
args.push(arg);
|
|
||||||
}
|
|
||||||
|
|
||||||
fn flag_max_count(args: &mut Vec<RGArg>) {
|
fn flag_max_count(args: &mut Vec<RGArg>) {
|
||||||
const SHORT: &str = "Limit the number of matches.";
|
const SHORT: &str = "Limit the number of matches.";
|
||||||
const LONG: &str = long!("\
|
const LONG: &str = long!("\
|
||||||
@@ -1983,28 +1857,12 @@ This flag can be disabled with --no-pcre2.
|
|||||||
");
|
");
|
||||||
let arg = RGArg::switch("pcre2").short("P")
|
let arg = RGArg::switch("pcre2").short("P")
|
||||||
.help(SHORT).long_help(LONG)
|
.help(SHORT).long_help(LONG)
|
||||||
.overrides("no-pcre2")
|
.overrides("no-pcre2");
|
||||||
.overrides("auto-hybrid-regex")
|
|
||||||
.overrides("no-auto-hybrid-regex");
|
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
|
|
||||||
let arg = RGArg::switch("no-pcre2")
|
let arg = RGArg::switch("no-pcre2")
|
||||||
.hidden()
|
.hidden()
|
||||||
.overrides("pcre2")
|
.overrides("pcre2");
|
||||||
.overrides("auto-hybrid-regex")
|
|
||||||
.overrides("no-auto-hybrid-regex");
|
|
||||||
args.push(arg);
|
|
||||||
}
|
|
||||||
|
|
||||||
fn flag_pcre2_version(args: &mut Vec<RGArg>) {
|
|
||||||
const SHORT: &str = "Print the version of PCRE2 that ripgrep uses.";
|
|
||||||
const LONG: &str = long!("\
|
|
||||||
When this flag is present, ripgrep will print the version of PCRE2 in use,
|
|
||||||
along with other information, and then exit. If PCRE2 is not available, then
|
|
||||||
ripgrep will print an error message and exit with an error code.
|
|
||||||
");
|
|
||||||
let arg = RGArg::switch("pcre2-version")
|
|
||||||
.help(SHORT).long_help(LONG);
|
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2014,13 +1872,12 @@ fn flag_pre(args: &mut Vec<RGArg>) {
|
|||||||
For each input FILE, search the standard output of COMMAND FILE rather than the
|
For each input FILE, search the standard output of COMMAND FILE rather than the
|
||||||
contents of FILE. This option expects the COMMAND program to either be an
|
contents of FILE. This option expects the COMMAND program to either be an
|
||||||
absolute path or to be available in your PATH. Either an empty string COMMAND
|
absolute path or to be available in your PATH. Either an empty string COMMAND
|
||||||
or the '--no-pre' flag will disable this behavior.
|
or the `--no-pre` flag will disable this behavior.
|
||||||
|
|
||||||
WARNING: When this flag is set, ripgrep will unconditionally spawn a
|
WARNING: When this flag is set, ripgrep will unconditionally spawn a
|
||||||
process for every file that is searched. Therefore, this can incur an
|
process for every file that is searched. Therefore, this can incur an
|
||||||
unnecessarily large performance penalty if you don't otherwise need the
|
unnecessarily large performance penalty if you don't otherwise need the
|
||||||
flexibility offered by this flag. One possible mitigation to this is to use
|
flexibility offered by this flag.
|
||||||
the '--pre-glob' flag to limit which files a preprocessor is run with.
|
|
||||||
|
|
||||||
A preprocessor is not run when ripgrep is searching stdin.
|
A preprocessor is not run when ripgrep is searching stdin.
|
||||||
|
|
||||||
@@ -2349,23 +2206,20 @@ escape codes to be printed that alter the behavior of your terminal.
|
|||||||
When binary file detection is enabled it is imperfect. In general, it uses
|
When binary file detection is enabled it is imperfect. In general, it uses
|
||||||
a simple heuristic. If a NUL byte is seen during search, then the file is
|
a simple heuristic. If a NUL byte is seen during search, then the file is
|
||||||
considered binary and search stops (unless this flag is present).
|
considered binary and search stops (unless this flag is present).
|
||||||
Alternatively, if the '--binary' flag is used, then ripgrep will only quit
|
|
||||||
when it sees a NUL byte after it sees a match (or searches the entire file).
|
|
||||||
|
|
||||||
This flag can be disabled with '--no-text'. It overrides the '--binary' flag.
|
Note that when the `-u/--unrestricted` flag is provided for a third time, then
|
||||||
|
this flag is automatically enabled.
|
||||||
|
|
||||||
|
This flag can be disabled with --no-text.
|
||||||
");
|
");
|
||||||
let arg = RGArg::switch("text").short("a")
|
let arg = RGArg::switch("text").short("a")
|
||||||
.help(SHORT).long_help(LONG)
|
.help(SHORT).long_help(LONG)
|
||||||
.overrides("no-text")
|
.overrides("no-text");
|
||||||
.overrides("binary")
|
|
||||||
.overrides("no-binary");
|
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
|
|
||||||
let arg = RGArg::switch("no-text")
|
let arg = RGArg::switch("no-text")
|
||||||
.hidden()
|
.hidden()
|
||||||
.overrides("text")
|
.overrides("text");
|
||||||
.overrides("binary")
|
|
||||||
.overrides("no-binary");
|
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2494,7 +2348,8 @@ Reduce the level of \"smart\" searching. A single -u won't respect .gitignore
|
|||||||
(etc.) files. Two -u flags will additionally search hidden files and
|
(etc.) files. Two -u flags will additionally search hidden files and
|
||||||
directories. Three -u flags will additionally search binary files.
|
directories. Three -u flags will additionally search binary files.
|
||||||
|
|
||||||
'rg -uuu' is roughly equivalent to 'grep -r'.
|
-uu is roughly equivalent to grep -r and -uuu is roughly equivalent to grep -a
|
||||||
|
-r.
|
||||||
");
|
");
|
||||||
let arg = RGArg::switch("unrestricted").short("u")
|
let arg = RGArg::switch("unrestricted").short("u")
|
||||||
.help(SHORT).long_help(LONG)
|
.help(SHORT).long_help(LONG)
|
||||||
@@ -2536,7 +2391,7 @@ ripgrep is explicitly instructed to search one file or stdin.
|
|||||||
|
|
||||||
This flag overrides --with-filename.
|
This flag overrides --with-filename.
|
||||||
");
|
");
|
||||||
let arg = RGArg::switch("no-filename").short("I")
|
let arg = RGArg::switch("no-filename")
|
||||||
.help(NO_SHORT).long_help(NO_LONG)
|
.help(NO_SHORT).long_help(NO_LONG)
|
||||||
.overrides("with-filename");
|
.overrides("with-filename");
|
||||||
args.push(arg);
|
args.push(arg);
|
||||||
|
174
src/args.rs
174
src/args.rs
@@ -73,8 +73,6 @@ pub enum Command {
|
|||||||
/// List all file type definitions configured, including the default file
|
/// List all file type definitions configured, including the default file
|
||||||
/// types and any additional file types added to the command line.
|
/// types and any additional file types added to the command line.
|
||||||
Types,
|
Types,
|
||||||
/// Print the version of PCRE2 in use.
|
|
||||||
PCRE2Version,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Command {
|
impl Command {
|
||||||
@@ -84,11 +82,7 @@ impl Command {
|
|||||||
|
|
||||||
match *self {
|
match *self {
|
||||||
Search | SearchParallel => true,
|
Search | SearchParallel => true,
|
||||||
| SearchNever
|
SearchNever | Files | FilesParallel | Types => false,
|
||||||
| Files
|
|
||||||
| FilesParallel
|
|
||||||
| Types
|
|
||||||
| PCRE2Version => false,
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -241,9 +235,7 @@ impl Args {
|
|||||||
let threads = self.matches().threads()?;
|
let threads = self.matches().threads()?;
|
||||||
let one_thread = is_one_search || threads == 1;
|
let one_thread = is_one_search || threads == 1;
|
||||||
|
|
||||||
Ok(if self.matches().is_present("pcre2-version") {
|
Ok(if self.matches().is_present("type-list") {
|
||||||
Command::PCRE2Version
|
|
||||||
} else if self.matches().is_present("type-list") {
|
|
||||||
Command::Types
|
Command::Types
|
||||||
} else if self.matches().is_present("files") {
|
} else if self.matches().is_present("files") {
|
||||||
if one_thread {
|
if one_thread {
|
||||||
@@ -294,18 +286,15 @@ impl Args {
|
|||||||
&self,
|
&self,
|
||||||
wtr: W,
|
wtr: W,
|
||||||
) -> Result<SearchWorker<W>> {
|
) -> Result<SearchWorker<W>> {
|
||||||
let matches = self.matches();
|
|
||||||
let matcher = self.matcher().clone();
|
let matcher = self.matcher().clone();
|
||||||
let printer = self.printer(wtr)?;
|
let printer = self.printer(wtr)?;
|
||||||
let searcher = matches.searcher(self.paths())?;
|
let searcher = self.matches().searcher(self.paths())?;
|
||||||
let mut builder = SearchWorkerBuilder::new();
|
let mut builder = SearchWorkerBuilder::new();
|
||||||
builder
|
builder
|
||||||
.json_stats(matches.is_present("json"))
|
.json_stats(self.matches().is_present("json"))
|
||||||
.preprocessor(matches.preprocessor())
|
.preprocessor(self.matches().preprocessor())
|
||||||
.preprocessor_globs(matches.preprocessor_globs()?)
|
.preprocessor_globs(self.matches().preprocessor_globs()?)
|
||||||
.search_zip(matches.is_present("search-zip"))
|
.search_zip(self.matches().is_present("search-zip"));
|
||||||
.binary_detection_implicit(matches.binary_detection_implicit())
|
|
||||||
.binary_detection_explicit(matches.binary_detection_explicit());
|
|
||||||
Ok(builder.build(matcher, searcher, printer))
|
Ok(builder.build(matcher, searcher, printer))
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -494,37 +483,6 @@ impl SortByKind {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Encoding mode the searcher will use.
|
|
||||||
#[derive(Clone, Debug)]
|
|
||||||
enum EncodingMode {
|
|
||||||
/// Use an explicit encoding forcefully, but let BOM sniffing override it.
|
|
||||||
Some(Encoding),
|
|
||||||
/// Use only BOM sniffing to auto-detect an encoding.
|
|
||||||
Auto,
|
|
||||||
/// Use no explicit encoding and disable all BOM sniffing. This will
|
|
||||||
/// always result in searching the raw bytes, regardless of their
|
|
||||||
/// true encoding.
|
|
||||||
Disabled,
|
|
||||||
}
|
|
||||||
|
|
||||||
impl EncodingMode {
|
|
||||||
/// Checks if an explicit encoding has been set. Returns false for
|
|
||||||
/// automatic BOM sniffing and no sniffing.
|
|
||||||
///
|
|
||||||
/// This is only used to determine whether PCRE2 needs to have its own
|
|
||||||
/// UTF-8 checking enabled. If we have an explicit encoding set, then
|
|
||||||
/// we're always guaranteed to get UTF-8, so we can disable PCRE2's check.
|
|
||||||
/// Otherwise, we have no such guarantee, and must enable PCRE2' UTF-8
|
|
||||||
/// check.
|
|
||||||
#[cfg(feature = "pcre2")]
|
|
||||||
fn has_explicit_encoding(&self) -> bool {
|
|
||||||
match self {
|
|
||||||
EncodingMode::Some(_) => true,
|
|
||||||
_ => false
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl ArgMatches {
|
impl ArgMatches {
|
||||||
/// Create an ArgMatches from clap's parse result.
|
/// Create an ArgMatches from clap's parse result.
|
||||||
fn new(clap_matches: clap::ArgMatches<'static>) -> ArgMatches {
|
fn new(clap_matches: clap::ArgMatches<'static>) -> ArgMatches {
|
||||||
@@ -599,25 +557,6 @@ impl ArgMatches {
|
|||||||
if self.is_present("pcre2") {
|
if self.is_present("pcre2") {
|
||||||
let matcher = self.matcher_pcre2(patterns)?;
|
let matcher = self.matcher_pcre2(patterns)?;
|
||||||
Ok(PatternMatcher::PCRE2(matcher))
|
Ok(PatternMatcher::PCRE2(matcher))
|
||||||
} else if self.is_present("auto-hybrid-regex") {
|
|
||||||
let rust_err = match self.matcher_rust(patterns) {
|
|
||||||
Ok(matcher) => return Ok(PatternMatcher::RustRegex(matcher)),
|
|
||||||
Err(err) => err,
|
|
||||||
};
|
|
||||||
log::debug!(
|
|
||||||
"error building Rust regex in hybrid mode:\n{}", rust_err,
|
|
||||||
);
|
|
||||||
let pcre_err = match self.matcher_pcre2(patterns) {
|
|
||||||
Ok(matcher) => return Ok(PatternMatcher::PCRE2(matcher)),
|
|
||||||
Err(err) => err,
|
|
||||||
};
|
|
||||||
Err(From::from(format!(
|
|
||||||
"regex could not be compiled with either the default regex \
|
|
||||||
engine or with PCRE2.\n\n\
|
|
||||||
default regex engine error:\n{}\n{}\n{}\n\n\
|
|
||||||
PCRE2 regex engine error:\n{}",
|
|
||||||
"~".repeat(79), rust_err, "~".repeat(79), pcre_err,
|
|
||||||
)))
|
|
||||||
} else {
|
} else {
|
||||||
let matcher = match self.matcher_rust(patterns) {
|
let matcher = match self.matcher_rust(patterns) {
|
||||||
Ok(matcher) => matcher,
|
Ok(matcher) => matcher,
|
||||||
@@ -686,13 +625,7 @@ impl ArgMatches {
|
|||||||
if let Some(limit) = self.dfa_size_limit()? {
|
if let Some(limit) = self.dfa_size_limit()? {
|
||||||
builder.dfa_size_limit(limit);
|
builder.dfa_size_limit(limit);
|
||||||
}
|
}
|
||||||
let res =
|
match builder.build(&patterns.join("|")) {
|
||||||
if self.is_present("fixed-strings") {
|
|
||||||
builder.build_literals(patterns)
|
|
||||||
} else {
|
|
||||||
builder.build(&patterns.join("|"))
|
|
||||||
};
|
|
||||||
match res {
|
|
||||||
Ok(m) => Ok(m),
|
Ok(m) => Ok(m),
|
||||||
Err(err) => Err(From::from(suggest_multiline(err.to_string()))),
|
Err(err) => Err(From::from(suggest_multiline(err.to_string()))),
|
||||||
}
|
}
|
||||||
@@ -712,17 +645,12 @@ impl ArgMatches {
|
|||||||
.word(self.is_present("word-regexp"));
|
.word(self.is_present("word-regexp"));
|
||||||
// For whatever reason, the JIT craps out during regex compilation with
|
// For whatever reason, the JIT craps out during regex compilation with
|
||||||
// a "no more memory" error on 32 bit systems. So don't use it there.
|
// a "no more memory" error on 32 bit systems. So don't use it there.
|
||||||
if cfg!(target_pointer_width = "64") {
|
if !cfg!(target_pointer_width = "32") {
|
||||||
builder
|
builder.jit_if_available(true);
|
||||||
.jit_if_available(true)
|
|
||||||
// The PCRE2 docs say that 32KB is the default, and that 1MB
|
|
||||||
// should be big enough for anything. But let's crank it to
|
|
||||||
// 10MB.
|
|
||||||
.max_jit_stack_size(Some(10 * (1<<20)));
|
|
||||||
}
|
}
|
||||||
if self.pcre2_unicode() {
|
if self.pcre2_unicode() {
|
||||||
builder.utf(true).ucp(true);
|
builder.utf(true).ucp(true);
|
||||||
if self.encoding()?.has_explicit_encoding() {
|
if self.encoding()?.is_some() {
|
||||||
// SAFETY: If an encoding was specified, then we're guaranteed
|
// SAFETY: If an encoding was specified, then we're guaranteed
|
||||||
// to get valid UTF-8, so we can disable PCRE2's UTF checking.
|
// to get valid UTF-8, so we can disable PCRE2's UTF checking.
|
||||||
// (Feeding invalid UTF-8 to PCRE2 is undefined behavior.)
|
// (Feeding invalid UTF-8 to PCRE2 is undefined behavior.)
|
||||||
@@ -778,7 +706,6 @@ impl ArgMatches {
|
|||||||
.per_match(self.is_present("vimgrep"))
|
.per_match(self.is_present("vimgrep"))
|
||||||
.replacement(self.replacement())
|
.replacement(self.replacement())
|
||||||
.max_columns(self.max_columns()?)
|
.max_columns(self.max_columns()?)
|
||||||
.max_columns_preview(self.max_columns_preview())
|
|
||||||
.max_matches(self.max_count()?)
|
.max_matches(self.max_count()?)
|
||||||
.column(self.column())
|
.column(self.column())
|
||||||
.byte_offset(self.is_present("byte-offset"))
|
.byte_offset(self.is_present("byte-offset"))
|
||||||
@@ -838,16 +765,9 @@ impl ArgMatches {
|
|||||||
.before_context(ctx_before)
|
.before_context(ctx_before)
|
||||||
.after_context(ctx_after)
|
.after_context(ctx_after)
|
||||||
.passthru(self.is_present("passthru"))
|
.passthru(self.is_present("passthru"))
|
||||||
.memory_map(self.mmap_choice(paths));
|
.memory_map(self.mmap_choice(paths))
|
||||||
match self.encoding()? {
|
.binary_detection(self.binary_detection())
|
||||||
EncodingMode::Some(enc) => {
|
.encoding(self.encoding()?);
|
||||||
builder.encoding(Some(enc));
|
|
||||||
}
|
|
||||||
EncodingMode::Auto => {} // default for the searcher
|
|
||||||
EncodingMode::Disabled => {
|
|
||||||
builder.bom_sniffing(false);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Ok(builder.build())
|
Ok(builder.build())
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -897,42 +817,19 @@ impl ArgMatches {
|
|||||||
///
|
///
|
||||||
/// Methods are sorted alphabetically.
|
/// Methods are sorted alphabetically.
|
||||||
impl ArgMatches {
|
impl ArgMatches {
|
||||||
/// Returns the form of binary detection to perform on files that are
|
/// Returns the form of binary detection to perform.
|
||||||
/// implicitly searched via recursive directory traversal.
|
fn binary_detection(&self) -> BinaryDetection {
|
||||||
fn binary_detection_implicit(&self) -> BinaryDetection {
|
|
||||||
let none =
|
let none =
|
||||||
self.is_present("text")
|
self.is_present("text")
|
||||||
|
|| self.unrestricted_count() >= 3
|
||||||
|| self.is_present("null-data");
|
|| self.is_present("null-data");
|
||||||
let convert =
|
|
||||||
self.is_present("binary")
|
|
||||||
|| self.unrestricted_count() >= 3;
|
|
||||||
if none {
|
if none {
|
||||||
BinaryDetection::none()
|
BinaryDetection::none()
|
||||||
} else if convert {
|
|
||||||
BinaryDetection::convert(b'\x00')
|
|
||||||
} else {
|
} else {
|
||||||
BinaryDetection::quit(b'\x00')
|
BinaryDetection::quit(b'\x00')
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the form of binary detection to perform on files that are
|
|
||||||
/// explicitly searched via the user invoking ripgrep on a particular
|
|
||||||
/// file or files or stdin.
|
|
||||||
///
|
|
||||||
/// In general, this should never be BinaryDetection::quit, since that acts
|
|
||||||
/// as a filter (but quitting immediately once a NUL byte is seen), and we
|
|
||||||
/// should never filter out files that the user wants to explicitly search.
|
|
||||||
fn binary_detection_explicit(&self) -> BinaryDetection {
|
|
||||||
let none =
|
|
||||||
self.is_present("text")
|
|
||||||
|| self.is_present("null-data");
|
|
||||||
if none {
|
|
||||||
BinaryDetection::none()
|
|
||||||
} else {
|
|
||||||
BinaryDetection::convert(b'\x00')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Returns true if the command line configuration implies that a match
|
/// Returns true if the command line configuration implies that a match
|
||||||
/// can never be shown.
|
/// can never be shown.
|
||||||
fn can_never_match(&self, patterns: &[String]) -> bool {
|
fn can_never_match(&self, patterns: &[String]) -> bool {
|
||||||
@@ -1055,30 +952,24 @@ impl ArgMatches {
|
|||||||
u64_to_usize("dfa-size-limit", r)
|
u64_to_usize("dfa-size-limit", r)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the encoding mode to use.
|
/// Returns the type of encoding to use.
|
||||||
///
|
///
|
||||||
/// This only returns an encoding if one is explicitly specified. Otherwise
|
/// This only returns an encoding if one is explicitly specified. When no
|
||||||
/// if set to automatic, the Searcher will do BOM sniffing for UTF-16
|
/// encoding is present, the Searcher will still do BOM sniffing for UTF-16
|
||||||
/// and transcode seamlessly. If disabled, no BOM sniffing nor transcoding
|
/// and transcode seamlessly.
|
||||||
/// will occur.
|
fn encoding(&self) -> Result<Option<Encoding>> {
|
||||||
fn encoding(&self) -> Result<EncodingMode> {
|
|
||||||
if self.is_present("no-encoding") {
|
if self.is_present("no-encoding") {
|
||||||
return Ok(EncodingMode::Auto);
|
return Ok(None);
|
||||||
}
|
}
|
||||||
|
|
||||||
let label = match self.value_of_lossy("encoding") {
|
let label = match self.value_of_lossy("encoding") {
|
||||||
None if self.pcre2_unicode() => "utf-8".to_string(),
|
None if self.pcre2_unicode() => "utf-8".to_string(),
|
||||||
None => return Ok(EncodingMode::Auto),
|
None => return Ok(None),
|
||||||
Some(label) => label,
|
Some(label) => label,
|
||||||
};
|
};
|
||||||
|
|
||||||
if label == "auto" {
|
if label == "auto" {
|
||||||
return Ok(EncodingMode::Auto);
|
return Ok(None);
|
||||||
} else if label == "none" {
|
|
||||||
return Ok(EncodingMode::Disabled);
|
|
||||||
}
|
}
|
||||||
|
Ok(Some(Encoding::new(&label)?))
|
||||||
Ok(EncodingMode::Some(Encoding::new(&label)?))
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Return the file separator to use based on the CLI configuration.
|
/// Return the file separator to use based on the CLI configuration.
|
||||||
@@ -1175,12 +1066,6 @@ impl ArgMatches {
|
|||||||
Ok(self.usize_of_nonzero("max-columns")?.map(|n| n as u64))
|
Ok(self.usize_of_nonzero("max-columns")?.map(|n| n as u64))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns true if and only if a preview should be shown for lines that
|
|
||||||
/// exceed the maximum column limit.
|
|
||||||
fn max_columns_preview(&self) -> bool {
|
|
||||||
self.is_present("max-columns-preview")
|
|
||||||
}
|
|
||||||
|
|
||||||
/// The maximum number of matches permitted.
|
/// The maximum number of matches permitted.
|
||||||
fn max_count(&self) -> Result<Option<u64>> {
|
fn max_count(&self) -> Result<Option<u64>> {
|
||||||
Ok(self.usize_of("max-count")?.map(|n| n as u64))
|
Ok(self.usize_of("max-count")?.map(|n| n as u64))
|
||||||
@@ -1310,8 +1195,7 @@ impl ArgMatches {
|
|||||||
!cli::is_readable_stdin()
|
!cli::is_readable_stdin()
|
||||||
|| (self.is_present("file") && file_is_stdin)
|
|| (self.is_present("file") && file_is_stdin)
|
||||||
|| self.is_present("files")
|
|| self.is_present("files")
|
||||||
|| self.is_present("type-list")
|
|| self.is_present("type-list");
|
||||||
|| self.is_present("pcre2-version");
|
|
||||||
if search_cwd {
|
if search_cwd {
|
||||||
Path::new("./").to_path_buf()
|
Path::new("./").to_path_buf()
|
||||||
} else {
|
} else {
|
||||||
@@ -1770,12 +1654,12 @@ where I: IntoIterator<Item=T>,
|
|||||||
if err.use_stderr() {
|
if err.use_stderr() {
|
||||||
return Err(err.into());
|
return Err(err.into());
|
||||||
}
|
}
|
||||||
// Explicitly ignore any error returned by write!. The most likely error
|
// Explicitly ignore any error returned by writeln!. The most likely error
|
||||||
// at this point is a broken pipe error, in which case, we want to ignore
|
// at this point is a broken pipe error, in which case, we want to ignore
|
||||||
// it and exit quietly.
|
// it and exit quietly.
|
||||||
//
|
//
|
||||||
// (This is the point of this helper function. clap's functionality for
|
// (This is the point of this helper function. clap's functionality for
|
||||||
// doing this will panic on a broken pipe error.)
|
// doing this will panic on a broken pipe error.)
|
||||||
let _ = write!(io::stdout(), "{}", err);
|
let _ = writeln!(io::stdout(), "{}", err);
|
||||||
process::exit(0);
|
process::exit(0);
|
||||||
}
|
}
|
||||||
|
28
src/main.rs
28
src/main.rs
@@ -39,7 +39,6 @@ fn try_main(args: Args) -> Result<()> {
|
|||||||
Files => files(&args),
|
Files => files(&args),
|
||||||
FilesParallel => files_parallel(&args),
|
FilesParallel => files_parallel(&args),
|
||||||
Types => types(&args),
|
Types => types(&args),
|
||||||
PCRE2Version => pcre2_version(&args),
|
|
||||||
}?;
|
}?;
|
||||||
if matched && (args.quiet() || !messages::errored()) {
|
if matched && (args.quiet() || !messages::errored()) {
|
||||||
process::exit(0)
|
process::exit(0)
|
||||||
@@ -276,30 +275,3 @@ fn types(args: &Args) -> Result<bool> {
|
|||||||
}
|
}
|
||||||
Ok(count > 0)
|
Ok(count > 0)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// The top-level entry point for --pcre2-version.
|
|
||||||
fn pcre2_version(args: &Args) -> Result<bool> {
|
|
||||||
#[cfg(feature = "pcre2")]
|
|
||||||
fn imp(args: &Args) -> Result<bool> {
|
|
||||||
use grep::pcre2;
|
|
||||||
|
|
||||||
let mut stdout = args.stdout();
|
|
||||||
|
|
||||||
let (major, minor) = pcre2::version();
|
|
||||||
writeln!(stdout, "PCRE2 {}.{} is available", major, minor)?;
|
|
||||||
|
|
||||||
if cfg!(target_pointer_width = "64") && pcre2::is_jit_available() {
|
|
||||||
writeln!(stdout, "JIT is available")?;
|
|
||||||
}
|
|
||||||
Ok(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
#[cfg(not(feature = "pcre2"))]
|
|
||||||
fn imp(args: &Args) -> Result<bool> {
|
|
||||||
let mut stdout = args.stdout();
|
|
||||||
writeln!(stdout, "PCRE2 is not available in this build of ripgrep.")?;
|
|
||||||
Ok(false)
|
|
||||||
}
|
|
||||||
|
|
||||||
imp(args)
|
|
||||||
}
|
|
||||||
|
@@ -10,7 +10,7 @@ use grep::matcher::Matcher;
|
|||||||
use grep::pcre2::{RegexMatcher as PCRE2RegexMatcher};
|
use grep::pcre2::{RegexMatcher as PCRE2RegexMatcher};
|
||||||
use grep::printer::{JSON, Standard, Summary, Stats};
|
use grep::printer::{JSON, Standard, Summary, Stats};
|
||||||
use grep::regex::{RegexMatcher as RustRegexMatcher};
|
use grep::regex::{RegexMatcher as RustRegexMatcher};
|
||||||
use grep::searcher::{BinaryDetection, Searcher};
|
use grep::searcher::Searcher;
|
||||||
use ignore::overrides::Override;
|
use ignore::overrides::Override;
|
||||||
use serde_json as json;
|
use serde_json as json;
|
||||||
use serde_json::json;
|
use serde_json::json;
|
||||||
@@ -27,8 +27,6 @@ struct Config {
|
|||||||
preprocessor: Option<PathBuf>,
|
preprocessor: Option<PathBuf>,
|
||||||
preprocessor_globs: Override,
|
preprocessor_globs: Override,
|
||||||
search_zip: bool,
|
search_zip: bool,
|
||||||
binary_implicit: BinaryDetection,
|
|
||||||
binary_explicit: BinaryDetection,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Default for Config {
|
impl Default for Config {
|
||||||
@@ -38,8 +36,6 @@ impl Default for Config {
|
|||||||
preprocessor: None,
|
preprocessor: None,
|
||||||
preprocessor_globs: Override::empty(),
|
preprocessor_globs: Override::empty(),
|
||||||
search_zip: false,
|
search_zip: false,
|
||||||
binary_implicit: BinaryDetection::none(),
|
|
||||||
binary_explicit: BinaryDetection::none(),
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -138,37 +134,6 @@ impl SearchWorkerBuilder {
|
|||||||
self.config.search_zip = yes;
|
self.config.search_zip = yes;
|
||||||
self
|
self
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Set the binary detection that should be used when searching files
|
|
||||||
/// found via a recursive directory search.
|
|
||||||
///
|
|
||||||
/// Generally, this binary detection may be `BinaryDetection::quit` if
|
|
||||||
/// we want to skip binary files completely.
|
|
||||||
///
|
|
||||||
/// By default, no binary detection is performed.
|
|
||||||
pub fn binary_detection_implicit(
|
|
||||||
&mut self,
|
|
||||||
detection: BinaryDetection,
|
|
||||||
) -> &mut SearchWorkerBuilder {
|
|
||||||
self.config.binary_implicit = detection;
|
|
||||||
self
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Set the binary detection that should be used when searching files
|
|
||||||
/// explicitly supplied by an end user.
|
|
||||||
///
|
|
||||||
/// Generally, this binary detection should NOT be `BinaryDetection::quit`,
|
|
||||||
/// since we never want to automatically filter files supplied by the end
|
|
||||||
/// user.
|
|
||||||
///
|
|
||||||
/// By default, no binary detection is performed.
|
|
||||||
pub fn binary_detection_explicit(
|
|
||||||
&mut self,
|
|
||||||
detection: BinaryDetection,
|
|
||||||
) -> &mut SearchWorkerBuilder {
|
|
||||||
self.config.binary_explicit = detection;
|
|
||||||
self
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// The result of executing a search.
|
/// The result of executing a search.
|
||||||
@@ -343,14 +308,6 @@ impl<W: WriteColor> SearchWorker<W> {
|
|||||||
|
|
||||||
/// Search the given subject using the appropriate strategy.
|
/// Search the given subject using the appropriate strategy.
|
||||||
fn search_impl(&mut self, subject: &Subject) -> io::Result<SearchResult> {
|
fn search_impl(&mut self, subject: &Subject) -> io::Result<SearchResult> {
|
||||||
let bin =
|
|
||||||
if subject.is_explicit() {
|
|
||||||
self.config.binary_explicit.clone()
|
|
||||||
} else {
|
|
||||||
self.config.binary_implicit.clone()
|
|
||||||
};
|
|
||||||
self.searcher.set_binary_detection(bin);
|
|
||||||
|
|
||||||
let path = subject.path();
|
let path = subject.path();
|
||||||
if subject.is_stdin() {
|
if subject.is_stdin() {
|
||||||
let stdin = io::stdin();
|
let stdin = io::stdin();
|
||||||
|
@@ -59,12 +59,17 @@ impl SubjectBuilder {
|
|||||||
if let Some(ignore_err) = subj.dent.error() {
|
if let Some(ignore_err) = subj.dent.error() {
|
||||||
ignore_message!("{}", ignore_err);
|
ignore_message!("{}", ignore_err);
|
||||||
}
|
}
|
||||||
// If this entry was explicitly provided by an end user, then we always
|
// If this entry represents stdin, then we always search it.
|
||||||
// want to search it.
|
if subj.dent.is_stdin() {
|
||||||
if subj.is_explicit() {
|
|
||||||
return Some(subj);
|
return Some(subj);
|
||||||
}
|
}
|
||||||
// At this point, we only want to search something if it's explicitly a
|
// If this subject has a depth of 0, then it was provided explicitly
|
||||||
|
// by an end user (or via a shell glob). In this case, we always want
|
||||||
|
// to search it if it even smells like a file (e.g., a symlink).
|
||||||
|
if subj.dent.depth() == 0 && !subj.is_dir() {
|
||||||
|
return Some(subj);
|
||||||
|
}
|
||||||
|
// At this point, we only want to search something it's explicitly a
|
||||||
// file. This omits symlinks. (If ripgrep was configured to follow
|
// file. This omits symlinks. (If ripgrep was configured to follow
|
||||||
// symlinks, then they have already been followed by the directory
|
// symlinks, then they have already been followed by the directory
|
||||||
// traversal.)
|
// traversal.)
|
||||||
@@ -122,26 +127,6 @@ impl Subject {
|
|||||||
self.dent.is_stdin()
|
self.dent.is_stdin()
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns true if and only if this entry corresponds to a subject to
|
|
||||||
/// search that was explicitly supplied by an end user.
|
|
||||||
///
|
|
||||||
/// Generally, this corresponds to either stdin or an explicit file path
|
|
||||||
/// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
|
|
||||||
/// an explicit subject, but, e.g., `./some-dir/some-other-file` is not.
|
|
||||||
///
|
|
||||||
/// However, note that ripgrep does not see through shell globbing. e.g.,
|
|
||||||
/// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
|
|
||||||
/// as an explicit subject.
|
|
||||||
pub fn is_explicit(&self) -> bool {
|
|
||||||
// stdin is obvious. When an entry has a depth of 0, that means it
|
|
||||||
// was explicitly provided to our directory iterator, which means it
|
|
||||||
// was in turn explicitly provided by the end user. The !is_dir check
|
|
||||||
// means that we want to search files even if their symlinks, again,
|
|
||||||
// because they were explicitly provided. (And we never want to try
|
|
||||||
// to search a directory.)
|
|
||||||
self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Returns true if and only if this subject points to a directory after
|
/// Returns true if and only if this subject points to a directory after
|
||||||
/// following symbolic links.
|
/// following symbolic links.
|
||||||
fn is_dir(&self) -> bool {
|
fn is_dir(&self) -> bool {
|
||||||
|
315
tests/binary.rs
315
tests/binary.rs
@@ -1,315 +0,0 @@
|
|||||||
use crate::util::{Dir, TestCommand};
|
|
||||||
|
|
||||||
// This file contains a smattering of tests specifically for checking ripgrep's
|
|
||||||
// handling of binary files. There's quite a bit of discussion on this in this
|
|
||||||
// bug report: https://github.com/BurntSushi/ripgrep/issues/306
|
|
||||||
|
|
||||||
// Our haystack is the first 500 lines of Gutenberg's copy of "A Study in
|
|
||||||
// Scarlet," with a NUL byte at line 237: `abcdef\x00`.
|
|
||||||
//
|
|
||||||
// The position and size of the haystack is, unfortunately, significant. In
|
|
||||||
// particular, the NUL byte is specifically inserted at some point *after* the
|
|
||||||
// first 8192 bytes, which corresponds to the initial capacity of the buffer
|
|
||||||
// that ripgrep uses to read files. (grep for DEFAULT_BUFFER_CAPACITY.) The
|
|
||||||
// position of the NUL byte ensures that we can execute some search on the
|
|
||||||
// initial buffer contents without ever detecting any binary data. Moreover,
|
|
||||||
// when using a memory map for searching, only the first 8192 bytes are
|
|
||||||
// scanned for a NUL byte, so no binary bytes are detected at all when using
|
|
||||||
// a memory map (unless our query matches line 237).
|
|
||||||
//
|
|
||||||
// One last note: in the tests below, we use --no-mmap heavily because binary
|
|
||||||
// detection with memory maps is a bit different. Namely, NUL bytes are only
|
|
||||||
// searched for in the first few KB of the file and in a match. Normally, NUL
|
|
||||||
// bytes are searched for everywhere.
|
|
||||||
//
|
|
||||||
// TODO: Add tests for binary file detection when using memory maps.
|
|
||||||
const HAY: &'static [u8] = include_bytes!("./data/sherlock-nul.txt");
|
|
||||||
|
|
||||||
// This tests that ripgrep prints a warning message if it finds and prints a
|
|
||||||
// match in a binary file before detecting that it is a binary file. The point
|
|
||||||
// here is to notify that user that the search of the file is only partially
|
|
||||||
// complete.
|
|
||||||
//
|
|
||||||
// This applies to files that are *implicitly* searched via a recursive
|
|
||||||
// directory traversal. In particular, this results in a WARNING message being
|
|
||||||
// printed. We make our file "implicit" by doing a recursive search with a glob
|
|
||||||
// that matches our file.
|
|
||||||
rgtest!(after_match1_implicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
WARNING: stopped searching binary file hay after match (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit, except we provide a file to search
|
|
||||||
// explicitly. This results in identical behavior, but a different message.
|
|
||||||
rgtest!(after_match1_explicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "Project Gutenberg EBook", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
Binary file matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_explicit, except we feed our content on stdin.
|
|
||||||
rgtest!(after_match1_stdin, |_: Dir, mut cmd: TestCommand| {
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "Project Gutenberg EBook",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
Binary file matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.pipe(HAY));
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit, but provides the --binary flag, which
|
|
||||||
// disables binary filtering. Thus, this matches the behavior of ripgrep as
|
|
||||||
// if the file were given explicitly.
|
|
||||||
rgtest!(after_match1_implicit_binary, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--binary", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
Binary file hay matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit, but enables -a/--text, so no binary
|
|
||||||
// detection should be performed.
|
|
||||||
rgtest!(after_match1_implicit_text, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--text", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit_text, but enables -a/--text, so no binary
|
|
||||||
// detection should be performed.
|
|
||||||
rgtest!(after_match1_explicit_text, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--text", "Project Gutenberg EBook", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit, except this asks ripgrep to print all matching
|
|
||||||
// files.
|
|
||||||
//
|
|
||||||
// This is an interesting corner case that one might consider a bug, however,
|
|
||||||
// it's unlikely to be fixed. Namely, ripgrep probably shouldn't print `hay`
|
|
||||||
// as a matching file since it is in fact a binary file, and thus should be
|
|
||||||
// filtered out by default. However, the --files-with-matches flag will print
|
|
||||||
// out the path of a matching file as soon as a match is seen and then stop
|
|
||||||
// searching completely. Therefore, the NUL byte is never actually detected.
|
|
||||||
//
|
|
||||||
// The only way to fix this would be to kill ripgrep's performance in this case
|
|
||||||
// and continue searching the entire file for a NUL byte. (Similarly if the
|
|
||||||
// --quiet flag is set. See the next test.)
|
|
||||||
rgtest!(after_match1_implicit_path, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-l", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
eqnice!("hay\n", cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit_path, except this indicates that a match was
|
|
||||||
// found with no other output. (This is the same bug described above, but
|
|
||||||
// manifest as an exit code with no output.)
|
|
||||||
rgtest!(after_match1_implicit_quiet, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-q", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
eqnice!("", cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// This sets up the same test as after_match1_implicit_path, but instead of
|
|
||||||
// just printing the matching files, this includes the full count of matches.
|
|
||||||
// In this case, we need to search the entire file, so ripgrep correctly
|
|
||||||
// detects the binary data and suppresses output.
|
|
||||||
rgtest!(after_match1_implicit_count, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-c", "Project Gutenberg EBook", "-g", "hay",
|
|
||||||
]);
|
|
||||||
cmd.assert_err();
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit_count, except the --binary flag is provided,
|
|
||||||
// which makes ripgrep disable binary data filtering even for implicit files.
|
|
||||||
rgtest!(after_match1_implicit_count_binary, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-c", "--binary",
|
|
||||||
"Project Gutenberg EBook",
|
|
||||||
"-g", "hay",
|
|
||||||
]);
|
|
||||||
eqnice!("hay:1\n", cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match1_implicit_count, except the file path is provided
|
|
||||||
// explicitly, so binary filtering is disabled and a count is correctly
|
|
||||||
// reported.
|
|
||||||
rgtest!(after_match1_explicit_count, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-c", "Project Gutenberg EBook", "hay",
|
|
||||||
]);
|
|
||||||
eqnice!("1\n", cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// This tests that a match way before the NUL byte is shown, but a match after
|
|
||||||
// the NUL byte is not.
|
|
||||||
rgtest!(after_match2_implicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n",
|
|
||||||
"Project Gutenberg EBook|a medical student",
|
|
||||||
"-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
WARNING: stopped searching binary file hay after match (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like after_match2_implicit, but enables -a/--text, so no binary
|
|
||||||
// detection should be performed.
|
|
||||||
rgtest!(after_match2_implicit_text, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--text",
|
|
||||||
"Project Gutenberg EBook|a medical student",
|
|
||||||
"-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
hay:236:\"And yet you say he is not a medical student?\"
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// This tests that ripgrep *silently* quits before finding a match that occurs
|
|
||||||
// after a NUL byte.
|
|
||||||
rgtest!(before_match1_implicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "Heaven", "-g", "hay",
|
|
||||||
]);
|
|
||||||
cmd.assert_err();
|
|
||||||
});
|
|
||||||
|
|
||||||
// This tests that ripgrep *does not* silently quit before finding a match that
|
|
||||||
// occurs after a NUL byte when a file is explicitly searched.
|
|
||||||
rgtest!(before_match1_explicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "Heaven", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
Binary file matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like before_match1_implicit, but enables the --binary flag, which
|
|
||||||
// disables binary filtering. Thus, this matches the behavior of ripgrep as if
|
|
||||||
// the file were given explicitly.
|
|
||||||
rgtest!(before_match1_implicit_binary, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--binary", "Heaven", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
Binary file hay matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like before_match1_implicit, but enables -a/--text, so no binary
|
|
||||||
// detection should be performed.
|
|
||||||
rgtest!(before_match1_implicit_text, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--text", "Heaven", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:238:\"No. Heaven knows what the objects of his studies are. But here we
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// This tests that ripgrep *silently* quits before finding a match that occurs
|
|
||||||
// before a NUL byte, but within the same buffer as the NUL byte.
|
|
||||||
rgtest!(before_match2_implicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "a medical student", "-g", "hay",
|
|
||||||
]);
|
|
||||||
cmd.assert_err();
|
|
||||||
});
|
|
||||||
|
|
||||||
// This tests that ripgrep *does not* silently quit before finding a match that
|
|
||||||
// occurs before a NUL byte, but within the same buffer as the NUL byte. Even
|
|
||||||
// though the match occurs before the NUL byte, ripgrep still doesn't print it
|
|
||||||
// because it has already scanned ahead to detect the NUL byte. (This matches
|
|
||||||
// the behavior of GNU grep.)
|
|
||||||
rgtest!(before_match2_explicit, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "a medical student", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
Binary file matches (found \"\\u{0}\" byte around offset 9741)
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
||||||
|
|
||||||
// Like before_match1_implicit, but enables -a/--text, so no binary
|
|
||||||
// detection should be performed.
|
|
||||||
rgtest!(before_match2_implicit_text, |dir: Dir, mut cmd: TestCommand| {
|
|
||||||
dir.create_bytes("hay", HAY);
|
|
||||||
cmd.args(&[
|
|
||||||
"--no-mmap", "-n", "--text", "a medical student", "-g", "hay",
|
|
||||||
]);
|
|
||||||
|
|
||||||
let expected = "\
|
|
||||||
hay:236:\"And yet you say he is not a medical student?\"
|
|
||||||
";
|
|
||||||
eqnice!(expected, cmd.stdout());
|
|
||||||
});
|
|
@@ -1,500 +0,0 @@
|
|||||||
The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
|
|
||||||
|
|
||||||
This eBook is for the use of anyone anywhere at no cost and with
|
|
||||||
almost no restrictions whatsoever. You may copy it, give it away or
|
|
||||||
re-use it under the terms of the Project Gutenberg License included
|
|
||||||
with this eBook or online at www.gutenberg.org
|
|
||||||
|
|
||||||
|
|
||||||
Title: A Study In Scarlet
|
|
||||||
|
|
||||||
Author: Arthur Conan Doyle
|
|
||||||
|
|
||||||
Posting Date: July 12, 2008 [EBook #244]
|
|
||||||
Release Date: April, 1995
|
|
||||||
[Last updated: February 17, 2013]
|
|
||||||
|
|
||||||
Language: English
|
|
||||||
|
|
||||||
|
|
||||||
*** START OF THIS PROJECT GUTENBERG EBOOK A STUDY IN SCARLET ***
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Produced by Roger Squires
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
A STUDY IN SCARLET.
|
|
||||||
|
|
||||||
By A. Conan Doyle
|
|
||||||
|
|
||||||
[1]
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Original Transcriber's Note: This etext is prepared directly
|
|
||||||
from an 1887 edition, and care has been taken to duplicate the
|
|
||||||
original exactly, including typographical and punctuation
|
|
||||||
vagaries.
|
|
||||||
|
|
||||||
Additions to the text include adding the underscore character to
|
|
||||||
indicate italics, and textual end-notes in square braces.
|
|
||||||
|
|
||||||
Project Gutenberg Editor's Note: In reproofing and moving old PG
|
|
||||||
files such as this to the present PG directory system it is the
|
|
||||||
policy to reformat the text to conform to present PG Standards.
|
|
||||||
In this case however, in consideration of the note above of the
|
|
||||||
original transcriber describing his care to try to duplicate the
|
|
||||||
original 1887 edition as to typography and punctuation vagaries,
|
|
||||||
no changes have been made in this ascii text file. However, in
|
|
||||||
the Latin-1 file and this html file, present standards are
|
|
||||||
followed and the several French and Spanish words have been
|
|
||||||
given their proper accents.
|
|
||||||
|
|
||||||
Part II, The Country of the Saints, deals much with the Mormon Church.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
A STUDY IN SCARLET.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
PART I.
|
|
||||||
|
|
||||||
(_Being a reprint from the reminiscences of_ JOHN H. WATSON, M.D., _late
|
|
||||||
of the Army Medical Department._) [2]
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
CHAPTER I. MR. SHERLOCK HOLMES.
|
|
||||||
|
|
||||||
|
|
||||||
IN the year 1878 I took my degree of Doctor of Medicine of the
|
|
||||||
University of London, and proceeded to Netley to go through the course
|
|
||||||
prescribed for surgeons in the army. Having completed my studies there,
|
|
||||||
I was duly attached to the Fifth Northumberland Fusiliers as Assistant
|
|
||||||
Surgeon. The regiment was stationed in India at the time, and before
|
|
||||||
I could join it, the second Afghan war had broken out. On landing at
|
|
||||||
Bombay, I learned that my corps had advanced through the passes, and
|
|
||||||
was already deep in the enemy's country. I followed, however, with many
|
|
||||||
other officers who were in the same situation as myself, and succeeded
|
|
||||||
in reaching Candahar in safety, where I found my regiment, and at once
|
|
||||||
entered upon my new duties.
|
|
||||||
|
|
||||||
The campaign brought honours and promotion to many, but for me it had
|
|
||||||
nothing but misfortune and disaster. I was removed from my brigade and
|
|
||||||
attached to the Berkshires, with whom I served at the fatal battle of
|
|
||||||
Maiwand. There I was struck on the shoulder by a Jezail bullet, which
|
|
||||||
shattered the bone and grazed the subclavian artery. I should have
|
|
||||||
fallen into the hands of the murderous Ghazis had it not been for the
|
|
||||||
devotion and courage shown by Murray, my orderly, who threw me across a
|
|
||||||
pack-horse, and succeeded in bringing me safely to the British lines.
|
|
||||||
|
|
||||||
Worn with pain, and weak from the prolonged hardships which I had
|
|
||||||
undergone, I was removed, with a great train of wounded sufferers, to
|
|
||||||
the base hospital at Peshawar. Here I rallied, and had already improved
|
|
||||||
so far as to be able to walk about the wards, and even to bask a little
|
|
||||||
upon the verandah, when I was struck down by enteric fever, that curse
|
|
||||||
of our Indian possessions. For months my life was despaired of, and
|
|
||||||
when at last I came to myself and became convalescent, I was so weak and
|
|
||||||
emaciated that a medical board determined that not a day should be lost
|
|
||||||
in sending me back to England. I was dispatched, accordingly, in the
|
|
||||||
troopship "Orontes," and landed a month later on Portsmouth jetty, with
|
|
||||||
my health irretrievably ruined, but with permission from a paternal
|
|
||||||
government to spend the next nine months in attempting to improve it.
|
|
||||||
|
|
||||||
I had neither kith nor kin in England, and was therefore as free as
|
|
||||||
air--or as free as an income of eleven shillings and sixpence a day will
|
|
||||||
permit a man to be. Under such circumstances, I naturally gravitated to
|
|
||||||
London, that great cesspool into which all the loungers and idlers of
|
|
||||||
the Empire are irresistibly drained. There I stayed for some time at
|
|
||||||
a private hotel in the Strand, leading a comfortless, meaningless
|
|
||||||
existence, and spending such money as I had, considerably more freely
|
|
||||||
than I ought. So alarming did the state of my finances become, that
|
|
||||||
I soon realized that I must either leave the metropolis and rusticate
|
|
||||||
somewhere in the country, or that I must make a complete alteration in
|
|
||||||
my style of living. Choosing the latter alternative, I began by making
|
|
||||||
up my mind to leave the hotel, and to take up my quarters in some less
|
|
||||||
pretentious and less expensive domicile.
|
|
||||||
|
|
||||||
On the very day that I had come to this conclusion, I was standing at
|
|
||||||
the Criterion Bar, when some one tapped me on the shoulder, and turning
|
|
||||||
round I recognized young Stamford, who had been a dresser under me at
|
|
||||||
Barts. The sight of a friendly face in the great wilderness of London is
|
|
||||||
a pleasant thing indeed to a lonely man. In old days Stamford had never
|
|
||||||
been a particular crony of mine, but now I hailed him with enthusiasm,
|
|
||||||
and he, in his turn, appeared to be delighted to see me. In the
|
|
||||||
exuberance of my joy, I asked him to lunch with me at the Holborn, and
|
|
||||||
we started off together in a hansom.
|
|
||||||
|
|
||||||
"Whatever have you been doing with yourself, Watson?" he asked in
|
|
||||||
undisguised wonder, as we rattled through the crowded London streets.
|
|
||||||
"You are as thin as a lath and as brown as a nut."
|
|
||||||
|
|
||||||
I gave him a short sketch of my adventures, and had hardly concluded it
|
|
||||||
by the time that we reached our destination.
|
|
||||||
|
|
||||||
"Poor devil!" he said, commiseratingly, after he had listened to my
|
|
||||||
misfortunes. "What are you up to now?"
|
|
||||||
|
|
||||||
"Looking for lodgings." [3] I answered. "Trying to solve the problem
|
|
||||||
as to whether it is possible to get comfortable rooms at a reasonable
|
|
||||||
price."
|
|
||||||
|
|
||||||
"That's a strange thing," remarked my companion; "you are the second man
|
|
||||||
to-day that has used that expression to me."
|
|
||||||
|
|
||||||
"And who was the first?" I asked.
|
|
||||||
|
|
||||||
"A fellow who is working at the chemical laboratory up at the hospital.
|
|
||||||
He was bemoaning himself this morning because he could not get someone
|
|
||||||
to go halves with him in some nice rooms which he had found, and which
|
|
||||||
were too much for his purse."
|
|
||||||
|
|
||||||
"By Jove!" I cried, "if he really wants someone to share the rooms and
|
|
||||||
the expense, I am the very man for him. I should prefer having a partner
|
|
||||||
to being alone."
|
|
||||||
|
|
||||||
Young Stamford looked rather strangely at me over his wine-glass. "You
|
|
||||||
don't know Sherlock Holmes yet," he said; "perhaps you would not care
|
|
||||||
for him as a constant companion."
|
|
||||||
|
|
||||||
"Why, what is there against him?"
|
|
||||||
|
|
||||||
"Oh, I didn't say there was anything against him. He is a little queer
|
|
||||||
in his ideas--an enthusiast in some branches of science. As far as I
|
|
||||||
know he is a decent fellow enough."
|
|
||||||
|
|
||||||
"A medical student, I suppose?" said I.
|
|
||||||
|
|
||||||
"No--I have no idea what he intends to go in for. I believe he is well
|
|
||||||
up in anatomy, and he is a first-class chemist; but, as far as I know,
|
|
||||||
he has never taken out any systematic medical classes. His studies are
|
|
||||||
very desultory and eccentric, but he has amassed a lot of out-of-the way
|
|
||||||
knowledge which would astonish his professors."
|
|
||||||
|
|
||||||
"Did you never ask him what he was going in for?" I asked.
|
|
||||||
|
|
||||||
"No; he is not a man that it is easy to draw out, though he can be
|
|
||||||
communicative enough when the fancy seizes him."
|
|
||||||
|
|
||||||
"I should like to meet him," I said. "If I am to lodge with anyone, I
|
|
||||||
should prefer a man of studious and quiet habits. I am not strong
|
|
||||||
enough yet to stand much noise or excitement. I had enough of both in
|
|
||||||
Afghanistan to last me for the remainder of my natural existence. How
|
|
||||||
could I meet this friend of yours?"
|
|
||||||
|
|
||||||
"He is sure to be at the laboratory," returned my companion. "He either
|
|
||||||
avoids the place for weeks, or else he works there from morning to
|
|
||||||
night. If you like, we shall drive round together after luncheon."
|
|
||||||
|
|
||||||
"Certainly," I answered, and the conversation drifted away into other
|
|
||||||
channels.
|
|
||||||
|
|
||||||
As we made our way to the hospital after leaving the Holborn, Stamford
|
|
||||||
gave me a few more particulars about the gentleman whom I proposed to
|
|
||||||
take as a fellow-lodger.
|
|
||||||
|
|
||||||
"You mustn't blame me if you don't get on with him," he said; "I know
|
|
||||||
nothing more of him than I have learned from meeting him occasionally in
|
|
||||||
the laboratory. You proposed this arrangement, so you must not hold me
|
|
||||||
responsible."
|
|
||||||
|
|
||||||
"If we don't get on it will be easy to part company," I answered. "It
|
|
||||||
seems to me, Stamford," I added, looking hard at my companion, "that you
|
|
||||||
have some reason for washing your hands of the matter. Is this fellow's
|
|
||||||
temper so formidable, or what is it? Don't be mealy-mouthed about it."
|
|
||||||
|
|
||||||
"It is not easy to express the inexpressible," he answered with a laugh.
|
|
||||||
"Holmes is a little too scientific for my tastes--it approaches to
|
|
||||||
cold-bloodedness. I could imagine his giving a friend a little pinch of
|
|
||||||
the latest vegetable alkaloid, not out of malevolence, you understand,
|
|
||||||
but simply out of a spirit of inquiry in order to have an accurate idea
|
|
||||||
of the effects. To do him justice, I think that he would take it himself
|
|
||||||
with the same readiness. He appears to have a passion for definite and
|
|
||||||
exact knowledge."
|
|
||||||
|
|
||||||
"Very right too."
|
|
||||||
|
|
||||||
"Yes, but it may be pushed to excess. When it comes to beating the
|
|
||||||
subjects in the dissecting-rooms with a stick, it is certainly taking
|
|
||||||
rather a bizarre shape."
|
|
||||||
|
|
||||||
"Beating the subjects!"
|
|
||||||
|
|
||||||
"Yes, to verify how far bruises may be produced after death. I saw him
|
|
||||||
at it with my own eyes."
|
|
||||||
|
|
||||||
"And yet you say he is not a medical student?"
|
|
||||||
abcdef |