Compare commits

..

103 Commits

Author SHA1 Message Date
Andrew Gallant
535cd0d312 ci: switch to GitHub Actions 2019-08-31 09:04:09 -04:00
Andrew Gallant
5011dba2fd ignore: remove unused parameter 2019-08-28 20:21:34 -04:00
Andrew Gallant
e14f9195e5 deps: update everything 2019-08-28 20:18:47 -04:00
Andrew Gallant
ef0e7af56a deps: update bstr to 0.2.7
The new bstr release contains a small performance bug fix where some
trivial methods weren't being inlined.
2019-08-11 10:41:05 -04:00
Todd Walton
b266818aa5 doc: use XDG_CONFIG_HOME in comments
XDG_CONFIG_DIR does not actually exist.

PR #1347
2019-08-09 13:37:37 -04:00
LawAbidingCactus
81415ae52d doc: update to reflect glob matching behavior change
Specifically, paths contains a `/` are not allowed to match any
other slash in the path, even as a prefix. So `!.git` is the correct
incantation for ignoring a `.git` directory that occurs anywhere 
in the path.
2019-08-07 13:47:18 -04:00
Andrew Gallant
5c4584aa7c grep-regex-0.1.5 2019-08-06 09:51:13 -04:00
Andrew Gallant
0972c6e7c7 grep-searcher-0.1.6 2019-08-06 09:50:52 -04:00
Andrew Gallant
0a372bf2e4 deps: update ignore 2019-08-06 09:50:35 -04:00
Andrew Gallant
345124a7fa ignore-0.4.10 2019-08-06 09:47:45 -04:00
Andrew Gallant
31807f805a deps: drop tempfile
We were only using it to create temporary directories for `ignore`
tests, but it pulls in a bunch of dependencies and we don't really need
randomness. So just use our own simple wrapper instead.
2019-08-06 09:46:05 -04:00
Andrew Gallant
4de227fd9a deps: update everything
Mostly this just updates regex and its assorted dependencies. This does
drop utf8-ranges and ucd-util, in accordance with changes to
regex-syntax and regex.
2019-08-05 13:50:55 -04:00
jimbo1qaz
d7ce274722 readme: Debian Buster is stable now
PR #1338
2019-08-04 08:06:10 -04:00
Andrew Gallant
5b10328f41 changelog: update with bug fix 2019-08-02 07:37:27 -04:00
Andrew Gallant
813c676eca searcher: fix roll buffer bug
This commit fixes a subtle bug in how the line buffer was rolling its
contents. Specifically, when ripgrep searches without memory maps,
it uses a "roll" buffer for incremental line oriented search without
needing to read the entire file into memory at once. The roll buffer
works by reading a chunk of bytes from the file into memory, and then
searching everything in that buffer up to the last `\n` byte. The bytes
*after* the last `\n` byte are preserved, since they likely correspond
to *part* of the next line. Once ripgrep is done searching the buffer,
it "rolls" the buffer such that the start of the next line is at the
beginning of the buffer, and then ripgrep reads more data into the
buffer starting at the (possibly) partial end of that line.

The implication of this strategy, necessarily so, is that a buffer must
be big enough to fit a single line in memory. This is because the regex
engine needs a contiguous block of memory to search, so there is no way
to search anything smaller than a single line. So if a file contains a
single line with 7.5 million bytes, then the buffer will grow to be at
least that size. (Many files have super long lines like this, but they
tend to be *binary* files, which ripgrep will detect and stop searching
unless the user forces it with the `-a/--text` flag. So in practice,
they aren't usually a problem. However, in this case, #1335 found a case
where a plain text file had a line with 7.5 million bytes.)

Now, for performance reasons, ripgrep reuses these buffers across its
search. Typically, it will create `N` of these line buffers when it
starts (where `N` is the number of threads it is using), and then reuse
them without creating any new ones as it searches through files.

This means that if you search a file with a very long line, that buffer
will expand to be big enough to store that line. ripgrep never contracts
these buffers, so once it searches the next file, ripgrep will continue
to use this large buffer. While it might be prudent to contract these
buffers in some circumstances, this isn't otherwise inherently a
problem. The memory has already been allocated, and there isn't much
cost to using it, other than the fact that ripgrep hangs on to it and
never gives it back to the OS.

However, the `roll` implementation described above had a really
important bug in it that was impacted by the size of the buffer.
Specifically, it used the following to "roll" the partial line at the
end of the buffer to the beginning:

    self.buf.copy_within_str(self.pos.., 0);

Which means that if the buffer is very large, ripgrep will copy
*everything* from `self.pos` (which might be very small, e.g., for small
files) to the end of the buffer, and move it to the beginning of the
buffer. This will happen repeatedly each time the buffer is used to
search small files, which winds up being quite a large slow down if the
line was exceptionally large (say, megabytes).

It turns out that copying everything is completely unnecessary. We only
need to copy the remainder of the last read to the beginning of the
buffer. Everything *after* the last read in the buffer is just free
space that can be filled for the next read. So, all we need to do is
copy just those bytes:

    self.buf.copy_within_str(self.pos..self.end, 0);

... which is typically much much smaller than the rest of the buffer.

This was likely also causing small performance losses in other cases as
well. For example, when searching a lot of small files, ripgrep would
likely do a lot more copying than necessary. Although, given that the
default buffer size is 8KB, this extra copying was likely pretty small,
and was thus harder to observe.

Fixes #1335
2019-08-02 07:23:27 -04:00
Andrew Gallant
f625d72b6f pkg: update brew tap to 11.0.2 2019-08-01 19:39:53 -04:00
Andrew Gallant
3de31f7527 ci: fix musl deployment
The docker image that the Linux binary is now built in does not have
ASCII doc installed, so setup Cross to point to my own image with those
tools installed.
2019-08-01 18:41:44 -04:00
Andrew Gallant
e402d6c260 ripgrep: release 11.0.2 2019-08-01 18:02:15 -04:00
Andrew Gallant
48b5bdc441 src: remove old directories
termcolor has had its own repository for a while now. No need for these
redirects any more.
2019-08-01 17:49:28 -04:00
Andrew Gallant
709ca91f50 ignore: release 0.4.9 2019-08-01 17:48:37 -04:00
Andrew Gallant
9c220f9a9b grep-regex: release 0.1.4 2019-08-01 17:47:45 -04:00
Andrew Gallant
9085bed139 grep-matcher: release 0.1.3 2019-08-01 17:46:59 -04:00
Andrew Gallant
931ab35f76 changelog: start work on 11.0.2 release 2019-08-01 17:42:38 -04:00
Andrew Gallant
b5e5979ff1 deps: update everything
This drops `spin` and `autocfg`, yay.
2019-08-01 17:42:38 -04:00
Andrew Gallant
052c857da0 doc: mention .ignore and .rgignore more prominently
Fixes #1284
2019-08-01 17:37:46 -04:00
Andrew Gallant
5e84e784c8 doc: add translations section
We note that they may not be up to date and are unofficial.

Fixes #1246
2019-08-01 17:37:46 -04:00
Andrew Gallant
01e8e11621 doc: improve PCRE2 failure mode documentation
If a user tries to search for an explicit `\n` character in a PCRE2
regex, ripgrep won't report an error and instead will (likely) silently
fail to match.

Fixes #1261
2019-08-01 17:32:44 -04:00
Ninan John
9268ff8e8d ripgrep: fix bug when CWD has directory named -
Specifically, when searching stdin, if the current directory has a
directory named `-`, then the `--with-filename` flag would automatically
be turned on. This is because `--with-filename` is automatically enabled
when ripgrep is given a single path that is a directory. When ripgrep is
given empty arguments, and if it is searching stdin, then its default
path list is just simple `["-"]`. The `is_dir` check passes, and
`--with-filename` gets enabled.

This commit fixes the problem by checking whether the path is `-` first.
If so, then we assume it isn't a directory. This is fine, since if it is
a directory and one asks to search it explicitly, then ripgrep will
interpret `-` as stdin anyway (which is arguably a bug on its own, but
probably not one worth fixing).

Fixes #1223, Closes #1292
2019-08-01 17:27:23 -04:00
dana
c2cb0a4de4 ripgrep: add --glob-case-insensitive
This flag forces -g/--glob patterns to be treated case-insensitively, as with
--iglob patterns.

Fixes #1293
2019-08-01 17:08:58 -04:00
Andrew Gallant
adb9332f52 regex: fix -F aho-corasick optimization
It turns out that when the -F flag was used, if any of the patterns
contained a regex meta character (such as `.`), then we winded up
escaping the pattern first before handing it off to Aho-Corasick, which
treats all patterns literally.

We continue to apply band-aides here and just avoid Aho-Corasick if
there is an escape in any of the literal patterns. This is unfortunate,
but making this work better requires more refactoring, and the right
solution is to get this optimization pushed down into the regex engine.

Fixes #1334
2019-08-01 16:58:12 -04:00
Matthew Davidson
bc37c32717 ignore/types: add edn type from Clojure ecosystem
PR #1330
2019-07-29 16:43:28 -04:00
Andrew Gallant
08ae4da2b7 deps: update them
There are some nice removals. It looks like rand has slimmed down, and
smallvec is gone now as well.
2019-07-25 07:52:33 -04:00
Andrew Gallant
7ac95c1f50 deps: bump ignore 2019-07-24 12:56:47 -04:00
Andrew Gallant
7a6903bd4e ignore-0.4.8 2019-07-24 12:56:01 -04:00
Tiziano Santoro
9801fae29f ignore: support compilation on wasm
Currently the crate assumes that exactly one of `cfg(windows)` or
`cfg(unix)` is true, but this is not actually the case, for instance
when compiling for `wasm32`.

Implement the missing functions so that the crate can compile on other
platforms, even though those functions will always return an error.

PR #1327
2019-07-24 12:55:37 -04:00
Miloš Stojanović
abdf7140d7 readme: fix broken link to Scoop bucket
PR #1324
2019-07-20 12:03:46 -04:00
Conrad Olega
b83e7968ef ignore/types: add Robot Framework
PR #1322
2019-07-14 08:12:34 -04:00
Hugo Locurcio
8ebc113847 doc: improve docs for --replace flag
Specifically, we document shell-specific caveats related to the `--replace`
flag.

PR #1318
2019-07-04 11:42:35 -04:00
Andrew Gallant
785c1f1766 release: globset, grep-cli, grep-printer, grep-searcher 2019-06-26 16:53:30 -04:00
Andrew Gallant
8b734cb490 deps: update everything 2019-06-26 16:51:06 -04:00
Andrew Gallant
b93762ea7a bstr: update everything to bstr 0.2 2019-06-26 16:47:33 -04:00
Andrew Gallant
34677d2622 search: a few small touchups 2019-06-18 20:23:47 -04:00
Andrew Gallant
d1389db2e3 search: better errors for preprocessor commands
If a preprocessor command could not be started, we now show some
additional context with the error message. Previously, it showed
something like this:

  some/file: No such file or directory (os error 2)

Which is itself pretty misleading. Now it shows:

  some/file: preprocessor command could not start: '"nonexist" "some/file"': No such file or directory (os error 2)

Fixes #1302
2019-06-16 19:02:02 -04:00
Andrew Gallant
50bcb7409e deps: update everything 2019-06-16 18:38:45 -04:00
Andrew Gallant
7b9972c308 style: fix deprecations
Use `dyn` for trait objects and use `..=` for inclusive ranges.
2019-06-16 18:37:51 -04:00
Hitesh Jasani
9f000c2910 ignore/types: add more nim types
PR #1297
2019-06-12 14:02:28 -04:00
skierpage
392682d352 doc: point regex doc link to the latest version
The latest doc is different, e.g. adds "symmetric differences" under
https://docs.rs/regex/*/regex/#character-classes

PR #1287
2019-06-01 08:44:55 -04:00
Andrew Gallant
7d3f794588 ignore: remove .git check in some cases
When we know we aren't going to process gitignores, we shouldn't waste
the syscall in every directory to check for a git repo.
2019-05-29 18:06:11 -04:00
bruce-one
290fd2a7b6 readme: mention Zstandard and Brotli
Also alphabetise the list.

PR #1288
2019-05-29 13:37:31 -04:00
Fabian Würfl
d1e4d28f30 readme: remove outdated statement
Issue #10 already states that "ripgrep is now in most or all of the major
package repositories."

PR #1280
2019-05-14 18:44:50 -04:00
Andrew Gallant
5ce2d7351d ci: use cross for musl x86_64 builds
This is necessary because jemalloc + musl + Ubuntu 16.04 is apparently
broken.

Moreover, jemalloc doesn't support i686, so we accept the performance
regression there.

See also: https://github.com/gnzlbg/jemallocator/issues/124
2019-04-25 11:12:14 -04:00
Andrew Gallant
9dcfd9a205 deps: bump pcre2-sys to 0.2.1
This brings in a bug fix that no longer tries to run `git` to update the
submodule if the `git` command doesn't exist.

This is useful is more restricted build contexts where `git` isn't
installed. Such as in the docker image used for running `cross`.
2019-04-25 11:12:14 -04:00
Andrew Gallant
36b276c6d0 printer: remove unnecessary mut 2019-04-24 17:22:27 -04:00
Andrew Gallant
03bf37ff4a alloc: use jemalloc when building with musl
It turns out that musl's allocator is slow enough to cause a fairly
noticeable performance regression when ripgrep is built as a static
binary with musl. We fix this by using jemalloc when building with musl.

We continue to use the default system allocator in all other scenarios.
Namely, glibc's allocator doesn't noticeably regress performance compared
to jemalloc. But we could add more targets to this logic if other
system allocators (macOS, Windows) prove to be slow.

This wasn't necessary before because rustc recently stopped using jemalloc
by default.

Fixes #1268
2019-04-24 17:21:38 -04:00
Andrew Gallant
e7829c05d3 cli: fix bug where last byte was stripped
In an effort to strip line terminators, we assumed their existence. But
a pattern file may not end with a line terminator, so we shouldn't
unconditionally strip them.

We fix this by moving to bstr's line handling, which does this for us
automatically.
2019-04-19 07:11:44 -04:00
Rory O’Kane
a6222939f9 readme: mention --pcre2 as long form of -P
This is for consistency with the short and long flags given in other
bullet points. I originally assumed there was no long flag for `-P`
because none was given here.

PR #1254
2019-04-16 21:22:48 -04:00
Rory O’Kane
6ffd434232 readme: mention --auto-hybrid-regex in advantages
This feature solves a major reason I was skeptical of using ripgrep, so
I think it’s good to mention it in the section about why one should use
it.

I use backreferences a lot, so I had previously thought that ripgrep
would provide no speed advantage over ag, since I would always have
`-P` enabled. But when I saw `--auto-hybrid-regex` in the 11.0.0
changelog, I learned that ripgrep can use it to speed up simple queries
while still allowing me to write backreferences.

PR #1253
2019-04-16 17:21:40 -04:00
Andrew Gallant
1f1cd9b467 pkg: update brew tap to 11.0.1 2019-04-16 13:39:56 -04:00
Andrew Gallant
973de50c9e ripgrep: release 11.0.1, take 2 2019-04-16 13:11:28 -04:00
Andrew Gallant
5f8805a496 ripgrep: release 11.0.1 2019-04-16 13:10:29 -04:00
Andrew Gallant
fdde2bcd38 deps: update regex to 1.1.6
This brings in a fix for a regression introduced in ripgrep 11.

Fixes #1247
2019-04-16 08:34:30 -04:00
Gerard de Melo
7b3fe6b325 doc: fix typo in FAQ
PR #1248
2019-04-16 08:32:30 -04:00
Max Horn
b3dd3ae203 ignore/types: add GAP
Add support for file types used by the GAP language, a research system
computational discrete algebra, see <https://www.gap-system.org>

PR #1249
2019-04-16 08:31:58 -04:00
Andrew Gallant
f3083e4574 readme: remove brew tap instructions
The brew tap isn't really needed any more, since SIMD is now
automatically enabled in all binaries.
2019-04-15 18:32:33 -04:00
Andrew Gallant
d03e30707e pkg: update brew tap to 11.0.0 2019-04-15 18:32:10 -04:00
Andrew Gallant
d7f57d9aab ripgrep: release 11.0.0 2019-04-15 18:09:40 -04:00
Andrew Gallant
1a2a24ea74 grep: release 0.2.4 2019-04-15 18:03:46 -04:00
Andrew Gallant
d66610b295 grep-cli: release 0.1.2 2019-04-15 18:02:44 -04:00
Andrew Gallant
019ae1989b grep-printer: release 0.1.2 2019-04-15 18:00:49 -04:00
Andrew Gallant
36d3f235dc grep-searcher: release 0.1.4 2019-04-15 17:59:22 -04:00
Andrew Gallant
79018eb693 grep-pcre2: release 0.1.3 2019-04-15 17:57:03 -04:00
Andrew Gallant
44cd344438 grep-regex: release 0.1.3 2019-04-15 17:56:04 -04:00
Andrew Gallant
e493e54b9b grep-matcher: release 0.1.2 2019-04-15 17:53:29 -04:00
Andrew Gallant
8e8215aa65 ignore: release 0.4.7 2019-04-15 17:50:37 -04:00
Andrew Gallant
3fe701498e doc: add note about --pre-glob
There was a performance warning in the --pre docs, but didn't mention
--pre-glob as a possible mitigation to it.
2019-04-15 17:47:48 -04:00
Andrew Gallant
e79085e9e4 release: globset 0.4.3 2019-04-15 14:07:03 -04:00
Andrew Gallant
764c197022 complete: fix typo 2019-04-15 07:04:57 -04:00
Andrew Gallant
ef1611b5f5 ripgrep: max-column-preview --> max-columns-preview
Credit to @okdana for catching this. This naming is a bit more
consistent with the existing --max-columns flag.
2019-04-15 06:51:51 -04:00
Andrew Gallant
45d12abbc5 changelog: small fixups 2019-04-14 20:21:55 -04:00
Andrew Gallant
5fde8391f9 changelog: backfill it
I went through every commit since the 0.10.0 release and added anything
that I thought was missing.
2019-04-14 20:04:01 -04:00
Marco Herrn
3edb11c513 ignore/types: add additional java files
- .jspx for XHTML JSP files
- .properties for Java Properties files (resource bundles, etc.)

Closes #1242
2019-04-14 19:38:24 -04:00
Andrew Gallant
ed144be775 ci: bump MSRV to 1.34.0 2019-04-14 19:29:27 -04:00
Andrew Gallant
967e7ad0de ripgrep: add --auto-hybrid-regex flag
This flag, when set, will automatically dispatch to PCRE2 if the given
regex cannot be compiled by Rust's regex engine. If both engines fail to
compile the regex, then both errors are surfaced.

Closes #1155
2019-04-14 19:29:27 -04:00
Andrew Gallant
9952ba2068 deps: update glob dev-dependency 2019-04-14 19:29:27 -04:00
Andrew Gallant
b751758d60 deps: update everything 2019-04-14 19:29:27 -04:00
Andrew Gallant
8f14cb18a5 ripgrep: increase pcre2's default JIT stack size
The default stack size is 32KB, and this increases it to 10MB. 32KB is
pretty paltry in the environments in which ripgrep runs, and 10MB is
easily afforded as a maximum size. (The size limit we set for Rust's
regex engine is considerably larger.)

This was motivated due to the fack that JIT stack limits have been
observed to be hit in the wild:
https://github.com/Microsoft/vscode/issues/64606
2019-04-14 19:29:27 -04:00
Andrew Gallant
da9d720431 ripgrep: add --pcre2-version flag
This flag will output details about the version of PCRE2 that ripgrep
is using (if any).
2019-04-14 19:29:27 -04:00
Andrew Gallant
a9d71a0368 pcre2: add a few re-exports
This adds the top-level is_jit_available and version free functions from
the underlying pcre2 crate, and also forwards the max_jit_stack_size
option.
2019-04-14 19:29:27 -04:00
Andrew Gallant
f3646242cc deps: use pcre2 0.2.0
This comes with PCRE 10.32 and a few new options we'll use in subsequent
commits.
2019-04-14 19:29:27 -04:00
Andrew Gallant
601f212a0b ripgrep: add -I as a short option for --no-filename
This flag is commonly used in pipelines and it can be annoying to write
it out every time you need it.

Ideally, we would use -h for this to match GNU grep, but -h is used to
print help output.

Closes #1185
2019-04-14 19:29:27 -04:00
Andrew Gallant
5a565354f8 versioning: next version will be ripgrep 11
This sets up the release announcement and briefly describes the
versioning change. The actual version change itself won't happen until
the release.

Closes #1172
2019-04-14 19:29:27 -04:00
Andrew Gallant
2a6532ae71 doc: note cases of exorbitant memory usage
Fixes #1189
2019-04-14 19:29:27 -04:00
Andrew Gallant
ece1f50cfe printer: support previews for long lines
This commit adds support for showing a preview of long lines. While the
default still remains as completely suppressing the entire line, this
new functionality will show the first N graphemes of a matching line,
including the number of matches that are suppressed.

This was unfortunately a fairly invasive change to the printer that
required a bit of refactoring. On the bright side, the single line
and multi-line coloring are now more unified than they were before.

Closes #1078
2019-04-14 19:29:27 -04:00
Andrew Gallant
a7d26c8f14 binary: rejigger ripgrep's handling of binary files
This commit attempts to surface binary filtering in a slightly more
user friendly way. Namely, before, ripgrep would silently stop
searching a file if it detected a NUL byte, even if it had previously
printed a match. This can lead to the user quite reasonably assuming
that there are no more matches, since a partial search is fairly
unintuitive. (ripgrep has this behavior by default because it really
wants to NOT search binary files at all, just like it doesn't search
gitignored or hidden files.)

With this commit, if a match has already been printed and ripgrep detects
a NUL byte, then it will print a warning message indicating that the search
stopped prematurely.

Moreover, this commit adds a new flag, --binary, which causes ripgrep to
stop filtering binary files, but in a way that still avoids dumping
binary data into terminals. That is, the --binary flag makes ripgrep
behave more like grep's default behavior.

For files explicitly specified in a search, e.g., `rg foo some-file`,
then no binary filtering is applied (just like no gitignore and no
hidden file filtering is applied). Instead, ripgrep behaves as if you
gave the --binary flag for all explicitly given files.

This was a fairly invasive change, and potentially increases the UX
complexity of ripgrep around binary files. (Before, there were two
binary modes, where as now there are three.) However, ripgrep is now a
bit louder with warning messages when binary file detection might
otherwise be hiding potential matches, so hopefully this is a net
improvement.

Finally, the `-uuu` convenience now maps to `--no-ignore --hidden
--binary`, since this is closer to the actualy intent of the
`--unrestricted` flag, i.e., to reduce ripgrep's smart filtering. As a
consequence, `rg -uuu foo` should now search roughly the same number of
bytes as `grep -r foo`, and `rg -uuua foo` should search roughly the
same number of bytes as `grep -ra foo`. (The "roughly" weasel word is
used because grep's and ripgrep's binary file detection might differ
somewhat---perhaps based on buffer sizes---which can impact exactly what
is and isn't searched.)

See the numerous tests in tests/binary.rs for intended behavior.

Fixes #306, Fixes #855
2019-04-14 19:29:27 -04:00
Andrew Gallant
bd222ae93f regex: fix HIR analysis bug
An alternate can be empty at this point, so we must handle it. We didn't
before because the regex engine actually disallows empty alternates,
however, this code runs before the regex compiler rejects the regex.
2019-04-14 19:29:27 -04:00
hupfdule
4359d8aac0 ignore/types: add more extensions for xml
This includes:

    *.dtd for Document Type Definitions
    *.xsl and *.xslt for XSL Transformation descriptions
    *.xsd for XML Schema definitions
    *.xjb for JAXB bindings
    *.rng for Relax NG files
    *.sch for Schematron files

PR #1243
2019-04-09 15:17:57 -04:00
tonypai
308819fb1f ignore/types: add lock files
Treat anything with a `.lock` extension as a lock file, with
an extra rule or two for special cases, e.g., package-lock.json.
2019-04-09 10:24:48 -04:00
Andrew Gallant
09108b7fda regex: make multi-literal searcher faster
This makes the case of searching for a dictionary of a very large number
of literals much much faster. (~10x or so.) In particular, we achieve this
by short-circuiting the construction of a full regex when we know we have
a simple alternation of literals. Building the regex for a large dictionary
(>100,000 literals) turns out to be quite slow, even if it internally will
dispatch to Aho-Corasick.

Even that isn't quite enough. It turns out that even *parsing* such a regex
is quite slow. So when the -F/--fixed-strings flag is set, we short
circuit regex parsing completely and jump straight to Aho-Corasick.

We aren't quite as fast as GNU grep here, but it's much closer (less than
2x slower).

In general, this is somewhat of a hack. In particular, it seems plausible
that this optimization could be implemented entirely in the regex engine.
Unfortunately, the regex engine's internals are just not amenable to this
at all, so it would require a larger refactoring effort. For now, it's
good enough to add this fairly simple hack at a higher level.

Unfortunately, if you don't pass -F/--fixed-strings, then ripgrep will
be slower, because of the aforementioned missing optimization. Moreover,
passing flags like `-i` or `-S` will cause ripgrep to abandon this
optimization and fall back to something potentially much slower. Again,
this fix really needs to happen inside the regex engine, although we
might be able to special case -i when the input literals are pure ASCII
via Aho-Corasick's `ascii_case_insensitive`.

Fixes #497, Fixes #838
2019-04-07 19:11:03 -04:00
Andrew Gallant
743d64f2e4 deps: update to clap 2.33 2019-04-06 10:35:08 -04:00
lesnyrumcajs
5962abc465 searcher: add option to disable BOM sniffing
This commit adds a new encoding feature where the -E/--encoding flag
will now accept a value of 'none'. When given this value, all encoding
related machinery is disabled and ripgrep will search the raw bytes of
the file, including the BOM if it's present.

Closes #1207, Closes #1208
2019-04-06 10:35:08 -04:00
dana
1604a18db3 ignore/types: add *.am and *.in for C/C++/make
PR #1205
2019-04-06 08:02:04 -04:00
luzpaz
9eeb0b01ce readme: add Repology badge
This adds a badge to the README.md file indicating to users that click
on it if their os/distro carries that latest version of ripgrep.

PR #1213
2019-04-06 08:00:40 -04:00
dana
df4400209a ripgrep: remove extra new-line after Clap output
PR #1222
2019-04-06 07:59:36 -04:00
73 changed files with 3634 additions and 1251 deletions

98
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,98 @@
name: ci
on:
pull_request:
push:
branches:
- master
schedule:
cron: '00 01 * * *'
jobs:
test:
name: test
runs-on: ${{ matrix.os }}
strategy:
matrix:
# The docs seem to suggest that we can have a matrix with just an
# include directive, but it result in a "matrix must define at least
# one vector" error in the CI system.
build:
- pinned-glibc
- pinned-musl
- stable
- beta
# We test musl with nightly because every once in a while, this will
# catch an upstream regression.
- nightly-glibc
- nightly-musl
- macos
- win-msvc-32
- win-msvc-64
- win-gnu-32
- win-gnu-64
include:
# - build: pinned-glibc
# os: ubuntu-18.04
# rust: 1.34.0
# target: x86_64-unknown-linux-gnu
- build: pinned-musl
os: ubuntu-18.04
rust: 1.34.0
target: x86_64-unknown-linux-musl
- build: stable
os: ubuntu-18.04
rust: stable
target: x86_64-unknown-linux-gnu
# - build: beta
# os: ubuntu-18.04
# rust: beta
# target: x86_64-unknown-linux-gnu
# - build: nightly-glibc
# os: ubuntu-18.04
# rust: nightly
# target: x86_64-unknown-linux-gnu
# - build: nightly-musl
# os: ubuntu-18.04
# rust: nightly
# target: x86_64-unknown-linux-musl
# - build: macos
# os: macOS-10.14
# rust: stable
# target: x86_64-apple-darwin
# - build: win-msvc-32
# os: windows-2019
# rust: stable
# target: i686-pc-windows-msvc
# - build: win-msvc-64
# os: windows-2019
# rust: stable
# target: x86_64-pc-windows-msvc
# - build: win-gnu-32
# os: windows-2019
# rust: stable-i686-gnu
# target: i686-pc-windows-gnu
# - build: win-gnu-64
# os: windows-2019
# rust: stable-x86_64-gnu
# target: x86_64-pc-windows-gnu
steps:
- name: Checkout repository
uses: actions/checkout@v1
with:
fetch-depth: 1
- name: Install Rust
uses: hecrj/setup-rust-action@v1
with:
rust-version: ${{ matrix.rust }}
- name: Install Rust Target
run: rustup target add ${{ matrix.target }}
- name: Install musl-gcc
if: contains(matrix.target, 'musl')
run: |
apt-get install musl-tools
- name: Build everything
run: cargo build --verbose --target ${{ matrix.target }} --all --features pcre2
- name: Test zsh auto-completions
if: matrix.build == 'stable'
run: ./ci/test_complete.sh
- name: Run tests
run: cargo test --verbose --target ${{ matrix.target }} --all --features pcre2

View File

@@ -63,13 +63,13 @@ matrix:
# Minimum Rust supported channel. We enable these to make sure ripgrep
# continues to work on the advertised minimum Rust version.
- os: linux
rust: 1.32.0
rust: 1.34.0
env: TARGET=x86_64-unknown-linux-gnu
- os: linux
rust: 1.32.0
rust: 1.34.0
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: 1.32.0
rust: 1.34.0
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
addons:
apt:

View File

@@ -1,6 +1,73 @@
0.11.0 (TBD)
============
TODO.
TBD
===
TODO
Bug fixes:
* [BUG #1335](https://github.com/BurntSushi/ripgrep/issues/1335):
Fixes a performance bug when searching plain text files with very long lines.
11.0.2 (2019-08-01)
===================
ripgrep 11.0.2 is a new patch release that fixes a few bugs, including a
performance regression and a matching bug when using the `-F/--fixed-strings`
flag.
Feature enhancements:
* [FEATURE #1293](https://github.com/BurntSushi/ripgrep/issues/1293):
Added `--glob-case-insensitive` flag that makes `--glob` behave as `--iglob`.
Bug fixes:
* [BUG #1246](https://github.com/BurntSushi/ripgrep/issues/1246):
Add translations to README, starting with an unofficial Chinese translation.
* [BUG #1259](https://github.com/BurntSushi/ripgrep/issues/1259):
Fix bug where the last byte of a `-f file` was stripped if it wasn't a `\n`.
* [BUG #1261](https://github.com/BurntSushi/ripgrep/issues/1261):
Document that no error is reported when searching for `\n` with `-P/--pcre2`.
* [BUG #1284](https://github.com/BurntSushi/ripgrep/issues/1284):
Mention `.ignore` and `.rgignore` more prominently in the README.
* [BUG #1292](https://github.com/BurntSushi/ripgrep/issues/1292):
Fix bug where `--with-filename` was sometimes enabled incorrectly.
* [BUG #1268](https://github.com/BurntSushi/ripgrep/issues/1268):
Fix major performance regression in GitHub `x86_64-linux` binary release.
* [BUG #1302](https://github.com/BurntSushi/ripgrep/issues/1302):
Show better error messages when a non-existent preprocessor command is given.
* [BUG #1334](https://github.com/BurntSushi/ripgrep/issues/1334):
Fix match regression with `-F` flag when patterns contain meta characters.
11.0.1 (2019-04-16)
===================
ripgrep 11.0.1 is a new patch release that fixes a search regression introduced
in the previous 11.0.0 release. In particular, ripgrep can enter an infinite
loop for some search patterns when searching invalid UTF-8.
Bug fixes:
* [BUG #1247](https://github.com/BurntSushi/ripgrep/issues/1247):
Fix search bug that can cause ripgrep to enter an infinite loop.
11.0.0 (2019-04-15)
===================
ripgrep 11 is a new major version release of ripgrep that contains many bug
fixes, some performance improvements and a few feature enhancements. Notably,
ripgrep's user experience for binary file filtering has been improved. See the
[guide's new section on binary data](GUIDE.md#binary-data) for more details.
This release also marks a change in ripgrep's versioning. Where as the previous
version was `0.10.0`, this version is `11.0.0`. Moving forward, ripgrep's
major version will be increased a few times per year. ripgrep will continue to
be conservative with respect to backwards compatibility, but may occasionally
introduce breaking changes, which will always be documented in this CHANGELOG.
See [issue 1172](https://github.com/BurntSushi/ripgrep/issues/1172) for a bit
more detail on why this versioning change was made.
This release increases the **minimum supported Rust version** from 1.28.0 to
1.34.0.
**BREAKING CHANGES**:
@@ -11,45 +78,91 @@ TODO.
error (e.g., regex syntax error). One exception to this is if ripgrep is run
with `-q/--quiet`. In that case, if an error occurs and a match is found,
then ripgrep will exit with a `0` exit status code.
* Supplying the `-u/--unrestricted` flag three times is now equivalent to
supplying `--no-ignore --hidden --binary`. Previously, `-uuu` was equivalent
to `--no-ignore --hidden --text`. The difference is that `--binary` disables
binary file filtering without potentially dumping binary data into your
terminal. That is, `rg -uuu foo` should now be equivalent to `grep -r foo`.
* The `avx-accel` feature of ripgrep has been removed since it is no longer
necessary. All uses of AVX in ripgrep are now enabled automatically via
runtime CPU feature detection. The `simd-accel` feature does remain
available, however, it does increase compilation times substantially at the
moment.
runtime CPU feature detection. The `simd-accel` feature does remain available
(only for enabling SIMD for transcoding), however, it does increase
compilation times substantially at the moment.
Performance improvements:
* [PERF #497](https://github.com/BurntSushi/ripgrep/issues/497),
[PERF #838](https://github.com/BurntSushi/ripgrep/issues/838):
Make `rg -F -f dictionary-of-literals` much faster.
Feature enhancements:
* Added or improved file type filtering for Apache Thrift, ASP, Bazel, Brotli,
BuildStream, bzip2, C, C++, Cython, gzip, Java, Make, Postscript, QML, Tex,
XML, xz, zig and zstd.
* [FEATURE #855](https://github.com/BurntSushi/ripgrep/issues/855):
Add `--binary` flag for disabling binary file filtering.
* [FEATURE #1078](https://github.com/BurntSushi/ripgrep/pull/1078):
Add `--max-columns-preview` flag for showing a preview of long lines.
* [FEATURE #1099](https://github.com/BurntSushi/ripgrep/pull/1099):
Add support for Brotli and Zstd to the `-z/--search-zip` flag.
* [FEATURE #1138](https://github.com/BurntSushi/ripgrep/pull/1138):
Add `--no-ignore-dot` flag for ignoring `.ignore` files.
* [FEATURE #1155](https://github.com/BurntSushi/ripgrep/pull/1155):
Add `--auto-hybrid-regex` flag for automatically falling back to PCRE2.
* [FEATURE #1159](https://github.com/BurntSushi/ripgrep/pull/1159):
ripgrep's exit status logic should now match GNU grep. See updated man page.
* [FEATURE #1170](https://github.com/BurntSushi/ripgrep/pull/1170):
Add `--ignore-file-case-insensitive` for case insensitive .ignore globs.
* [FEATURE #1164](https://github.com/BurntSushi/ripgrep/pull/1164):
Add `--ignore-file-case-insensitive` for case insensitive ignore globs.
* [FEATURE #1185](https://github.com/BurntSushi/ripgrep/pull/1185):
Add `-I` flag as a short option for the `--no-filename` flag.
* [FEATURE #1207](https://github.com/BurntSushi/ripgrep/pull/1207):
Add `none` value to `-E/--encoding` to forcefully disable all transcoding.
* [FEATURE da9d7204](https://github.com/BurntSushi/ripgrep/commit/da9d7204):
Add `--pcre2-version` for querying showing PCRE2 version information.
Bug fixes:
* [BUG #306](https://github.com/BurntSushi/ripgrep/issues/306),
[BUG #855](https://github.com/BurntSushi/ripgrep/issues/855):
Improve the user experience for ripgrep's binary file filtering.
* [BUG #373](https://github.com/BurntSushi/ripgrep/issues/373),
[BUG #1098](https://github.com/BurntSushi/ripgrep/issues/1098):
`**` is now accepted as valid syntax anywhere in a glob.
* [BUG #916](https://github.com/BurntSushi/ripgrep/issues/916):
ripgrep no longer hangs when searching `/proc` with a zombie process present.
* [BUG #1052](https://github.com/BurntSushi/ripgrep/issues/1052):
Fix bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1055](https://github.com/BurntSushi/ripgrep/issues/1055):
Suggest `-U/--multiline` when a pattern contains a `\n`.
* [BUG #1063](https://github.com/BurntSushi/ripgrep/issues/1063):
Always strip a BOM if it's present, even for UTF-8.
* [BUG #1064](https://github.com/BurntSushi/ripgrep/issues/1064):
Fix inner literal detection that could lead to incorrect matches.
* [BUG #1079](https://github.com/BurntSushi/ripgrep/issues/1079):
Fixes a bug where the order of globs could result in missing a match.
* [BUG #1089](https://github.com/BurntSushi/ripgrep/issues/1089):
Fix another bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1091](https://github.com/BurntSushi/ripgrep/issues/1091):
Add note about inverted flags to the man page.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
Fix handling of literal slashes in gitignore patterns.
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
Fix corner cases involving the `--crlf` flag.
* [BUG #1101](https://github.com/BurntSushi/ripgrep/issues/1101):
Fix AsciiDoc escaping for man page output.
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
Clarify what `--encoding auto` does.
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
`--files-with-matches` and `--files-without-match` work with one file.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
Fix handling of literal slashes in gitignore patterns.
* [BUG #1121](https://github.com/BurntSushi/ripgrep/issues/1121):
Fix bug that was triggering Windows antimalware when using the --files flag.
Fix bug that was triggering Windows antimalware when using the `--files`
flag.
* [BUG #1125](https://github.com/BurntSushi/ripgrep/issues/1125),
[BUG #1159](https://github.com/BurntSushi/ripgrep/issues/1159):
ripgrep shouldn't panic for `rg -h | rg` and should emit correct exit status.
* [BUG #1144](https://github.com/BurntSushi/ripgrep/issues/1144):
Fixes a bug where line numbers could be wrong on big-endian machines.
* [BUG #1154](https://github.com/BurntSushi/ripgrep/issues/1154):
Windows files with "hidden" attribute are now treated as hidden.
* [BUG #1173](https://github.com/BurntSushi/ripgrep/issues/1173):
@@ -58,6 +171,12 @@ Bug fixes:
Fix handling of repeated `**` patterns in gitignore files.
* [BUG #1176](https://github.com/BurntSushi/ripgrep/issues/1176):
Fix bug where `-F`/`-x` weren't applied to patterns given via `-f`.
* [BUG #1189](https://github.com/BurntSushi/ripgrep/issues/1189):
Document cases where ripgrep may use a lot of memory.
* [BUG #1203](https://github.com/BurntSushi/ripgrep/issues/1203):
Fix a matching bug related to the suffix literal optimization.
* [BUG 8f14cb18](https://github.com/BurntSushi/ripgrep/commit/8f14cb18):
Increase the default stack size for PCRE2's JIT.
0.10.0 (2018-09-07)

612
Cargo.lock generated
View File

@@ -2,48 +2,42 @@
# It is not intended for manual editing.
[[package]]
name = "aho-corasick"
version = "0.7.3"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "atty"
version = "0.2.11"
version = "0.2.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "autocfg"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "base64"
version = "0.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"byteorder 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"byteorder 1.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "bitflags"
version = "1.0.4"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bstr"
version = "0.1.2"
version = "0.2.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-automata 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-automata 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -53,17 +47,17 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "byteorder"
version = "1.3.1"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "cc"
version = "1.0.34"
version = "1.0.41"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "cfg-if"
version = "0.1.7"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
@@ -71,36 +65,27 @@ name = "clap"
version = "2.33.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"bitflags 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"bitflags 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)",
"textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "cloudabi"
version = "0.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"bitflags 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "crossbeam-channel"
version = "0.3.8"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"crossbeam-utils 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)",
"smallvec 0.6.9 (registry+https://github.com/rust-lang/crates.io-index)",
"crossbeam-utils 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "crossbeam-utils"
version = "0.6.5"
version = "0.6.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -108,7 +93,7 @@ name = "encoding_rs"
version = "0.8.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)",
"packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
@@ -126,157 +111,175 @@ version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "fuchsia-cprng"
version = "0.1.1"
name = "fs_extra"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "glob"
version = "0.2.11"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "globset"
version = "0.4.2"
version = "0.4.4"
dependencies = [
"aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick 0.7.6 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep"
version = "0.2.3"
version = "0.2.4"
dependencies = [
"grep-cli 0.1.1",
"grep-matcher 0.1.1",
"grep-pcre2 0.1.2",
"grep-printer 0.1.1",
"grep-regex 0.1.2",
"grep-searcher 0.1.3",
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-cli 0.1.3",
"grep-matcher 0.1.3",
"grep-pcre2 0.1.3",
"grep-printer 0.1.3",
"grep-regex 0.1.5",
"grep-searcher 0.1.6",
"termcolor 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.2.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-cli"
version = "0.1.1"
version = "0.1.3"
dependencies = [
"atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.2",
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"atty 0.2.13 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.4",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-matcher"
version = "0.1.1"
version = "0.1.3"
dependencies = [
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-pcre2"
version = "0.1.2"
version = "0.1.3"
dependencies = [
"grep-matcher 0.1.1",
"pcre2 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.3",
"pcre2 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-printer"
version = "0.1.1"
version = "0.1.3"
dependencies = [
"base64 0.10.1 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.1",
"grep-regex 0.1.2",
"grep-searcher 0.1.3",
"serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.3",
"grep-regex 0.1.5",
"grep-searcher 0.1.6",
"serde 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.40 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-regex"
version = "0.1.2"
version = "0.1.5"
dependencies = [
"grep-matcher 0.1.1",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick 0.7.6 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.3",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-searcher"
version = "0.1.3"
version = "0.1.6"
dependencies = [
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs_io 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.1",
"grep-regex 0.1.2",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.3",
"grep-regex 0.1.5",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ignore"
version = "0.4.6"
version = "0.4.10"
dependencies = [
"crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.2",
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"tempfile 3.0.7 (registry+https://github.com/rust-lang/crates.io-index)",
"crossbeam-channel 0.3.9 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.4",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.2.9 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "itoa"
version = "0.4.3"
version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "jemalloc-sys"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cc 1.0.41 (registry+https://github.com/rust-lang/crates.io-index)",
"fs_extra 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "jemallocator"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"jemalloc-sys 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "lazy_static"
version = "1.3.0"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "libc"
version = "0.2.51"
version = "0.2.62"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "log"
version = "0.4.6"
version = "0.4.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "memchr"
version = "2.2.0"
version = "2.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
@@ -284,16 +287,16 @@ name = "memmap"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "num_cpus"
version = "1.10.0"
version = "1.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -301,230 +304,102 @@ name = "packed_simd"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "pcre2"
version = "0.1.1"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"pcre2-sys 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"pcre2-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "pcre2-sys"
version = "0.1.1"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cc 1.0.34 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)",
"cc 1.0.41 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)",
"pkg-config 0.3.15 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "pkg-config"
version = "0.3.14"
version = "0.3.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "proc-macro2"
version = "0.4.27"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"unicode-xid 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "quote"
version = "0.6.11"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand"
version = "0.6.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"autocfg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_chacha 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_hc 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_isaac 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_jitter 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_os 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_pcg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_xorshift 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_chacha"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"autocfg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_core"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_core"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "rand_hc"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_isaac"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_jitter"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_os"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cloudabi 0.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"fuchsia-cprng 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"rdrand 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_pcg"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"autocfg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand_xorshift"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rdrand"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "redox_syscall"
version = "0.1.52"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "redox_termios"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
"proc-macro2 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex"
version = "1.1.5"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick 0.7.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex-automata"
version = "0.1.6"
version = "0.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"byteorder 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"byteorder 1.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex-syntax"
version = "0.6.6"
version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"ucd-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "remove_dir_all"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ripgrep"
version = "0.10.0"
version = "11.0.2"
dependencies = [
"bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)",
"grep 0.2.3",
"ignore 0.4.6",
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"grep 0.2.4",
"ignore 0.4.10",
"jemallocator 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.10.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.40 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ryu"
version = "0.2.7"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "same-file"
version = "1.0.4"
version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -532,34 +407,29 @@ dependencies = [
[[package]]
name = "serde"
version = "1.0.90"
version = "1.0.99"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "serde_derive"
version = "1.0.90"
version = "1.0.99"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
"syn 0.15.30 (registry+https://github.com/rust-lang/crates.io-index)",
"proc-macro2 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"syn 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "serde_json"
version = "1.0.39"
version = "1.0.40"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"itoa 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)",
"ryu 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)",
"itoa 0.4.4 (registry+https://github.com/rust-lang/crates.io-index)",
"ryu 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "smallvec"
version = "0.6.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "strsim"
version = "0.8.0"
@@ -567,43 +437,20 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "syn"
version = "0.15.30"
version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-xid 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "tempfile"
version = "3.0.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
"remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"proc-macro2 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "termcolor"
version = "1.0.4"
version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"wincolor 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "termion"
version = "1.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"wincolor 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -611,7 +458,7 @@ name = "textwrap"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -619,42 +466,32 @@ name = "thread_local"
version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ucd-util"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unicode-width"
version = "0.1.5"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unicode-xid"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "utf8-ranges"
version = "1.0.2"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "walkdir"
version = "2.2.7"
version = "2.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"same-file 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "winapi"
version = "0.3.7"
version = "0.3.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -671,7 +508,7 @@ name = "winapi-util"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -681,83 +518,64 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "wincolor"
version = "1.0.1"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[metadata]
"checksum aho-corasick 0.7.3 (registry+https://github.com/rust-lang/crates.io-index)" = "e6f484ae0c99fec2e858eb6134949117399f222608d84cadb3f58c1f97c2364c"
"checksum atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "9a7d5b8723950951411ee34d271d99dddcc2035a16ab25310ea2c8cfd4369652"
"checksum autocfg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "a6d640bee2da49f60a4068a7fae53acde8982514ab7bae8b8cea9e88cbcfd799"
"checksum aho-corasick 0.7.6 (registry+https://github.com/rust-lang/crates.io-index)" = "58fb5e95d83b38284460a5fda7d6470aa0b8844d283a0b614b8535e880800d2d"
"checksum atty 0.2.13 (registry+https://github.com/rust-lang/crates.io-index)" = "1803c647a3ec87095e7ae7acfca019e98de5ec9a7d01343f611cf3152ed71a90"
"checksum base64 0.10.1 (registry+https://github.com/rust-lang/crates.io-index)" = "0b25d992356d2eb0ed82172f5248873db5560c4721f564b13cb5193bda5e668e"
"checksum bitflags 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)" = "228047a76f468627ca71776ecdebd732a3423081fcf5125585bcd7c49886ce12"
"checksum bstr 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6c8203ca06c502958719dae5f653a79e0cc6ba808ed02beffbf27d09610f2143"
"checksum bitflags 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "3d155346769a6855b86399e9bc3814ab343cd3d62c7e985113d46a0ec3c281fd"
"checksum bstr 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)" = "94cdf78eb7e94c566c1f5dbe2abf8fc70a548fc902942a48c4b3a98b48ca9ade"
"checksum bytecount 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "be0fdd54b507df8f22012890aadd099979befdba27713c767993f8380112ca7c"
"checksum byteorder 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a019b10a2a7cdeb292db131fc8113e57ea2a908f6e7894b0c3c671893b65dbeb"
"checksum cc 1.0.34 (registry+https://github.com/rust-lang/crates.io-index)" = "30f813bf45048a18eda9190fd3c6b78644146056740c43172a5a3699118588fd"
"checksum cfg-if 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)" = "11d43355396e872eefb45ce6342e4374ed7bc2b3a502d1b28e36d6e23c05d1f4"
"checksum byteorder 1.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "a7c3dd8985a7111efc5c80b44e23ecdd8c007de8ade3b96595387e812b957cf5"
"checksum cc 1.0.41 (registry+https://github.com/rust-lang/crates.io-index)" = "8dae9c4b8fedcae85592ba623c4fd08cfdab3e3b72d6df780c6ead964a69bfff"
"checksum cfg-if 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)" = "b486ce3ccf7ffd79fdeb678eac06a9e6c09fc88d33836340becb8fffe87c5e33"
"checksum clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)" = "5067f5bb2d80ef5d68b4c87db81601f0b75bca627bc2ef76b141d7b846a3c6d9"
"checksum cloudabi 0.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "ddfc5b9aa5d4507acaf872de71051dfd0e309860e88966e1051e462a077aac4f"
"checksum crossbeam-channel 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "0f0ed1a4de2235cabda8558ff5840bffb97fcb64c97827f354a451307df5f72b"
"checksum crossbeam-utils 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "f8306fcef4a7b563b76b7dd949ca48f52bc1141aa067d2ea09565f3e2652aa5c"
"checksum crossbeam-channel 0.3.9 (registry+https://github.com/rust-lang/crates.io-index)" = "c8ec7fcd21571dc78f96cc96243cab8d8f035247c3efd16c687be154c3fa9efa"
"checksum crossbeam-utils 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)" = "04973fa96e96579258a5091af6003abde64af786b860f18622b82e026cca60e6"
"checksum encoding_rs 0.8.17 (registry+https://github.com/rust-lang/crates.io-index)" = "4155785c79f2f6701f185eb2e6b4caf0555ec03477cb4c70db67b465311620ed"
"checksum encoding_rs_io 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "9619ee7a2bf4e777e020b95c1439abaf008f8ea8041b78a0552c4f1bcf4df32c"
"checksum fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "2fad85553e09a6f881f739c29f0b00b0f01357c743266d478b68951ce23285f3"
"checksum fuchsia-cprng 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a06f77d526c1a601b7c4cdd98f54b5eaabffc14d5f2f0296febdc7f357c6d3ba"
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
"checksum itoa 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)" = "1306f3464951f30e30d12373d31c79fbd52d236e5e896fd92f96ec7babbbe60b"
"checksum lazy_static 1.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bc5729f27f159ddd61f4df6228e827e86643d4d3e7c32183cb30a1c08f604a14"
"checksum libc 0.2.51 (registry+https://github.com/rust-lang/crates.io-index)" = "bedcc7a809076656486ffe045abeeac163da1b558e963a31e29fbfbeba916917"
"checksum log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)" = "c84ec4b527950aa83a329754b01dbe3f58361d1c5efacd1f6d68c494d08a17c6"
"checksum memchr 2.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "2efc7bc57c883d4a4d6e3246905283d8dae951bb3bd32f49d6ef297f546e1c39"
"checksum fs_extra 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "5f2a4a2034423744d2cc7ca2068453168dcdb82c438419e639a26bd87839c674"
"checksum glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "9b919933a397b79c37e33b77bb2aa3dc8eb6e165ad809e58ff75bc7db2e34574"
"checksum itoa 0.4.4 (registry+https://github.com/rust-lang/crates.io-index)" = "501266b7edd0174f8530248f87f99c88fbe60ca4ef3dd486835b8d8d53136f7f"
"checksum jemalloc-sys 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "0d3b9f3f5c9b31aa0f5ed3260385ac205db665baa41d49bb8338008ae94ede45"
"checksum jemallocator 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "43ae63fcfc45e99ab3d1b29a46782ad679e98436c3169d15a167a1108a724b69"
"checksum lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
"checksum libc 0.2.62 (registry+https://github.com/rust-lang/crates.io-index)" = "34fcd2c08d2f832f376f4173a231990fa5aef4e99fb569867318a227ef4c06ba"
"checksum log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)" = "14b6052be84e6b71ab17edffc2eeabf5c2c3ae1fdb464aae35ac50c67a44e1f7"
"checksum memchr 2.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "88579771288728879b57485cc7d6b07d648c9f0141eb955f8ab7f9d45394468e"
"checksum memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "6585fd95e7bb50d6cc31e20d4cf9afb4e2ba16c5846fc76793f11218da9c475b"
"checksum num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1a23f0ed30a54abaa0c7e83b1d2d87ada7c3c23078d1d87815af3e3b6385fbba"
"checksum num_cpus 1.10.1 (registry+https://github.com/rust-lang/crates.io-index)" = "bcef43580c035376c0705c42792c294b66974abbfd2789b511784023f71f3273"
"checksum packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "a85ea9fc0d4ac0deb6fe7911d38786b32fc11119afd9e9d38b84ff691ce64220"
"checksum pcre2 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3ae0a2682105ec5ca0ee5910bbc7e926386d348a05166348f74007942983c319"
"checksum pcre2-sys 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "a9027f9474e4e13d3b965538aafcaebe48c803488ad76b3c97ef061a8324695f"
"checksum pkg-config 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)" = "676e8eb2b1b4c9043511a9b7bea0915320d7e502b0a079fb03f9635a5252b18c"
"checksum proc-macro2 0.4.27 (registry+https://github.com/rust-lang/crates.io-index)" = "4d317f9caece796be1980837fd5cb3dfec5613ebdb04ad0956deea83ce168915"
"checksum quote 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)" = "cdd8e04bd9c52e0342b406469d494fcb033be4bdbe5c606016defbb1681411e1"
"checksum rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)" = "6d71dacdc3c88c1fde3885a3be3fbab9f35724e6ce99467f7d9c5026132184ca"
"checksum rand_chacha 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "556d3a1ca6600bfcbab7c7c91ccb085ac7fbbcd70e008a98742e7847f4f7bcef"
"checksum rand_core 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7a6fdeb83b075e8266dcc8762c22776f6877a63111121f5f8c7411e5be7eed4b"
"checksum rand_core 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "d0e7a549d590831370895ab7ba4ea0c1b6b011d106b5ff2da6eee112615e6dc0"
"checksum rand_hc 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "7b40677c7be09ae76218dc623efbf7b18e34bced3f38883af07bb75630a21bc4"
"checksum rand_isaac 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "ded997c9d5f13925be2a6fd7e66bf1872597f759fd9dd93513dd7e92e5a5ee08"
"checksum rand_jitter 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "7b9ea758282efe12823e0d952ddb269d2e1897227e464919a554f2a03ef1b832"
"checksum rand_os 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "7b75f676a1e053fc562eafbb47838d67c84801e38fc1ba459e8f180deabd5071"
"checksum rand_pcg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "abf9b09b01790cfe0364f52bf32995ea3c39f4d2dd011eac241d2914146d0b44"
"checksum rand_xorshift 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "cbf7e9e623549b0e21f6e97cf8ecf247c1a8fd2e8a992ae265314300b2455d5c"
"checksum rdrand 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "678054eb77286b51581ba43620cc911abf02758c91f93f479767aed0f90458b2"
"checksum redox_syscall 0.1.52 (registry+https://github.com/rust-lang/crates.io-index)" = "d32b3053e5ced86e4bc0411fec997389532bf56b000e66cb4884eeeb41413d69"
"checksum redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7e891cfe48e9100a70a3b6eb652fef28920c117d366339687bd5576160db0f76"
"checksum regex 1.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "559008764a17de49a3146b234641644ed37d118d1ef641a0bb573d146edc6ce0"
"checksum regex-automata 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "a25a7daa2eea48550e9946133d6cc9621020d29cc7069089617234bf8b6a8693"
"checksum regex-syntax 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)" = "dcfd8681eebe297b81d98498869d4aae052137651ad7b96822f09ceb690d0a96"
"checksum remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3488ba1b9a2084d38645c4c08276a1752dcbf2c7130d74f1569681ad5d2799c5"
"checksum ryu 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)" = "eb9e9b8cde282a9fe6a42dd4681319bfb63f121b8a8ee9439c6f4107e58a46f7"
"checksum same-file 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)" = "8f20c4be53a8a1ff4c1f1b2bd14570d2f634628709752f0702ecdd2b3f9a5267"
"checksum serde 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)" = "aa5f7c20820475babd2c077c3ab5f8c77a31c15e16ea38687b4c02d3e48680f4"
"checksum serde_derive 1.0.90 (registry+https://github.com/rust-lang/crates.io-index)" = "58fc82bec244f168b23d1963b45c8bf5726e9a15a9d146a067f9081aeed2de79"
"checksum serde_json 1.0.39 (registry+https://github.com/rust-lang/crates.io-index)" = "5a23aa71d4a4d43fdbfaac00eff68ba8a06a51759a89ac3304323e800c4dd40d"
"checksum smallvec 0.6.9 (registry+https://github.com/rust-lang/crates.io-index)" = "c4488ae950c49d403731982257768f48fada354a5203fe81f9bb6f43ca9002be"
"checksum pcre2 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "603da5e101220b9b306bf28e4f1f8703458ce2f64d2787b374e1a19494317180"
"checksum pcre2-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "876c72d05059d23a84bd9fcdc3b1d31c50ea7fe00fe1522b4e68cd3608db8d5b"
"checksum pkg-config 0.3.15 (registry+https://github.com/rust-lang/crates.io-index)" = "a7c1d2cfa5a714db3b5f24f0915e74fcdf91d09d496ba61329705dda7774d2af"
"checksum proc-macro2 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "4c5c2380ae88876faae57698be9e9775e3544decad214599c3a6266cca6ac802"
"checksum quote 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "053a8c8bcc71fcce321828dc897a98ab9760bef03a4fc36693c231e5b3216cfe"
"checksum regex 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "88c3d9193984285d544df4a30c23a4e62ead42edf70a4452ceb76dac1ce05c26"
"checksum regex-automata 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)" = "92b73c2a1770c255c240eaa4ee600df1704a38dc3feaa6e949e7fcd4f8dc09f9"
"checksum regex-syntax 0.6.11 (registry+https://github.com/rust-lang/crates.io-index)" = "b143cceb2ca5e56d5671988ef8b15615733e7ee16cd348e064333b251b89343f"
"checksum ryu 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "c92464b447c0ee8c4fb3824ecc8383b81717b9f1e74ba2e72540aef7b9f82997"
"checksum same-file 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)" = "585e8ddcedc187886a30fa705c47985c3fa88d06624095856b36ca0b82ff4421"
"checksum serde 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)" = "fec2851eb56d010dc9a21b89ca53ee75e6528bab60c11e89d38390904982da9f"
"checksum serde_derive 1.0.99 (registry+https://github.com/rust-lang/crates.io-index)" = "cb4dc18c61206b08dc98216c98faa0232f4337e1e1b8574551d5bad29ea1b425"
"checksum serde_json 1.0.40 (registry+https://github.com/rust-lang/crates.io-index)" = "051c49229f282f7c6f3813f8286cc1e3323e8051823fce42c7ea80fe13521704"
"checksum strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)" = "8ea5119cdb4c55b55d432abb513a0429384878c15dde60cc77b1c99de1a95a6a"
"checksum syn 0.15.30 (registry+https://github.com/rust-lang/crates.io-index)" = "66c8865bf5a7cbb662d8b011950060b3c8743dca141b054bf7195b20d314d8e2"
"checksum tempfile 3.0.7 (registry+https://github.com/rust-lang/crates.io-index)" = "b86c784c88d98c801132806dadd3819ed29d8600836c4088e855cdf3e178ed8a"
"checksum termcolor 1.0.4 (registry+https://github.com/rust-lang/crates.io-index)" = "4096add70612622289f2fdcdbd5086dc81c1e2675e6ae58d6c4f62a16c6d7f2f"
"checksum termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "689a3bdfaab439fd92bc87df5c4c78417d3cbe537487274e9b0b2dce76e92096"
"checksum syn 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)" = "66850e97125af79138385e9b88339cbcd037e3f28ceab8c5ad98e64f0f1f80bf"
"checksum termcolor 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)" = "96d6098003bde162e4277c70665bd87c326f5a0c3f3fbfb285787fa482d54e6e"
"checksum textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)" = "d326610f408c7a4eb6f51c37c330e496b08506c9457c9d34287ecc38809fb060"
"checksum thread_local 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)" = "c6b53e329000edc2b34dbe8545fd20e55a333362d0a321909685a19bd28c3f1b"
"checksum ucd-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "535c204ee4d8434478593480b8f86ab45ec9aae0e83c568ca81abf0fd0e88f86"
"checksum unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "882386231c45df4700b275c7ff55b6f3698780a650026380e72dabe76fa46526"
"checksum unicode-xid 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "fc72304796d0818e357ead4e000d19c9c174ab23dc11093ac919054d20a6a7fc"
"checksum utf8-ranges 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "796f7e48bef87609f7ade7e06495a87d5cd06c7866e6a5cbfceffc558a243737"
"checksum walkdir 2.2.7 (registry+https://github.com/rust-lang/crates.io-index)" = "9d9d7ed3431229a144296213105a390676cc49c9b6a72bd19f3176c98e129fa1"
"checksum winapi 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)" = "f10e386af2b13e47c89e7236a7a14a086791a2b88ebad6df9bf42040195cf770"
"checksum unicode-width 0.1.6 (registry+https://github.com/rust-lang/crates.io-index)" = "7007dbd421b92cc6e28410fe7362e2e0a2503394908f417b68ec8d1c364c4e20"
"checksum unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "826e7639553986605ec5979c7dd957c7895e93eabed50ab2ffa7f6128a75097c"
"checksum walkdir 2.2.9 (registry+https://github.com/rust-lang/crates.io-index)" = "9658c94fa8b940eab2250bd5a457f9c48b748420d71293b165c8cdbe2f55f71e"
"checksum winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "8093091eeb260906a183e6ae1abdba2ef5ef2257a21801128899c3fc699229c6"
"checksum winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
"checksum winapi-util 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "7168bab6e1daee33b4557efd0e95d5ca70a03706d39fa5f3fe7a236f584b03c9"
"checksum winapi-x86_64-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
"checksum wincolor 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "561ed901ae465d6185fa7864d63fbd5720d0ef718366c9a4dc83cf6170d7e9ba"
"checksum wincolor 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "96f5016b18804d24db43cebf3c77269e7569b8954a8464501c216cc5e070eaa9"

View File

@@ -1,11 +1,11 @@
[package]
name = "ripgrep"
version = "0.10.0" #:version
version = "11.0.2" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
ripgrep is a line-oriented search tool that recursively searches your current
directory for a regex pattern while respecting your gitignore rules. ripgrep
has first class support on Windows, macOS and Linux
has first class support on Windows, macOS and Linux.
"""
documentation = "https://github.com/BurntSushi/ripgrep"
homepage = "https://github.com/BurntSushi/ripgrep"
@@ -46,9 +46,9 @@ members = [
]
[dependencies]
bstr = "0.1.2"
grep = { version = "0.2.3", path = "grep" }
ignore = { version = "0.4.4", path = "ignore" }
bstr = "0.2.0"
grep = { version = "0.2.4", path = "grep" }
ignore = { version = "0.4.7", path = "ignore" }
lazy_static = "1.1.0"
log = "0.4.5"
num_cpus = "1.8.0"
@@ -61,6 +61,9 @@ version = "2.32.0"
default-features = false
features = ["suggestions"]
[target.'cfg(all(target_env = "musl", target_pointer_width = "64"))'.dependencies.jemallocator]
version = "0.3.0"
[build-dependencies]
lazy_static = "1.1.0"

2
Cross.toml Normal file
View File

@@ -0,0 +1,2 @@
[target.x86_64-unknown-linux-musl]
image = "burntsushi/x86_64-unknown-linux-musl:v0.1.14"

2
FAQ.md
View File

@@ -935,7 +935,7 @@ for the previous section apply.
* Are you writing portable shell scripts intended to work in a variety of
environments? Great, probably not a good idea to use ripgrep! ripgrep is has
nowhere near the ubquity of grep, so if you do use ripgrep, you might need
nowhere near the ubiquity of grep, so if you do use ripgrep, you might need
to futz with the installation process more than you would with grep.
* Do you care about POSIX compatibility? If so, then you can't use ripgrep
because it never was, isn't and never will be POSIX compatible.

View File

@@ -18,6 +18,7 @@ translatable to any command line shell environment.
* [Replacements](#replacements)
* [Configuration file](#configuration-file)
* [File encoding](#file-encoding)
* [Binary data](#binary-data)
* [Common options](#common-options)
@@ -109,7 +110,7 @@ colors, you'll notice that `faster` will be highlighted instead of just the
It is beyond the scope of this guide to provide a full tutorial on regular
expressions, but ripgrep's specific syntax is documented here:
https://docs.rs/regex/0.2.5/regex/#syntax
https://docs.rs/regex/*/regex/#syntax
### Recursive search
@@ -537,8 +538,9 @@ formatting peculiarities:
```
$ cat $HOME/.ripgreprc
# Don't let ripgrep vomit really long lines to my terminal.
# Don't let ripgrep vomit really long lines to my terminal, and show a preview.
--max-columns=150
--max-columns-preview
# Add my 'web' type.
--type-add
@@ -680,6 +682,76 @@ $ rg '\w(?-u:\w)\w'
```
### Binary data
In addition to skipping hidden files and files in your `.gitignore` by default,
ripgrep also attempts to skip binary files. ripgrep does this by default
because binary files (like PDFs or images) are typically not things you want to
search when searching for regex matches. Moreover, if content in a binary file
did match, then it's possible for undesirable binary data to be printed to your
terminal and wreak havoc.
Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
rules, a file cannot as easily be classified as binary. In order to figure out
whether a file is binary, the most effective heuristic that balances
correctness with performance is to simply look for `NUL` bytes. At that point,
the determination is simple: a file is considered "binary" if and only if it
contains a `NUL` byte somewhere in its contents.
The issue is that while most binary files will have a `NUL` byte toward the
beginning of its contents, this is not necessarily true. The `NUL` byte might
be the very last byte in a large file, but that file is still considered
binary. While this leads to a fair amount of complexity inside ripgrep's
implementation, it also results in some unintuitive user experiences.
At a high level, ripgrep operates in three different modes with respect to
binary files:
1. The default mode is to attempt to remove binary files from a search
completely. This is meant to mirror how ripgrep removes hidden files and
files in your `.gitignore` automatically. That is, as soon as a file is
detected as binary, searching stops. If a match was already printed (because
it was detected long before a `NUL` byte), then ripgrep will print a warning
message indicating that the search stopped prematurely. This default mode
**only applies to files searched by ripgrep as a result of recursive
directory traversal**, which is consistent with ripgrep's other automatic
filtering. For example, `rg foo .file` will search `.file` even though it
is hidden. Similarly, `rg foo binary-file` search `binary-file` in "binary"
mode automatically.
2. Binary mode is similar to the default mode, except it will not always
stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
will continue searching a file that is known to be binary until the first
of two conditions is met: 1) the end of the file has been reached or 2) a
match is or has been seen. This means that in binary mode, if ripgrep
reports no matches, then there are no matches in the file. When a match does
occur, ripgrep prints a message similar to one it prints when in its default
mode indicating that the search has stopped prematurely. This mode can be
forcefully enabled for all files with the `--binary` flag. The purpose of
binary mode is to provide a way to discover matches in all files, but to
avoid having binary data dumped into your terminal.
3. Text mode completely disables all binary detection and searches all files
as if they were text. This is useful when searching a file that is
predominantly text but contains a `NUL` byte, or if you are specifically
trying to search binary data. This mode can be enabled with the `-a/--text`
flag. Note that when using this mode on very large binary files, it is
possible for ripgrep to use a lot of memory.
Unfortunately, there is one additional complexity in ripgrep that can make it
difficult to reason about binary files. That is, the way binary detection works
depends on the way that ripgrep searches your files. Specifically:
* When ripgrep uses memory maps, then binary detection is only performed on the
first few kilobytes of the file in addition to every matching line.
* When ripgrep doesn't use memory maps, then binary detection is performed on
all bytes searched.
This means that whether a file is detected as binary or not can change based
on the internal search strategy used by ripgrep. If you prefer to keep
ripgrep's binary file detection consistent, then you can disable memory maps
via the `--no-mmap` flag. (The cost will be a small performance regression when
searching very large files on some platforms.)
### Common options
ripgrep has a lot of flags. Too many to keep in your head at once. This section

View File

@@ -11,6 +11,7 @@ and grep.
[![Linux build status](https://travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![Crates.io](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
[![Packaging status](https://repology.org/badge/tiny-repos/ripgrep.svg)](https://repology.org/project/ripgrep/badges)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
@@ -28,6 +29,7 @@ Please see the [CHANGELOG](CHANGELOG.md) for a release history.
* [Configuration files](GUIDE.md#configuration-file)
* [Shell completions](FAQ.md#complete)
* [Building](#building)
* [Translations](#translations)
### Screenshot of search results
@@ -91,11 +93,11 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
[the FAQ](FAQ.md#posix4ever) for more details on whether ripgrep can truly
replace grep.)
* Like other tools specialized to code search, ripgrep defaults to recursive
directory search and won't search files ignored by your `.gitignore` files.
It also ignores hidden and binary files by default. ripgrep also implements
full support for `.gitignore`, whereas there are many bugs related to that
functionality in other code search tools claiming to provide the same
functionality.
directory search and won't search files ignored by your
`.gitignore`/`.ignore`/`.rgignore` files. It also ignores hidden and binary
files by default. ripgrep also implements full support for `.gitignore`,
whereas there are many bugs related to that functionality in other code
search tools claiming to provide the same functionality.
* ripgrep can search specific types of files. For example, `rg -tpy foo`
limits your search to Python files and `rg -Tjs foo` excludes Javascript
files from your search. ripgrep can be taught about new file types with
@@ -107,13 +109,14 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
* ripgrep has optional support for switching its regex engine to use PCRE2.
Among other things, this makes it possible to use look-around and
backreferences in your patterns, which are not supported in ripgrep's default
regex engine. PCRE2 support is enabled with `-P`.
regex engine. PCRE2 support can be enabled with `-P/--pcre2` (use PCRE2
always) or `--auto-hybrid-regex` (use PCRE2 only if needed).
* ripgrep supports searching files in text encodings other than UTF-8, such
as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
automatically detecting UTF-16 is provided. Other text encodings must be
specifically specified with the `-E/--encoding` flag.)
* ripgrep supports searching files compressed in a common format (gzip, xz,
lzma, bzip2 or lz4) with the `-z/--search-zip` flag.
* ripgrep supports searching files compressed in a common format (brotli,
bzip2, gzip, lz4, lzma, xz, or zstandard) with the `-z/--search-zip` flag.
* ripgrep supports arbitrary input preprocessing filters which could be PDF
text extraction, less supported decompression, decrypting, automatic encoding
detection and so on.
@@ -207,14 +210,6 @@ from homebrew-core, (compiled with rust stable, no SIMD):
$ brew install ripgrep
```
or you can install a binary compiled with rust nightly (including SIMD and all
optimizations) by utilizing a custom tap:
```
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install ripgrep-bin
```
If you're a **MacPorts** user, then you can install ripgrep from the
[official ports](https://www.macports.org/ports.php?by=name&substr=ripgrep):
@@ -230,7 +225,7 @@ $ choco install ripgrep
```
If you're a **Windows Scoop** user, then you can install ripgrep from the
[official bucket](https://github.com/lukesampson/scoop/blob/master/bucket/ripgrep.json):
[official bucket](https://github.com/ScoopInstaller/Main/blob/master/bucket/ripgrep.json):
```
$ scoop install ripgrep
@@ -293,11 +288,11 @@ then ripgrep can be installed using a binary `.deb` file provided in each
[ripgrep release](https://github.com/BurntSushi/ripgrep/releases).
```
$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.10.0/ripgrep_0.10.0_amd64.deb
$ sudo dpkg -i ripgrep_0.10.0_amd64.deb
$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/11.0.2/ripgrep_11.0.2_amd64.deb
$ sudo dpkg -i ripgrep_11.0.2_amd64.deb
```
If you run Debian Buster (currently Debian testing) or Debian sid, ripgrep is
If you run Debian Buster (currently Debian stable) or Debian sid, ripgrep is
[officially maintained by Debian](https://tracker.debian.org/pkg/rust-ripgrep).
```
$ sudo apt-get install ripgrep
@@ -339,7 +334,7 @@ If you're a **NetBSD** user, then you can install ripgrep from
If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
* Note that the minimum supported version of Rust for ripgrep is **1.32.0**,
* Note that the minimum supported version of Rust for ripgrep is **1.34.0**,
although ripgrep may work with older versions.
* Note that the binary may be bigger than expected because it contains debug
symbols. This is intentional. To remove debug symbols and therefore reduce
@@ -349,18 +344,12 @@ If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
$ cargo install ripgrep
```
When compiling with Rust 1.27 or newer, this will automatically enable SIMD
optimizations for search.
ripgrep isn't currently in any other package repositories.
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
### Building
ripgrep is written in Rust, so you'll need to grab a
[Rust installation](https://www.rust-lang.org/) in order to compile it.
ripgrep compiles with Rust 1.32.0 (stable) or newer. In general, ripgrep tracks
ripgrep compiles with Rust 1.34.0 (stable) or newer. In general, ripgrep tracks
the latest stable release of the Rust compiler.
To build ripgrep:
@@ -430,3 +419,11 @@ $ cargo test --all
```
from the repository root.
### Translations
The following is a list of known translations of ripgrep's documentation. These
are unofficially maintained and may not be up to date.
* [Chinese](https://github.com/chinanf-boy/ripgrep-zh#%E6%9B%B4%E6%96%B0-)

View File

@@ -8,12 +8,13 @@ set -ex
# Generate artifacts for release
mk_artifacts() {
CARGO="$(builder)"
if is_arm; then
cargo build --target "$TARGET" --release
"$CARGO" build --target "$TARGET" --release
else
# Technically, MUSL builds will force PCRE2 to get statically compiled,
# but we also want PCRE2 statically build for macOS binaries.
PCRE2_SYS_STATIC=1 cargo build --target "$TARGET" --release --features 'pcre2'
PCRE2_SYS_STATIC=1 "$CARGO" build --target "$TARGET" --release --features 'pcre2'
fi
}

View File

@@ -0,0 +1,5 @@
FROM japaric/x86_64-unknown-linux-musl:v0.1.14
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libxslt1-dev asciidoc docbook-xsl xsltproc libxml2-utils

View File

@@ -7,11 +7,13 @@ set -ex
. "$(dirname $0)/utils.sh"
main() {
CARGO="$(builder)"
# Test a normal debug build.
if is_arm; then
cargo build --target "$TARGET" --verbose
"$CARGO" build --target "$TARGET" --verbose
else
cargo build --target "$TARGET" --verbose --all --features 'pcre2'
"$CARGO" build --target "$TARGET" --verbose --all --features 'pcre2'
fi
# Show the output of the most recent build.rs stderr.
@@ -44,7 +46,7 @@ main() {
"$(dirname "${0}")/test_complete.sh"
# Run tests for ripgrep and all sub-crates.
cargo test --target "$TARGET" --verbose --all --features 'pcre2'
"$CARGO" test --target "$TARGET" --verbose --all --features 'pcre2'
}
main

View File

@@ -55,6 +55,13 @@ gcc_prefix() {
esac
}
is_musl() {
case "$TARGET" in
*-musl) return 0 ;;
*) return 1 ;;
esac
}
is_x86() {
case "$(architecture)" in
amd64|i386) return 0 ;;
@@ -62,6 +69,13 @@ is_x86() {
esac
}
is_x86_64() {
case "$(architecture)" in
amd64) return 0 ;;
*) return 1 ;;
esac
}
is_arm() {
case "$(architecture)" in
armhf) return 0 ;;
@@ -82,3 +96,12 @@ is_osx() {
*) return 1 ;;
esac
}
builder() {
if is_musl && is_x86_64; then
cargo install cross
echo "cross"
else
echo "cargo"
fi
}

View File

@@ -43,6 +43,7 @@ _rg() {
+ '(exclusive)' # Misc. fully exclusive options
'(: * -)'{-h,--help}'[display help information]'
'(: * -)'{-V,--version}'[display version information]'
'(: * -)'--pcre2-version'[print the version of PCRE2 used by ripgrep, if available]'
+ '(buffered)' # buffering options
'--line-buffered[force line buffering]'
@@ -85,7 +86,7 @@ _rg() {
+ '(file-name)' # File-name options
{-H,--with-filename}'[show file name for matches]'
"--no-filename[don't show file name for matches]"
{-I,--no-filename}"[don't show file name for matches]"
+ '(file-system)' # File system options
"--one-file-system[don't descend into directories on other file systems]"
@@ -103,6 +104,10 @@ _rg() {
'*'{-g+,--glob=}'[include/exclude files matching specified glob]:glob'
'*--iglob=[include/exclude files matching specified case-insensitive glob]:glob'
+ '(glob-case-insensitive)' # File-glob case sensitivity options
'--glob-case-insensitive[treat -g/--glob patterns case insensitively]'
$no'--no-glob-case-insensitive[treat -g/--glob patterns case sensitively]'
+ '(heading)' # Heading options
'(pretty-vimgrep)--heading[show matches grouped by file name]'
"(pretty-vimgrep)--no-heading[don't show matches grouped by file name]"
@@ -111,6 +116,10 @@ _rg() {
'--hidden[search hidden files and directories]'
$no"--no-hidden[don't search hidden files and directories]"
+ '(hybrid)' # hybrid regex options
'--auto-hybrid-regex[dynamically use PCRE2 if necessary]'
$no"--no-auto-hybrid-regex[don't dynamically use PCRE2 if necessary]"
+ '(ignore)' # Ignore-file options
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
$no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
@@ -148,6 +157,10 @@ _rg() {
$no"--no-crlf[don't use CRLF as line terminator]"
'(text)--null-data[use NUL as line terminator]'
+ '(max-columns-preview)' # max column preview options
'--max-columns-preview[show preview for long lines (with -M)]'
$no"--no-max-columns-preview[don't show preview for long lines (with -M)]"
+ '(max-depth)' # Directory-depth options
'--max-depth=[specify max number of directories to descend]:number of directories'
'!--maxdepth=:number of directories'
@@ -227,6 +240,8 @@ _rg() {
+ '(text)' # Binary-search options
{-a,--text}'[search binary files as if they were text]'
"--binary[search binary files, don't print binary data]"
$no"--no-binary[don't search binary files]"
$no"(--null-data)--no-text[don't search binary files as if they were text]"
+ '(threads)' # Thread-count options

View File

@@ -41,6 +41,9 @@ configuration file. The file can specify one shell argument per line. Lines
starting with *#* are ignored. For more details, see the man page or the
*README*.
Tip: to disable all smart filtering and make ripgrep behave a bit more like
classical grep, use *rg -uuu*.
REGEX SYNTAX
------------
@@ -140,16 +143,16 @@ would behave identically to the following command
same with using globs
--glob=!git/*
--glob=!.git
or
--glob
!git/*
!.git
would behave identically to the following command
rg --glob '!git/*' foo
rg --glob '!.git' foo
ripgrep also provides a flag, *--no-config*, that when present will suppress
any and all support for configuration. This includes any future support
@@ -189,6 +192,21 @@ file that is simultaneously truncated. This behavior can be avoided by passing
the *--no-mmap* flag which will forcefully disable the use of memory maps in
all cases.
ripgrep may use a large amount of memory depending on a few factors. Firstly,
if ripgrep uses parallelism for search (the default), then the entire output
for each individual file is buffered into memory in order to prevent
interleaving matches in the output. To avoid this, you can disable parallelism
with the *-j1* flag. Secondly, ripgrep always needs to have at least a single
line in memory in order to execute a search. A file with a very long line can
thus cause ripgrep to use a lot of memory. Generally, this only occurs when
searching binary data with the *-a* flag enabled. (When the *-a* flag isn't
enabled, ripgrep will replace all NUL bytes with line terminators, which
typically prevents exorbitant memory usage.) Thirdly, when ripgrep searches
a large file using a memory map, the process will report its resident memory
usage as the size of the file. However, this does not mean ripgrep actually
needed to use that much memory; the operating system will generally handle this
for you.
VERSION
-------

View File

@@ -1,6 +1,6 @@
[package]
name = "globset"
version = "0.4.2" #:version
version = "0.4.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Cross platform single glob and glob set matching. Glob set matching is the
@@ -20,13 +20,13 @@ bench = false
[dependencies]
aho-corasick = "0.7.3"
bstr = { version = "0.1.2", default-features = false, features = ["std"] }
bstr = { version = "0.2.0", default-features = false, features = ["std"] }
fnv = "1.0.6"
log = "0.4.5"
regex = "1.1.5"
[dev-dependencies]
glob = "0.2.11"
glob = "0.3.0"
[features]
simd-accel = []

View File

@@ -120,7 +120,7 @@ impl GlobMatcher {
/// Tests whether the given path matches this pattern or not.
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
self.re.is_match(path.path.as_bytes())
self.re.is_match(&path.path)
}
}
@@ -145,7 +145,7 @@ impl GlobStrategic {
/// Tests whether the given path matches this pattern or not.
fn is_match_candidate(&self, candidate: &Candidate) -> bool {
let byte_path = candidate.path.as_bytes();
let byte_path = &*candidate.path;
match self.strategy {
MatchStrategy::Literal(ref lit) => lit.as_bytes() == byte_path,

View File

@@ -119,7 +119,7 @@ use std::path::Path;
use std::str;
use aho_corasick::AhoCorasick;
use bstr::{B, BStr, BString};
use bstr::{B, ByteSlice, ByteVec};
use regex::bytes::{Regex, RegexBuilder, RegexSet};
use pathutil::{file_name, file_name_ext, normalize_path};
@@ -490,15 +490,15 @@ impl GlobSetBuilder {
/// path against multiple globs or sets of globs.
#[derive(Clone, Debug)]
pub struct Candidate<'a> {
path: Cow<'a, BStr>,
basename: Cow<'a, BStr>,
ext: Cow<'a, BStr>,
path: Cow<'a, [u8]>,
basename: Cow<'a, [u8]>,
ext: Cow<'a, [u8]>,
}
impl<'a> Candidate<'a> {
/// Create a new candidate for matching from the given path.
pub fn new<P: AsRef<Path> + ?Sized>(path: &'a P) -> Candidate<'a> {
let path = normalize_path(BString::from_path_lossy(path.as_ref()));
let path = normalize_path(Vec::from_path_lossy(path.as_ref()));
let basename = file_name(&path).unwrap_or(Cow::Borrowed(B("")));
let ext = file_name_ext(&basename).unwrap_or(Cow::Borrowed(B("")));
Candidate {
@@ -508,7 +508,7 @@ impl<'a> Candidate<'a> {
}
}
fn path_prefix(&self, max: usize) -> &BStr {
fn path_prefix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {
@@ -516,7 +516,7 @@ impl<'a> Candidate<'a> {
}
}
fn path_suffix(&self, max: usize) -> &BStr {
fn path_suffix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {

View File

@@ -1,15 +1,15 @@
use std::borrow::Cow;
use bstr::BStr;
use bstr::{ByteSlice, ByteVec};
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
pub fn file_name<'a>(path: &Cow<'a, BStr>) -> Option<Cow<'a, BStr>> {
pub fn file_name<'a>(path: &Cow<'a, [u8]>) -> Option<Cow<'a, [u8]>> {
if path.is_empty() {
return None;
} else if path.last() == Some(b'.') {
} else if path.last_byte() == Some(b'.') {
return None;
}
let last_slash = path.rfind_byte(b'/').map(|i| i + 1).unwrap_or(0);
@@ -39,7 +39,7 @@ pub fn file_name<'a>(path: &Cow<'a, BStr>) -> Option<Cow<'a, BStr>> {
/// a pattern like `*.rs` is obviously trying to match files with a `rs`
/// extension, but it also matches files like `.rs`, which doesn't have an
/// extension according to std::path::Path::extension.
pub fn file_name_ext<'a>(name: &Cow<'a, BStr>) -> Option<Cow<'a, BStr>> {
pub fn file_name_ext<'a>(name: &Cow<'a, [u8]>) -> Option<Cow<'a, [u8]>> {
if name.is_empty() {
return None;
}
@@ -60,7 +60,7 @@ pub fn file_name_ext<'a>(name: &Cow<'a, BStr>) -> Option<Cow<'a, BStr>> {
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(unix)]
pub fn normalize_path(path: Cow<BStr>) -> Cow<BStr> {
pub fn normalize_path(path: Cow<[u8]>) -> Cow<[u8]> {
// UNIX only uses /, so we're good.
path
}
@@ -68,7 +68,7 @@ pub fn normalize_path(path: Cow<BStr>) -> Cow<BStr> {
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(not(unix))]
pub fn normalize_path(mut path: Cow<BStr>) -> Cow<BStr> {
pub fn normalize_path(mut path: Cow<[u8]>) -> Cow<[u8]> {
use std::path::is_separator;
for i in 0..path.len() {
@@ -84,7 +84,7 @@ pub fn normalize_path(mut path: Cow<BStr>) -> Cow<BStr> {
mod tests {
use std::borrow::Cow;
use bstr::{B, BString};
use bstr::{B, ByteVec};
use super::{file_name_ext, normalize_path};
@@ -92,7 +92,7 @@ mod tests {
($name:ident, $file_name:expr, $ext:expr) => {
#[test]
fn $name() {
let bs = BString::from($file_name);
let bs = Vec::from($file_name);
let got = file_name_ext(&Cow::Owned(bs));
assert_eq!($ext.map(|s| Cow::Borrowed(B(s))), got);
}
@@ -109,7 +109,7 @@ mod tests {
($name:ident, $path:expr, $expected:expr) => {
#[test]
fn $name() {
let bs = BString::from_slice($path);
let bs = Vec::from_slice($path);
let got = normalize_path(Cow::Owned(bs));
assert_eq!($expected.to_vec(), got.into_owned());
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-cli"
version = "0.1.1" #:version
version = "0.1.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Utilities for search oriented command line applications.
@@ -14,8 +14,8 @@ license = "Unlicense/MIT"
[dependencies]
atty = "0.2.11"
bstr = "0.1.2"
globset = { version = "0.4.2", path = "../globset" }
bstr = "0.2.0"
globset = { version = "0.4.3", path = "../globset" }
lazy_static = "1.1.0"
log = "0.4.5"
regex = "1.1"

View File

@@ -1,7 +1,7 @@
use std::ffi::OsStr;
use std::str;
use bstr::{BStr, BString};
use bstr::{ByteSlice, ByteVec};
/// A single state in the state machine used by `unescape`.
#[derive(Clone, Copy, Eq, PartialEq)]
@@ -38,7 +38,6 @@ enum State {
/// assert_eq!(r"foo\nbar\xFFbaz", escape(b"foo\nbar\xFFbaz"));
/// ```
pub fn escape(bytes: &[u8]) -> String {
let bytes = BStr::new(bytes);
let mut escaped = String::new();
for (s, e, ch) in bytes.char_indices() {
if ch == '\u{FFFD}' {
@@ -56,7 +55,7 @@ pub fn escape(bytes: &[u8]) -> String {
///
/// This is like [`escape`](fn.escape.html), but accepts an OS string.
pub fn escape_os(string: &OsStr) -> String {
escape(BString::from_os_str_lossy(string).as_bytes())
escape(Vec::from_os_str_lossy(string).as_bytes())
}
/// Unescapes a string.
@@ -111,7 +110,7 @@ pub fn unescape(s: &str) -> Vec<u8> {
}
HexFirst => {
match c {
'0'...'9' | 'A'...'F' | 'a'...'f' => {
'0'..='9' | 'A'..='F' | 'a'..='f' => {
state = HexSecond(c);
}
c => {
@@ -122,7 +121,7 @@ pub fn unescape(s: &str) -> Vec<u8> {
}
HexSecond(first) => {
match c {
'0'...'9' | 'A'...'F' | 'a'...'f' => {
'0'..='9' | 'A'..='F' | 'a'..='f' => {
let ordinal = format!("{}{}", first, c);
let byte = u8::from_str_radix(&ordinal, 16).unwrap();
bytes.push(byte);
@@ -174,7 +173,7 @@ fn escape_char(cp: char, into: &mut String) {
/// Adds the given byte to the given string, escaping it if necessary.
fn escape_byte(byte: u8, into: &mut String) {
match byte {
0x21...0x5B | 0x5D...0x7D => into.push(byte as char),
0x21..=0x5B | 0x5D..=0x7D => into.push(byte as char),
b'\n' => into.push_str(r"\n"),
b'\r' => into.push_str(r"\r"),
b'\t' => into.push_str(r"\t"),

View File

@@ -2,10 +2,12 @@ use std::error;
use std::ffi::OsStr;
use std::fmt;
use std::fs::File;
use std::io::{self, BufRead};
use std::io;
use std::path::Path;
use std::str;
use bstr::io::BufReadExt;
use escape::{escape, escape_os};
/// An error that occurs when a pattern could not be converted to valid UTF-8.
@@ -156,28 +158,22 @@ pub fn patterns_from_stdin() -> io::Result<Vec<String>> {
/// ```
pub fn patterns_from_reader<R: io::Read>(rdr: R) -> io::Result<Vec<String>> {
let mut patterns = vec![];
let mut bufrdr = io::BufReader::new(rdr);
let mut line = vec![];
let mut line_number = 0;
while {
line.clear();
io::BufReader::new(rdr).for_byte_line(|line| {
line_number += 1;
bufrdr.read_until(b'\n', &mut line)? > 0
} {
line.pop().unwrap(); // remove trailing '\n'
if line.last() == Some(&b'\r') {
line.pop().unwrap();
}
match pattern_from_bytes(&line) {
Ok(pattern) => patterns.push(pattern.to_string()),
match pattern_from_bytes(line) {
Ok(pattern) => {
patterns.push(pattern.to_string());
Ok(true)
}
Err(err) => {
return Err(io::Error::new(
Err(io::Error::new(
io::ErrorKind::Other,
format!("{}: {}", line_number, err),
));
))
}
}
}
})?;
Ok(patterns)
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-matcher"
version = "0.1.1" #:version
version = "0.1.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A trait for regular expressions, with a focus on line oriented search.

View File

@@ -134,7 +134,7 @@ fn find_cap_ref(replacement: &[u8]) -> Option<CaptureRef> {
/// Returns true if and only if the given byte is allowed in a capture name.
fn is_valid_cap_letter(b: &u8) -> bool {
match *b {
b'0' ... b'9' | b'a' ... b'z' | b'A' ... b'Z' | b'_' => true,
b'0' ..= b'9' | b'a' ..= b'z' | b'A' ..= b'Z' | b'_' => true,
_ => false,
}
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-pcre2"
version = "0.1.2" #:version
version = "0.1.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Use PCRE2 with the 'grep' crate.
@@ -13,5 +13,5 @@ keywords = ["regex", "grep", "pcre", "backreference", "look"]
license = "Unlicense/MIT"
[dependencies]
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
pcre2 = "0.1.1"
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
pcre2 = "0.2.0"

View File

@@ -10,6 +10,7 @@ extern crate pcre2;
pub use error::{Error, ErrorKind};
pub use matcher::{RegexCaptures, RegexMatcher, RegexMatcherBuilder};
pub use pcre2::{is_jit_available, version};
mod error;
mod matcher;

View File

@@ -227,6 +227,27 @@ impl RegexMatcherBuilder {
self.builder.jit_if_available(yes);
self
}
/// Set the maximum size of PCRE2's JIT stack, in bytes. If the JIT is
/// not enabled, then this has no effect.
///
/// When `None` is given, no custom JIT stack will be created, and instead,
/// the default JIT stack is used. When the default is used, its maximum
/// size is 32 KB.
///
/// When this is set, then a new JIT stack will be created with the given
/// maximum size as its limit.
///
/// Increasing the stack size can be useful for larger regular expressions.
///
/// By default, this is set to `None`.
pub fn max_jit_stack_size(
&mut self,
bytes: Option<usize>,
) -> &mut RegexMatcherBuilder {
self.builder.max_jit_stack_size(bytes);
self
}
}
/// An implementation of the `Matcher` trait using PCRE2.

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-printer"
version = "0.1.1" #:version
version = "0.1.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
An implementation of the grep crate's Sink trait that provides standard
@@ -19,13 +19,13 @@ serde1 = ["base64", "serde", "serde_derive", "serde_json"]
[dependencies]
base64 = { version = "0.10.0", optional = true }
bstr = "0.1.2"
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
grep-searcher = { version = "0.1.1", path = "../grep-searcher" }
bstr = "0.2.0"
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
grep-searcher = { version = "0.1.4", path = "../grep-searcher" }
termcolor = "1.0.4"
serde = { version = "1.0.77", optional = true }
serde_derive = { version = "1.0.77", optional = true }
serde_json = { version = "1.0.27", optional = true }
[dev-dependencies]
grep-regex = { version = "0.1.1", path = "../grep-regex" }
grep-regex = { version = "0.1.3", path = "../grep-regex" }

View File

@@ -5,6 +5,7 @@ use std::path::Path;
use std::sync::Arc;
use std::time::Instant;
use bstr::ByteSlice;
use grep_matcher::{Match, Matcher};
use grep_searcher::{
LineStep, Searcher,
@@ -16,10 +17,7 @@ use termcolor::{ColorSpec, NoColor, WriteColor};
use color::ColorSpecs;
use counter::CounterWriter;
use stats::Stats;
use util::{
PrinterPath, Replacer, Sunk,
trim_ascii_prefix, trim_ascii_prefix_range,
};
use util::{PrinterPath, Replacer, Sunk, trim_ascii_prefix};
/// The configuration for the standard printer.
///
@@ -36,6 +34,7 @@ struct Config {
per_match: bool,
replacement: Arc<Option<Vec<u8>>>,
max_columns: Option<u64>,
max_columns_preview: bool,
max_matches: Option<u64>,
column: bool,
byte_offset: bool,
@@ -59,6 +58,7 @@ impl Default for Config {
per_match: false,
replacement: Arc::new(None),
max_columns: None,
max_columns_preview: false,
max_matches: None,
column: false,
byte_offset: false,
@@ -263,6 +263,21 @@ impl StandardBuilder {
self
}
/// When enabled, if a line is found to be over the configured maximum
/// column limit (measured in terms of bytes), then a preview of the long
/// line will be printed instead.
///
/// The preview will correspond to the first `N` *grapheme clusters* of
/// the line, where `N` is the limit configured by `max_columns`.
///
/// If no limit is set, then enabling this has no effect.
///
/// This is disabled by default.
pub fn max_columns_preview(&mut self, yes: bool) -> &mut StandardBuilder {
self.config.max_columns_preview = yes;
self
}
/// Set the maximum amount of matching lines that are printed.
///
/// If multi line search is enabled and a match spans multiple lines, then
@@ -743,6 +758,11 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
stats.add_matches(self.standard.matches.len() as u64);
stats.add_matched_lines(mat.lines().count() as u64);
}
if searcher.binary_detection().convert_byte().is_some() {
if self.binary_byte_offset.is_some() {
return Ok(false);
}
}
StandardImpl::from_match(searcher, self, mat).sink()?;
Ok(!self.should_quit())
@@ -764,6 +784,12 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
self.record_matches(ctx.bytes())?;
self.replace(ctx.bytes())?;
}
if searcher.binary_detection().convert_byte().is_some() {
if self.binary_byte_offset.is_some() {
return Ok(false);
}
}
StandardImpl::from_context(searcher, self, ctx).sink()?;
Ok(!self.should_quit())
}
@@ -776,6 +802,15 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
Ok(true)
}
fn binary_data(
&mut self,
_searcher: &Searcher,
binary_byte_offset: u64,
) -> Result<bool, io::Error> {
self.binary_byte_offset = Some(binary_byte_offset);
Ok(true)
}
fn begin(
&mut self,
_searcher: &Searcher,
@@ -793,10 +828,12 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for StandardSink<'p, 's, M, W> {
fn finish(
&mut self,
_searcher: &Searcher,
searcher: &Searcher,
finish: &SinkFinish,
) -> Result<(), io::Error> {
self.binary_byte_offset = finish.binary_byte_offset();
if let Some(offset) = self.binary_byte_offset {
StandardImpl::new(searcher, self).write_binary_message(offset)?;
}
if let Some(stats) = self.stats.as_mut() {
stats.add_elapsed(self.start_time.elapsed());
stats.add_searches(1);
@@ -992,7 +1029,7 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
let mut count = 0;
let mut stepper = LineStep::new(line_term, 0, bytes.len());
while let Some((start, end)) = stepper.next(bytes) {
let mut line = Match::new(start, end);
let line = Match::new(start, end);
self.write_prelude(
self.sunk.absolute_byte_offset() + line.start() as u64,
self.sunk.line_number().map(|n| n + count),
@@ -1000,43 +1037,11 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
)?;
count += 1;
if self.exceeds_max_columns(&bytes[line]) {
self.write_exceeded_line()?;
continue;
self.write_exceeded_line(bytes, line, matches, &mut midx)?;
} else {
self.write_colored_matches(bytes, line, matches, &mut midx)?;
self.write_line_term()?;
}
if self.has_line_terminator(&bytes[line]) {
line = line.with_end(line.end() - 1);
}
if self.config().trim_ascii {
line = self.trim_ascii_prefix_range(bytes, line);
}
while !line.is_empty() {
if matches[midx].end() <= line.start() {
if midx + 1 < matches.len() {
midx += 1;
continue;
} else {
self.end_color_match()?;
self.write(&bytes[line])?;
break;
}
}
let m = matches[midx];
if line.start() < m.start() {
let upto = cmp::min(line.end(), m.start());
self.end_color_match()?;
self.write(&bytes[line.with_end(upto)])?;
line = line.with_start(upto);
} else {
let upto = cmp::min(line.end(), m.end());
self.start_color_match()?;
self.write(&bytes[line.with_end(upto)])?;
line = line.with_start(upto);
}
}
self.end_color_match()?;
self.write_line_term()?;
}
Ok(())
}
@@ -1051,12 +1056,8 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
let mut stepper = LineStep::new(line_term, 0, bytes.len());
while let Some((start, end)) = stepper.next(bytes) {
let mut line = Match::new(start, end);
if self.has_line_terminator(&bytes[line]) {
line = line.with_end(line.end() - 1);
}
if self.config().trim_ascii {
line = self.trim_ascii_prefix_range(bytes, line);
}
self.trim_line_terminator(bytes, &mut line);
self.trim_ascii_prefix(bytes, &mut line);
while !line.is_empty() {
if matches[midx].end() <= line.start() {
if midx + 1 < matches.len() {
@@ -1079,14 +1080,19 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
Some(m.start() as u64 + 1),
)?;
let buf = &bytes[line.with_end(upto)];
let this_line = line.with_end(upto);
line = line.with_start(upto);
if self.exceeds_max_columns(&buf) {
self.write_exceeded_line()?;
continue;
if self.exceeds_max_columns(&bytes[this_line]) {
self.write_exceeded_line(
bytes,
this_line,
matches,
&mut midx,
)?;
} else {
self.write_spec(spec, &bytes[this_line])?;
self.write_line_term()?;
}
self.write_spec(spec, buf)?;
self.write_line_term()?;
}
}
count += 1;
@@ -1099,7 +1105,6 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
let spec = self.config().colors.matched();
let bytes = self.sunk.bytes();
for &m in self.sunk.matches() {
let mut m = m;
let mut count = 0;
let mut stepper = LineStep::new(line_term, 0, bytes.len());
while let Some((start, end)) = stepper.next(bytes) {
@@ -1117,15 +1122,11 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
)?;
count += 1;
if self.exceeds_max_columns(&bytes[line]) {
self.write_exceeded_line()?;
self.write_exceeded_line(bytes, line, &[m], &mut 0)?;
continue;
}
if self.has_line_terminator(&bytes[line]) {
line = line.with_end(line.end() - 1);
}
if self.config().trim_ascii {
line = self.trim_ascii_prefix_range(bytes, line);
}
self.trim_line_terminator(bytes, &mut line);
self.trim_ascii_prefix(bytes, &mut line);
while !line.is_empty() {
if m.end() <= line.start() {
@@ -1182,7 +1183,10 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
line: &[u8],
) -> io::Result<()> {
if self.exceeds_max_columns(line) {
self.write_exceeded_line()?;
let range = Match::new(0, line.len());
self.write_exceeded_line(
line, range, self.sunk.matches(), &mut 0,
)?;
} else {
self.write_trim(line)?;
if !self.has_line_terminator(line) {
@@ -1195,50 +1199,114 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
fn write_colored_line(
&self,
matches: &[Match],
line: &[u8],
bytes: &[u8],
) -> io::Result<()> {
// If we know we aren't going to emit color, then we can go faster.
let spec = self.config().colors.matched();
if !self.wtr().borrow().supports_color() || spec.is_none() {
return self.write_line(line);
}
if self.exceeds_max_columns(line) {
return self.write_exceeded_line();
return self.write_line(bytes);
}
let mut last_written =
if !self.config().trim_ascii {
0
} else {
self.trim_ascii_prefix_range(
line,
Match::new(0, line.len()),
).start()
};
for mut m in matches.iter().map(|&m| m) {
if last_written < m.start() {
let line = Match::new(0, bytes.len());
if self.exceeds_max_columns(bytes) {
self.write_exceeded_line(bytes, line, matches, &mut 0)
} else {
self.write_colored_matches(bytes, line, matches, &mut 0)?;
self.write_line_term()?;
Ok(())
}
}
/// Write the `line` portion of `bytes`, with appropriate coloring for
/// each `match`, starting at `match_index`.
///
/// This accounts for trimming any whitespace prefix and will *never* print
/// a line terminator. If a match exceeds the range specified by `line`,
/// then only the part of the match within `line` (if any) is printed.
fn write_colored_matches(
&self,
bytes: &[u8],
mut line: Match,
matches: &[Match],
match_index: &mut usize,
) -> io::Result<()> {
self.trim_line_terminator(bytes, &mut line);
self.trim_ascii_prefix(bytes, &mut line);
if matches.is_empty() {
self.write(&bytes[line])?;
return Ok(());
}
while !line.is_empty() {
if matches[*match_index].end() <= line.start() {
if *match_index + 1 < matches.len() {
*match_index += 1;
continue;
} else {
self.end_color_match()?;
self.write(&bytes[line])?;
break;
}
}
let m = matches[*match_index];
if line.start() < m.start() {
let upto = cmp::min(line.end(), m.start());
self.end_color_match()?;
self.write(&line[last_written..m.start()])?;
} else if last_written < m.end() {
m = m.with_start(last_written);
self.write(&bytes[line.with_end(upto)])?;
line = line.with_start(upto);
} else {
continue;
}
if !m.is_empty() {
let upto = cmp::min(line.end(), m.end());
self.start_color_match()?;
self.write(&line[m])?;
self.write(&bytes[line.with_end(upto)])?;
line = line.with_start(upto);
}
last_written = m.end();
}
self.end_color_match()?;
self.write(&line[last_written..])?;
if !self.has_line_terminator(line) {
self.write_line_term()?;
}
Ok(())
}
fn write_exceeded_line(&self) -> io::Result<()> {
fn write_exceeded_line(
&self,
bytes: &[u8],
mut line: Match,
matches: &[Match],
match_index: &mut usize,
) -> io::Result<()> {
if self.config().max_columns_preview {
let original = line;
let end = bytes[line]
.grapheme_indices()
.map(|(_, end, _)| end)
.take(self.config().max_columns.unwrap_or(0) as usize)
.last()
.unwrap_or(0) + line.start();
line = line.with_end(end);
self.write_colored_matches(bytes, line, matches, match_index)?;
if matches.is_empty() {
self.write(b" [... omitted end of long line]")?;
} else {
let remaining = matches
.iter()
.filter(|m| {
m.start() >= line.end() && m.start() < original.end()
})
.count();
let tense =
if remaining == 1 {
"match"
} else {
"matches"
};
write!(
self.wtr().borrow_mut(),
" [... {} more {}]",
remaining, tense,
)?;
}
self.write_line_term()?;
return Ok(());
}
if self.sunk.original_matches().is_empty() {
if self.is_context() {
self.write(b"[Omitted long context line]")?;
@@ -1314,6 +1382,38 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
Ok(())
}
fn write_binary_message(&self, offset: u64) -> io::Result<()> {
if self.sink.match_count == 0 {
return Ok(());
}
let bin = self.searcher.binary_detection();
if let Some(byte) = bin.quit_byte() {
self.write(b"WARNING: stopped searching binary file ")?;
if let Some(path) = self.path() {
self.write_spec(self.config().colors.path(), path.as_bytes())?;
self.write(b" ")?;
}
let remainder = format!(
"after match (found {:?} byte around offset {})\n",
[byte].as_bstr(), offset,
);
self.write(remainder.as_bytes())?;
} else if let Some(byte) = bin.convert_byte() {
self.write(b"Binary file ")?;
if let Some(path) = self.path() {
self.write_spec(self.config().colors.path(), path.as_bytes())?;
self.write(b" ")?;
}
let remainder = format!(
"matches (found {:?} byte around offset {})\n",
[byte].as_bstr(), offset,
);
self.write(remainder.as_bytes())?;
}
Ok(())
}
fn write_context_separator(&self) -> io::Result<()> {
if let Some(ref sep) = *self.config().separator_context {
self.write(sep)?;
@@ -1389,13 +1489,26 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
if !self.config().trim_ascii {
return self.write(buf);
}
self.write(self.trim_ascii_prefix(buf))
let mut range = Match::new(0, buf.len());
self.trim_ascii_prefix(buf, &mut range);
self.write(&buf[range])
}
fn write(&self, buf: &[u8]) -> io::Result<()> {
self.wtr().borrow_mut().write_all(buf)
}
fn trim_line_terminator(&self, buf: &[u8], line: &mut Match) {
let lineterm = self.searcher.line_terminator();
if lineterm.is_suffix(&buf[*line]) {
let mut end = line.end() - 1;
if lineterm.is_crlf() && buf[end - 1] == b'\r' {
end -= 1;
}
*line = line.with_end(end);
}
}
fn has_line_terminator(&self, buf: &[u8]) -> bool {
self.searcher.line_terminator().is_suffix(buf)
}
@@ -1451,14 +1564,12 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
///
/// This stops trimming a prefix as soon as it sees non-whitespace or a
/// line terminator.
fn trim_ascii_prefix_range(&self, slice: &[u8], range: Match) -> Match {
trim_ascii_prefix_range(self.searcher.line_terminator(), slice, range)
}
/// Trim prefix ASCII spaces from the given slice and return the
/// corresponding sub-slice.
fn trim_ascii_prefix<'s>(&self, slice: &'s [u8]) -> &'s [u8] {
trim_ascii_prefix(self.searcher.line_terminator(), slice)
fn trim_ascii_prefix(&self, slice: &[u8], range: &mut Match) {
if !self.config().trim_ascii {
return;
}
let lineterm = self.searcher.line_terminator();
*range = trim_ascii_prefix(lineterm, slice, *range)
}
}
@@ -2225,6 +2336,31 @@ but Doctor Watson has to have it taken out for him and dusted,
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_preview() {
let matcher = RegexMatcher::new("exhibited|dusted").unwrap();
let mut printer = StandardBuilder::new()
.max_columns(Some(46))
.max_columns_preview(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
but Doctor Watson has to have it taken out for [... omitted end of long line]
and exhibited clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_with_count() {
let matcher = RegexMatcher::new("cigar|ash|dusted").unwrap();
@@ -2250,6 +2386,86 @@ but Doctor Watson has to have it taken out for him and dusted,
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_with_count_preview_no_match() {
let matcher = RegexMatcher::new("exhibited|has to have it").unwrap();
let mut printer = StandardBuilder::new()
.stats(true)
.max_columns(Some(46))
.max_columns_preview(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
but Doctor Watson has to have it taken out for [... 0 more matches]
and exhibited clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_with_count_preview_one_match() {
let matcher = RegexMatcher::new("exhibited|dusted").unwrap();
let mut printer = StandardBuilder::new()
.stats(true)
.max_columns(Some(46))
.max_columns_preview(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
but Doctor Watson has to have it taken out for [... 1 more match]
and exhibited clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_with_count_preview_two_matches() {
let matcher = RegexMatcher::new(
"exhibited|dusted|has to have it",
).unwrap();
let mut printer = StandardBuilder::new()
.stats(true)
.max_columns(Some(46))
.max_columns_preview(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
but Doctor Watson has to have it taken out for [... 1 more match]
and exhibited clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_multi_line() {
let matcher = RegexMatcher::new("(?s)ash.+dusted").unwrap();
@@ -2275,6 +2491,36 @@ but Doctor Watson has to have it taken out for him and dusted,
assert_eq_printed!(expected, got);
}
#[test]
fn max_columns_multi_line_preview() {
let matcher = RegexMatcher::new(
"(?s)clew|cigar ash.+have it|exhibited",
).unwrap();
let mut printer = StandardBuilder::new()
.stats(true)
.max_columns(Some(46))
.max_columns_preview(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.multi_line(true)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
can extract a clew from a wisp of straw or a f [... 1 more match]
but Doctor Watson has to have it taken out for [... 0 more matches]
and exhibited clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn max_matches() {
let matcher = RegexMatcher::new("Sherlock").unwrap();
@@ -2564,8 +2810,40 @@ Holmeses, success in the province of detective work must always
assert_eq_printed!(expected, got);
}
#[test]
fn only_matching_max_columns_preview() {
let matcher = RegexMatcher::new("Doctor Watsons|Sherlock").unwrap();
let mut printer = StandardBuilder::new()
.only_matching(true)
.max_columns(Some(10))
.max_columns_preview(true)
.column(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(true)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
1:9:Doctor Wat [... 0 more matches]
1:57:Sherlock
3:49:Sherlock
";
assert_eq_printed!(expected, got);
}
#[test]
fn only_matching_max_columns_multi_line1() {
// The `(?s:.{0})` trick fools the matcher into thinking that it
// can match across multiple lines without actually doing so. This is
// so we can test multi-line handling in the case of a match on only
// one line.
let matcher = RegexMatcher::new(
r"(?s:.{0})(Doctor Watsons|Sherlock)"
).unwrap();
@@ -2594,6 +2872,41 @@ Holmeses, success in the province of detective work must always
assert_eq_printed!(expected, got);
}
#[test]
fn only_matching_max_columns_preview_multi_line1() {
// The `(?s:.{0})` trick fools the matcher into thinking that it
// can match across multiple lines without actually doing so. This is
// so we can test multi-line handling in the case of a match on only
// one line.
let matcher = RegexMatcher::new(
r"(?s:.{0})(Doctor Watsons|Sherlock)"
).unwrap();
let mut printer = StandardBuilder::new()
.only_matching(true)
.max_columns(Some(10))
.max_columns_preview(true)
.column(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.multi_line(true)
.line_number(true)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
1:9:Doctor Wat [... 0 more matches]
1:57:Sherlock
3:49:Sherlock
";
assert_eq_printed!(expected, got);
}
#[test]
fn only_matching_max_columns_multi_line2() {
let matcher = RegexMatcher::new(
@@ -2625,6 +2938,38 @@ Holmeses, success in the province of detective work must always
assert_eq_printed!(expected, got);
}
#[test]
fn only_matching_max_columns_preview_multi_line2() {
let matcher = RegexMatcher::new(
r"(?s)Watson.+?(Holmeses|clearly)"
).unwrap();
let mut printer = StandardBuilder::new()
.only_matching(true)
.max_columns(Some(50))
.max_columns_preview(true)
.column(true)
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.multi_line(true)
.line_number(true)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
1:16:Watsons of this world, as opposed to the Sherlock
2:16:Holmeses
5:12:Watson has to have it taken out for him and dusted [... 0 more matches]
6:12:and exhibited clearly
";
assert_eq_printed!(expected, got);
}
#[test]
fn per_match() {
let matcher = RegexMatcher::new("Doctor Watsons|Sherlock").unwrap();
@@ -2820,6 +3165,61 @@ Holmeses, success in the province of detective work must always
assert_eq_printed!(expected, got);
}
#[test]
fn replacement_max_columns_preview1() {
let matcher = RegexMatcher::new(r"Sherlock|Doctor (\w+)").unwrap();
let mut printer = StandardBuilder::new()
.max_columns(Some(67))
.max_columns_preview(true)
.replacement(Some(b"doctah $1 MD".to_vec()))
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(true)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
1:For the doctah Watsons MD of this world, as opposed to the doctah [... 0 more matches]
3:be, to a very large extent, the result of luck. doctah MD Holmes
5:but doctah Watson MD has to have it taken out for him and dusted,
";
assert_eq_printed!(expected, got);
}
#[test]
fn replacement_max_columns_preview2() {
let matcher = RegexMatcher::new(
"exhibited|dusted|has to have it",
).unwrap();
let mut printer = StandardBuilder::new()
.max_columns(Some(43))
.max_columns_preview(true)
.replacement(Some(b"xxx".to_vec()))
.build(NoColor::new(vec![]));
SearcherBuilder::new()
.line_number(false)
.build()
.search_reader(
&matcher,
SHERLOCK.as_bytes(),
printer.sink(&matcher),
)
.unwrap();
let got = printer_contents(&mut printer);
let expected = "\
but Doctor Watson xxx taken out for him and [... 1 more match]
and xxx clearly, with a label attached.
";
assert_eq_printed!(expected, got);
}
#[test]
fn replacement_only_matching() {
let matcher = RegexMatcher::new(r"Sherlock|Doctor (\w+)").unwrap();

View File

@@ -636,6 +636,34 @@ impl<'p, 's, M: Matcher, W: WriteColor> Sink for SummarySink<'p, 's, M, W> {
stats.add_bytes_searched(finish.byte_count());
stats.add_bytes_printed(self.summary.wtr.borrow().count());
}
// If our binary detection method says to quit after seeing binary
// data, then we shouldn't print any results at all, even if we've
// found a match before detecting binary data. The intent here is to
// keep BinaryDetection::quit as a form of filter. Otherwise, we can
// present a matching file with a smaller number of matches than
// there might be, which can be quite misleading.
//
// If our binary detection method is to convert binary data, then we
// don't quit and therefore search the entire contents of the file.
//
// There is an unfortunate inconsistency here. Namely, when using
// Quiet or PathWithMatch, then the printer can quit after the first
// match seen, which could be long before seeing binary data. This
// means that using PathWithMatch can print a path where as using
// Count might not print it at all because of binary data.
//
// It's not possible to fix this without also potentially significantly
// impacting the performance of Quiet or PathWithMatch, so we accept
// the bug.
if self.binary_byte_offset.is_some()
&& searcher.binary_detection().quit_byte().is_some()
{
// Squash the match count. The statistics reported will still
// contain the match count, but the "official" match count should
// be zero.
self.match_count = 0;
return Ok(());
}
let show_count =
!self.summary.config.exclude_zero

View File

@@ -4,7 +4,7 @@ use std::io;
use std::path::Path;
use std::time;
use bstr::{BStr, BString};
use bstr::{ByteSlice, ByteVec};
use grep_matcher::{Captures, LineTerminator, Match, Matcher};
use grep_searcher::{
LineIter,
@@ -263,12 +263,12 @@ impl<'a> Sunk<'a> {
/// portability with a small cost: on Windows, paths that are not valid UTF-16
/// will not roundtrip correctly.
#[derive(Clone, Debug)]
pub struct PrinterPath<'a>(Cow<'a, BStr>);
pub struct PrinterPath<'a>(Cow<'a, [u8]>);
impl<'a> PrinterPath<'a> {
/// Create a new path suitable for printing.
pub fn new(path: &'a Path) -> PrinterPath<'a> {
PrinterPath(BString::from_path_lossy(path))
PrinterPath(Vec::from_path_lossy(path))
}
/// Create a new printer path from the given path which can be efficiently
@@ -289,7 +289,7 @@ impl<'a> PrinterPath<'a> {
/// path separators that are both replaced by `new_sep`. In all other
/// environments, only `/` is treated as a path separator.
fn replace_separator(&mut self, new_sep: u8) {
let transformed_path: BString = self.0.bytes().map(|b| {
let transformed_path: Vec<u8> = self.0.bytes().map(|b| {
if b == b'/' || (cfg!(windows) && b == b'\\') {
new_sep
} else {
@@ -301,7 +301,7 @@ impl<'a> PrinterPath<'a> {
/// Return the raw bytes for this path.
pub fn as_bytes(&self) -> &[u8] {
self.0.as_bytes()
&self.0
}
}
@@ -346,7 +346,7 @@ impl Serialize for NiceDuration {
///
/// This stops trimming a prefix as soon as it sees non-whitespace or a line
/// terminator.
pub fn trim_ascii_prefix_range(
pub fn trim_ascii_prefix(
line_term: LineTerminator,
slice: &[u8],
range: Match,
@@ -366,14 +366,3 @@ pub fn trim_ascii_prefix_range(
.count();
range.with_start(range.start() + count)
}
/// Trim prefix ASCII spaces from the given slice and return the corresponding
/// sub-slice.
pub fn trim_ascii_prefix(line_term: LineTerminator, slice: &[u8]) -> &[u8] {
let range = trim_ascii_prefix_range(
line_term,
slice,
Match::new(0, slice.len()),
);
&slice[range]
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-regex"
version = "0.1.2" #:version
version = "0.1.5" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Use Rust's regex library with the 'grep' crate.
@@ -13,9 +13,9 @@ keywords = ["regex", "grep", "search", "pattern", "line"]
license = "Unlicense/MIT"
[dependencies]
aho-corasick = "0.7.3"
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
log = "0.4.5"
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
regex = "1.1"
regex-syntax = "0.6.5"
thread_local = "0.3.6"
utf8-ranges = "1.0.1"

View File

@@ -1,12 +1,13 @@
use grep_matcher::{ByteSet, LineTerminator};
use regex::bytes::{Regex, RegexBuilder};
use regex_syntax::ast::{self, Ast};
use regex_syntax::hir::Hir;
use regex_syntax::hir::{self, Hir};
use ast::AstAnalysis;
use crlf::crlfify;
use error::Error;
use literal::LiteralSets;
use multi::alternation_literals;
use non_matching::non_matching_bytes;
use strip::strip_from_match;
@@ -67,19 +68,17 @@ impl Config {
/// If there was a problem parsing the given expression then an error
/// is returned.
pub fn hir(&self, pattern: &str) -> Result<ConfiguredHIR, Error> {
let analysis = self.analysis(pattern)?;
let expr = ::regex_syntax::ParserBuilder::new()
.nest_limit(self.nest_limit)
.octal(self.octal)
let ast = self.ast(pattern)?;
let analysis = self.analysis(&ast)?;
let expr = hir::translate::TranslatorBuilder::new()
.allow_invalid_utf8(true)
.ignore_whitespace(self.ignore_whitespace)
.case_insensitive(self.is_case_insensitive(&analysis)?)
.case_insensitive(self.is_case_insensitive(&analysis))
.multi_line(self.multi_line)
.dot_matches_new_line(self.dot_matches_new_line)
.swap_greed(self.swap_greed)
.unicode(self.unicode)
.build()
.parse(pattern)
.translate(pattern, &ast)
.map_err(Error::regex)?;
let expr = match self.line_terminator {
None => expr,
@@ -99,21 +98,34 @@ impl Config {
fn is_case_insensitive(
&self,
analysis: &AstAnalysis,
) -> Result<bool, Error> {
) -> bool {
if self.case_insensitive {
return Ok(true);
return true;
}
if !self.case_smart {
return Ok(false);
return false;
}
Ok(analysis.any_literal() && !analysis.any_uppercase())
analysis.any_literal() && !analysis.any_uppercase()
}
/// Returns true if and only if this config is simple enough such that
/// if the pattern is a simple alternation of literals, then it can be
/// constructed via a plain Aho-Corasick automaton.
///
/// Note that it is OK to return true even when settings like `multi_line`
/// are enabled, since if multi-line can impact the match semantics of a
/// regex, then it is by definition not a simple alternation of literals.
pub fn can_plain_aho_corasick(&self) -> bool {
!self.word
&& !self.case_insensitive
&& !self.case_smart
}
/// Perform analysis on the AST of this pattern.
///
/// This returns an error if the given pattern failed to parse.
fn analysis(&self, pattern: &str) -> Result<AstAnalysis, Error> {
Ok(AstAnalysis::from_ast(&self.ast(pattern)?))
fn analysis(&self, ast: &Ast) -> Result<AstAnalysis, Error> {
Ok(AstAnalysis::from_ast(ast))
}
/// Parse the given pattern into its abstract syntax.
@@ -173,6 +185,15 @@ impl ConfiguredHIR {
self.pattern_to_regex(&self.expr.to_string())
}
/// If this HIR corresponds to an alternation of literals with no
/// capturing groups, then this returns those literals.
pub fn alternation_literals(&self) -> Option<Vec<Vec<u8>>> {
if !self.config.can_plain_aho_corasick() {
return None;
}
alternation_literals(&self.expr)
}
/// Applies the given function to the concrete syntax of this HIR and then
/// generates a new HIR based on the result of the function in a way that
/// preserves the configuration.

View File

@@ -76,7 +76,9 @@ impl Matcher for CRLFMatcher {
caps: &mut RegexCaptures,
) -> Result<bool, NoError> {
caps.strip_crlf(false);
let r = self.regex.captures_read_at(caps.locations(), haystack, at);
let r = self.regex.captures_read_at(
caps.locations_mut(), haystack, at,
);
if !r.is_some() {
return Ok(false);
}

View File

@@ -4,13 +4,13 @@ An implementation of `grep-matcher`'s `Matcher` trait for Rust's regex engine.
#![deny(missing_docs)]
extern crate aho_corasick;
extern crate grep_matcher;
#[macro_use]
extern crate log;
extern crate regex;
extern crate regex_syntax;
extern crate thread_local;
extern crate utf8_ranges;
pub use error::{Error, ErrorKind};
pub use matcher::{RegexCaptures, RegexMatcher, RegexMatcherBuilder};
@@ -21,6 +21,7 @@ mod crlf;
mod error;
mod literal;
mod matcher;
mod multi;
mod non_matching;
mod strip;
mod util;

View File

@@ -8,6 +8,7 @@ use regex::bytes::{CaptureLocations, Regex};
use config::{Config, ConfiguredHIR};
use crlf::CRLFMatcher;
use error::Error;
use multi::MultiLiteralMatcher;
use word::WordMatcher;
/// A builder for constructing a `Matcher` using regular expressions.
@@ -52,7 +53,7 @@ impl RegexMatcherBuilder {
}
let matcher = RegexMatcherImpl::new(&chir)?;
trace!("final regex: {:?}", matcher.regex().to_string());
trace!("final regex: {:?}", matcher.regex());
Ok(RegexMatcher {
config: self.config.clone(),
matcher: matcher,
@@ -61,6 +62,50 @@ impl RegexMatcherBuilder {
})
}
/// Build a new matcher from a plain alternation of literals.
///
/// Depending on the configuration set by the builder, this may be able to
/// build a matcher substantially faster than by joining the patterns with
/// a `|` and calling `build`.
pub fn build_literals<B: AsRef<str>>(
&self,
literals: &[B],
) -> Result<RegexMatcher, Error> {
let mut has_escape = false;
let mut slices = vec![];
for lit in literals {
slices.push(lit.as_ref());
has_escape = has_escape || lit.as_ref().contains('\\');
}
// Even when we have a fixed set of literals, we might still want to
// use the regex engine. Specifically, if any string has an escape
// in it, then we probably can't feed it to Aho-Corasick without
// removing the escape. Additionally, if there are any particular
// special match semantics we need to honor, that Aho-Corasick isn't
// enough. Finally, the regex engine can do really well with a small
// number of literals (at time of writing, this is changing soon), so
// we use it when there's a small set.
//
// Yes, this is one giant hack. Ideally, this entirely separate literal
// matcher that uses Aho-Corasick would be pushed down into the regex
// engine.
if has_escape
|| !self.config.can_plain_aho_corasick()
|| literals.len() < 40
{
return self.build(&slices.join("|"));
}
let matcher = MultiLiteralMatcher::new(&slices)?;
let imp = RegexMatcherImpl::MultiLiteral(matcher);
Ok(RegexMatcher {
config: self.config.clone(),
matcher: imp,
fast_line_regex: None,
non_matching_bytes: ByteSet::empty(),
})
}
/// Set the value for the case insensitive (`i`) flag.
///
/// When enabled, letters in the pattern will match both upper case and
@@ -348,6 +393,8 @@ impl RegexMatcher {
enum RegexMatcherImpl {
/// The standard matcher used for all regular expressions.
Standard(StandardMatcher),
/// A matcher for an alternation of plain literals.
MultiLiteral(MultiLiteralMatcher),
/// A matcher that strips `\r` from the end of matches.
///
/// This is only used when the CRLF hack is enabled and the regex is line
@@ -370,16 +417,23 @@ impl RegexMatcherImpl {
} else if expr.needs_crlf_stripped() {
Ok(RegexMatcherImpl::CRLF(CRLFMatcher::new(expr)?))
} else {
if let Some(lits) = expr.alternation_literals() {
if lits.len() >= 40 {
let matcher = MultiLiteralMatcher::new(&lits)?;
return Ok(RegexMatcherImpl::MultiLiteral(matcher));
}
}
Ok(RegexMatcherImpl::Standard(StandardMatcher::new(expr)?))
}
}
/// Return the underlying regex object used.
fn regex(&self) -> &Regex {
fn regex(&self) -> String {
match *self {
RegexMatcherImpl::Word(ref x) => x.regex(),
RegexMatcherImpl::CRLF(ref x) => x.regex(),
RegexMatcherImpl::Standard(ref x) => &x.regex,
RegexMatcherImpl::Word(ref x) => x.regex().to_string(),
RegexMatcherImpl::CRLF(ref x) => x.regex().to_string(),
RegexMatcherImpl::MultiLiteral(_) => "<N/A>".to_string(),
RegexMatcherImpl::Standard(ref x) => x.regex.to_string(),
}
}
}
@@ -399,6 +453,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.find_at(haystack, at),
MultiLiteral(ref m) => m.find_at(haystack, at),
CRLF(ref m) => m.find_at(haystack, at),
Word(ref m) => m.find_at(haystack, at),
}
@@ -408,6 +463,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.new_captures(),
MultiLiteral(ref m) => m.new_captures(),
CRLF(ref m) => m.new_captures(),
Word(ref m) => m.new_captures(),
}
@@ -417,6 +473,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.capture_count(),
MultiLiteral(ref m) => m.capture_count(),
CRLF(ref m) => m.capture_count(),
Word(ref m) => m.capture_count(),
}
@@ -426,6 +483,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.capture_index(name),
MultiLiteral(ref m) => m.capture_index(name),
CRLF(ref m) => m.capture_index(name),
Word(ref m) => m.capture_index(name),
}
@@ -435,6 +493,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.find(haystack),
MultiLiteral(ref m) => m.find(haystack),
CRLF(ref m) => m.find(haystack),
Word(ref m) => m.find(haystack),
}
@@ -450,6 +509,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.find_iter(haystack, matched),
MultiLiteral(ref m) => m.find_iter(haystack, matched),
CRLF(ref m) => m.find_iter(haystack, matched),
Word(ref m) => m.find_iter(haystack, matched),
}
@@ -465,6 +525,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.try_find_iter(haystack, matched),
MultiLiteral(ref m) => m.try_find_iter(haystack, matched),
CRLF(ref m) => m.try_find_iter(haystack, matched),
Word(ref m) => m.try_find_iter(haystack, matched),
}
@@ -478,6 +539,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.captures(haystack, caps),
MultiLiteral(ref m) => m.captures(haystack, caps),
CRLF(ref m) => m.captures(haystack, caps),
Word(ref m) => m.captures(haystack, caps),
}
@@ -494,6 +556,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.captures_iter(haystack, caps, matched),
MultiLiteral(ref m) => m.captures_iter(haystack, caps, matched),
CRLF(ref m) => m.captures_iter(haystack, caps, matched),
Word(ref m) => m.captures_iter(haystack, caps, matched),
}
@@ -510,6 +573,9 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.try_captures_iter(haystack, caps, matched),
MultiLiteral(ref m) => {
m.try_captures_iter(haystack, caps, matched)
}
CRLF(ref m) => m.try_captures_iter(haystack, caps, matched),
Word(ref m) => m.try_captures_iter(haystack, caps, matched),
}
@@ -524,6 +590,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.captures_at(haystack, at, caps),
MultiLiteral(ref m) => m.captures_at(haystack, at, caps),
CRLF(ref m) => m.captures_at(haystack, at, caps),
Word(ref m) => m.captures_at(haystack, at, caps),
}
@@ -540,6 +607,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.replace(haystack, dst, append),
MultiLiteral(ref m) => m.replace(haystack, dst, append),
CRLF(ref m) => m.replace(haystack, dst, append),
Word(ref m) => m.replace(haystack, dst, append),
}
@@ -559,6 +627,9 @@ impl Matcher for RegexMatcher {
Standard(ref m) => {
m.replace_with_captures(haystack, caps, dst, append)
}
MultiLiteral(ref m) => {
m.replace_with_captures(haystack, caps, dst, append)
}
CRLF(ref m) => {
m.replace_with_captures(haystack, caps, dst, append)
}
@@ -572,6 +643,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.is_match(haystack),
MultiLiteral(ref m) => m.is_match(haystack),
CRLF(ref m) => m.is_match(haystack),
Word(ref m) => m.is_match(haystack),
}
@@ -585,6 +657,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.is_match_at(haystack, at),
MultiLiteral(ref m) => m.is_match_at(haystack, at),
CRLF(ref m) => m.is_match_at(haystack, at),
Word(ref m) => m.is_match_at(haystack, at),
}
@@ -597,6 +670,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.shortest_match(haystack),
MultiLiteral(ref m) => m.shortest_match(haystack),
CRLF(ref m) => m.shortest_match(haystack),
Word(ref m) => m.shortest_match(haystack),
}
@@ -610,6 +684,7 @@ impl Matcher for RegexMatcher {
use self::RegexMatcherImpl::*;
match self.matcher {
Standard(ref m) => m.shortest_match_at(haystack, at),
MultiLiteral(ref m) => m.shortest_match_at(haystack, at),
CRLF(ref m) => m.shortest_match_at(haystack, at),
Word(ref m) => m.shortest_match_at(haystack, at),
}
@@ -710,7 +785,9 @@ impl Matcher for StandardMatcher {
at: usize,
caps: &mut RegexCaptures,
) -> Result<bool, NoError> {
Ok(self.regex.captures_read_at(&mut caps.locs, haystack, at).is_some())
Ok(self.regex.captures_read_at(
&mut caps.locations_mut(), haystack, at,
).is_some())
}
fn shortest_match_at(
@@ -737,54 +814,84 @@ impl Matcher for StandardMatcher {
/// index of the group using the corresponding matcher's `capture_index`
/// method, and then use that index with `RegexCaptures::get`.
#[derive(Clone, Debug)]
pub struct RegexCaptures {
/// Where the locations are stored.
locs: CaptureLocations,
/// These captures behave as if the capturing groups begin at the given
/// offset. When set to `0`, this has no affect and capture groups are
/// indexed like normal.
///
/// This is useful when building matchers that wrap arbitrary regular
/// expressions. For example, `WordMatcher` takes an existing regex `re`
/// and creates `(?:^|\W)(re)(?:$|\W)`, but hides the fact that the regex
/// has been wrapped from the caller. In order to do this, the matcher
/// and the capturing groups must behave as if `(re)` is the `0`th capture
/// group.
offset: usize,
/// When enable, the end of a match has `\r` stripped from it, if one
/// exists.
strip_crlf: bool,
pub struct RegexCaptures(RegexCapturesImp);
#[derive(Clone, Debug)]
enum RegexCapturesImp {
AhoCorasick {
/// The start and end of the match, corresponding to capture group 0.
mat: Option<Match>,
},
Regex {
/// Where the locations are stored.
locs: CaptureLocations,
/// These captures behave as if the capturing groups begin at the given
/// offset. When set to `0`, this has no affect and capture groups are
/// indexed like normal.
///
/// This is useful when building matchers that wrap arbitrary regular
/// expressions. For example, `WordMatcher` takes an existing regex
/// `re` and creates `(?:^|\W)(re)(?:$|\W)`, but hides the fact that
/// the regex has been wrapped from the caller. In order to do this,
/// the matcher and the capturing groups must behave as if `(re)` is
/// the `0`th capture group.
offset: usize,
/// When enable, the end of a match has `\r` stripped from it, if one
/// exists.
strip_crlf: bool,
},
}
impl Captures for RegexCaptures {
fn len(&self) -> usize {
self.locs.len().checked_sub(self.offset).unwrap()
match self.0 {
RegexCapturesImp::AhoCorasick { .. } => 1,
RegexCapturesImp::Regex { ref locs, offset, .. } => {
locs.len().checked_sub(offset).unwrap()
}
}
}
fn get(&self, i: usize) -> Option<Match> {
if !self.strip_crlf {
let actual = i.checked_add(self.offset).unwrap();
return self.locs.pos(actual).map(|(s, e)| Match::new(s, e));
}
match self.0 {
RegexCapturesImp::AhoCorasick { mat, .. } => {
if i == 0 {
mat
} else {
None
}
}
RegexCapturesImp::Regex { ref locs, offset, strip_crlf } => {
if !strip_crlf {
let actual = i.checked_add(offset).unwrap();
return locs.pos(actual).map(|(s, e)| Match::new(s, e));
}
// currently don't support capture offsetting with CRLF stripping
assert_eq!(self.offset, 0);
let m = match self.locs.pos(i).map(|(s, e)| Match::new(s, e)) {
None => return None,
Some(m) => m,
};
// If the end position of this match corresponds to the end position
// of the overall match, then we apply our CRLF stripping. Otherwise,
// we cannot assume stripping is correct.
if i == 0 || m.end() == self.locs.pos(0).unwrap().1 {
Some(m.with_end(m.end() - 1))
} else {
Some(m)
// currently don't support capture offsetting with CRLF
// stripping
assert_eq!(offset, 0);
let m = match locs.pos(i).map(|(s, e)| Match::new(s, e)) {
None => return None,
Some(m) => m,
};
// If the end position of this match corresponds to the end
// position of the overall match, then we apply our CRLF
// stripping. Otherwise, we cannot assume stripping is correct.
if i == 0 || m.end() == locs.pos(0).unwrap().1 {
Some(m.with_end(m.end() - 1))
} else {
Some(m)
}
}
}
}
}
impl RegexCaptures {
pub(crate) fn simple() -> RegexCaptures {
RegexCaptures(RegexCapturesImp::AhoCorasick { mat: None })
}
pub(crate) fn new(locs: CaptureLocations) -> RegexCaptures {
RegexCaptures::with_offset(locs, 0)
}
@@ -793,15 +900,53 @@ impl RegexCaptures {
locs: CaptureLocations,
offset: usize,
) -> RegexCaptures {
RegexCaptures { locs, offset, strip_crlf: false }
RegexCaptures(RegexCapturesImp::Regex {
locs, offset, strip_crlf: false,
})
}
pub(crate) fn locations(&mut self) -> &mut CaptureLocations {
&mut self.locs
pub(crate) fn locations(&self) -> &CaptureLocations {
match self.0 {
RegexCapturesImp::AhoCorasick { .. } => {
panic!("getting locations for simple captures is invalid")
}
RegexCapturesImp::Regex { ref locs, .. } => {
locs
}
}
}
pub(crate) fn locations_mut(&mut self) -> &mut CaptureLocations {
match self.0 {
RegexCapturesImp::AhoCorasick { .. } => {
panic!("getting locations for simple captures is invalid")
}
RegexCapturesImp::Regex { ref mut locs, .. } => {
locs
}
}
}
pub(crate) fn strip_crlf(&mut self, yes: bool) {
self.strip_crlf = yes;
match self.0 {
RegexCapturesImp::AhoCorasick { .. } => {
panic!("setting strip_crlf for simple captures is invalid")
}
RegexCapturesImp::Regex { ref mut strip_crlf, .. } => {
*strip_crlf = yes;
}
}
}
pub(crate) fn set_simple(&mut self, one: Option<Match>) {
match self.0 {
RegexCapturesImp::AhoCorasick { ref mut mat } => {
*mat = one;
}
RegexCapturesImp::Regex { .. } => {
panic!("setting simple captures for regex is invalid")
}
}
}
}

127
grep-regex/src/multi.rs Normal file
View File

@@ -0,0 +1,127 @@
use aho_corasick::{AhoCorasick, AhoCorasickBuilder, MatchKind};
use grep_matcher::{Matcher, Match, NoError};
use regex_syntax::hir::Hir;
use error::Error;
use matcher::RegexCaptures;
/// A matcher for an alternation of literals.
///
/// Ideally, this optimization would be pushed down into the regex engine, but
/// making this work correctly there would require quite a bit of refactoring.
/// Moreover, doing it one layer above lets us do thing like, "if we
/// specifically only want to search for literals, then don't bother with
/// regex parsing at all."
#[derive(Clone, Debug)]
pub struct MultiLiteralMatcher {
/// The Aho-Corasick automaton.
ac: AhoCorasick,
}
impl MultiLiteralMatcher {
/// Create a new multi-literal matcher from the given literals.
pub fn new<B: AsRef<[u8]>>(
literals: &[B],
) -> Result<MultiLiteralMatcher, Error> {
let ac = AhoCorasickBuilder::new()
.match_kind(MatchKind::LeftmostFirst)
.auto_configure(literals)
.build_with_size::<usize, _, _>(literals)
.map_err(Error::regex)?;
Ok(MultiLiteralMatcher { ac })
}
}
impl Matcher for MultiLiteralMatcher {
type Captures = RegexCaptures;
type Error = NoError;
fn find_at(
&self,
haystack: &[u8],
at: usize,
) -> Result<Option<Match>, NoError> {
match self.ac.find(&haystack[at..]) {
None => Ok(None),
Some(m) => Ok(Some(Match::new(at + m.start(), at + m.end()))),
}
}
fn new_captures(&self) -> Result<RegexCaptures, NoError> {
Ok(RegexCaptures::simple())
}
fn capture_count(&self) -> usize {
1
}
fn capture_index(&self, _: &str) -> Option<usize> {
None
}
fn captures_at(
&self,
haystack: &[u8],
at: usize,
caps: &mut RegexCaptures,
) -> Result<bool, NoError> {
caps.set_simple(None);
let mat = self.find_at(haystack, at)?;
caps.set_simple(mat);
Ok(mat.is_some())
}
// We specifically do not implement other methods like find_iter. Namely,
// the iter methods are guaranteed to be correct by virtue of implementing
// find_at above.
}
/// Alternation literals checks if the given HIR is a simple alternation of
/// literals, and if so, returns them. Otherwise, this returns None.
pub fn alternation_literals(expr: &Hir) -> Option<Vec<Vec<u8>>> {
use regex_syntax::hir::{HirKind, Literal};
// This is pretty hacky, but basically, if `is_alternation_literal` is
// true, then we can make several assumptions about the structure of our
// HIR. This is what justifies the `unreachable!` statements below.
if !expr.is_alternation_literal() {
return None;
}
let alts = match *expr.kind() {
HirKind::Alternation(ref alts) => alts,
_ => return None, // one literal isn't worth it
};
let extendlit = |lit: &Literal, dst: &mut Vec<u8>| {
match *lit {
Literal::Unicode(c) => {
let mut buf = [0; 4];
dst.extend_from_slice(c.encode_utf8(&mut buf).as_bytes());
}
Literal::Byte(b) => {
dst.push(b);
}
}
};
let mut lits = vec![];
for alt in alts {
let mut lit = vec![];
match *alt.kind() {
HirKind::Empty => {}
HirKind::Literal(ref x) => extendlit(x, &mut lit),
HirKind::Concat(ref exprs) => {
for e in exprs {
match *e.kind() {
HirKind::Literal(ref x) => extendlit(x, &mut lit),
_ => unreachable!("expected literal, got {:?}", e),
}
}
}
_ => unreachable!("expected literal or concat, got {:?}", alt),
}
lits.push(lit);
}
Some(lits)
}

View File

@@ -1,6 +1,6 @@
use grep_matcher::ByteSet;
use regex_syntax::hir::{self, Hir, HirKind};
use utf8_ranges::Utf8Sequences;
use regex_syntax::utf8::Utf8Sequences;
/// Return a confirmed set of non-matching bytes from the given expression.
pub fn non_matching_bytes(expr: &Hir) -> ByteSet {

View File

@@ -103,7 +103,9 @@ impl Matcher for WordMatcher {
at: usize,
caps: &mut RegexCaptures,
) -> Result<bool, NoError> {
let r = self.regex.captures_read_at(caps.locations(), haystack, at);
let r = self.regex.captures_read_at(
caps.locations_mut(), haystack, at,
);
Ok(r.is_some())
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep-searcher"
version = "0.1.3" #:version
version = "0.1.6" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.
@@ -13,16 +13,16 @@ keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense/MIT"
[dependencies]
bstr = { version = "0.1.2", default-features = false, features = ["std"] }
bstr = { version = "0.2.0", default-features = false, features = ["std"] }
bytecount = "0.5"
encoding_rs = "0.8.14"
encoding_rs_io = "0.1.6"
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
log = "0.4.5"
memmap = "0.7"
[dev-dependencies]
grep-regex = { version = "0.1.1", path = "../grep-regex" }
grep-regex = { version = "0.1.3", path = "../grep-regex" }
regex = "1.1"
[features]

View File

@@ -17,7 +17,7 @@ fn main() {
}
}
fn example() -> Result<(), Box<Error>> {
fn example() -> Result<(), Box<dyn Error>> {
let pattern = match env::args().nth(1) {
Some(pattern) => pattern,
None => return Err(From::from(format!(

View File

@@ -1,7 +1,7 @@
use std::cmp;
use std::io;
use bstr::{BStr, BString};
use bstr::ByteSlice;
/// The default buffer capacity that we use for the line buffer.
pub(crate) const DEFAULT_BUFFER_CAPACITY: usize = 8 * (1<<10); // 8 KB
@@ -122,7 +122,7 @@ impl LineBufferBuilder {
pub fn build(&self) -> LineBuffer {
LineBuffer {
config: self.config,
buf: BString::from(vec![0; self.config.capacity]),
buf: vec![0; self.config.capacity],
pos: 0,
last_lineterm: 0,
end: 0,
@@ -254,13 +254,14 @@ impl<'b, R: io::Read> LineBufferReader<'b, R> {
/// Return the contents of this buffer.
pub fn buffer(&self) -> &[u8] {
self.line_buffer.buffer().as_bytes()
self.line_buffer.buffer()
}
/// Return the underlying buffer as a byte string. Used for tests only.
/// Return the buffer as a BStr, used for convenient equality checking
/// in tests only.
#[cfg(test)]
fn bstr(&self) -> &BStr {
self.line_buffer.buffer()
fn bstr(&self) -> &::bstr::BStr {
self.buffer().as_bstr()
}
/// Consume the number of bytes provided. This must be less than or equal
@@ -289,7 +290,7 @@ pub struct LineBuffer {
/// The configuration of this buffer.
config: Config,
/// The primary buffer with which to hold data.
buf: BString,
buf: Vec<u8>,
/// The current position of this buffer. This is always a valid sliceable
/// index into `buf`, and its maximum value is the length of `buf`.
pos: usize,
@@ -317,6 +318,14 @@ pub struct LineBuffer {
}
impl LineBuffer {
/// Set the binary detection method used on this line buffer.
///
/// This permits dynamically changing the binary detection strategy on
/// an existing line buffer without needing to create a new one.
pub fn set_binary_detection(&mut self, binary: BinaryDetection) {
self.config.binary = binary;
}
/// Reset this buffer, such that it can be used with a new reader.
fn clear(&mut self) {
self.pos = 0;
@@ -344,13 +353,13 @@ impl LineBuffer {
}
/// Return the contents of this buffer.
fn buffer(&self) -> &BStr {
fn buffer(&self) -> &[u8] {
&self.buf[self.pos..self.last_lineterm]
}
/// Return the contents of the free space beyond the end of the buffer as
/// a mutable slice.
fn free_buffer(&mut self) -> &mut BStr {
fn free_buffer(&mut self) -> &mut [u8] {
&mut self.buf[self.end..]
}
@@ -473,7 +482,7 @@ impl LineBuffer {
}
let roll_len = self.end - self.pos;
self.buf.copy_within(self.pos.., 0);
self.buf.copy_within_str(self.pos..self.end, 0);
self.pos = 0;
self.last_lineterm = roll_len;
self.end = roll_len;
@@ -511,7 +520,7 @@ impl LineBuffer {
/// Replaces `src` with `replacement` in bytes, and return the offset of the
/// first replacement, if one exists.
fn replace_bytes(bytes: &mut BStr, src: u8, replacement: u8) -> Option<usize> {
fn replace_bytes(bytes: &mut [u8], src: u8, replacement: u8) -> Option<usize> {
if src == replacement {
return None;
}
@@ -534,7 +543,7 @@ fn replace_bytes(bytes: &mut BStr, src: u8, replacement: u8) -> Option<usize> {
#[cfg(test)]
mod tests {
use std::str;
use bstr::BString;
use bstr::{ByteSlice, ByteVec};
use super::*;
const SHERLOCK: &'static str = "\
@@ -555,7 +564,7 @@ and exhibited clearly, with a label attached.\
src: u8,
replacement: u8,
) -> (String, Option<usize>) {
let mut dst = BString::from(slice);
let mut dst = Vec::from(slice);
let result = replace_bytes(&mut dst, src, replacement);
(dst.into_string().unwrap(), result)
}
@@ -669,12 +678,12 @@ and exhibited clearly, with a label attached.\
let mut linebuf = LineBufferBuilder::new().capacity(1).build();
let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);
let mut got = BString::new();
let mut got = vec![];
while rdr.fill().unwrap() {
got.push(rdr.buffer());
got.push_str(rdr.buffer());
rdr.consume_all();
}
assert_eq!(bytes, got);
assert_eq!(bytes, got.as_bstr());
assert_eq!(rdr.absolute_byte_offset(), bytes.len() as u64);
assert_eq!(rdr.binary_byte_offset(), None);
}

View File

@@ -2,7 +2,7 @@
A collection of routines for performing operations on lines.
*/
use bstr::B;
use bstr::ByteSlice;
use bytecount;
use grep_matcher::{LineTerminator, Match};
@@ -85,7 +85,7 @@ impl LineStep {
#[inline(always)]
fn next_impl(&mut self, mut bytes: &[u8]) -> Option<(usize, usize)> {
bytes = &bytes[..self.end];
match B(&bytes[self.pos..]).find_byte(self.line_term) {
match bytes[self.pos..].find_byte(self.line_term) {
None => {
if self.pos < bytes.len() {
let m = (self.pos, bytes.len());
@@ -135,14 +135,14 @@ pub fn locate(
line_term: u8,
range: Match,
) -> Match {
let line_start = B(&bytes[..range.start()])
let line_start = bytes[..range.start()]
.rfind_byte(line_term)
.map_or(0, |i| i + 1);
let line_end =
if range.end() > line_start && bytes[range.end() - 1] == line_term {
range.end()
} else {
B(&bytes[range.end()..])
bytes[range.end()..]
.find_byte(line_term)
.map_or(bytes.len(), |i| range.end() + i + 1)
};
@@ -182,7 +182,7 @@ fn preceding_by_pos(
pos -= 1;
}
loop {
match B(&bytes[..pos]).rfind_byte(line_term) {
match bytes[..pos].rfind_byte(line_term) {
None => {
return 0;
}

View File

@@ -1,6 +1,6 @@
use std::cmp;
use bstr::B;
use bstr::ByteSlice;
use grep_matcher::{LineMatchKind, Matcher};
use lines::{self, LineStep};
@@ -90,6 +90,13 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
self.sink_matched(buf, range)
}
pub fn binary_data(
&mut self,
binary_byte_offset: u64,
) -> Result<bool, S::Error> {
self.sink.binary_data(&self.searcher, binary_byte_offset)
}
pub fn begin(&mut self) -> Result<bool, S::Error> {
self.sink.begin(&self.searcher)
}
@@ -141,19 +148,28 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
consumed
}
pub fn detect_binary(&mut self, buf: &[u8], range: &Range) -> bool {
pub fn detect_binary(
&mut self,
buf: &[u8],
range: &Range,
) -> Result<bool, S::Error> {
if self.binary_byte_offset.is_some() {
return true;
return Ok(self.config.binary.quit_byte().is_some());
}
let binary_byte = match self.config.binary.0 {
BinaryDetection::Quit(b) => b,
_ => return false,
BinaryDetection::Convert(b) => b,
_ => return Ok(false),
};
if let Some(i) = B(&buf[*range]).find_byte(binary_byte) {
self.binary_byte_offset = Some(range.start() + i);
true
if let Some(i) = buf[*range].find_byte(binary_byte) {
let offset = range.start() + i;
self.binary_byte_offset = Some(offset);
if !self.binary_data(offset as u64)? {
return Ok(true);
}
Ok(self.config.binary.quit_byte().is_some())
} else {
false
Ok(false)
}
}
@@ -416,7 +432,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
buf: &[u8],
range: &Range,
) -> Result<bool, S::Error> {
if self.binary && self.detect_binary(buf, range) {
if self.binary && self.detect_binary(buf, range)? {
return Ok(false);
}
if !self.sink_break_context(range.start())? {
@@ -448,7 +464,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
buf: &[u8],
range: &Range,
) -> Result<bool, S::Error> {
if self.binary && self.detect_binary(buf, range) {
if self.binary && self.detect_binary(buf, range)? {
return Ok(false);
}
self.count_lines(buf, range.start());
@@ -478,7 +494,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
) -> Result<bool, S::Error> {
assert!(self.after_context_left >= 1);
if self.binary && self.detect_binary(buf, range) {
if self.binary && self.detect_binary(buf, range)? {
return Ok(false);
}
self.count_lines(buf, range.start());
@@ -507,7 +523,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
buf: &[u8],
range: &Range,
) -> Result<bool, S::Error> {
if self.binary && self.detect_binary(buf, range) {
if self.binary && self.detect_binary(buf, range)? {
return Ok(false);
}
self.count_lines(buf, range.start());

View File

@@ -51,6 +51,7 @@ where M: Matcher,
fn fill(&mut self) -> Result<bool, S::Error> {
assert!(self.rdr.buffer()[self.core.pos()..].is_empty());
let already_binary = self.rdr.binary_byte_offset().is_some();
let old_buf_len = self.rdr.buffer().len();
let consumed = self.core.roll(self.rdr.buffer());
self.rdr.consume(consumed);
@@ -58,7 +59,14 @@ where M: Matcher,
Err(err) => return Err(S::Error::error_io(err)),
Ok(didread) => didread,
};
if !didread || self.rdr.binary_byte_offset().is_some() {
if !already_binary {
if let Some(offset) = self.rdr.binary_byte_offset() {
if !self.core.binary_data(offset)? {
return Ok(false);
}
}
}
if !didread || self.should_binary_quit() {
return Ok(false);
}
// If rolling the buffer didn't result in consuming anything and if
@@ -71,6 +79,11 @@ where M: Matcher,
}
Ok(true)
}
fn should_binary_quit(&self) -> bool {
self.rdr.binary_byte_offset().is_some()
&& self.config.binary.quit_byte().is_some()
}
}
#[derive(Debug)]
@@ -103,7 +116,7 @@ impl<'s, M: Matcher, S: Sink> SliceByLine<'s, M, S> {
DEFAULT_BUFFER_CAPACITY,
);
let binary_range = Range::new(0, binary_upto);
if !self.core.detect_binary(self.slice, &binary_range) {
if !self.core.detect_binary(self.slice, &binary_range)? {
while
!self.slice[self.core.pos()..].is_empty()
&& self.core.match_by_line(self.slice)?
@@ -155,7 +168,7 @@ impl<'s, M: Matcher, S: Sink> MultiLine<'s, M, S> {
DEFAULT_BUFFER_CAPACITY,
);
let binary_range = Range::new(0, binary_upto);
if !self.core.detect_binary(self.slice, &binary_range) {
if !self.core.detect_binary(self.slice, &binary_range)? {
let mut keepgoing = true;
while !self.slice[self.core.pos()..].is_empty() && keepgoing {
keepgoing = self.sink()?;

View File

@@ -75,25 +75,41 @@ impl BinaryDetection {
BinaryDetection(line_buffer::BinaryDetection::Quit(binary_byte))
}
// TODO(burntsushi): Figure out how to make binary conversion work. This
// permits implementing GNU grep's default behavior, which is to zap NUL
// bytes but still execute a search (if a match is detected, then GNU grep
// stops and reports that a match was found but doesn't print the matching
// line itself).
//
// This behavior is pretty simple to implement using the line buffer (and
// in fact, it is already implemented and tested), since there's a fixed
// size buffer that we can easily write to. The issue arises when searching
// a `&[u8]` (whether on the heap or via a memory map), since this isn't
// something we can easily write to.
/// The given byte is searched in all contents read by the line buffer. If
/// it occurs, then it is replaced by the line terminator. The line buffer
/// guarantees that this byte will never be observable by callers.
#[allow(dead_code)]
fn convert(binary_byte: u8) -> BinaryDetection {
/// Binary detection is performed by looking for the given byte, and
/// replacing it with the line terminator configured on the searcher.
/// (If the searcher is configured to use `CRLF` as the line terminator,
/// then this byte is replaced by just `LF`.)
///
/// When searching is performed using a fixed size buffer, then the
/// contents of that buffer are always searched for the presence of this
/// byte and replaced with the line terminator. In effect, the caller is
/// guaranteed to never observe this byte while searching.
///
/// When searching is performed with the entire contents mapped into
/// memory, then this setting has no effect and is ignored.
pub fn convert(binary_byte: u8) -> BinaryDetection {
BinaryDetection(line_buffer::BinaryDetection::Convert(binary_byte))
}
/// If this binary detection uses the "quit" strategy, then this returns
/// the byte that will cause a search to quit. In any other case, this
/// returns `None`.
pub fn quit_byte(&self) -> Option<u8> {
match self.0 {
line_buffer::BinaryDetection::Quit(b) => Some(b),
_ => None,
}
}
/// If this binary detection uses the "convert" strategy, then this returns
/// the byte that will be replaced by the line terminator. In any other
/// case, this returns `None`.
pub fn convert_byte(&self) -> Option<u8> {
match self.0 {
line_buffer::BinaryDetection::Convert(b) => Some(b),
_ => None,
}
}
}
/// An encoding to use when searching.
@@ -739,6 +755,12 @@ impl Searcher {
}
}
/// Set the binary detection method used on this searcher.
pub fn set_binary_detection(&mut self, detection: BinaryDetection) {
self.config.binary = detection.clone();
self.line_buffer.borrow_mut().set_binary_detection(detection.0);
}
/// Check that the searcher's configuration and the matcher are consistent
/// with each other.
fn check_config<M: Matcher>(&self, matcher: M) -> Result<(), ConfigError> {
@@ -778,6 +800,12 @@ impl Searcher {
self.config.line_term
}
/// Returns the type of binary detection configured on this searcher.
#[inline]
pub fn binary_detection(&self) -> &BinaryDetection {
&self.config.binary
}
/// Returns true if and only if this searcher is configured to invert its
/// search results. That is, matching lines are lines that do **not** match
/// the searcher's matcher.

View File

@@ -1,3 +1,4 @@
use std::error;
use std::fmt;
use std::io;
@@ -49,9 +50,9 @@ impl SinkError for io::Error {
/// A `Box<std::error::Error>` can be used as an error for `Sink`
/// implementations out of the box.
impl SinkError for Box<::std::error::Error> {
fn error_message<T: fmt::Display>(message: T) -> Box<::std::error::Error> {
Box::<::std::error::Error>::from(message.to_string())
impl SinkError for Box<dyn error::Error> {
fn error_message<T: fmt::Display>(message: T) -> Box<dyn error::Error> {
Box::<dyn error::Error>::from(message.to_string())
}
}
@@ -167,6 +168,28 @@ pub trait Sink {
Ok(true)
}
/// This method is called whenever binary detection is enabled and binary
/// data is found. If binary data is found, then this is called at least
/// once for the first occurrence with the absolute byte offset at which
/// the binary data begins.
///
/// If this returns `true`, then searching continues. If this returns
/// `false`, then searching is stopped immediately and `finish` is called.
///
/// If this returns an error, then searching is stopped immediately,
/// `finish` is not called and the error is bubbled back up to the caller
/// of the searcher.
///
/// By default, it does nothing and returns `true`.
#[inline]
fn binary_data(
&mut self,
_searcher: &Searcher,
_binary_byte_offset: u64,
) -> Result<bool, Self::Error> {
Ok(true)
}
/// This method is called when a search has begun, before any search is
/// executed. By default, this does nothing.
///
@@ -228,6 +251,15 @@ impl<'a, S: Sink> Sink for &'a mut S {
(**self).context_break(searcher)
}
#[inline]
fn binary_data(
&mut self,
searcher: &Searcher,
binary_byte_offset: u64,
) -> Result<bool, S::Error> {
(**self).binary_data(searcher, binary_byte_offset)
}
#[inline]
fn begin(
&mut self,
@@ -275,6 +307,15 @@ impl<S: Sink + ?Sized> Sink for Box<S> {
(**self).context_break(searcher)
}
#[inline]
fn binary_data(
&mut self,
searcher: &Searcher,
binary_byte_offset: u64,
) -> Result<bool, S::Error> {
(**self).binary_data(searcher, binary_byte_offset)
}
#[inline]
fn begin(
&mut self,

View File

@@ -1,7 +1,7 @@
use std::io::{self, Write};
use std::str;
use bstr::B;
use bstr::ByteSlice;
use grep_matcher::{
LineMatchKind, LineTerminator, Match, Matcher, NoCaptures, NoError,
};
@@ -94,7 +94,7 @@ impl Matcher for RegexMatcher {
}
// Make it interesting and return the last byte in the current
// line.
let i = B(haystack)
let i = haystack
.find_byte(self.line_term.unwrap().as_byte())
.map(|i| i)
.unwrap_or(haystack.len() - 1);

View File

@@ -1,6 +1,6 @@
[package]
name = "grep"
version = "0.2.3" #:version
version = "0.2.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.
@@ -13,12 +13,12 @@ keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense/MIT"
[dependencies]
grep-cli = { version = "0.1.1", path = "../grep-cli" }
grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
grep-pcre2 = { version = "0.1.2", path = "../grep-pcre2", optional = true }
grep-printer = { version = "0.1.1", path = "../grep-printer" }
grep-regex = { version = "0.1.1", path = "../grep-regex" }
grep-searcher = { version = "0.1.1", path = "../grep-searcher" }
grep-cli = { version = "0.1.2", path = "../grep-cli" }
grep-matcher = { version = "0.1.2", path = "../grep-matcher" }
grep-pcre2 = { version = "0.1.3", path = "../grep-pcre2", optional = true }
grep-printer = { version = "0.1.2", path = "../grep-printer" }
grep-regex = { version = "0.1.3", path = "../grep-regex" }
grep-searcher = { version = "0.1.4", path = "../grep-searcher" }
[dev-dependencies]
termcolor = "1.0.4"

View File

@@ -21,7 +21,7 @@ fn main() {
}
}
fn try_main() -> Result<(), Box<Error>> {
fn try_main() -> Result<(), Box<dyn Error>> {
let mut args: Vec<OsString> = env::args_os().collect();
if args.len() < 2 {
return Err("Usage: simplegrep <pattern> [<path> ...]".into());
@@ -32,7 +32,7 @@ fn try_main() -> Result<(), Box<Error>> {
search(cli::pattern_from_os(&args[1])?, &args[2..])
}
fn search(pattern: &str, paths: &[OsString]) -> Result<(), Box<Error>> {
fn search(pattern: &str, paths: &[OsString]) -> Result<(), Box<dyn Error>> {
let matcher = RegexMatcher::new_line_matcher(&pattern)?;
let mut searcher = SearcherBuilder::new()
.binary_detection(BinaryDetection::quit(b'\x00'))

View File

@@ -1,6 +1,6 @@
[package]
name = "ignore"
version = "0.4.6" #:version
version = "0.4.10" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A fast library for efficiently matching ignore files such as `.gitignore`
@@ -19,7 +19,7 @@ bench = false
[dependencies]
crossbeam-channel = "0.3.6"
globset = { version = "0.4.2", path = "../globset" }
globset = { version = "0.4.3", path = "../globset" }
lazy_static = "1.1"
log = "0.4.5"
memchr = "2.1"
@@ -31,8 +31,5 @@ walkdir = "2.2.7"
[target.'cfg(windows)'.dependencies.winapi-util]
version = "0.1.2"
[dev-dependencies]
tempfile = "3.0.5"
[features]
simd-accel = ["globset/simd-accel"]

View File

@@ -14,13 +14,13 @@
// well.
use std::collections::HashMap;
use std::ffi::{OsString, OsStr};
use std::ffi::{OsStr, OsString};
use std::path::{Path, PathBuf};
use std::sync::{Arc, RwLock};
use gitignore::{self, Gitignore, GitignoreBuilder};
use pathutil::{is_hidden, strip_prefix};
use overrides::{self, Override};
use pathutil::{is_hidden, strip_prefix};
use types::{self, Types};
use walk::DirEntry;
use {Error, Match, PartialErrorBuilder};
@@ -152,10 +152,7 @@ impl Ignore {
///
/// Note that this can only be called on an `Ignore` matcher with no
/// parents (i.e., `is_root` returns `true`). This will panic otherwise.
pub fn add_parents<P: AsRef<Path>>(
&self,
path: P,
) -> (Ignore, Option<Error>) {
pub fn add_parents<P: AsRef<Path>>(&self, path: P) -> (Ignore, Option<Error>) {
if !self.0.opts.parents
&& !self.0.opts.git_ignore
&& !self.0.opts.git_exclude
@@ -197,7 +194,11 @@ impl Ignore {
errs.maybe_push(err);
igtmp.is_absolute_parent = true;
igtmp.absolute_base = Some(absolute_base.clone());
igtmp.has_git = parent.join(".git").exists();
igtmp.has_git = if self.0.opts.git_ignore {
parent.join(".git").exists()
} else {
false
};
ig = Ignore(Arc::new(igtmp));
compiled.insert(parent.as_os_str().to_os_string(), ig.clone());
}
@@ -212,10 +213,7 @@ impl Ignore {
/// returned if it exists.
///
/// Note that all I/O errors are completely ignored.
pub fn add_child<P: AsRef<Path>>(
&self,
dir: P,
) -> (Ignore, Option<Error>) {
pub fn add_child<P: AsRef<Path>>(&self, dir: P) -> (Ignore, Option<Error>) {
let (ig, err) = self.add_child_path(dir.as_ref());
(Ignore(Arc::new(ig)), err)
}
@@ -223,58 +221,49 @@ impl Ignore {
/// Like add_child, but takes a full path and returns an IgnoreInner.
fn add_child_path(&self, dir: &Path) -> (IgnoreInner, Option<Error>) {
let mut errs = PartialErrorBuilder::default();
let custom_ig_matcher =
if self.0.custom_ignore_filenames.is_empty() {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(
&dir,
&self.0.custom_ignore_filenames,
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let ig_matcher =
if !self.0.opts.ignore {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(
&dir,
&[".ignore"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let gi_matcher =
if !self.0.opts.git_ignore {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(
&dir,
&[".gitignore"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let gi_exclude_matcher =
if !self.0.opts.git_exclude {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(
&dir,
&[".git/info/exclude"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let custom_ig_matcher = if self.0.custom_ignore_filenames.is_empty() {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(
&dir,
&self.0.custom_ignore_filenames,
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let ig_matcher = if !self.0.opts.ignore {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(&dir, &[".ignore"], self.0.opts.ignore_case_insensitive);
errs.maybe_push(err);
m
};
let gi_matcher = if !self.0.opts.git_ignore {
Gitignore::empty()
} else {
let (m, err) =
create_gitignore(&dir, &[".gitignore"], self.0.opts.ignore_case_insensitive);
errs.maybe_push(err);
m
};
let gi_exclude_matcher = if !self.0.opts.git_exclude {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(
&dir,
&[".git/info/exclude"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let has_git = if self.0.opts.git_ignore {
dir.join(".git").exists()
} else {
false
};
let ig = IgnoreInner {
compiled: self.0.compiled.clone(),
dir: dir.to_path_buf(),
@@ -290,7 +279,7 @@ impl Ignore {
git_global_matcher: self.0.git_global_matcher.clone(),
git_ignore_matcher: gi_matcher,
git_exclude_matcher: gi_exclude_matcher,
has_git: dir.join(".git").exists(),
has_git: has_git,
opts: self.0.opts,
};
(ig, errs.into_error_option())
@@ -302,16 +291,16 @@ impl Ignore {
let has_custom_ignore_files = !self.0.custom_ignore_filenames.is_empty();
let has_explicit_ignores = !self.0.explicit_ignores.is_empty();
opts.ignore || opts.git_global || opts.git_ignore
|| opts.git_exclude || has_custom_ignore_files
|| has_explicit_ignores
opts.ignore
|| opts.git_global
|| opts.git_ignore
|| opts.git_exclude
|| has_custom_ignore_files
|| has_explicit_ignores
}
/// Like `matched`, but works with a directory entry instead.
pub fn matched_dir_entry<'a>(
&'a self,
dent: &DirEntry,
) -> Match<IgnoreMatch<'a>> {
pub fn matched_dir_entry<'a>(&'a self, dent: &DirEntry) -> Match<IgnoreMatch<'a>> {
let m = self.matched(dent.path(), dent.is_dir());
if m.is_none() && self.0.opts.hidden && is_hidden(dent) {
return Match::Ignore(IgnoreMatch::hidden());
@@ -323,11 +312,7 @@ impl Ignore {
/// ignored or not.
///
/// The match contains information about its origin.
fn matched<'a, P: AsRef<Path>>(
&'a self,
path: P,
is_dir: bool,
) -> Match<IgnoreMatch<'a>> {
fn matched<'a, P: AsRef<Path>>(&'a self, path: P, is_dir: bool) -> Match<IgnoreMatch<'a>> {
// We need to be careful with our path. If it has a leading ./, then
// strip it because it causes nothing but trouble.
let mut path = path.as_ref();
@@ -339,9 +324,11 @@ impl Ignore {
// return that result immediately. Overrides have the highest
// precedence.
if !self.0.overrides.is_empty() {
let mat =
self.0.overrides.matched(path, is_dir)
.map(IgnoreMatch::overrides);
let mat = self
.0
.overrides
.matched(path, is_dir)
.map(IgnoreMatch::overrides);
if !mat.is_none() {
return mat;
}
@@ -356,8 +343,7 @@ impl Ignore {
}
}
if !self.0.types.is_empty() {
let mat =
self.0.types.matched(path, is_dir).map(IgnoreMatch::types);
let mat = self.0.types.matched(path, is_dir).map(IgnoreMatch::types);
if mat.is_ignore() {
return mat;
} else if mat.is_whitelist() {
@@ -369,61 +355,70 @@ impl Ignore {
/// Performs matching only on the ignore files for this directory and
/// all parent directories.
fn matched_ignore<'a>(
&'a self,
path: &Path,
is_dir: bool,
) -> Match<IgnoreMatch<'a>> {
let (mut m_custom_ignore, mut m_ignore, mut m_gi, mut m_gi_exclude, mut m_explicit) =
(Match::None, Match::None, Match::None, Match::None, Match::None);
fn matched_ignore<'a>(&'a self, path: &Path, is_dir: bool) -> Match<IgnoreMatch<'a>> {
let (mut m_custom_ignore, mut m_ignore, mut m_gi, mut m_gi_exclude, mut m_explicit) = (
Match::None,
Match::None,
Match::None,
Match::None,
Match::None,
);
let any_git = self.parents().any(|ig| ig.0.has_git);
let mut saw_git = false;
for ig in self.parents().take_while(|ig| !ig.0.is_absolute_parent) {
if m_custom_ignore.is_none() {
m_custom_ignore =
ig.0.custom_ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.custom_ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_exclude_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
if self.0.opts.parents {
if let Some(abs_parent_path) = self.absolute_base() {
let path = abs_parent_path.join(path);
for ig in self.parents().skip_while(|ig|!ig.0.is_absolute_parent) {
for ig in self.parents().skip_while(|ig| !ig.0.is_absolute_parent) {
if m_custom_ignore.is_none() {
m_custom_ignore =
ig.0.custom_ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.custom_ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_exclude_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
@@ -435,16 +430,21 @@ impl Ignore {
}
m_explicit = gi.matched(&path, is_dir).map(IgnoreMatch::gitignore);
}
let m_global =
if any_git {
self.0.git_global_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore)
} else {
Match::None
};
let m_global = if any_git {
self.0
.git_global_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore)
} else {
Match::None
};
m_custom_ignore.or(m_ignore).or(m_gi).or(m_gi_exclude).or(m_global).or(m_explicit)
m_custom_ignore
.or(m_ignore)
.or(m_gi)
.or(m_gi_exclude)
.or(m_global)
.or(m_explicit)
}
/// Returns an iterator over parent ignore matchers, including this one.
@@ -524,20 +524,19 @@ impl IgnoreBuilder {
/// The matcher returned won't match anything until ignore rules from
/// directories are added to it.
pub fn build(&self) -> Ignore {
let git_global_matcher =
if !self.opts.git_global {
Gitignore::empty()
} else {
let mut builder = GitignoreBuilder::new("");
builder
.case_insensitive(self.opts.ignore_case_insensitive)
.unwrap();
let (gi, err) = builder.build_global();
if let Some(err) = err {
debug!("{}", err);
}
gi
};
let git_global_matcher = if !self.opts.git_global {
Gitignore::empty()
} else {
let mut builder = GitignoreBuilder::new("");
builder
.case_insensitive(self.opts.ignore_case_insensitive)
.unwrap();
let (gi, err) = builder.build_global();
if let Some(err) = err {
debug!("{}", err);
}
gi
};
Ignore(Arc::new(IgnoreInner {
compiled: Arc::new(RwLock::new(HashMap::new())),
@@ -593,9 +592,10 @@ impl IgnoreBuilder {
/// later names.
pub fn add_custom_ignore_filename<S: AsRef<OsStr>>(
&mut self,
file_name: S
file_name: S,
) -> &mut IgnoreBuilder {
self.custom_ignore_filenames.push(file_name.as_ref().to_os_string());
self.custom_ignore_filenames
.push(file_name.as_ref().to_os_string());
self
}
@@ -667,10 +667,7 @@ impl IgnoreBuilder {
/// Process ignore files case insensitively
///
/// This is disabled by default.
pub fn ignore_case_insensitive(
&mut self,
yes: bool,
) -> &mut IgnoreBuilder {
pub fn ignore_case_insensitive(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.ignore_case_insensitive = yes;
self
}
@@ -710,10 +707,9 @@ mod tests {
use std::io::Write;
use std::path::Path;
use tempfile::{self, TempDir};
use dir::IgnoreBuilder;
use gitignore::Gitignore;
use tests::TempDir;
use Error;
fn wfile<P: AsRef<Path>>(path: P, contents: &str) {
@@ -732,19 +728,21 @@ mod tests {
}
}
fn tmpdir(prefix: &str) -> TempDir {
tempfile::Builder::new().prefix(prefix).tempdir().unwrap()
fn tmpdir() -> TempDir {
TempDir::new().unwrap()
}
#[test]
fn explicit_ignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join("not-an-ignore"), "foo\n!bar");
let (gi, err) = Gitignore::new(td.path().join("not-an-ignore"));
assert!(err.is_none());
let (ig, err) = IgnoreBuilder::new()
.add_ignore(gi).build().add_child(td.path());
.add_ignore(gi)
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
@@ -753,7 +751,7 @@ mod tests {
#[test]
fn git_exclude() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git/info"));
wfile(td.path().join(".git/info/exclude"), "foo\n!bar");
@@ -766,7 +764,7 @@ mod tests {
#[test]
fn gitignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
wfile(td.path().join(".gitignore"), "foo\n!bar");
@@ -779,7 +777,7 @@ mod tests {
#[test]
fn gitignore_no_git() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -791,7 +789,7 @@ mod tests {
#[test]
fn ignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".ignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -803,13 +801,14 @@ mod tests {
#[test]
fn custom_ignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
let custom_ignore = ".customignore";
wfile(td.path().join(custom_ignore), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
@@ -819,14 +818,15 @@ mod tests {
// Tests that a custom ignore file will override an .ignore.
#[test]
fn custom_ignore_over_ignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
let custom_ignore = ".customignore";
wfile(td.path().join(".ignore"), "foo");
wfile(td.path().join(custom_ignore), "!foo");
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_whitelist());
}
@@ -834,7 +834,7 @@ mod tests {
// Tests that earlier custom ignore files have lower precedence than later.
#[test]
fn custom_ignore_precedence() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
let custom_ignore1 = ".customignore1";
let custom_ignore2 = ".customignore2";
wfile(td.path().join(custom_ignore1), "foo");
@@ -843,7 +843,8 @@ mod tests {
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore1)
.add_custom_ignore_filename(custom_ignore2)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_whitelist());
}
@@ -851,7 +852,7 @@ mod tests {
// Tests that an .ignore will override a .gitignore.
#[test]
fn ignore_over_gitignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "foo");
wfile(td.path().join(".ignore"), "!foo");
@@ -863,7 +864,7 @@ mod tests {
// Tests that exclude has lower precedent than both .ignore and .gitignore.
#[test]
fn exclude_lowest() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "!foo");
wfile(td.path().join(".ignore"), "!bar");
mkdirp(td.path().join(".git/info"));
@@ -878,7 +879,7 @@ mod tests {
#[test]
fn errored() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo");
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -887,7 +888,7 @@ mod tests {
#[test]
fn errored_both() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo");
wfile(td.path().join(".ignore"), "{bar");
@@ -897,7 +898,7 @@ mod tests {
#[test]
fn errored_partial() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
wfile(td.path().join(".gitignore"), "{foo\nbar");
@@ -908,7 +909,7 @@ mod tests {
#[test]
fn errored_partial_and_ignore() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo\nbar");
wfile(td.path().join(".ignore"), "!bar");
@@ -919,7 +920,7 @@ mod tests {
#[test]
fn not_present_empty() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
@@ -929,7 +930,7 @@ mod tests {
fn stops_at_git_dir() {
// This tests that .gitignore files beyond a .git barrier aren't
// matched, but .ignore files are.
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo/.git"));
wfile(td.path().join(".gitignore"), "foo");
@@ -950,7 +951,7 @@ mod tests {
#[test]
fn absolute_parent() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo"));
wfile(td.path().join(".gitignore"), "bar");
@@ -973,7 +974,7 @@ mod tests {
#[test]
fn absolute_parent_anchored() {
let td = tmpdir("ignore-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("src/llvm"));
wfile(td.path().join(".gitignore"), "/llvm/\nfoo");

View File

@@ -537,7 +537,7 @@ impl GitignoreBuilder {
///
/// Note that the file path returned may not exist.
fn gitconfig_excludes_path() -> Option<PathBuf> {
// git supports $HOME/.gitconfig and $XDG_CONFIG_DIR/git/config. Notably,
// git supports $HOME/.gitconfig and $XDG_CONFIG_HOME/git/config. Notably,
// both can be active at the same time, where $HOME/.gitconfig takes
// precedent. So if $HOME/.gitconfig defines a `core.excludesFile`, then
// we're done.
@@ -568,7 +568,7 @@ fn gitconfig_home_contents() -> Option<Vec<u8>> {
}
/// Returns the file contents of git's global config file, if one exists, in
/// the user's XDG_CONFIG_DIR directory.
/// the user's XDG_CONFIG_HOME directory.
fn gitconfig_xdg_contents() -> Option<Vec<u8>> {
let path = env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })

View File

@@ -55,8 +55,6 @@ extern crate log;
extern crate memchr;
extern crate regex;
extern crate same_file;
#[cfg(test)]
extern crate tempfile;
extern crate thread_local;
extern crate walkdir;
#[cfg(windows)]
@@ -442,3 +440,66 @@ impl<T> Match<T> {
}
}
}
#[cfg(test)]
mod tests {
use std::env;
use std::error;
use std::fs;
use std::path::{Path, PathBuf};
use std::result;
/// A convenient result type alias.
pub type Result<T> =
result::Result<T, Box<dyn error::Error + Send + Sync>>;
macro_rules! err {
($($tt:tt)*) => {
Box::<dyn error::Error + Send + Sync>::from(format!($($tt)*))
}
}
/// A simple wrapper for creating a temporary directory that is
/// automatically deleted when it's dropped.
///
/// We use this in lieu of tempfile because tempfile brings in too many
/// dependencies.
#[derive(Debug)]
pub struct TempDir(PathBuf);
impl Drop for TempDir {
fn drop(&mut self) {
fs::remove_dir_all(&self.0).unwrap();
}
}
impl TempDir {
/// Create a new empty temporary directory under the system's configured
/// temporary directory.
pub fn new() -> Result<TempDir> {
use std::sync::atomic::{AtomicUsize, Ordering};
static TRIES: usize = 100;
static COUNTER: AtomicUsize = AtomicUsize::new(0);
let tmpdir = env::temp_dir();
for _ in 0..TRIES {
let count = COUNTER.fetch_add(1, Ordering::SeqCst);
let path = tmpdir.join("rust-ignore").join(count.to_string());
if path.is_dir() {
continue;
}
fs::create_dir_all(&path).map_err(|e| {
err!("failed to create {}: {}", path.display(), e)
})?;
return Ok(TempDir(path));
}
Err(err!("failed to create temp dir after {} tries", TRIES))
}
/// Return the underlying path to this temporary directory.
pub fn path(&self) -> &Path {
&self.0
}
}
}

View File

@@ -111,7 +111,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("brotli", &["*.br"]),
("buildstream", &["*.bst"]),
("bzip2", &["*.bz2", "*.tbz2"]),
("c", &["*.c", "*.h", "*.H", "*.cats"]),
("c", &["*.[chH]", "*.[chH].in", "*.cats"]),
("cabal", &["*.cabal"]),
("cbor", &["*.cbor"]),
("ceylon", &["*.ceylon"]),
@@ -121,8 +121,8 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("creole", &["*.creole"]),
("config", &["*.cfg", "*.conf", "*.config", "*.ini"]),
("cpp", &[
"*.C", "*.cc", "*.cpp", "*.cxx",
"*.h", "*.H", "*.hh", "*.hpp", "*.hxx", "*.inl",
"*.[ChH]", "*.cc", "*.[ch]pp", "*.[ch]xx", "*.hh", "*.inl",
"*.[ChH].in", "*.cc.in", "*.[ch]pp.in", "*.[ch]xx.in", "*.hh.in",
]),
("crystal", &["Projectfile", "*.cr"]),
("cs", &["*.cs"]),
@@ -135,6 +135,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("d", &["*.d"]),
("dhall", &["*.dhall"]),
("docker", &["*Dockerfile*"]),
("edn", &["*.edn"]),
("elisp", &["*.el"]),
("elixir", &["*.ex", "*.eex", "*.exs"]),
("elm", &["*.elm"]),
@@ -146,6 +147,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("gap", &["*.g", "*.gap", "*.gi", "*.gd", "*.tst"]),
("gn", &["*.gn", "*.gni"]),
("go", &["*.go"]),
("gzip", &["*.gz", "*.tgz"]),
@@ -156,7 +158,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("hs", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html", "*.ejs"]),
("idris", &["*.idr", "*.lidr"]),
("java", &["*.java", "*.jsp"]),
("java", &["*.java", "*.jsp", "*.jspx", "*.properties"]),
("jinja", &["*.j2", "*.jinja", "*.jinja2"]),
("js", &[
"*.js", "*.jsx", "*.vue",
@@ -196,14 +198,16 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
"OFL-*[0-9]*",
]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
("lock", &["*.lock", "package-lock.json"]),
("log", &["*.log"]),
("lua", &["*.lua"]),
("lzma", &["*.lzma"]),
("lz4", &["*.lz4"]),
("m4", &["*.ac", "*.m4"]),
("make", &[
"gnumakefile", "Gnumakefile", "GNUmakefile",
"makefile", "Makefile",
"[Gg][Nn][Uu]makefile", "[Mm]akefile",
"[Gg][Nn][Uu]makefile.am", "[Mm]akefile.am",
"[Gg][Nn][Uu]makefile.in", "[Mm]akefile.in",
"*.mk", "*.mak"
]),
("mako", &["*.mako", "*.mao"]),
@@ -216,7 +220,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("msbuild", &[
"*.csproj", "*.fsproj", "*.vcxproj", "*.proj", "*.props", "*.targets"
]),
("nim", &["*.nim"]),
("nim", &["*.nim", "*.nimf", "*.nimble", "*.nims"]),
("nix", &["*.nix"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
@@ -238,6 +242,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("readme", &["README*", "*README"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rdoc", &["*.rdoc"]),
("robot", &["*.robot"]),
("rst", &["*.rst"]),
("ruby", &["Gemfile", "*.gemspec", ".irbrc", "Rakefile", "*.rb"]),
("rust", &["*.rs"]),
@@ -299,7 +304,10 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("vimscript", &["*.vim"]),
("wiki", &["*.mediawiki", "*.wiki"]),
("webidl", &["*.idl", "*.webidl", "*.widl"]),
("xml", &["*.xml", "*.xml.dist"]),
("xml", &[
"*.xml", "*.xml.dist", "*.dtd", "*.xsl", "*.xslt", "*.xsd", "*.xjb",
"*.rng", "*.sch",
]),
("xz", &["*.xz", "*.txz"]),
("yacc", &["*.y"]),
("yaml", &["*.yaml", "*.yml"]),

View File

@@ -4,8 +4,8 @@ use std::fmt;
use std::fs::{self, FileType, Metadata};
use std::io;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::Arc;
use std::thread;
use std::time::Duration;
use std::vec;
@@ -182,14 +182,14 @@ impl DirEntryInner {
match *self {
Stdin => {
let err = Error::Io(io::Error::new(
io::ErrorKind::Other, "<stdin> has no metadata"));
io::ErrorKind::Other,
"<stdin> has no metadata",
));
Err(err.with_path("<stdin>"))
}
Walkdir(ref x) => {
x.metadata().map_err(|err| {
Error::Io(io::Error::from(err)).with_path(x.path())
})
}
Walkdir(ref x) => x
.metadata()
.map_err(|err| Error::Io(io::Error::from(err)).with_path(x.path())),
Raw(ref x) => x.metadata(),
}
}
@@ -223,8 +223,8 @@ impl DirEntryInner {
#[cfg(unix)]
fn ino(&self) -> Option<u64> {
use walkdir::DirEntryExt;
use self::DirEntryInner::*;
use walkdir::DirEntryExt;
match *self {
Stdin => None,
Walkdir(ref x) => Some(x.ino()),
@@ -297,7 +297,8 @@ impl DirEntryRaw {
fs::metadata(&self.path)
} else {
Ok(self.metadata.clone())
}.map_err(|err| Error::Io(io::Error::from(err)).with_path(&self.path))
}
.map_err(|err| Error::Io(io::Error::from(err)).with_path(&self.path))
}
#[cfg(not(windows))]
@@ -306,7 +307,8 @@ impl DirEntryRaw {
fs::metadata(&self.path)
} else {
fs::symlink_metadata(&self.path)
}.map_err(|err| Error::Io(io::Error::from(err)).with_path(&self.path))
}
.map_err(|err| Error::Io(io::Error::from(err)).with_path(&self.path))
}
fn file_type(&self) -> FileType {
@@ -314,7 +316,9 @@ impl DirEntryRaw {
}
fn file_name(&self) -> &OsStr {
self.path.file_name().unwrap_or_else(|| self.path.as_os_str())
self.path
.file_name()
.unwrap_or_else(|| self.path.as_os_str())
}
fn depth(&self) -> usize {
@@ -326,10 +330,7 @@ impl DirEntryRaw {
self.ino
}
fn from_entry(
depth: usize,
ent: &fs::DirEntry,
) -> Result<DirEntryRaw, Error> {
fn from_entry(depth: usize, ent: &fs::DirEntry) -> Result<DirEntryRaw, Error> {
let ty = ent.file_type().map_err(|err| {
let err = Error::Io(io::Error::from(err)).with_path(ent.path());
Error::WithDepth {
@@ -379,15 +380,22 @@ impl DirEntryRaw {
})
}
#[cfg(not(unix))]
fn from_path(
// Placeholder implementation to allow compiling on non-standard platforms (e.g. wasm32).
#[cfg(not(any(windows, unix)))]
fn from_entry_os(
depth: usize,
pb: PathBuf,
link: bool,
ent: &fs::DirEntry,
ty: fs::FileType,
) -> Result<DirEntryRaw, Error> {
let md = fs::metadata(&pb).map_err(|err| {
Error::Io(err).with_path(&pb)
})?;
Err(Error::Io(io::Error::new(
io::ErrorKind::Other,
"unsupported platform",
)))
}
#[cfg(windows)]
fn from_path(depth: usize, pb: PathBuf, link: bool) -> Result<DirEntryRaw, Error> {
let md = fs::metadata(&pb).map_err(|err| Error::Io(err).with_path(&pb))?;
Ok(DirEntryRaw {
path: pb,
ty: md.file_type(),
@@ -398,16 +406,10 @@ impl DirEntryRaw {
}
#[cfg(unix)]
fn from_path(
depth: usize,
pb: PathBuf,
link: bool,
) -> Result<DirEntryRaw, Error> {
fn from_path(depth: usize, pb: PathBuf, link: bool) -> Result<DirEntryRaw, Error> {
use std::os::unix::fs::MetadataExt;
let md = fs::metadata(&pb).map_err(|err| {
Error::Io(err).with_path(&pb)
})?;
let md = fs::metadata(&pb).map_err(|err| Error::Io(err).with_path(&pb))?;
Ok(DirEntryRaw {
path: pb,
ty: md.file_type(),
@@ -416,6 +418,15 @@ impl DirEntryRaw {
ino: md.ino(),
})
}
// Placeholder implementation to allow compiling on non-standard platforms (e.g. wasm32).
#[cfg(not(any(windows, unix)))]
fn from_path(depth: usize, pb: PathBuf, link: bool) -> Result<DirEntryRaw, Error> {
Err(Error::Io(io::Error::new(
io::ErrorKind::Other,
"unsupported platform",
)))
}
}
/// WalkBuilder builds a recursive directory iterator.
@@ -481,8 +492,8 @@ pub struct WalkBuilder {
#[derive(Clone)]
enum Sorter {
ByName(Arc<Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static>),
ByPath(Arc<Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static>),
ByName(Arc<dyn Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static>),
ByPath(Arc<dyn Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static>),
}
impl fmt::Debug for WalkBuilder {
@@ -525,33 +536,34 @@ impl WalkBuilder {
let follow_links = self.follow_links;
let max_depth = self.max_depth;
let sorter = self.sorter.clone();
let its = self.paths.iter().map(move |p| {
if p == Path::new("-") {
(p.to_path_buf(), None)
} else {
let mut wd = WalkDir::new(p);
wd = wd.follow_links(follow_links || p.is_file());
wd = wd.same_file_system(self.same_file_system);
if let Some(max_depth) = max_depth {
wd = wd.max_depth(max_depth);
}
if let Some(ref sorter) = sorter {
match sorter.clone() {
Sorter::ByName(cmp) => {
wd = wd.sort_by(move |a, b| {
cmp(a.file_name(), b.file_name())
});
}
Sorter::ByPath(cmp) => {
wd = wd.sort_by(move |a, b| {
cmp(a.path(), b.path())
});
let its = self
.paths
.iter()
.map(move |p| {
if p == Path::new("-") {
(p.to_path_buf(), None)
} else {
let mut wd = WalkDir::new(p);
wd = wd.follow_links(follow_links || p.is_file());
wd = wd.same_file_system(self.same_file_system);
if let Some(max_depth) = max_depth {
wd = wd.max_depth(max_depth);
}
if let Some(ref sorter) = sorter {
match sorter.clone() {
Sorter::ByName(cmp) => {
wd = wd.sort_by(move |a, b| cmp(a.file_name(), b.file_name()));
}
Sorter::ByPath(cmp) => {
wd = wd.sort_by(move |a, b| cmp(a.path(), b.path()));
}
}
}
(p.to_path_buf(), Some(WalkEventIter::from(wd)))
}
(p.to_path_buf(), Some(WalkEventIter::from(wd)))
}
}).collect::<Vec<_>>().into_iter();
})
.collect::<Vec<_>>()
.into_iter();
let ig_root = self.ig_builder.build();
Walk {
its: its,
@@ -635,8 +647,12 @@ impl WalkBuilder {
let mut errs = PartialErrorBuilder::default();
errs.maybe_push(builder.add(path));
match builder.build() {
Ok(gi) => { self.ig_builder.add_ignore(gi); }
Err(err) => { errs.push(err); }
Ok(gi) => {
self.ig_builder.add_ignore(gi);
}
Err(err) => {
errs.push(err);
}
}
errs.into_error_option()
}
@@ -649,7 +665,7 @@ impl WalkBuilder {
/// later names.
pub fn add_custom_ignore_filename<S: AsRef<OsStr>>(
&mut self,
file_name: S
file_name: S,
) -> &mut WalkBuilder {
self.ig_builder.add_custom_ignore_filename(file_name);
self
@@ -786,11 +802,9 @@ impl WalkBuilder {
/// by `sort_by_file_name`.
///
/// Note that this is not used in the parallel iterator.
pub fn sort_by_file_path<F>(
&mut self,
cmp: F,
) -> &mut WalkBuilder
where F: Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static
pub fn sort_by_file_path<F>(&mut self, cmp: F) -> &mut WalkBuilder
where
F: Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static,
{
self.sorter = Some(Sorter::ByPath(Arc::new(cmp)));
self
@@ -808,7 +822,8 @@ impl WalkBuilder {
///
/// Note that this is not used in the parallel iterator.
pub fn sort_by_file_name<F>(&mut self, cmp: F) -> &mut WalkBuilder
where F: Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static
where
F: Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static,
{
self.sorter = Some(Sorter::ByName(Arc::new(cmp)));
self
@@ -989,7 +1004,11 @@ enum WalkEvent {
impl From<WalkDir> for WalkEventIter {
fn from(it: WalkDir) -> WalkEventIter {
WalkEventIter { depth: 0, it: it.into_iter(), next: None }
WalkEventIter {
depth: 0,
it: it.into_iter(),
next: None,
}
}
}
@@ -1072,10 +1091,10 @@ impl WalkParallel {
/// Execute the parallel recursive directory iterator. `mkf` is called
/// for each thread used for iteration. The function produced by `mkf`
/// is then in turn called for each visited file path.
pub fn run<F>(
self,
mut mkf: F,
) where F: FnMut() -> Box<FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static> {
pub fn run<F>(self, mut mkf: F)
where
F: FnMut() -> Box<dyn FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static>,
{
let mut f = mkf();
let threads = self.threads();
// TODO: Figure out how to use a bounded channel here. With an
@@ -1092,30 +1111,16 @@ impl WalkParallel {
// Note that we only send directories. For files, we send to them the
// callback directly.
for path in self.paths {
let (dent, root_device) =
if path == Path::new("-") {
(DirEntry::new_stdin(), None)
let (dent, root_device) = if path == Path::new("-") {
(DirEntry::new_stdin(), None)
} else {
let root_device = if !self.same_file_system {
None
} else {
let root_device =
if !self.same_file_system {
None
} else {
match device_num(&path) {
Ok(root_device) => Some(root_device),
Err(err) => {
let err = Error::Io(err).with_path(path);
if f(Err(err)).is_quit() {
return;
}
continue;
}
}
};
match DirEntryRaw::from_path(0, path, false) {
Ok(dent) => {
(DirEntry::new_raw(dent, None), root_device)
}
match device_num(&path) {
Ok(root_device) => Some(root_device),
Err(err) => {
let err = Error::Io(err).with_path(path);
if f(Err(err)).is_quit() {
return;
}
@@ -1123,11 +1128,22 @@ impl WalkParallel {
}
}
};
match DirEntryRaw::from_path(0, path, false) {
Ok(dent) => (DirEntry::new_raw(dent, None), root_device),
Err(err) => {
if f(Err(err)).is_quit() {
return;
}
continue;
}
}
};
tx.send(Message::Work(Work {
dent: dent,
ignore: self.ig_root.clone(),
root_device: root_device,
})).unwrap();
}))
.unwrap();
any_work = true;
}
// ... but there's no need to start workers if we don't need them.
@@ -1253,7 +1269,7 @@ impl Work {
/// Note that a worker is *both* a producer and a consumer.
struct Worker {
/// The caller's callback.
f: Box<FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static>,
f: Box<dyn FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static>,
/// The push side of our mpmc queue.
tx: channel::Sender<Message>,
/// The receive side of our mpmc queue.
@@ -1319,22 +1335,21 @@ impl Worker {
continue;
}
};
let descend =
if let Some(root_device) = work.root_device {
match is_same_file_system(root_device, work.dent.path()) {
Ok(true) => true,
Ok(false) => false,
Err(err) => {
if (self.f)(Err(err)).is_quit() {
self.quit_now();
return;
}
false
let descend = if let Some(root_device) = work.root_device {
match is_same_file_system(root_device, work.dent.path()) {
Ok(true) => true,
Ok(false) => false,
Err(err) => {
if (self.f)(Err(err)).is_quit() {
self.quit_now();
return;
}
false
}
} else {
true
};
}
} else {
true
};
let depth = work.dent.depth();
match (self.f)(Ok(work.dent)) {
@@ -1352,12 +1367,7 @@ impl Worker {
continue;
}
for result in readdir {
let state = self.run_one(
&work.ignore,
depth + 1,
work.root_device,
result,
);
let state = self.run_one(&work.ignore, depth + 1, work.root_device, result);
if state.is_quit() {
self.quit_now();
return;
@@ -1422,23 +1432,24 @@ impl Worker {
}
}
let should_skip_path = should_skip_entry(ig, &dent);
let should_skip_filesize =
if self.max_filesize.is_some() && !dent.is_dir() {
skip_filesize(
self.max_filesize.unwrap(),
dent.path(),
&dent.metadata().ok(),
)
} else {
false
};
let should_skip_filesize = if self.max_filesize.is_some() && !dent.is_dir() {
skip_filesize(
self.max_filesize.unwrap(),
dent.path(),
&dent.metadata().ok(),
)
} else {
false
};
if !should_skip_path && !should_skip_filesize {
self.tx.send(Message::Work(Work {
dent: dent,
ignore: ig.clone(),
root_device: root_device,
})).unwrap();
self.tx
.send(Message::Work(Work {
dent: dent,
ignore: ig.clone(),
root_device: root_device,
}))
.unwrap();
}
WalkState::Continue
}
@@ -1568,17 +1579,25 @@ fn check_symlink_loop(
child_depth: usize,
) -> Result<(), Error> {
let hchild = Handle::from_path(child_path).map_err(|err| {
Error::from(err).with_path(child_path).with_depth(child_depth)
Error::from(err)
.with_path(child_path)
.with_depth(child_depth)
})?;
for ig in ig_parent.parents().take_while(|ig| !ig.is_absolute_parent()) {
for ig in ig_parent
.parents()
.take_while(|ig| !ig.is_absolute_parent())
{
let h = Handle::from_path(ig.path()).map_err(|err| {
Error::from(err).with_path(child_path).with_depth(child_depth)
Error::from(err)
.with_path(child_path)
.with_depth(child_depth)
})?;
if hchild == h {
return Err(Error::Loop {
ancestor: ig.path().to_path_buf(),
child: child_path.to_path_buf(),
}.with_depth(child_depth));
}
.with_depth(child_depth));
}
}
Ok(())
@@ -1586,14 +1605,10 @@ fn check_symlink_loop(
// Before calling this function, make sure that you ensure that is really
// necessary as the arguments imply a file stat.
fn skip_filesize(
max_filesize: u64,
path: &Path,
ent: &Option<Metadata>
) -> bool {
fn skip_filesize(max_filesize: u64, path: &Path, ent: &Option<Metadata>) -> bool {
let filesize = match *ent {
Some(ref md) => Some(md.len()),
None => None
None => None,
};
if let Some(fs) = filesize {
@@ -1608,10 +1623,7 @@ fn skip_filesize(
}
}
fn should_skip_entry(
ig: &Ignore,
dent: &DirEntry,
) -> bool {
fn should_skip_entry(ig: &Ignore, dent: &DirEntry) -> bool {
let m = ig.matched_dir_entry(dent);
if m.is_ignore() {
debug!("ignoring {}: {:?}", dent.path().display(), m);
@@ -1673,28 +1685,27 @@ fn path_equals(dent: &DirEntry, handle: &Handle) -> Result<bool, Error> {
/// Returns true if and only if the given path is on the same device as the
/// given root device.
fn is_same_file_system(root_device: u64, path: &Path) -> Result<bool, Error> {
let dent_device = device_num(path)
.map_err(|err| Error::Io(err).with_path(path))?;
let dent_device = device_num(path).map_err(|err| Error::Io(err).with_path(path))?;
Ok(root_device == dent_device)
}
#[cfg(unix)]
fn device_num<P: AsRef<Path>>(path: P)-> io::Result<u64> {
fn device_num<P: AsRef<Path>>(path: P) -> io::Result<u64> {
use std::os::unix::fs::MetadataExt;
path.as_ref().metadata().map(|md| md.dev())
}
#[cfg(windows)]
#[cfg(windows)]
fn device_num<P: AsRef<Path>>(path: P) -> io::Result<u64> {
use winapi_util::{Handle, file};
use winapi_util::{file, Handle};
let h = Handle::from_path_any(path)?;
file::information(h).map(|info| info.volume_serial_number())
}
#[cfg(not(any(unix, windows)))]
fn device_num<P: AsRef<Path>>(_: P)-> io::Result<u64> {
fn device_num<P: AsRef<Path>>(_: P) -> io::Result<u64> {
Err(io::Error::new(
io::ErrorKind::Other,
"walkdir: same_file_system option not supported on this platform",
@@ -1708,9 +1719,8 @@ mod tests {
use std::path::Path;
use std::sync::{Arc, Mutex};
use tempfile::{self, TempDir};
use super::{DirEntry, WalkBuilder, WalkState};
use tests::TempDir;
fn wfile<P: AsRef<Path>>(path: P, contents: &str) {
let mut file = File::create(path).unwrap();
@@ -1757,10 +1767,7 @@ mod tests {
paths
}
fn walk_collect_parallel(
prefix: &Path,
builder: &WalkBuilder,
) -> Vec<String> {
fn walk_collect_parallel(prefix: &Path, builder: &WalkBuilder) -> Vec<String> {
let mut paths = vec![];
for dent in walk_collect_entries_parallel(builder) {
let path = dent.path().strip_prefix(prefix).unwrap();
@@ -1795,15 +1802,11 @@ mod tests {
paths
}
fn tmpdir(prefix: &str) -> TempDir {
tempfile::Builder::new().prefix(prefix).tempdir().unwrap()
fn tmpdir() -> TempDir {
TempDir::new().unwrap()
}
fn assert_paths(
prefix: &Path,
builder: &WalkBuilder,
expected: &[&str],
) {
fn assert_paths(prefix: &Path, builder: &WalkBuilder, expected: &[&str]) {
let got = walk_collect(prefix, builder);
assert_eq!(got, mkpaths(expected), "single threaded");
let got = walk_collect_parallel(prefix, builder);
@@ -1812,20 +1815,22 @@ mod tests {
#[test]
fn no_ignores() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("a/b/c"));
mkdirp(td.path().join("x/y"));
wfile(td.path().join("a/b/foo"), "");
wfile(td.path().join("x/y/foo"), "");
assert_paths(td.path(), &WalkBuilder::new(td.path()), &[
"x", "x/y", "x/y/foo", "a", "a/b", "a/b/foo", "a/b/c",
]);
assert_paths(
td.path(),
&WalkBuilder::new(td.path()),
&["x", "x/y", "x/y/foo", "a", "a/b", "a/b/foo", "a/b/c"],
);
}
#[test]
fn custom_ignore() {
let td = tmpdir("walk-test-");
let td = tmpdir();
let custom_ignore = ".customignore";
mkdirp(td.path().join("a"));
wfile(td.path().join(custom_ignore), "foo");
@@ -1841,7 +1846,7 @@ mod tests {
#[test]
fn custom_ignore_exclusive_use() {
let td = tmpdir("walk-test-");
let td = tmpdir();
let custom_ignore = ".customignore";
mkdirp(td.path().join("a"));
wfile(td.path().join(custom_ignore), "foo");
@@ -1861,7 +1866,7 @@ mod tests {
#[test]
fn gitignore() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("a"));
wfile(td.path().join(".gitignore"), "foo");
@@ -1870,14 +1875,16 @@ mod tests {
wfile(td.path().join("bar"), "");
wfile(td.path().join("a/bar"), "");
assert_paths(td.path(), &WalkBuilder::new(td.path()), &[
"bar", "a", "a/bar",
]);
assert_paths(
td.path(),
&WalkBuilder::new(td.path()),
&["bar", "a", "a/bar"],
);
}
#[test]
fn explicit_ignore() {
let td = tmpdir("walk-test-");
let td = tmpdir();
let igpath = td.path().join(".not-an-ignore");
mkdirp(td.path().join("a"));
wfile(&igpath, "foo");
@@ -1893,7 +1900,7 @@ mod tests {
#[test]
fn explicit_ignore_exclusive_use() {
let td = tmpdir("walk-test-");
let td = tmpdir();
let igpath = td.path().join(".not-an-ignore");
mkdirp(td.path().join("a"));
wfile(&igpath, "foo");
@@ -1905,13 +1912,16 @@ mod tests {
let mut builder = WalkBuilder::new(td.path());
builder.standard_filters(false);
assert!(builder.add_ignore(&igpath).is_none());
assert_paths(td.path(), &builder,
&[".not-an-ignore", "bar", "a", "a/bar"]);
assert_paths(
td.path(),
&builder,
&[".not-an-ignore", "bar", "a", "a/bar"],
);
}
#[test]
fn gitignore_parent() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("a"));
wfile(td.path().join(".gitignore"), "foo");
@@ -1924,7 +1934,7 @@ mod tests {
#[test]
fn max_depth() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("a/b/c"));
wfile(td.path().join("foo"), "");
wfile(td.path().join("a/foo"), "");
@@ -1932,19 +1942,23 @@ mod tests {
wfile(td.path().join("a/b/c/foo"), "");
let mut builder = WalkBuilder::new(td.path());
assert_paths(td.path(), &builder, &[
"a", "a/b", "a/b/c", "foo", "a/foo", "a/b/foo", "a/b/c/foo",
]);
assert_paths(
td.path(),
&builder,
&["a", "a/b", "a/b/c", "foo", "a/foo", "a/b/foo", "a/b/c/foo"],
);
assert_paths(td.path(), builder.max_depth(Some(0)), &[]);
assert_paths(td.path(), builder.max_depth(Some(1)), &["a", "foo"]);
assert_paths(td.path(), builder.max_depth(Some(2)), &[
"a", "a/b", "foo", "a/foo",
]);
assert_paths(
td.path(),
builder.max_depth(Some(2)),
&["a", "a/b", "foo", "a/foo"],
);
}
#[test]
fn max_filesize() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("a/b"));
wfile_size(td.path().join("foo"), 0);
wfile_size(td.path().join("bar"), 400);
@@ -1954,41 +1968,49 @@ mod tests {
wfile_size(td.path().join("a/baz"), 200);
let mut builder = WalkBuilder::new(td.path());
assert_paths(td.path(), &builder, &[
"a", "a/b", "foo", "bar", "baz", "a/foo", "a/bar", "a/baz",
]);
assert_paths(td.path(), builder.max_filesize(Some(0)), &[
"a", "a/b", "foo"
]);
assert_paths(td.path(), builder.max_filesize(Some(500)), &[
"a", "a/b", "foo", "bar", "a/bar", "a/baz"
]);
assert_paths(td.path(), builder.max_filesize(Some(50000)), &[
"a", "a/b", "foo", "bar", "baz", "a/foo", "a/bar", "a/baz",
]);
assert_paths(
td.path(),
&builder,
&["a", "a/b", "foo", "bar", "baz", "a/foo", "a/bar", "a/baz"],
);
assert_paths(
td.path(),
builder.max_filesize(Some(0)),
&["a", "a/b", "foo"],
);
assert_paths(
td.path(),
builder.max_filesize(Some(500)),
&["a", "a/b", "foo", "bar", "a/bar", "a/baz"],
);
assert_paths(
td.path(),
builder.max_filesize(Some(50000)),
&["a", "a/b", "foo", "bar", "baz", "a/foo", "a/bar", "a/baz"],
);
}
#[cfg(unix)] // because symlinks on windows are weird
#[test]
fn symlinks() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("a/b"));
symlink(td.path().join("a/b"), td.path().join("z"));
wfile(td.path().join("a/b/foo"), "");
let mut builder = WalkBuilder::new(td.path());
assert_paths(td.path(), &builder, &[
"a", "a/b", "a/b/foo", "z",
]);
assert_paths(td.path(), &builder.follow_links(true), &[
"a", "a/b", "a/b/foo", "z", "z/foo",
]);
assert_paths(td.path(), &builder, &["a", "a/b", "a/b/foo", "z"]);
assert_paths(
td.path(),
&builder.follow_links(true),
&["a", "a/b", "a/b/foo", "z", "z/foo"],
);
}
#[cfg(unix)] // because symlinks on windows are weird
#[test]
fn first_path_not_symlink() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("foo"));
let dents = WalkBuilder::new(td.path().join("foo"))
@@ -1999,9 +2021,7 @@ mod tests {
assert_eq!(1, dents.len());
assert!(!dents[0].path_is_symlink());
let dents = walk_collect_entries_parallel(
&WalkBuilder::new(td.path().join("foo")),
);
let dents = walk_collect_entries_parallel(&WalkBuilder::new(td.path().join("foo")));
assert_eq!(1, dents.len());
assert!(!dents[0].path_is_symlink());
}
@@ -2009,17 +2029,13 @@ mod tests {
#[cfg(unix)] // because symlinks on windows are weird
#[test]
fn symlink_loop() {
let td = tmpdir("walk-test-");
let td = tmpdir();
mkdirp(td.path().join("a/b"));
symlink(td.path().join("a"), td.path().join("a/b/c"));
let mut builder = WalkBuilder::new(td.path());
assert_paths(td.path(), &builder, &[
"a", "a/b", "a/b/c",
]);
assert_paths(td.path(), &builder.follow_links(true), &[
"a", "a/b",
]);
assert_paths(td.path(), &builder, &["a", "a/b", "a/b/c"]);
assert_paths(td.path(), &builder.follow_links(true), &["a", "a/b"]);
}
// It's a little tricky to test the 'same_file_system' option since
@@ -2039,7 +2055,7 @@ mod tests {
// If our test directory actually isn't a different volume from /sys,
// then this test is meaningless and we shouldn't run it.
let td = tmpdir("walk-test-");
let td = tmpdir();
if device_num(td.path()).unwrap() == device_num("/sys").unwrap() {
return;
}
@@ -2053,8 +2069,6 @@ mod tests {
// completely.
let mut builder = WalkBuilder::new(td.path());
builder.follow_links(true).same_file_system(true);
assert_paths(td.path(), &builder, &[
"same_file", "same_file/alink",
]);
assert_paths(td.path(), &builder, &["same_file", "same_file/alink"]);
}
}

View File

@@ -1,14 +1,14 @@
class RipgrepBin < Formula
version '0.10.0'
version '11.0.2'
desc "Recursively search directories for a regex pattern."
homepage "https://github.com/BurntSushi/ripgrep"
if OS.mac?
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
sha256 "32754b4173ac87a7bfffd436d601a49362676eb1841ab33440f2f49c002c8967"
sha256 "0ba26423691deedf2649b12b1abe3d2be294ee1cb17c40b68fe85efe194f4f57"
elsif OS.linux?
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-unknown-linux-musl.tar.gz"
sha256 "c76080aa807a339b44139885d77d15ad60ab8cdd2c2fdaf345d0985625bc0f97"
sha256 "2e7978e346553fbc45c0940d9fa11e12f9afbae8213b261aad19b698150e169a"
end
conflicts_with "ripgrep"

View File

@@ -27,6 +27,9 @@ configuration file. The file can specify one shell argument per line. Lines
starting with '#' are ignored. For more details, see the man page or the
README.
Tip: to disable all smart filtering and make ripgrep behave a bit more like
classical grep, use 'rg -uuu'.
Project home page: https://github.com/BurntSushi/ripgrep
Use -h for short descriptions and --help for more details.";
@@ -544,7 +547,9 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
// flags are hidden and merely mentioned in the docs of the corresponding
// "positive" flag.
flag_after_context(&mut args);
flag_auto_hybrid_regex(&mut args);
flag_before_context(&mut args);
flag_binary(&mut args);
flag_block_buffered(&mut args);
flag_byte_offset(&mut args);
flag_case_sensitive(&mut args);
@@ -566,6 +571,7 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
flag_fixed_strings(&mut args);
flag_follow(&mut args);
flag_glob(&mut args);
flag_glob_case_insensitive(&mut args);
flag_heading(&mut args);
flag_hidden(&mut args);
flag_iglob(&mut args);
@@ -578,6 +584,7 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
flag_line_number(&mut args);
flag_line_regexp(&mut args);
flag_max_columns(&mut args);
flag_max_columns_preview(&mut args);
flag_max_count(&mut args);
flag_max_depth(&mut args);
flag_max_filesize(&mut args);
@@ -600,6 +607,7 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
flag_path_separator(&mut args);
flag_passthru(&mut args);
flag_pcre2(&mut args);
flag_pcre2_version(&mut args);
flag_pre(&mut args);
flag_pre_glob(&mut args);
flag_pretty(&mut args);
@@ -646,7 +654,7 @@ will be provided. Namely, the following is equivalent to the above:
let arg = RGArg::positional("pattern", "PATTERN")
.help(SHORT).long_help(LONG)
.required_unless(&[
"file", "files", "regexp", "type-list",
"file", "files", "regexp", "type-list", "pcre2-version",
]);
args.push(arg);
}
@@ -677,6 +685,50 @@ This overrides the --context flag.
args.push(arg);
}
fn flag_auto_hybrid_regex(args: &mut Vec<RGArg>) {
const SHORT: &str = "Dynamically use PCRE2 if necessary.";
const LONG: &str = long!("\
When this flag is used, ripgrep will dynamically choose between supported regex
engines depending on the features used in a pattern. When ripgrep chooses a
regex engine, it applies that choice for every regex provided to ripgrep (e.g.,
via multiple -e/--regexp or -f/--file flags).
As an example of how this flag might behave, ripgrep will attempt to use
its default finite automata based regex engine whenever the pattern can be
successfully compiled with that regex engine. If PCRE2 is enabled and if the
pattern given could not be compiled with the default regex engine, then PCRE2
will be automatically used for searching. If PCRE2 isn't available, then this
flag has no effect because there is only one regex engine to choose from.
In the future, ripgrep may adjust its heuristics for how it decides which
regex engine to use. In general, the heuristics will be limited to a static
analysis of the patterns, and not to any specific runtime behavior observed
while searching files.
The primary downside of using this flag is that it may not always be obvious
which regex engine ripgrep uses, and thus, the match semantics or performance
profile of ripgrep may subtly and unexpectedly change. However, in many cases,
all regex engines will agree on what constitutes a match and it can be nice
to transparently support more advanced regex features like look-around and
backreferences without explicitly needing to enable them.
This flag can be disabled with --no-auto-hybrid-regex.
");
let arg = RGArg::switch("auto-hybrid-regex")
.help(SHORT).long_help(LONG)
.overrides("no-auto-hybrid-regex")
.overrides("pcre2")
.overrides("no-pcre2");
args.push(arg);
let arg = RGArg::switch("no-auto-hybrid-regex")
.hidden()
.overrides("auto-hybrid-regex")
.overrides("pcre2")
.overrides("no-pcre2");
args.push(arg);
}
fn flag_before_context(args: &mut Vec<RGArg>) {
const SHORT: &str = "Show NUM lines before each match.";
const LONG: &str = long!("\
@@ -691,6 +743,55 @@ This overrides the --context flag.
args.push(arg);
}
fn flag_binary(args: &mut Vec<RGArg>) {
const SHORT: &str = "Search binary files.";
const LONG: &str = long!("\
Enabling this flag will cause ripgrep to search binary files. By default,
ripgrep attempts to automatically skip binary files in order to improve the
relevance of results and make the search faster.
Binary files are heuristically detected based on whether they contain a NUL
byte or not. By default (without this flag set), once a NUL byte is seen,
ripgrep will stop searching the file. Usually, NUL bytes occur in the beginning
of most binary files. If a NUL byte occurs after a match, then ripgrep will
still stop searching the rest of the file, but a warning will be printed.
In contrast, when this flag is provided, ripgrep will continue searching a file
even if a NUL byte is found. In particular, if a NUL byte is found then ripgrep
will continue searching until either a match is found or the end of the file is
reached, whichever comes sooner. If a match is found, then ripgrep will stop
and print a warning saying that the search stopped prematurely.
If you want ripgrep to search a file without any special NUL byte handling at
all (and potentially print binary data to stdout), then you should use the
'-a/--text' flag.
The '--binary' flag is a flag for controlling ripgrep's automatic filtering
mechanism. As such, it does not need to be used when searching a file
explicitly or when searching stdin. That is, it is only applicable when
recursively searching a directory.
Note that when the '-u/--unrestricted' flag is provided for a third time, then
this flag is automatically enabled.
This flag can be disabled with '--no-binary'. It overrides the '-a/--text'
flag.
");
let arg = RGArg::switch("binary")
.help(SHORT).long_help(LONG)
.overrides("no-binary")
.overrides("text")
.overrides("no-text");
args.push(arg);
let arg = RGArg::switch("no-binary")
.hidden()
.overrides("binary")
.overrides("text")
.overrides("no-text");
args.push(arg);
}
fn flag_block_buffered(args: &mut Vec<RGArg>) {
const SHORT: &str = "Force block buffering.";
const LONG: &str = long!("\
@@ -1118,6 +1219,25 @@ it.
args.push(arg);
}
fn flag_glob_case_insensitive(args: &mut Vec<RGArg>) {
const SHORT: &str = "Process all glob patterns case insensitively.";
const LONG: &str = long!("\
Process glob patterns given with the -g/--glob flag case insensitively. This
effectively treats --glob as --iglob.
This flag can be disabled with the --no-glob-case-insensitive flag.
");
let arg = RGArg::switch("glob-case-insensitive")
.help(SHORT).long_help(LONG)
.overrides("no-glob-case-insensitive");
args.push(arg);
let arg = RGArg::switch("no-glob-case-insensitive")
.hidden()
.overrides("glob-case-insensitive");
args.push(arg);
}
fn flag_heading(args: &mut Vec<RGArg>) {
const SHORT: &str = "Print matches grouped by each file.";
const LONG: &str = long!("\
@@ -1390,6 +1510,30 @@ When this flag is omitted or is set to 0, then it has no effect.
args.push(arg);
}
fn flag_max_columns_preview(args: &mut Vec<RGArg>) {
const SHORT: &str = "Print a preview for lines exceeding the limit.";
const LONG: &str = long!("\
When the '--max-columns' flag is used, ripgrep will by default completely
replace any line that is too long with a message indicating that a matching
line was removed. When this flag is combined with '--max-columns', a preview
of the line (corresponding to the limit size) is shown instead, where the part
of the line exceeding the limit is not shown.
If the '--max-columns' flag is not set, then this has no effect.
This flag can be disabled with '--no-max-columns-preview'.
");
let arg = RGArg::switch("max-columns-preview")
.help(SHORT).long_help(LONG)
.overrides("no-max-columns-preview");
args.push(arg);
let arg = RGArg::switch("no-max-columns-preview")
.hidden()
.overrides("max-columns-preview");
args.push(arg);
}
fn flag_max_count(args: &mut Vec<RGArg>) {
const SHORT: &str = "Limit the number of matches.";
const LONG: &str = long!("\
@@ -1851,7 +1995,12 @@ or backreferences.
Note that PCRE2 is an optional ripgrep feature. If PCRE2 wasn't included in
your build of ripgrep, then using this flag will result in ripgrep printing
an error message and exiting.
an error message and exiting. PCRE2 may also have worse user experience in
some cases, since it has fewer introspection APIs than ripgrep's default regex
engine. For example, if you use a '\n' in a PCRE2 regex without the
'-U/--multiline' flag, then ripgrep will silently fail to match anything
instead of reporting an error immediately (like it does with the default
regex engine).
Related flags: --no-pcre2-unicode
@@ -1859,12 +2008,28 @@ This flag can be disabled with --no-pcre2.
");
let arg = RGArg::switch("pcre2").short("P")
.help(SHORT).long_help(LONG)
.overrides("no-pcre2");
.overrides("no-pcre2")
.overrides("auto-hybrid-regex")
.overrides("no-auto-hybrid-regex");
args.push(arg);
let arg = RGArg::switch("no-pcre2")
.hidden()
.overrides("pcre2");
.overrides("pcre2")
.overrides("auto-hybrid-regex")
.overrides("no-auto-hybrid-regex");
args.push(arg);
}
fn flag_pcre2_version(args: &mut Vec<RGArg>) {
const SHORT: &str = "Print the version of PCRE2 that ripgrep uses.";
const LONG: &str = long!("\
When this flag is present, ripgrep will print the version of PCRE2 in use,
along with other information, and then exit. If PCRE2 is not available, then
ripgrep will print an error message and exit with an error code.
");
let arg = RGArg::switch("pcre2-version")
.help(SHORT).long_help(LONG);
args.push(arg);
}
@@ -1874,12 +2039,13 @@ fn flag_pre(args: &mut Vec<RGArg>) {
For each input FILE, search the standard output of COMMAND FILE rather than the
contents of FILE. This option expects the COMMAND program to either be an
absolute path or to be available in your PATH. Either an empty string COMMAND
or the `--no-pre` flag will disable this behavior.
or the '--no-pre' flag will disable this behavior.
WARNING: When this flag is set, ripgrep will unconditionally spawn a
process for every file that is searched. Therefore, this can incur an
unnecessarily large performance penalty if you don't otherwise need the
flexibility offered by this flag.
flexibility offered by this flag. One possible mitigation to this is to use
the '--pre-glob' flag to limit which files a preprocessor is run with.
A preprocessor is not run when ripgrep is searching stdin.
@@ -2030,7 +2196,10 @@ Replace every match with the text given when printing results. Neither this
flag nor any other ripgrep flag will modify your files.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported in the
replacement string.
replacement string. In shells such as Bash and zsh, you should wrap the
pattern in single quotes instead of double quotes. Otherwise, capture group
indices will be replaced by expanded shell variables which will most likely
be empty.
Note that the replacement by default replaces each match, and NOT the entire
line. To replace the entire line, you should match the entire line.
@@ -2208,20 +2377,23 @@ escape codes to be printed that alter the behavior of your terminal.
When binary file detection is enabled it is imperfect. In general, it uses
a simple heuristic. If a NUL byte is seen during search, then the file is
considered binary and search stops (unless this flag is present).
Alternatively, if the '--binary' flag is used, then ripgrep will only quit
when it sees a NUL byte after it sees a match (or searches the entire file).
Note that when the `-u/--unrestricted` flag is provided for a third time, then
this flag is automatically enabled.
This flag can be disabled with --no-text.
This flag can be disabled with '--no-text'. It overrides the '--binary' flag.
");
let arg = RGArg::switch("text").short("a")
.help(SHORT).long_help(LONG)
.overrides("no-text");
.overrides("no-text")
.overrides("binary")
.overrides("no-binary");
args.push(arg);
let arg = RGArg::switch("no-text")
.hidden()
.overrides("text");
.overrides("text")
.overrides("binary")
.overrides("no-binary");
args.push(arg);
}
@@ -2350,8 +2522,7 @@ Reduce the level of \"smart\" searching. A single -u won't respect .gitignore
(etc.) files. Two -u flags will additionally search hidden files and
directories. Three -u flags will additionally search binary files.
-uu is roughly equivalent to grep -r and -uuu is roughly equivalent to grep -a
-r.
'rg -uuu' is roughly equivalent to 'grep -r'.
");
let arg = RGArg::switch("unrestricted").short("u")
.help(SHORT).long_help(LONG)
@@ -2393,7 +2564,7 @@ ripgrep is explicitly instructed to search one file or stdin.
This flag overrides --with-filename.
");
let arg = RGArg::switch("no-filename")
let arg = RGArg::switch("no-filename").short("I")
.help(NO_SHORT).long_help(NO_LONG)
.overrides("with-filename");
args.push(arg);

View File

@@ -73,6 +73,8 @@ pub enum Command {
/// List all file type definitions configured, including the default file
/// types and any additional file types added to the command line.
Types,
/// Print the version of PCRE2 in use.
PCRE2Version,
}
impl Command {
@@ -82,7 +84,11 @@ impl Command {
match *self {
Search | SearchParallel => true,
SearchNever | Files | FilesParallel | Types => false,
| SearchNever
| Files
| FilesParallel
| Types
| PCRE2Version => false,
}
}
}
@@ -235,7 +241,9 @@ impl Args {
let threads = self.matches().threads()?;
let one_thread = is_one_search || threads == 1;
Ok(if self.matches().is_present("type-list") {
Ok(if self.matches().is_present("pcre2-version") {
Command::PCRE2Version
} else if self.matches().is_present("type-list") {
Command::Types
} else if self.matches().is_present("files") {
if one_thread {
@@ -286,15 +294,18 @@ impl Args {
&self,
wtr: W,
) -> Result<SearchWorker<W>> {
let matches = self.matches();
let matcher = self.matcher().clone();
let printer = self.printer(wtr)?;
let searcher = self.matches().searcher(self.paths())?;
let searcher = matches.searcher(self.paths())?;
let mut builder = SearchWorkerBuilder::new();
builder
.json_stats(self.matches().is_present("json"))
.preprocessor(self.matches().preprocessor())
.preprocessor_globs(self.matches().preprocessor_globs()?)
.search_zip(self.matches().is_present("search-zip"));
.json_stats(matches.is_present("json"))
.preprocessor(matches.preprocessor())
.preprocessor_globs(matches.preprocessor_globs()?)
.search_zip(matches.is_present("search-zip"))
.binary_detection_implicit(matches.binary_detection_implicit())
.binary_detection_explicit(matches.binary_detection_explicit());
Ok(builder.build(matcher, searcher, printer))
}
@@ -588,6 +599,25 @@ impl ArgMatches {
if self.is_present("pcre2") {
let matcher = self.matcher_pcre2(patterns)?;
Ok(PatternMatcher::PCRE2(matcher))
} else if self.is_present("auto-hybrid-regex") {
let rust_err = match self.matcher_rust(patterns) {
Ok(matcher) => return Ok(PatternMatcher::RustRegex(matcher)),
Err(err) => err,
};
log::debug!(
"error building Rust regex in hybrid mode:\n{}", rust_err,
);
let pcre_err = match self.matcher_pcre2(patterns) {
Ok(matcher) => return Ok(PatternMatcher::PCRE2(matcher)),
Err(err) => err,
};
Err(From::from(format!(
"regex could not be compiled with either the default regex \
engine or with PCRE2.\n\n\
default regex engine error:\n{}\n{}\n{}\n\n\
PCRE2 regex engine error:\n{}",
"~".repeat(79), rust_err, "~".repeat(79), pcre_err,
)))
} else {
let matcher = match self.matcher_rust(patterns) {
Ok(matcher) => matcher,
@@ -656,7 +686,13 @@ impl ArgMatches {
if let Some(limit) = self.dfa_size_limit()? {
builder.dfa_size_limit(limit);
}
match builder.build(&patterns.join("|")) {
let res =
if self.is_present("fixed-strings") {
builder.build_literals(patterns)
} else {
builder.build(&patterns.join("|"))
};
match res {
Ok(m) => Ok(m),
Err(err) => Err(From::from(suggest_multiline(err.to_string()))),
}
@@ -676,8 +712,13 @@ impl ArgMatches {
.word(self.is_present("word-regexp"));
// For whatever reason, the JIT craps out during regex compilation with
// a "no more memory" error on 32 bit systems. So don't use it there.
if !cfg!(target_pointer_width = "32") {
builder.jit_if_available(true);
if cfg!(target_pointer_width = "64") {
builder
.jit_if_available(true)
// The PCRE2 docs say that 32KB is the default, and that 1MB
// should be big enough for anything. But let's crank it to
// 10MB.
.max_jit_stack_size(Some(10 * (1<<20)));
}
if self.pcre2_unicode() {
builder.utf(true).ucp(true);
@@ -737,6 +778,7 @@ impl ArgMatches {
.per_match(self.is_present("vimgrep"))
.replacement(self.replacement())
.max_columns(self.max_columns()?)
.max_columns_preview(self.max_columns_preview())
.max_matches(self.max_count()?)
.column(self.column())
.byte_offset(self.is_present("byte-offset"))
@@ -796,8 +838,7 @@ impl ArgMatches {
.before_context(ctx_before)
.after_context(ctx_after)
.passthru(self.is_present("passthru"))
.memory_map(self.mmap_choice(paths))
.binary_detection(self.binary_detection());
.memory_map(self.mmap_choice(paths));
match self.encoding()? {
EncodingMode::Some(enc) => {
builder.encoding(Some(enc));
@@ -856,16 +897,39 @@ impl ArgMatches {
///
/// Methods are sorted alphabetically.
impl ArgMatches {
/// Returns the form of binary detection to perform.
fn binary_detection(&self) -> BinaryDetection {
/// Returns the form of binary detection to perform on files that are
/// implicitly searched via recursive directory traversal.
fn binary_detection_implicit(&self) -> BinaryDetection {
let none =
self.is_present("text")
|| self.is_present("null-data");
let convert =
self.is_present("binary")
|| self.unrestricted_count() >= 3;
if none {
BinaryDetection::none()
} else if convert {
BinaryDetection::convert(b'\x00')
} else {
BinaryDetection::quit(b'\x00')
}
}
/// Returns the form of binary detection to perform on files that are
/// explicitly searched via the user invoking ripgrep on a particular
/// file or files or stdin.
///
/// In general, this should never be BinaryDetection::quit, since that acts
/// as a filter (but quitting immediately once a NUL byte is seen), and we
/// should never filter out files that the user wants to explicitly search.
fn binary_detection_explicit(&self) -> BinaryDetection {
let none =
self.is_present("text")
|| self.unrestricted_count() >= 3
|| self.is_present("null-data");
if none {
BinaryDetection::none()
} else {
BinaryDetection::quit(b'\x00')
BinaryDetection::convert(b'\x00')
}
}
@@ -1111,6 +1175,12 @@ impl ArgMatches {
Ok(self.usize_of_nonzero("max-columns")?.map(|n| n as u64))
}
/// Returns true if and only if a preview should be shown for lines that
/// exceed the maximum column limit.
fn max_columns_preview(&self) -> bool {
self.is_present("max-columns-preview")
}
/// The maximum number of matches permitted.
fn max_count(&self) -> Result<Option<u64>> {
Ok(self.usize_of("max-count")?.map(|n| n as u64))
@@ -1198,6 +1268,10 @@ impl ArgMatches {
/// Builds the set of glob overrides from the command line flags.
fn overrides(&self) -> Result<Override> {
let mut builder = OverrideBuilder::new(env::current_dir()?);
// Make all globs case insensitive with --glob-case-insensitive.
if self.is_present("glob-case-insensitive") {
builder.case_insensitive(true).unwrap();
}
for glob in self.values_of_lossy_vec("glob") {
builder.add(&glob)?;
}
@@ -1240,7 +1314,8 @@ impl ArgMatches {
!cli::is_readable_stdin()
|| (self.is_present("file") && file_is_stdin)
|| self.is_present("files")
|| self.is_present("type-list");
|| self.is_present("type-list")
|| self.is_present("pcre2-version");
if search_cwd {
Path::new("./").to_path_buf()
} else {
@@ -1519,10 +1594,11 @@ impl ArgMatches {
if self.is_present("no-filename") {
false
} else {
let path_stdin = Path::new("-");
self.is_present("with-filename")
|| self.is_present("vimgrep")
|| paths.len() > 1
|| paths.get(0).map_or(false, |p| p.is_dir())
|| paths.get(0).map_or(false, |p| p != path_stdin && p.is_dir())
}
}
}
@@ -1699,12 +1775,12 @@ where I: IntoIterator<Item=T>,
if err.use_stderr() {
return Err(err.into());
}
// Explicitly ignore any error returned by writeln!. The most likely error
// Explicitly ignore any error returned by write!. The most likely error
// at this point is a broken pipe error, in which case, we want to ignore
// it and exit quietly.
//
// (This is the point of this helper function. clap's functionality for
// doing this will panic on a broken pipe error.)
let _ = writeln!(io::stdout(), "{}", err);
let _ = write!(io::stdout(), "{}", err);
process::exit(0);
}

View File

@@ -9,7 +9,7 @@ use std::io;
use std::ffi::OsString;
use std::path::{Path, PathBuf};
use bstr::io::BufReadExt;
use bstr::{io::BufReadExt, ByteSlice};
use log;
use crate::Result;
@@ -55,7 +55,7 @@ pub fn args() -> Vec<OsString> {
/// for each line in addition to successfully parsed arguments.
fn parse<P: AsRef<Path>>(
path: P,
) -> Result<(Vec<OsString>, Vec<Box<Error>>)> {
) -> Result<(Vec<OsString>, Vec<Box<dyn Error>>)> {
let path = path.as_ref();
match File::open(&path) {
Ok(file) => parse_reader(file),
@@ -76,7 +76,7 @@ fn parse<P: AsRef<Path>>(
/// in addition to successfully parsed arguments.
fn parse_reader<R: io::Read>(
rdr: R,
) -> Result<(Vec<OsString>, Vec<Box<Error>>)> {
) -> Result<(Vec<OsString>, Vec<Box<dyn Error>>)> {
let bufrdr = io::BufReader::new(rdr);
let (mut args, mut errs) = (vec![], vec![]);
let mut line_number = 0;

View File

@@ -1,3 +1,4 @@
use std::error;
use std::io::{self, Write};
use std::process;
use std::sync::{Arc, Mutex};
@@ -19,7 +20,30 @@ mod path_printer;
mod search;
mod subject;
type Result<T> = ::std::result::Result<T, Box<::std::error::Error>>;
// Since Rust no longer uses jemalloc by default, ripgrep will, by default,
// use the system allocator. On Linux, this would normally be glibc's
// allocator, which is pretty good. In particular, ripgrep does not have a
// particularly allocation heavy workload, so there really isn't much
// difference (for ripgrep's purposes) between glibc's allocator and jemalloc.
//
// However, when ripgrep is built with musl, this means ripgrep will use musl's
// allocator, which appears to be substantially worse. (musl's goal is not to
// have the fastest version of everything. Its goal is to be small and amenable
// to static compilation.) Even though ripgrep isn't particularly allocation
// heavy, musl's allocator appears to slow down ripgrep quite a bit. Therefore,
// when building with musl, we use jemalloc.
//
// We don't unconditionally use jemalloc because it can be nice to use the
// system's default allocator by default. Moreover, jemalloc seems to increase
// compilation times by a bit.
//
// Moreover, we only do this on 64-bit systems since jemalloc doesn't support
// i686.
#[cfg(all(target_env = "musl", target_pointer_width = "64"))]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
type Result<T> = ::std::result::Result<T, Box<dyn error::Error>>;
fn main() {
if let Err(err) = Args::parse().and_then(try_main) {
@@ -39,6 +63,7 @@ fn try_main(args: Args) -> Result<()> {
Files => files(&args),
FilesParallel => files_parallel(&args),
Types => types(&args),
PCRE2Version => pcre2_version(&args),
}?;
if matched && (args.quiet() || !messages::errored()) {
process::exit(0)
@@ -275,3 +300,30 @@ fn types(args: &Args) -> Result<bool> {
}
Ok(count > 0)
}
/// The top-level entry point for --pcre2-version.
fn pcre2_version(args: &Args) -> Result<bool> {
#[cfg(feature = "pcre2")]
fn imp(args: &Args) -> Result<bool> {
use grep::pcre2;
let mut stdout = args.stdout();
let (major, minor) = pcre2::version();
writeln!(stdout, "PCRE2 {}.{} is available", major, minor)?;
if cfg!(target_pointer_width = "64") && pcre2::is_jit_available() {
writeln!(stdout, "JIT is available")?;
}
Ok(true)
}
#[cfg(not(feature = "pcre2"))]
fn imp(args: &Args) -> Result<bool> {
let mut stdout = args.stdout();
writeln!(stdout, "PCRE2 is not available in this build of ripgrep.")?;
Ok(false)
}
imp(args)
}

View File

@@ -10,7 +10,7 @@ use grep::matcher::Matcher;
use grep::pcre2::{RegexMatcher as PCRE2RegexMatcher};
use grep::printer::{JSON, Standard, Summary, Stats};
use grep::regex::{RegexMatcher as RustRegexMatcher};
use grep::searcher::Searcher;
use grep::searcher::{BinaryDetection, Searcher};
use ignore::overrides::Override;
use serde_json as json;
use serde_json::json;
@@ -27,6 +27,8 @@ struct Config {
preprocessor: Option<PathBuf>,
preprocessor_globs: Override,
search_zip: bool,
binary_implicit: BinaryDetection,
binary_explicit: BinaryDetection,
}
impl Default for Config {
@@ -36,6 +38,8 @@ impl Default for Config {
preprocessor: None,
preprocessor_globs: Override::empty(),
search_zip: false,
binary_implicit: BinaryDetection::none(),
binary_explicit: BinaryDetection::none(),
}
}
}
@@ -134,6 +138,37 @@ impl SearchWorkerBuilder {
self.config.search_zip = yes;
self
}
/// Set the binary detection that should be used when searching files
/// found via a recursive directory search.
///
/// Generally, this binary detection may be `BinaryDetection::quit` if
/// we want to skip binary files completely.
///
/// By default, no binary detection is performed.
pub fn binary_detection_implicit(
&mut self,
detection: BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_implicit = detection;
self
}
/// Set the binary detection that should be used when searching files
/// explicitly supplied by an end user.
///
/// Generally, this binary detection should NOT be `BinaryDetection::quit`,
/// since we never want to automatically filter files supplied by the end
/// user.
///
/// By default, no binary detection is performed.
pub fn binary_detection_explicit(
&mut self,
detection: BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_explicit = detection;
self
}
}
/// The result of executing a search.
@@ -280,7 +315,24 @@ pub struct SearchWorker<W> {
impl<W: WriteColor> SearchWorker<W> {
/// Execute a search over the given subject.
pub fn search(&mut self, subject: &Subject) -> io::Result<SearchResult> {
self.search_impl(subject)
let bin =
if subject.is_explicit() {
self.config.binary_explicit.clone()
} else {
self.config.binary_implicit.clone()
};
self.searcher.set_binary_detection(bin);
let path = subject.path();
if subject.is_stdin() {
self.search_reader(path, io::stdin().lock())
} else if self.should_preprocess(path) {
self.search_preprocessor(path)
} else if self.should_decompress(path) {
self.search_decompress(path)
} else {
self.search_path(path)
}
}
/// Return a mutable reference to the underlying printer.
@@ -306,22 +358,6 @@ impl<W: WriteColor> SearchWorker<W> {
}
}
/// Search the given subject using the appropriate strategy.
fn search_impl(&mut self, subject: &Subject) -> io::Result<SearchResult> {
let path = subject.path();
if subject.is_stdin() {
let stdin = io::stdin();
// A `return` here appeases the borrow checker. NLL will fix this.
return self.search_reader(path, stdin.lock());
} else if self.should_preprocess(path) {
self.search_preprocessor(path)
} else if self.should_decompress(path) {
self.search_decompress(path)
} else {
self.search_path(path)
}
}
/// Returns true if and only if the given file path should be
/// decompressed before searching.
fn should_decompress(&self, path: &Path) -> bool {
@@ -349,11 +385,23 @@ impl<W: WriteColor> SearchWorker<W> {
&mut self,
path: &Path,
) -> io::Result<SearchResult> {
let bin = self.config.preprocessor.clone().unwrap();
let mut cmd = Command::new(&bin);
let bin = self.config.preprocessor.as_ref().unwrap();
let mut cmd = Command::new(bin);
cmd.arg(path).stdin(Stdio::from(File::open(path)?));
let rdr = self.command_builder.build(&mut cmd)?;
let rdr = self
.command_builder
.build(&mut cmd)
.map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!(
"preprocessor command could not start: '{:?}': {}",
cmd,
err,
),
)
})?;
self.search_reader(path, rdr).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,

View File

@@ -59,17 +59,12 @@ impl SubjectBuilder {
if let Some(ignore_err) = subj.dent.error() {
ignore_message!("{}", ignore_err);
}
// If this entry represents stdin, then we always search it.
if subj.dent.is_stdin() {
// If this entry was explicitly provided by an end user, then we always
// want to search it.
if subj.is_explicit() {
return Some(subj);
}
// If this subject has a depth of 0, then it was provided explicitly
// by an end user (or via a shell glob). In this case, we always want
// to search it if it even smells like a file (e.g., a symlink).
if subj.dent.depth() == 0 && !subj.is_dir() {
return Some(subj);
}
// At this point, we only want to search something it's explicitly a
// At this point, we only want to search something if it's explicitly a
// file. This omits symlinks. (If ripgrep was configured to follow
// symlinks, then they have already been followed by the directory
// traversal.)
@@ -127,6 +122,26 @@ impl Subject {
self.dent.is_stdin()
}
/// Returns true if and only if this entry corresponds to a subject to
/// search that was explicitly supplied by an end user.
///
/// Generally, this corresponds to either stdin or an explicit file path
/// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
/// an explicit subject, but, e.g., `./some-dir/some-other-file` is not.
///
/// However, note that ripgrep does not see through shell globbing. e.g.,
/// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
/// as an explicit subject.
pub fn is_explicit(&self) -> bool {
// stdin is obvious. When an entry has a depth of 0, that means it
// was explicitly provided to our directory iterator, which means it
// was in turn explicitly provided by the end user. The !is_dir check
// means that we want to search files even if their symlinks, again,
// because they were explicitly provided. (And we never want to try
// to search a directory.)
self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
}
/// Returns true if and only if this subject points to a directory after
/// following symbolic links.
fn is_dir(&self) -> bool {

View File

@@ -1,2 +0,0 @@
termcolor has moved to its own repository:
https://github.com/BurntSushi/termcolor

315
tests/binary.rs Normal file
View File

@@ -0,0 +1,315 @@
use crate::util::{Dir, TestCommand};
// This file contains a smattering of tests specifically for checking ripgrep's
// handling of binary files. There's quite a bit of discussion on this in this
// bug report: https://github.com/BurntSushi/ripgrep/issues/306
// Our haystack is the first 500 lines of Gutenberg's copy of "A Study in
// Scarlet," with a NUL byte at line 237: `abcdef\x00`.
//
// The position and size of the haystack is, unfortunately, significant. In
// particular, the NUL byte is specifically inserted at some point *after* the
// first 8192 bytes, which corresponds to the initial capacity of the buffer
// that ripgrep uses to read files. (grep for DEFAULT_BUFFER_CAPACITY.) The
// position of the NUL byte ensures that we can execute some search on the
// initial buffer contents without ever detecting any binary data. Moreover,
// when using a memory map for searching, only the first 8192 bytes are
// scanned for a NUL byte, so no binary bytes are detected at all when using
// a memory map (unless our query matches line 237).
//
// One last note: in the tests below, we use --no-mmap heavily because binary
// detection with memory maps is a bit different. Namely, NUL bytes are only
// searched for in the first few KB of the file and in a match. Normally, NUL
// bytes are searched for everywhere.
//
// TODO: Add tests for binary file detection when using memory maps.
const HAY: &'static [u8] = include_bytes!("./data/sherlock-nul.txt");
// This tests that ripgrep prints a warning message if it finds and prints a
// match in a binary file before detecting that it is a binary file. The point
// here is to notify that user that the search of the file is only partially
// complete.
//
// This applies to files that are *implicitly* searched via a recursive
// directory traversal. In particular, this results in a WARNING message being
// printed. We make our file "implicit" by doing a recursive search with a glob
// that matches our file.
rgtest!(after_match1_implicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "Project Gutenberg EBook", "-g", "hay",
]);
let expected = "\
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
WARNING: stopped searching binary file hay after match (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like after_match1_implicit, except we provide a file to search
// explicitly. This results in identical behavior, but a different message.
rgtest!(after_match1_explicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "Project Gutenberg EBook", "hay",
]);
let expected = "\
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
Binary file matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like after_match1_explicit, except we feed our content on stdin.
rgtest!(after_match1_stdin, |_: Dir, mut cmd: TestCommand| {
cmd.args(&[
"--no-mmap", "-n", "Project Gutenberg EBook",
]);
let expected = "\
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
Binary file matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.pipe(HAY));
});
// Like after_match1_implicit, but provides the --binary flag, which
// disables binary filtering. Thus, this matches the behavior of ripgrep as
// if the file were given explicitly.
rgtest!(after_match1_implicit_binary, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--binary", "Project Gutenberg EBook", "-g", "hay",
]);
let expected = "\
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
Binary file hay matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like after_match1_implicit, but enables -a/--text, so no binary
// detection should be performed.
rgtest!(after_match1_implicit_text, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--text", "Project Gutenberg EBook", "-g", "hay",
]);
let expected = "\
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
";
eqnice!(expected, cmd.stdout());
});
// Like after_match1_implicit_text, but enables -a/--text, so no binary
// detection should be performed.
rgtest!(after_match1_explicit_text, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--text", "Project Gutenberg EBook", "hay",
]);
let expected = "\
1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
";
eqnice!(expected, cmd.stdout());
});
// Like after_match1_implicit, except this asks ripgrep to print all matching
// files.
//
// This is an interesting corner case that one might consider a bug, however,
// it's unlikely to be fixed. Namely, ripgrep probably shouldn't print `hay`
// as a matching file since it is in fact a binary file, and thus should be
// filtered out by default. However, the --files-with-matches flag will print
// out the path of a matching file as soon as a match is seen and then stop
// searching completely. Therefore, the NUL byte is never actually detected.
//
// The only way to fix this would be to kill ripgrep's performance in this case
// and continue searching the entire file for a NUL byte. (Similarly if the
// --quiet flag is set. See the next test.)
rgtest!(after_match1_implicit_path, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-l", "Project Gutenberg EBook", "-g", "hay",
]);
eqnice!("hay\n", cmd.stdout());
});
// Like after_match1_implicit_path, except this indicates that a match was
// found with no other output. (This is the same bug described above, but
// manifest as an exit code with no output.)
rgtest!(after_match1_implicit_quiet, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-q", "Project Gutenberg EBook", "-g", "hay",
]);
eqnice!("", cmd.stdout());
});
// This sets up the same test as after_match1_implicit_path, but instead of
// just printing the matching files, this includes the full count of matches.
// In this case, we need to search the entire file, so ripgrep correctly
// detects the binary data and suppresses output.
rgtest!(after_match1_implicit_count, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-c", "Project Gutenberg EBook", "-g", "hay",
]);
cmd.assert_err();
});
// Like after_match1_implicit_count, except the --binary flag is provided,
// which makes ripgrep disable binary data filtering even for implicit files.
rgtest!(after_match1_implicit_count_binary, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-c", "--binary",
"Project Gutenberg EBook",
"-g", "hay",
]);
eqnice!("hay:1\n", cmd.stdout());
});
// Like after_match1_implicit_count, except the file path is provided
// explicitly, so binary filtering is disabled and a count is correctly
// reported.
rgtest!(after_match1_explicit_count, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-c", "Project Gutenberg EBook", "hay",
]);
eqnice!("1\n", cmd.stdout());
});
// This tests that a match way before the NUL byte is shown, but a match after
// the NUL byte is not.
rgtest!(after_match2_implicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n",
"Project Gutenberg EBook|a medical student",
"-g", "hay",
]);
let expected = "\
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
WARNING: stopped searching binary file hay after match (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like after_match2_implicit, but enables -a/--text, so no binary
// detection should be performed.
rgtest!(after_match2_implicit_text, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--text",
"Project Gutenberg EBook|a medical student",
"-g", "hay",
]);
let expected = "\
hay:1:The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
hay:236:\"And yet you say he is not a medical student?\"
";
eqnice!(expected, cmd.stdout());
});
// This tests that ripgrep *silently* quits before finding a match that occurs
// after a NUL byte.
rgtest!(before_match1_implicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "Heaven", "-g", "hay",
]);
cmd.assert_err();
});
// This tests that ripgrep *does not* silently quit before finding a match that
// occurs after a NUL byte when a file is explicitly searched.
rgtest!(before_match1_explicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "Heaven", "hay",
]);
let expected = "\
Binary file matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like before_match1_implicit, but enables the --binary flag, which
// disables binary filtering. Thus, this matches the behavior of ripgrep as if
// the file were given explicitly.
rgtest!(before_match1_implicit_binary, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--binary", "Heaven", "-g", "hay",
]);
let expected = "\
Binary file hay matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like before_match1_implicit, but enables -a/--text, so no binary
// detection should be performed.
rgtest!(before_match1_implicit_text, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--text", "Heaven", "-g", "hay",
]);
let expected = "\
hay:238:\"No. Heaven knows what the objects of his studies are. But here we
";
eqnice!(expected, cmd.stdout());
});
// This tests that ripgrep *silently* quits before finding a match that occurs
// before a NUL byte, but within the same buffer as the NUL byte.
rgtest!(before_match2_implicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "a medical student", "-g", "hay",
]);
cmd.assert_err();
});
// This tests that ripgrep *does not* silently quit before finding a match that
// occurs before a NUL byte, but within the same buffer as the NUL byte. Even
// though the match occurs before the NUL byte, ripgrep still doesn't print it
// because it has already scanned ahead to detect the NUL byte. (This matches
// the behavior of GNU grep.)
rgtest!(before_match2_explicit, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "a medical student", "hay",
]);
let expected = "\
Binary file matches (found \"\\u{0}\" byte around offset 9741)
";
eqnice!(expected, cmd.stdout());
});
// Like before_match1_implicit, but enables -a/--text, so no binary
// detection should be performed.
rgtest!(before_match2_implicit_text, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("hay", HAY);
cmd.args(&[
"--no-mmap", "-n", "--text", "a medical student", "-g", "hay",
]);
let expected = "\
hay:236:\"And yet you say he is not a medical student?\"
";
eqnice!(expected, cmd.stdout());
});

500
tests/data/sherlock-nul.txt Normal file
View File

@@ -0,0 +1,500 @@
The Project Gutenberg EBook of A Study In Scarlet, by Arthur Conan Doyle
This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever. You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org
Title: A Study In Scarlet
Author: Arthur Conan Doyle
Posting Date: July 12, 2008 [EBook #244]
Release Date: April, 1995
[Last updated: February 17, 2013]
Language: English
*** START OF THIS PROJECT GUTENBERG EBOOK A STUDY IN SCARLET ***
Produced by Roger Squires
A STUDY IN SCARLET.
By A. Conan Doyle
[1]
Original Transcriber's Note: This etext is prepared directly
from an 1887 edition, and care has been taken to duplicate the
original exactly, including typographical and punctuation
vagaries.
Additions to the text include adding the underscore character to
indicate italics, and textual end-notes in square braces.
Project Gutenberg Editor's Note: In reproofing and moving old PG
files such as this to the present PG directory system it is the
policy to reformat the text to conform to present PG Standards.
In this case however, in consideration of the note above of the
original transcriber describing his care to try to duplicate the
original 1887 edition as to typography and punctuation vagaries,
no changes have been made in this ascii text file. However, in
the Latin-1 file and this html file, present standards are
followed and the several French and Spanish words have been
given their proper accents.
Part II, The Country of the Saints, deals much with the Mormon Church.
A STUDY IN SCARLET.
PART I.
(_Being a reprint from the reminiscences of_ JOHN H. WATSON, M.D., _late
of the Army Medical Department._) [2]
CHAPTER I. MR. SHERLOCK HOLMES.
IN the year 1878 I took my degree of Doctor of Medicine of the
University of London, and proceeded to Netley to go through the course
prescribed for surgeons in the army. Having completed my studies there,
I was duly attached to the Fifth Northumberland Fusiliers as Assistant
Surgeon. The regiment was stationed in India at the time, and before
I could join it, the second Afghan war had broken out. On landing at
Bombay, I learned that my corps had advanced through the passes, and
was already deep in the enemy's country. I followed, however, with many
other officers who were in the same situation as myself, and succeeded
in reaching Candahar in safety, where I found my regiment, and at once
entered upon my new duties.
The campaign brought honours and promotion to many, but for me it had
nothing but misfortune and disaster. I was removed from my brigade and
attached to the Berkshires, with whom I served at the fatal battle of
Maiwand. There I was struck on the shoulder by a Jezail bullet, which
shattered the bone and grazed the subclavian artery. I should have
fallen into the hands of the murderous Ghazis had it not been for the
devotion and courage shown by Murray, my orderly, who threw me across a
pack-horse, and succeeded in bringing me safely to the British lines.
Worn with pain, and weak from the prolonged hardships which I had
undergone, I was removed, with a great train of wounded sufferers, to
the base hospital at Peshawar. Here I rallied, and had already improved
so far as to be able to walk about the wards, and even to bask a little
upon the verandah, when I was struck down by enteric fever, that curse
of our Indian possessions. For months my life was despaired of, and
when at last I came to myself and became convalescent, I was so weak and
emaciated that a medical board determined that not a day should be lost
in sending me back to England. I was dispatched, accordingly, in the
troopship "Orontes," and landed a month later on Portsmouth jetty, with
my health irretrievably ruined, but with permission from a paternal
government to spend the next nine months in attempting to improve it.
I had neither kith nor kin in England, and was therefore as free as
air--or as free as an income of eleven shillings and sixpence a day will
permit a man to be. Under such circumstances, I naturally gravitated to
London, that great cesspool into which all the loungers and idlers of
the Empire are irresistibly drained. There I stayed for some time at
a private hotel in the Strand, leading a comfortless, meaningless
existence, and spending such money as I had, considerably more freely
than I ought. So alarming did the state of my finances become, that
I soon realized that I must either leave the metropolis and rusticate
somewhere in the country, or that I must make a complete alteration in
my style of living. Choosing the latter alternative, I began by making
up my mind to leave the hotel, and to take up my quarters in some less
pretentious and less expensive domicile.
On the very day that I had come to this conclusion, I was standing at
the Criterion Bar, when some one tapped me on the shoulder, and turning
round I recognized young Stamford, who had been a dresser under me at
Barts. The sight of a friendly face in the great wilderness of London is
a pleasant thing indeed to a lonely man. In old days Stamford had never
been a particular crony of mine, but now I hailed him with enthusiasm,
and he, in his turn, appeared to be delighted to see me. In the
exuberance of my joy, I asked him to lunch with me at the Holborn, and
we started off together in a hansom.
"Whatever have you been doing with yourself, Watson?" he asked in
undisguised wonder, as we rattled through the crowded London streets.
"You are as thin as a lath and as brown as a nut."
I gave him a short sketch of my adventures, and had hardly concluded it
by the time that we reached our destination.
"Poor devil!" he said, commiseratingly, after he had listened to my
misfortunes. "What are you up to now?"
"Looking for lodgings." [3] I answered. "Trying to solve the problem
as to whether it is possible to get comfortable rooms at a reasonable
price."
"That's a strange thing," remarked my companion; "you are the second man
to-day that has used that expression to me."
"And who was the first?" I asked.
"A fellow who is working at the chemical laboratory up at the hospital.
He was bemoaning himself this morning because he could not get someone
to go halves with him in some nice rooms which he had found, and which
were too much for his purse."
"By Jove!" I cried, "if he really wants someone to share the rooms and
the expense, I am the very man for him. I should prefer having a partner
to being alone."
Young Stamford looked rather strangely at me over his wine-glass. "You
don't know Sherlock Holmes yet," he said; "perhaps you would not care
for him as a constant companion."
"Why, what is there against him?"
"Oh, I didn't say there was anything against him. He is a little queer
in his ideas--an enthusiast in some branches of science. As far as I
know he is a decent fellow enough."
"A medical student, I suppose?" said I.
"No--I have no idea what he intends to go in for. I believe he is well
up in anatomy, and he is a first-class chemist; but, as far as I know,
he has never taken out any systematic medical classes. His studies are
very desultory and eccentric, but he has amassed a lot of out-of-the way
knowledge which would astonish his professors."
"Did you never ask him what he was going in for?" I asked.
"No; he is not a man that it is easy to draw out, though he can be
communicative enough when the fancy seizes him."
"I should like to meet him," I said. "If I am to lodge with anyone, I
should prefer a man of studious and quiet habits. I am not strong
enough yet to stand much noise or excitement. I had enough of both in
Afghanistan to last me for the remainder of my natural existence. How
could I meet this friend of yours?"
"He is sure to be at the laboratory," returned my companion. "He either
avoids the place for weeks, or else he works there from morning to
night. If you like, we shall drive round together after luncheon."
"Certainly," I answered, and the conversation drifted away into other
channels.
As we made our way to the hospital after leaving the Holborn, Stamford
gave me a few more particulars about the gentleman whom I proposed to
take as a fellow-lodger.
"You mustn't blame me if you don't get on with him," he said; "I know
nothing more of him than I have learned from meeting him occasionally in
the laboratory. You proposed this arrangement, so you must not hold me
responsible."
"If we don't get on it will be easy to part company," I answered. "It
seems to me, Stamford," I added, looking hard at my companion, "that you
have some reason for washing your hands of the matter. Is this fellow's
temper so formidable, or what is it? Don't be mealy-mouthed about it."
"It is not easy to express the inexpressible," he answered with a laugh.
"Holmes is a little too scientific for my tastes--it approaches to
cold-bloodedness. I could imagine his giving a friend a little pinch of
the latest vegetable alkaloid, not out of malevolence, you understand,
but simply out of a spirit of inquiry in order to have an accurate idea
of the effects. To do him justice, I think that he would take it himself
with the same readiness. He appears to have a passion for definite and
exact knowledge."
"Very right too."
"Yes, but it may be pushed to excess. When it comes to beating the
subjects in the dissecting-rooms with a stick, it is certainly taking
rather a bizarre shape."
"Beating the subjects!"
"Yes, to verify how far bruises may be produced after death. I saw him
at it with my own eyes."
"And yet you say he is not a medical student?"
abcdef
"No. Heaven knows what the objects of his studies are. But here we
are, and you must form your own impressions about him." As he spoke, we
turned down a narrow lane and passed through a small side-door, which
opened into a wing of the great hospital. It was familiar ground to me,
and I needed no guiding as we ascended the bleak stone staircase and
made our way down the long corridor with its vista of whitewashed
wall and dun-coloured doors. Near the further end a low arched passage
branched away from it and led to the chemical laboratory.
This was a lofty chamber, lined and littered with countless bottles.
Broad, low tables were scattered about, which bristled with retorts,
test-tubes, and little Bunsen lamps, with their blue flickering flames.
There was only one student in the room, who was bending over a distant
table absorbed in his work. At the sound of our steps he glanced round
and sprang to his feet with a cry of pleasure. "I've found it! I've
found it," he shouted to my companion, running towards us with a
test-tube in his hand. "I have found a re-agent which is precipitated
by hoemoglobin, [4] and by nothing else." Had he discovered a gold mine,
greater delight could not have shone upon his features.
"Dr. Watson, Mr. Sherlock Holmes," said Stamford, introducing us.
"How are you?" he said cordially, gripping my hand with a strength
for which I should hardly have given him credit. "You have been in
Afghanistan, I perceive."
"How on earth did you know that?" I asked in astonishment.
"Never mind," said he, chuckling to himself. "The question now is about
hoemoglobin. No doubt you see the significance of this discovery of
mine?"
"It is interesting, chemically, no doubt," I answered, "but
practically----"
"Why, man, it is the most practical medico-legal discovery for years.
Don't you see that it gives us an infallible test for blood stains. Come
over here now!" He seized me by the coat-sleeve in his eagerness, and
drew me over to the table at which he had been working. "Let us have
some fresh blood," he said, digging a long bodkin into his finger, and
drawing off the resulting drop of blood in a chemical pipette. "Now, I
add this small quantity of blood to a litre of water. You perceive that
the resulting mixture has the appearance of pure water. The proportion
of blood cannot be more than one in a million. I have no doubt, however,
that we shall be able to obtain the characteristic reaction." As he
spoke, he threw into the vessel a few white crystals, and then added
some drops of a transparent fluid. In an instant the contents assumed a
dull mahogany colour, and a brownish dust was precipitated to the bottom
of the glass jar.
"Ha! ha!" he cried, clapping his hands, and looking as delighted as a
child with a new toy. "What do you think of that?"
"It seems to be a very delicate test," I remarked.
"Beautiful! beautiful! The old Guiacum test was very clumsy and
uncertain. So is the microscopic examination for blood corpuscles. The
latter is valueless if the stains are a few hours old. Now, this appears
to act as well whether the blood is old or new. Had this test been
invented, there are hundreds of men now walking the earth who would long
ago have paid the penalty of their crimes."
"Indeed!" I murmured.
"Criminal cases are continually hinging upon that one point. A man is
suspected of a crime months perhaps after it has been committed. His
linen or clothes are examined, and brownish stains discovered upon them.
Are they blood stains, or mud stains, or rust stains, or fruit stains,
or what are they? That is a question which has puzzled many an expert,
and why? Because there was no reliable test. Now we have the Sherlock
Holmes' test, and there will no longer be any difficulty."
His eyes fairly glittered as he spoke, and he put his hand over his
heart and bowed as if to some applauding crowd conjured up by his
imagination.
"You are to be congratulated," I remarked, considerably surprised at his
enthusiasm.
"There was the case of Von Bischoff at Frankfort last year. He would
certainly have been hung had this test been in existence. Then there was
Mason of Bradford, and the notorious Muller, and Lefevre of Montpellier,
and Samson of New Orleans. I could name a score of cases in which it
would have been decisive."
"You seem to be a walking calendar of crime," said Stamford with a
laugh. "You might start a paper on those lines. Call it the 'Police News
of the Past.'"
"Very interesting reading it might be made, too," remarked Sherlock
Holmes, sticking a small piece of plaster over the prick on his finger.
"I have to be careful," he continued, turning to me with a smile, "for I
dabble with poisons a good deal." He held out his hand as he spoke, and
I noticed that it was all mottled over with similar pieces of plaster,
and discoloured with strong acids.
"We came here on business," said Stamford, sitting down on a high
three-legged stool, and pushing another one in my direction with
his foot. "My friend here wants to take diggings, and as you were
complaining that you could get no one to go halves with you, I thought
that I had better bring you together."
Sherlock Holmes seemed delighted at the idea of sharing his rooms with
me. "I have my eye on a suite in Baker Street," he said, "which would
suit us down to the ground. You don't mind the smell of strong tobacco,
I hope?"
"I always smoke 'ship's' myself," I answered.
"That's good enough. I generally have chemicals about, and occasionally
do experiments. Would that annoy you?"
"By no means."
"Let me see--what are my other shortcomings. I get in the dumps at
times, and don't open my mouth for days on end. You must not think I am
sulky when I do that. Just let me alone, and I'll soon be right. What
have you to confess now? It's just as well for two fellows to know the
worst of one another before they begin to live together."
I laughed at this cross-examination. "I keep a bull pup," I said, "and
I object to rows because my nerves are shaken, and I get up at all sorts
of ungodly hours, and I am extremely lazy. I have another set of vices
when I'm well, but those are the principal ones at present."
"Do you include violin-playing in your category of rows?" he asked,
anxiously.
"It depends on the player," I answered. "A well-played violin is a treat
for the gods--a badly-played one----"
"Oh, that's all right," he cried, with a merry laugh. "I think we may
consider the thing as settled--that is, if the rooms are agreeable to
you."
"When shall we see them?"
"Call for me here at noon to-morrow, and we'll go together and settle
everything," he answered.
"All right--noon exactly," said I, shaking his hand.
We left him working among his chemicals, and we walked together towards
my hotel.
"By the way," I asked suddenly, stopping and turning upon Stamford, "how
the deuce did he know that I had come from Afghanistan?"
My companion smiled an enigmatical smile. "That's just his little
peculiarity," he said. "A good many people have wanted to know how he
finds things out."
"Oh! a mystery is it?" I cried, rubbing my hands. "This is very piquant.
I am much obliged to you for bringing us together. 'The proper study of
mankind is man,' you know."
"You must study him, then," Stamford said, as he bade me good-bye.
"You'll find him a knotty problem, though. I'll wager he learns more
about you than you about him. Good-bye."
"Good-bye," I answered, and strolled on to my hotel, considerably
interested in my new acquaintance.
CHAPTER II. THE SCIENCE OF DEDUCTION.
WE met next day as he had arranged, and inspected the rooms at No. 221B,
[5] Baker Street, of which he had spoken at our meeting. They
consisted of a couple of comfortable bed-rooms and a single large
airy sitting-room, cheerfully furnished, and illuminated by two broad
windows. So desirable in every way were the apartments, and so moderate
did the terms seem when divided between us, that the bargain was
concluded upon the spot, and we at once entered into possession.
That very evening I moved my things round from the hotel, and on the
following morning Sherlock Holmes followed me with several boxes and
portmanteaus. For a day or two we were busily employed in unpacking and
laying out our property to the best advantage. That done, we
gradually began to settle down and to accommodate ourselves to our new
surroundings.
Holmes was certainly not a difficult man to live with. He was quiet
in his ways, and his habits were regular. It was rare for him to be
up after ten at night, and he had invariably breakfasted and gone out
before I rose in the morning. Sometimes he spent his day at the chemical
laboratory, sometimes in the dissecting-rooms, and occasionally in long
walks, which appeared to take him into the lowest portions of the City.
Nothing could exceed his energy when the working fit was upon him; but
now and again a reaction would seize him, and for days on end he would
lie upon the sofa in the sitting-room, hardly uttering a word or moving
a muscle from morning to night. On these occasions I have noticed such
a dreamy, vacant expression in his eyes, that I might have suspected him
of being addicted to the use of some narcotic, had not the temperance
and cleanliness of his whole life forbidden such a notion.
As the weeks went by, my interest in him and my curiosity as to his
aims in life, gradually deepened and increased. His very person and
appearance were such as to strike the attention of the most casual
observer. In height he was rather over six feet, and so excessively
lean that he seemed to be considerably taller. His eyes were sharp and
piercing, save during those intervals of torpor to which I have alluded;
and his thin, hawk-like nose gave his whole expression an air of
alertness and decision. His chin, too, had the prominence and squareness
which mark the man of determination. His hands were invariably
blotted with ink and stained with chemicals, yet he was possessed of
extraordinary delicacy of touch, as I frequently had occasion to observe
when I watched him manipulating his fragile philosophical instruments.
The reader may set me down as a hopeless busybody, when I confess how
much this man stimulated my curiosity, and how often I endeavoured
to break through the reticence which he showed on all that concerned
himself. Before pronouncing judgment, however, be it remembered, how
objectless was my life, and how little there was to engage my attention.
My health forbade me from venturing out unless the weather was
exceptionally genial, and I had no friends who would call upon me and
break the monotony of my daily existence. Under these circumstances, I
eagerly hailed the little mystery which hung around my companion, and
spent much of my time in endeavouring to unravel it.
He was not studying medicine. He had himself, in reply to a question,
confirmed Stamford's opinion upon that point. Neither did he appear to
have pursued any course of reading which might fit him for a degree in
science or any other recognized portal which would give him an entrance
into the learned world. Yet his zeal for certain studies was remarkable,
and within eccentric limits his knowledge was so extraordinarily ample
and minute that his observations have fairly astounded me. Surely no man
would work so hard or attain such precise information unless he had some
definite end in view. Desultory readers are seldom remarkable for the
exactness of their learning. No man burdens his mind with small matters
unless he has some very good reason for doing so.
His ignorance was as remarkable as his knowledge. Of contemporary
literature, philosophy and politics he appeared to know next to nothing.
Upon my quoting Thomas Carlyle, he inquired in the naivest way who he
might be and what he had done. My surprise reached a climax, however,
when I found incidentally that he was ignorant of the Copernican Theory
and of the composition of the Solar System. That any civilized human
being in this nineteenth century should not be aware that the earth
travelled round the sun appeared to be to me such an extraordinary fact
that I could hardly realize it.
"You appear to be astonished," he said, smiling at my expression of
surprise. "Now that I do know it I shall do my best to forget it."
"To forget it!"
"You see," he explained, "I consider that a man's brain originally is
like a little empty attic, and you have to stock it with such furniture
as you choose. A fool takes in all the lumber of every sort that he
comes across, so that the knowledge which might be useful to him gets
crowded out, or at best is jumbled up with a lot of other things so that
he has a difficulty in laying his hands upon it. Now the skilful workman
is very careful indeed as to what he takes into his brain-attic. He will
have nothing but the tools which may help him in doing his work, but of
these he has a large assortment, and all in the most perfect order. It
is a mistake to think that that little room has elastic walls and can
distend to any extent. Depend upon it there comes a time when for every
addition of knowledge you forget something that you knew before. It is
of the highest importance, therefore, not to have useless facts elbowing
out the useful ones."

View File

@@ -72,7 +72,7 @@ rgtest!(f7_stdin, |dir: Dir, mut cmd: TestCommand| {
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
";
eqnice!(expected, cmd.arg("-f-").pipe("Sherlock"));
eqnice!(expected, cmd.arg("-f-").pipe(b"Sherlock"));
});
// See: https://github.com/BurntSushi/ripgrep/issues/20
@@ -630,6 +630,41 @@ rgtest!(f993_null_data, |dir: Dir, mut cmd: TestCommand| {
eqnice!(expected, cmd.stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1078
//
// N.B. There are many more tests in the grep-printer crate.
rgtest!(f1078_max_columns_preview1, |dir: Dir, mut cmd: TestCommand| {
dir.create("sherlock", SHERLOCK);
cmd.args(&[
"-M46", "--max-columns-preview",
"exhibited|dusted|has to have it",
]);
let expected = "\
sherlock:but Doctor Watson has to have it taken out for [... omitted end of long line]
sherlock:and exhibited clearly, with a label attached.
";
eqnice!(expected, cmd.stdout());
});
rgtest!(f1078_max_columns_preview2, |dir: Dir, mut cmd: TestCommand| {
dir.create("sherlock", SHERLOCK);
cmd.args(&[
"-M43", "--max-columns-preview",
// Doing a replacement forces ripgrep to show the number of remaining
// matches. Normally, this happens by default when printing a tty with
// colors.
"-rxxx",
"exhibited|dusted|has to have it",
]);
let expected = "\
sherlock:but Doctor Watson xxx taken out for him and [... 1 more match]
sherlock:and xxx clearly, with a label attached.
";
eqnice!(expected, cmd.stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1138
rgtest!(f1138_no_ignore_dot, |dir: Dir, mut cmd: TestCommand| {
dir.create_dir(".git");
@@ -646,6 +681,21 @@ rgtest!(f1138_no_ignore_dot, |dir: Dir, mut cmd: TestCommand| {
eqnice!("bar\n", cmd.arg("--ignore-file").arg(".fzf-ignore").stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1155
rgtest!(f1155_auto_hybrid_regex, |dir: Dir, mut cmd: TestCommand| {
// No sense in testing a hybrid regex engine with only one engine!
if !dir.is_pcre2() {
return;
}
dir.create("sherlock", SHERLOCK);
cmd.arg("--no-pcre2").arg("--auto-hybrid-regex").arg(r"(?<=the )Sherlock");
let expected = "\
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
";
eqnice!(expected, cmd.stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1207
//

View File

@@ -341,6 +341,14 @@ rgtest!(glob_case_sensitive, |dir: Dir, mut cmd: TestCommand| {
eqnice!("file2.html:Sherlock\n", cmd.stdout());
});
rgtest!(glob_always_case_insensitive, |dir: Dir, mut cmd: TestCommand| {
dir.create("sherlock", SHERLOCK);
dir.create("file.HTML", "Sherlock");
cmd.args(&["--glob-case-insensitive", "--glob", "*.html", "Sherlock"]);
eqnice!("file.HTML:Sherlock\n", cmd.stdout());
});
rgtest!(byte_offset_only_matching, |dir: Dir, mut cmd: TestCommand| {
dir.create("sherlock", SHERLOCK);
cmd.arg("-b").arg("-o").arg("Sherlock");
@@ -752,12 +760,11 @@ rgtest!(unrestricted2, |dir: Dir, mut cmd: TestCommand| {
rgtest!(unrestricted3, |dir: Dir, mut cmd: TestCommand| {
dir.create("sherlock", SHERLOCK);
dir.create("file", "foo\x00bar\nfoo\x00baz\n");
dir.create("hay", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("-uuu").arg("foo");
let expected = "\
file:foo\x00bar
file:foo\x00baz
Binary file hay matches (found \"\\u{0}\" byte around offset 3)
";
eqnice!(expected, cmd.stdout());
});
@@ -950,10 +957,35 @@ rgtest!(compressed_failing_gzip, |dir: Dir, mut cmd: TestCommand| {
cmd.assert_non_empty_stderr();
});
rgtest!(binary_nosearch, |dir: Dir, mut cmd: TestCommand| {
rgtest!(binary_convert, |dir: Dir, mut cmd: TestCommand| {
dir.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("foo").arg("file");
cmd.arg("--no-mmap").arg("foo").arg("file");
let expected = "\
Binary file matches (found \"\\u{0}\" byte around offset 3)
";
eqnice!(expected, cmd.stdout());
});
rgtest!(binary_convert_mmap, |dir: Dir, mut cmd: TestCommand| {
dir.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("--mmap").arg("foo").arg("file");
let expected = "\
Binary file matches (found \"\\u{0}\" byte around offset 3)
";
eqnice!(expected, cmd.stdout());
});
rgtest!(binary_quit, |dir: Dir, mut cmd: TestCommand| {
dir.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("--no-mmap").arg("foo").arg("-gfile");
cmd.assert_err();
});
rgtest!(binary_quit_mmap, |dir: Dir, mut cmd: TestCommand| {
dir.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("--mmap").arg("foo").arg("-gfile");
cmd.assert_err();
});

View File

@@ -88,7 +88,7 @@ rgtest!(stdin, |_: Dir, mut cmd: TestCommand| {
1:For the Doctor Watsons of this world, as opposed to the Sherlock
2:Holmeses, success in the province of detective work must always
";
eqnice!(expected, cmd.pipe(SHERLOCK));
eqnice!(expected, cmd.pipe(SHERLOCK.as_bytes()));
});
// Test that multiline search and contextual matches work.

View File

@@ -705,3 +705,36 @@ rgtest!(r1203_reverse_suffix_literal, |dir: Dir, _: TestCommand| {
let mut cmd = dir.command();
eqnice!("153.230000\n", cmd.arg(r"\d\d\d000").arg("test").stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1223
rgtest!(r1223_no_dir_check_for_default_path, |dir: Dir, mut cmd: TestCommand| {
dir.create_dir("-");
dir.create("a.json", "{}");
dir.create("a.txt", "some text");
eqnice!(
"a.json\na.txt\n",
sort_lines(&cmd.arg("a").pipe(b"a.json\na.txt"))
);
});
// See: https://github.com/BurntSushi/ripgrep/issues/1259
rgtest!(r1259_drop_last_byte_nonl, |dir: Dir, mut cmd: TestCommand| {
dir.create("patterns-nonl", "[foo]");
dir.create("patterns-nl", "[foo]\n");
dir.create("test", "fz");
eqnice!("fz\n", cmd.arg("-f").arg("patterns-nonl").arg("test").stdout());
cmd = dir.command();
eqnice!("fz\n", cmd.arg("-f").arg("patterns-nl").arg("test").stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1334
rgtest!(r1334_crazy_literals, |dir: Dir, mut cmd: TestCommand| {
dir.create("patterns", &"1.208.0.0/12\n".repeat(40));
dir.create("corpus", "1.208.0.0/12\n");
eqnice!(
"1.208.0.0/12\n",
cmd.arg("-Ff").arg("patterns").arg("corpus").stdout()
);
});

View File

@@ -7,6 +7,8 @@ mod hay;
// Utilities for making tests nicer to read and easier to write.
mod util;
// Tests for ripgrep's handling of binary files.
mod binary;
// Tests related to most features in ripgrep. If you're adding something new
// to ripgrep, tests should probably go in here.
mod feature;

View File

@@ -297,7 +297,7 @@ impl TestCommand {
}
/// Pipe `input` to a command, and collect the output.
pub fn pipe(&mut self, input: &str) -> String {
pub fn pipe(&mut self, input: &[u8]) -> String {
self.cmd.stdin(process::Stdio::piped());
self.cmd.stdout(process::Stdio::piped());
self.cmd.stderr(process::Stdio::piped());
@@ -309,7 +309,7 @@ impl TestCommand {
let mut stdin = child.stdin.take().expect("expected standard input");
let input = input.to_owned();
let worker = thread::spawn(move || {
write!(stdin, "{}", input)
stdin.write_all(&input)
});
let output = self.expect_success(child.wait_with_output().unwrap());

View File

@@ -1,2 +0,0 @@
wincolor has moved to the termcolor repository:
https://github.com/BurntSushi/termcolor