Compare commits

...

539 Commits

Author SHA1 Message Date
Andrew Gallant
92daa34eb3 ripgrep: release 12.0.0 2020-03-15 21:42:54 -04:00
Andrew Gallant
a8c1fb7c88 changelog: prepare for 12.0.0 release 2020-03-15 21:06:45 -04:00
Andrew Gallant
52ec68799c ci: make script names consistent 2020-03-15 21:06:45 -04:00
Andrew Gallant
c0d78240df ci: remove Travis and appveyor specific stuff 2020-03-15 21:06:45 -04:00
Andrew Gallant
cda9acb876 ci: rebuild release infrastructure on GitHub Actions 2020-03-15 21:06:45 -04:00
Andrew Gallant
1ece50694e readme: update file size 2020-03-15 13:27:31 -04:00
Andrew Gallant
f3a966bcbc readme: add 'Unicode' label to ugrep 2020-03-15 13:26:02 -04:00
Andrew Gallant
a38913b63a readme: update benchmarks
This also updates the corpora used, so previous times (and counts) are
not comparable.

We also remove some tools, likt pt, sift and ucg, since they appear to
be no longer maintained. ag isn't really maintained either, but it still
has significant mind share, so we retain a benchmark for it.

We also upgrade ack to version 3, and remove the clarification on how
`-w` is implemented.

We also add `git grep -P` (uses PCRE2) which appears to be much faster
than `git grep -E`.

Finally, we add ugrep which is a new up and comer in this space.

Fixes #1474
2020-03-15 13:21:18 -04:00
Andrew Gallant
e772a95b58 regex: avoid using literal optimizations when whitespace is detected
If a literal is entirely whitespace, then it's quite likely that it is
very common. So when that case occurs, just don't do (inner) literal
optimizations at all.

The regex engine may still make sub-optimal decisions here, but that's a
problem for another day.

Fixes #1087
2020-03-15 13:19:14 -04:00
Andrew Gallant
9dd4bf8d7f style: fix rust-analyzer lint warnings 2020-03-15 13:19:14 -04:00
Andrew Gallant
c4c43c733e cli: add --no-ignore-files flag
The purpose of this flag is to force ripgrep to ignore all --ignore-file
flags (whether they come before or after --no-ignore-files).

This flag can be overridden with --ignore-files.

Fixes #1466
2020-03-15 13:19:14 -04:00
Andrew Gallant
447506ebe0 doc: clarify globing behavior
Fixes #1442, Fixes #1478
2020-03-15 13:19:14 -04:00
Andrew Gallant
12e4180985 doc: remove CPU features from man pages
It doesn't really belong in the man page since it's an artifact of a
build/runtime configuration. Moreover, it inhibits reproducible builds.

Fixes #1441
2020-03-15 13:19:14 -04:00
Andrew Gallant
daa8319398 doc: note ripgrep's stdin behavior
Fixes #1439
2020-03-15 13:19:14 -04:00
pierrenn
3a6a24a52a cli: add engine flag
This permits switching between the different regex engine modes that
ripgrep supports. The purpose of this flag is to make it easier to
extend ripgrep with additional regex engines.

Closes #1488, Closes #1502
2020-03-15 09:30:58 -04:00
pierrenn
aab3d80374 args: refactor to permit adding other engines
This is in preparation for adding a new --engine flag which is intended
to eventually supplant --auto-hybrid-regex.

While there are no immediate plans to add more regex engines to ripgrep,
this is intended to make it easier to maintain a patch to ripgrep with
an additional regex engine. See #1488 for more details.
2020-03-15 09:24:28 -04:00
Andrew Gallant
1856cda77b style: fix rust-analyzer lints in core 2020-03-15 09:04:54 -04:00
Andrew Gallant
7340d8dbbe deps: update everything
This adds one new dependency, maybe-uninit, which is brought in by
crossbeam-channel[1]. This is to apparently fix some unsound code
without bumping the MSRV. Since ripgrep uses the latest stable release
of Rust, the maybe-uninit crate should compile down to nothing and just
re-export std's `MaybeUninit` type.

[1] - https://github.com/crossbeam-rs/crossbeam/pull/458
2020-03-15 08:32:33 -04:00
chip
50d2047ae2 crates: update URLs in Cargo.toml
This corrects an oversight when the repo was re-organized to
have its crates moved into a 'crates' sub-directory.

PR #1505
2020-02-28 20:31:43 -05:00
Wolf Honore
227436624f ignore/types: add coq type
PR #1504
2020-02-28 19:11:29 -05:00
pierreN
5bfdd3a652 ci: fix ci by removing fetch-depth 1
It's not clear why removing this makes things work. I've submitted
PRs that passed CI with fetch-depth=1. Maybe it only fails when
PRs are submitted from external contributors?

Either way, for now, we remove this and absorb the extra cost in
order to get PRs passing CI again.

PR #1501
2020-02-27 08:53:06 -05:00
Andrew Gallant
ecec6147d1 doc: be more vague in the FAQ
The existing vagueness was not enough to prevent people from lawyering
me over it.
2020-02-22 09:13:31 -05:00
Lucien Greathouse
db7a8cdcb5 globset: Implement serde::{Serialize, Deserialize} for Glob
PR #1492
2020-02-21 07:40:47 -05:00
Andrew Gallant
eef7a7e7ff readme: update CI badge 2020-02-20 18:15:15 -05:00
Andrew Gallant
4176050cdd ignore: another simplification
Again, thanks to @zsugabubus!
2020-02-20 17:26:34 -05:00
Andrew Gallant
109460fce2 ignore: simplify parallel worker initialization
We can just ask the channel whether any work has been loaded. Normally
querying a channel for its length is a strong predictor of bugs, but in
this case, we do it before we ever attempt a `recv`, so it should work.

Kudos to @zsugabubus for suggesting this!
2020-02-20 16:50:41 -05:00
Andrew Gallant
da3431b478 ci: switch build to GitHub Actions 2020-02-20 16:07:51 -05:00
Andrew Gallant
f314b0d55f ignore: fix parallel traversal
It turns out that the previous version wasn't quite correct. Namely, it
was possible for the following sequence to occur:

1. Consider that all workers, except for one, are `waiting`.
2. The last remaining worker finds one more job to do and sends it on
   the channel.
3. One of the previously `waiting` workers wakes up from the job that
   the last running worker sent, but `self.resume()` has not been
   called yet.
4. The last worker, from (2), calls `get_work` and sees that the
   channel has nothing on it, so it executes `self.waiting() ==
   1`. Since the worker in (3) hasn't called `self.resume()` yet,
   `self.waiting() == 1` evaluates to true.
5. This sets off a chain reaction that stops all workers, despite that
   fact that (3) got more work (which could itself spawn more work).

The end result is that the traversal may terminate while their are still
outstanding work items to process. This problem was observed through
spurious failures in CI. I was not actually able to reproduce the bug
locally.

We fix this by changing our strategy to detect termination using a
counter. Namely, we increment the counter just before sending new work
and decrement the counter just after finishing work. In this way, we
guarantee that the counter only ever reaches 0 once there is no more
work to process.

See #1337 for more discussion. Many thanks to @zsugabubus for helping me
work through this.
2020-02-20 16:07:51 -05:00
Andrew Gallant
fab5c812f3 tests: add debugging output
The transient failures appear to be persisting and they are quite
difficult to debug. So include a full directory listing in the output of
every test failure.
2020-02-20 16:07:51 -05:00
Andrew Gallant
c824d095a7 tests: use std::env::consts::EXE_SUFFIX
This avoids a conditional compilation knob and is likely more portable.
2020-02-20 16:07:51 -05:00
Andrew Gallant
ee21897ebd tests: make 'cross test' work
The reason why it wasn't working was the integration tests. Namely, the
integration tests attempted to execute the 'rg' binary directly from
inside cross's docker container. But this obviously doesn't work when
'rg' was compiled for a totally different architecture.

Cross normally does this by hooking into the Rust test infrastructure
and causing tests to run with 'qemu'. But our integration tests didn't
do that. This commit fixes our test setup to check for cross's
environment variable that points to the 'qemu' binary. Once we have
that, we just use 'qemu-foo rg' instead of 'rg'. Piece of cake.
2020-02-20 16:07:51 -05:00
Andrew Gallant
0373f6ddb0 ci: soft-disable Travis and AppVeyor 2020-02-20 16:07:51 -05:00
asymmetric
b44554c803 ignore/types: add K type
Adds support for files used by the K executable semantic framework:
http://www.kframework.org/index.php/Main_Page

PR #1493
2020-02-19 07:07:09 -05:00
Andrew Gallant
0874aa115c repo: make ripgrep build with the new organization 2020-02-17 19:24:53 -05:00
Andrew Gallant
fdd8510fdd repo: move all source code in crates directory
The top-level listing was just getting a bit too long for my taste. So
put all of the code in one directory and shrink the large top-level mess
to a small top-level mess.

NOTE: This commit only contains renames. The subsequent commit will
actually make ripgrep build again. We do it this way with the naive hope
that this will make it easier for git history to track the renames.
Sigh.
2020-02-17 19:24:53 -05:00
Andrew Gallant
0bc4f0447b style: rustfmt everything
This is why I was so intent on clearing the PR queue. This will
effectively invalidate all existing patches, so I wanted to start from a
clean slate.

We do make one little tweak: we put the default type definitions in
their own file and tell rustfmt to keep its grubby mits off of it. We
also sort it lexicographically and hopefully will enforce that from here
on.
2020-02-17 19:24:53 -05:00
Andrew Gallant
c95f29e3ba ci: check rustfmt in Travis 2020-02-17 19:24:53 -05:00
Andrew Gallant
3644208b03 ci: set MSRV to Rust 1.41.0
The next release will be ripgrep 12, so we bump to the latest stable
release of Rust.
2020-02-17 19:24:53 -05:00
Andrew Gallant
66f045e055 changelog: add commit links
... now that we have stable identifiers.
2020-02-17 17:34:19 -05:00
zsugabubus
3d59bd98aa ignore: rework inter-thread messaging
Change the meaning of `Quit` message. Now it means terminate. The final
"dance" is unnecessary, because by the time quitting begins, no thread
will ever spawn a new `Work`. The trick was to replace the heuristic
spin-loop with blocking receive.

Closes #1337
2020-02-17 17:16:28 -05:00
Andrew Gallant
52d7f47420 ignore: treat symbolic links to directories as directories
Due to how walkdir works if symlinks are not followed, symlinks to
directories are seen as simple files by ripgrep. This caused a panic
in some cases due to receiving a WalkEvent::Exit event without a
corresponding WalkEvent::Dir event.

This is fixed by looking at the metadata of the file in the case of a
symlink to determine if it's a directory. We are careful to only do
this stat check when the depth of the entry is 0, as this bug only
impacts us when 1) we aren't following symlinks generally and 2) the
user provides a symlinked directory that we do follow as a top-level
path to search.

Fixes #1389, Closes #1397
2020-02-17 17:16:28 -05:00
Andrew Gallant
75cbe88fa2 cli: add --no-unicode, deprecate --no-pcre2-unicode
This adds a universal --no-unicode flag that is intended to work for all
supported regex engines. There is no point in retaining
--no-pcre2-unicode, so we make them aliases to the new flags and
deprecate them.
2020-02-17 17:16:28 -05:00
Andrew Gallant
711426a632 cli: add --no-require-git flag
This flag prevents ripgrep from requiring one to search a git repository
in order to respect git-related ignore rules (global, .gitignore and
local excludes). This actually corresponds to behavior ripgrep had long
ago, but #934 changed that. It turns out that users were relying on this
buggy behavior. In most cases, fixing it as simple as converting one's
rules to .ignore or .rgignore files. Unfortunately, there are other use
cases---like Perforce automatically respecting .gitignore files---that
make a strong case for ripgrep to at least support this.

The UX of a flag like this is absolutely atrocious. It's so obscure that
it's really not worth explicitly calling it out anywhere. Moreover, the
error cases that occur when this flag isn't used (but its behavior is
desirable) will not be intuitive, do not seem easily detectable and will
not guide users to this flag. Nevertheless, the motivation for this is
just barely strong enough for me to begrudgingly accept this.

Fixes #1414, Closes #1416
2020-02-17 17:16:28 -05:00
Andrew Gallant
01eeec56bb deb: fix fish completion install location
It looks like `completions` is owned by Fish itself. Third party
completions should go in `vendor_completions.d`.

Fixes #1485
2020-02-17 17:16:28 -05:00
Jakub Wieczorek
322fc75a3d ignore: make walker visit untraversable directories
This commit fixes an inconsistency between the serial and the parallel
directory walkers around visiting a directory for which the user holds
insufficient permissions to descend into.

The serial walker does produce a successful entry for a directory that
it cannot descend into due to insufficient permissions. However, before
this change that has not been the case for the parallel walker, which
would produce an `Err` item not only when descending into a directory
that it cannot read from but also for the directory entry itself.

This change brings the behaviour of the parallel variant in line with
that of the serial one.

Fixes #1346, Closes #1365
2020-02-17 17:16:28 -05:00
Jakub Wieczorek
b435eaafc8 grep-regex: fix inner literal extraction bug
This appears to be another transcription bug from copying this code from
the prefix literal detection from inside the regex crate. Namely, when
it comes to inner literals, we only want to treat counted repetition as
two separate cases: the case when the minimum match is 0 and the case
when the minimum match is more than 0. In the former case, we treat
`e{0,n}` as `e*` and in the latter we treat `e{m,n}` where `m >= 1` as
just `e`.

We could definitely do better here. e.g., This means regexes like
`(foo){10}` will only have `foo` extracted as a literal, where searching
for the full literal would likely be faster.

The actual bug here was that we were not implementing this logic
correctly. Namely, we weren't always "cutting" the literals in the
second case to prevent them from being expanded.

Fixes #1319, Closes #1367
2020-02-17 17:16:28 -05:00
Ed Page
f8e70294d5 ignore: allow post-processing at end-of-thread
On top of the parallel-walk's closures, this provides a Visitor API.
This clarifies the role of the two different closures in the `run`
API and allows implementing of `Drop` for post-processing once traversal
is finished.

The closure API is maintained not just for compatibility but also
convinience for simple cases.

Fixes #469, Closes #1430
2020-02-17 17:16:28 -05:00
Ed Page
578e2d47a8 core: simplify parallel walking using borrows
This changes ripgrep to use ignore's new support for borrowing data when
walking in parallel.
2020-02-17 17:16:28 -05:00
Ed Page
9f7c2ebc09 ignore: allow parallel walker to borrow data
This makes it so the caller can more easily refactor from
single-threaded to multi-threaded walking. If they want to support both,
this makes it easier to do so with a single initialization code-path. In
particular, it side-steps the need to put everything into an `Arc`.

This is not a breaking change because it strictly increases the number
of allowed inputs to `WalkParallel::run`.

Closes #1410, Closes #1432
2020-02-17 17:16:28 -05:00
Andrew Gallant
5c1eac41a3 changelog: highlight a bad performance regression 2020-02-17 17:16:28 -05:00
Johannes Altmanninger
6f2b79f584 ignore: use git commondir for sourcing .git/info/exclude
Git looks for this file in GIT_COMMON_DIR, which is usually the same
as GIT_DIR (.git). However, when searching inside a linked worktree,
.git is usually a file that contains the path of the actual git dir,
which in turn contains a file "commondir" which references the directory
where info/exclude may reside, alongside other configuration shared across
all worktrees. This directory is usually the git dir of the main worktree.

Unlike git this does *not* read environment variables GIT_DIR and
GIT_COMMON_DIR, because it is not clear how to interpret them when
searching multiple repositories.

Fixes #1445, Closes #1446
2020-02-17 17:16:28 -05:00
Andrew Gallant
0c3b673e4c cli: make ripgrep work in non-existent directories
It turns out that querying the CWD while in a directory that no longer
exists results in an error. Since the CWD is queried every time ripgrep
starts---whether it needs it or not---for dealing with glob matching,
ripgrep winds up being completely useless inside a non-existent
directory.

We fix this in a few different ways:

* Firstly, if std::env::current_dir() fails, then we fall back to trying
  to read the `PWD` environment variable.
* If that fails, that we return a more sensible error message so that a
  user can at least react to the problem. Previously, the error message
  was inscrutable.
* Finally, we try to avoid the problem altogether by building empty glob
  matchers if not globs were provided, thus side-stepping querying the
  CWD completely.

Fixes #1291, Closes #1400
2020-02-17 17:16:28 -05:00
Naveen Nathan
297b428c8c cli: add --no-ignore-exclude flag
This commit adds a new --no-ignore-exclude flag that permits disabling
the use of .git/info/exclude filtering. Local exclusions are manual
configurations to a repository and are not shared, so it is sometimes
useful to disable to get a consistent view of a repository.

This also adds a new section to the man page that describes automatic
filtering.

Closes #1420
2020-02-17 17:16:28 -05:00
Manfred Endres
804b43ecd8 globset: implement FromStr for Glob
The `globset::Glob` type [`new`] function creates a new value with an
`&str` parameter which returns an `Result<Glob, Error>` object. This is
exactly what [`std::str::FromStr::from_str`][`std::str::FromStr`] defines.
Libraries like [`clap`] use [`std::str::FromStr`] to create objects from
provided commandline arguments. This change makes this library usable
without a newtype wrapper.

[`std::str::FromStr`]: 	https://doc.rust-lang.org/std/str/trait.FromStr.html
[`clap`]:		https://docs.rs/clap/2.33.0/clap/macro.value_t.html
[`new`]:		https://docs.rs/globset/0.4.4/globset/struct.Glob.html#method.new

Closes #1447
2020-02-17 17:16:28 -05:00
Lucien Greathouse
2263b8ac92 globset: add GlobMatcher::glob
This exposes the underlying `Glob` used to compile the matcher. This can
be useful for wrapping up the glob matcher in other types.

Closes #1454
2020-02-17 17:16:28 -05:00
Andrew Gallant
cd8ec38a68 grep-regex: add fast path for -w/--word-regexp
Previously, ripgrep would always defer to the regex engine's capturing
matches in order to implement word matching. Namely, ripgrep would
determine the correct match offsets via a capturing group, since the
word regex is itself generated from the user supplied regex.

Unfortunately, the regex engine's capturing mode is still fairly slow,
so this commit adds a fast path to avoid capturing mode in the vast
majority of cases. See comments in the code for details.
2020-02-17 17:16:28 -05:00
Andrew Gallant
6a0e0147e0 grep-regex: improve literal detection with -w
When the -w/--word-regexp was used, ripgrep would in many cases fail to
apply literal optimizations. This occurs specifically when the regex
given by the user is an alternation of literals with no common prefixes
or suffixes, e.g.,

    rg -w 'foo|bar|baz|quux'

In this case, the inner literal detector fails. Normally, this would
result in literal prefixes being detected by the regex engine. But
because of the -w/--word-regexp flag, the actual regex that we run ends
up looking like this:

    (^|\W)(foo|bar|baz|quux)($|\W)

which of course defeats any prefix or suffix literal optimizations in
the regex crate's somewhat naive extractor. (A better extractor could
still do literal optimizations in the above case.)

So this commit fixes this by falling back to prefix or suffix literals
when they're available instead of prematurely giving up and assuming the
regex engine will do the rest.
2020-02-17 17:16:28 -05:00
Andrew Gallant
ad97e9c93f grep-regex: improve inner literal detection
This fixes an interesting performance bug where the inner literal
extractor would sometimes choose a sub-optimal literal. For example,
consider the regex:

    \x20+Sherlock Holmes\x20+

(The `\x20` is the ASCII code for a space character, which we use here
to just make it clearer. It otherwise does not matter.)

Previously, this would see the initial \x20 and then stop collecting
literals after the `+` repetition operator. This was because the inner
literal detector was adapter from the prefix literal detector, which had
to stop here. Namely, while \x20S would be a valid prefix (for example),
\x20\x20S would also be a valid prefix. As would \x20\x20\x20S and so
on. So the prefix detector would have to stop at the repetition
operator. Otherwise, only searching for \x20S could potentially scan
farther then the starting position of the next match.

However, for inner literals, this calculus no longer makes sense. We can
freely search for, e.g., \x20S without missing matches that start with
\x20\x20S precisely because we know this is an inner literal which may
not correspond to the start of a match.

With this fix, the literal that is now detected is

    \x20Sherlock Holmes\x20

Which is much better. We achieve this by no longer "cutting" literals
after seeing a `+` repetition operator. Instead, we permit literals to
continue to be extended.

The reason why this is important is because using \x20 as the literal to
search for is generally bad juju since it is so common. In fact, we
should probably add more logic here to either avoid such things or give
up entirely on the inner literal optimization if it detected a literal
that we think is very common. But we punt on such things here.
2020-02-17 17:16:28 -05:00
Robert Irelan
24f8a3e5ec doc: document all file type
This adds it to the guide and the docs for the --type flag.

Fixes #1344, Closes #1472
2020-02-17 17:16:28 -05:00
Mikko Vedru
1bdb767851 doc: improve docs for --sort and --sortr flags
I improved the help documentation in the following manner and for the
following reasons:

1. It's only logical to put the default sub-option on the first possible
line, as well as to separately mention that it is indeed the default
sub-option.

2. Additional options for the flags should describe the main points of
their purpose without requiring user to read the whole help entry. In my
opinion, the information sub-options' influence on multi-threading and
speed are important enough to warrant their inclusion in each
sub-option's description line text.

Closes #1434
2020-02-17 17:16:28 -05:00
Andreas Stieger
a4897eca23 readme: simplify openSUSE instructions
Closes #1436
2020-02-17 17:16:28 -05:00
Collin Styles
a070722ff2 cli: add --include-zero flag
This flag, when used in conjunction with --count or --count-matches,
will print a result for each file searched even if there were zero
matches in that file. This is off by default but can be enabled to make
ripgrep behave more like grep.

This also clarifies some of the defaults for the
grep-printer::SummaryBuilder type.

Closes #1370, Closes #1405
2020-02-17 17:16:28 -05:00
Matěj Cepl
4628d77808 ignore/types: add spec file type
This is for RPM package SPEC files.

Fixes #946, Closes #1449
2020-02-17 17:16:28 -05:00
Ximin Luo
f8418c6a52 explicitly declare lazy_static dependency
`benches/bench.rs` uses lazy_static but Cargo.toml does not declare a
dependency on it. This causes rustc to use its own internal private
copy instead. Sometimes this causes unintuitive errors like this Debian
bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942243

The underlying issue is https://github.com/rust-lang/rust#27812 but it
can be avoided by explicitly declaring the dependency, which you are
supposed to do anyways.

Closes #1435
2020-02-17 17:16:28 -05:00
luh2
040ca45ba0 ignore/types: add xhtml to xml file type
Closes #1426
2020-02-17 17:16:28 -05:00
Andrew Gallant
91470572cd changelog: add notes about new file types 2020-02-17 17:16:28 -05:00
Sven-Hendrik Haase
027adbf485 ignore/types: add 'diff' file type
This includes .patch and .diff files.

Fixes #1418, Closes #1419
2020-02-17 17:16:28 -05:00
Mohammad AlSaleh
e71eedf0eb cli: add --no-context-separator flag
--context-separator='' still adds a new line separator, which could
still potentially be useful. So we add a new `--no-context-separator`
flag that completely disables context separators even when the -A/-B/-C
context flags are used.

Closes #1390
2020-02-17 17:16:28 -05:00
Andrew Gallant
88f46d12f1 tests: remove existing test directory
I'm surprised this wasn't caught until now, but if a test directory
already exists, then it was reused. This can result in hard to debug
problems with tests when, e.g., file names are changed and a recursive
search is executed.
2020-02-17 17:16:28 -05:00
sharkdp
a18cf6ec39 ignore: add existence check for ignore files
This commit adds a simple `.exists()` check for `.gitignore`,
`.ignore`, and other similar files before actually calling
`File::open(…)` in `GitIgnoreBuilder::add`.

The reason is that a simple existence check via `stat` can be faster
than actually trying to `open` the file, see
https://stackoverflow.com/a/12774387/704831. As we typically expect(?)
the number of directories *without* ignore files to be much larger
than the number of directories *with* ignore files, this leads to an
overall speedup.

The performance gain is not huge for `rg`, but can be quite significant
if more `.gitignore`-like files are added via
`add_custom_ignore_filename`. The speedup is *larger* for folders with
*low* files-per-directory ratios.

Note though that we do not do this check on Windows until a specific
analysis there suggests this is beneficial. Namely, Windows generally
has slower file system operations, so it's not clear whether this
speculative check is actually a benefit or not.

Benchmark results
-----------------

`rg --files` in my home folder (200k results, 6.5 files per directory):

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 396.4 ± 3.2 | 390.9 | 400.0 | 1.05 |
| `./rg-feature --files` | 376.0 ± 3.6 | 369.3 | 383.5 | 1.00 |

`rg --files --hidden` in my home folder (800k results, 5.4
files per directory)

| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files --hidden` | 1.575 ± 0.012 | 1.560 | 1.597 | 1.06 |
| `./rg-feature --files --hidden` | 1.479 ± 0.011 | 1.464 | 1.496 | 1.00 |

`rg --files` in the chromium-79.0.3915.2 source tree (300k results, 12.7 files per
directory)

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `~/rg-master --files` | 445.2 ± 5.3 | 435.6 | 453.0 | 1.04 |
| `~/rg-feature --files` | 428.9 ± 7.0 | 418.2 | 440.0 | 1.00 |

`rg --files` in the linux-5.3 source tree (65k results, 15.1
files per directory)

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 94.5 ± 1.9 | 89.8 | 98.5 | 1.02 |
| `./rg-feature --files` | 92.6 ± 2.7 | 88.4 | 98.7 | 1.00 |

Closes #1381
2020-02-17 17:16:28 -05:00
Gibson Fahnestock
c78c3236a8 readme: remove outdated SIMD info
Looks like the upstream brew Formula [0][] now has SIMD support, so
remove the extraneous info now that the custom tap is no longer needed
[1][].

[0]: https://github.com/Homebrew/homebrew-core/blob/master/Formula/ripgrep.rb
[1]: f3083e4574

PR #1431
2020-02-15 17:19:22 -05:00
Sorin Sbarnea
7cf21600cd readme: document CentOS 8 support
ripgrep install instructions are valid even for the 7 version. The tool
works without problems on these too.

PR #1428
2020-02-15 17:16:57 -05:00
Jonathan Mast
647b0d3977 ignore/types: add HAML and ERB
These are commonly used templating languages for Ruby, add their
extensions to the filetypes list for convenient filtering.

PR #1407
2020-02-15 09:18:32 -05:00
Jeff S
e572fc1683 ignore/types: add slim, slime, and skim templates
PR #1391
2020-02-15 09:17:46 -05:00
Andrew Gallant
9cb93abd11 ignore: allow use of Error::description
We can remove it in the next semver incompatible release.
2020-02-10 06:44:21 -05:00
Luca Kredel
41695c66fa ignore/types: add typoscript file type
Add the file types for TypoScript - the configuration language of the
TYPO3 CMS.

PR #1477
2020-02-07 08:41:00 -05:00
Andrew Gallant
cb0dfda936 faq: add section about donations
This is asked often enough that it's worth having a canonical answer.
2020-02-05 13:09:11 -05:00
Andrew Gallant
74d1fe59e9 deps: update everything 2020-01-30 18:33:40 -05:00
Andrew Gallant
9fd1e202e0 deps: update regex, regex-syntax and aho-corasick
Notably, this brings in a bug fix reported by @okdana:
https://github.com/rust-lang/regex/issues/640
2020-01-30 18:32:56 -05:00
Robert Irelan
e76807b1b5 ignore/types: add *.org_archive to org file type
.org_archive is the default extension for Org archive files, created when
entries from an Org-mode file are archived (see
<https://orgmode.org/org.html#Moving-subtrees>). These files are still in Org
mode format, so it's worth searching them at the same time as non-archive Org
mode files.

PR #1475
2020-01-29 13:59:34 -05:00
Andrew Gallant
f8fb65f7e3 globset: fix benchmarks
There were apparently a lot of unused things, including lazy_static.
2020-01-27 16:45:12 -05:00
Tristan Waddington
98de8d248a ignore/types: make 'gradle' it's own type
This change maintains the existing behavior of the 'groovy' type, which
includes both .groovy and .gradle files.

PR #1470
2020-01-23 06:51:11 -05:00
Crestwave
c358700dfb readme: add instructions for Haiku x86_64 and x86_gcc2
PR #1465
2020-01-21 07:34:24 -05:00
Alex Touchet
8670a4a969 readme: update outdated links
PR #1463
2020-01-21 07:32:54 -05:00
Oliver Newman
e3b1f86908 doc: add missing "will" to the user guide
PR #1462
2020-01-20 17:26:08 -05:00
Jan Verbeek
46b07bb2ee ignore/types: fix postscript globs
The postscript globs were missing asterisks, so they were treated as
literal filenames.

PR #1461
2020-01-20 07:48:57 -05:00
Andrew Gallant
8bdf84e3a8 deps: update everything 2020-01-16 19:47:23 -05:00
Andrew Gallant
5a6e17fcc1 deps: various updates
Most of these updates (sans thread_local) are from crates I maintain
that have seen updates recently.

Notably, this includes a bump to `termcolor 1.1.0` which includes
support for respecting `NO_COLOR`. This commit therefore means that
ripgrep now supports `NO_COLOR`.

As an added bonus, we drop a dependency on Windows. (Although the total
amount of code compiled remains the same.)

Closes #1186
2020-01-11 10:09:10 -05:00
Andrew Gallant
00bfcd14a6 ignore-0.4.11 2020-01-10 15:08:27 -05:00
Andrew Gallant
bf0ddc4675 ci: fix musl docker build
Looks like the old japaric images are bunk. We update our docker image
to be based on the new rustembedded images and configure cross to use
it.

Turns out that this wasn't due to a stale docker image, but rather, a
bug in cross: https://github.com/rust-embedded/cross/issues/357
We work around that bug by installing the master branch of cross. Sigh.
2020-01-10 15:07:47 -05:00
Andrew Gallant
0fb3f6a159 ci: disable github actions for now
The CI build failures are annoying and distracting. Hopefully soon I'll
be able to invest more time in the switch.
2020-01-10 15:07:47 -05:00
Andrew Gallant
837fb5e21f deps: update to crossbeam-channel 0.4
Closes #1427
2020-01-10 15:07:47 -05:00
Andrew Gallant
2e1815606e deps: update to bytecount 0.6
Looks like there aren't any major changes other than dependency updates.
2020-01-10 15:07:47 -05:00
Andrew Gallant
cb2f6ddc61 deps: update to thread_local 1.0
We also update the pcre2 and regex dependencies, which removes any other
lingering uses of thread_local 0.3.
2020-01-10 15:07:47 -05:00
Andrew Gallant
bd7a42602f deps: bump to base64 0.11 2020-01-10 15:07:47 -05:00
Andrew Gallant
528ce56e1b deps: run cargo update
The only new dependency is an unused target specific dependency hermit
via the atty crate.
2020-01-10 15:07:47 -05:00
Yevgen Antymyrov
8892bf648c doc: fix typo in FAQ 2019-09-25 08:13:27 -04:00
Jonathan Clem
8cb7271b64 ci: get GitHub Actions running again
Basically, matrix.os needs to be defined for every build. We
were commenting out some of the builds in order to debug
CI in the `include` section, but we also need to comment them
out in the `build section.
2019-09-11 09:08:24 -04:00
Andrew Gallant
4858267f3b ci: initial github actions config 2019-08-31 09:24:44 -04:00
Andrew Gallant
5011dba2fd ignore: remove unused parameter 2019-08-28 20:21:34 -04:00
Andrew Gallant
e14f9195e5 deps: update everything 2019-08-28 20:18:47 -04:00
Andrew Gallant
ef0e7af56a deps: update bstr to 0.2.7
The new bstr release contains a small performance bug fix where some
trivial methods weren't being inlined.
2019-08-11 10:41:05 -04:00
Todd Walton
b266818aa5 doc: use XDG_CONFIG_HOME in comments
XDG_CONFIG_DIR does not actually exist.

PR #1347
2019-08-09 13:37:37 -04:00
LawAbidingCactus
81415ae52d doc: update to reflect glob matching behavior change
Specifically, paths contains a `/` are not allowed to match any
other slash in the path, even as a prefix. So `!.git` is the correct
incantation for ignoring a `.git` directory that occurs anywhere 
in the path.
2019-08-07 13:47:18 -04:00
Andrew Gallant
5c4584aa7c grep-regex-0.1.5 2019-08-06 09:51:13 -04:00
Andrew Gallant
0972c6e7c7 grep-searcher-0.1.6 2019-08-06 09:50:52 -04:00
Andrew Gallant
0a372bf2e4 deps: update ignore 2019-08-06 09:50:35 -04:00
Andrew Gallant
345124a7fa ignore-0.4.10 2019-08-06 09:47:45 -04:00
Andrew Gallant
31807f805a deps: drop tempfile
We were only using it to create temporary directories for `ignore`
tests, but it pulls in a bunch of dependencies and we don't really need
randomness. So just use our own simple wrapper instead.
2019-08-06 09:46:05 -04:00
Andrew Gallant
4de227fd9a deps: update everything
Mostly this just updates regex and its assorted dependencies. This does
drop utf8-ranges and ucd-util, in accordance with changes to
regex-syntax and regex.
2019-08-05 13:50:55 -04:00
jimbo1qaz
d7ce274722 readme: Debian Buster is stable now
PR #1338
2019-08-04 08:06:10 -04:00
Andrew Gallant
5b10328f41 changelog: update with bug fix 2019-08-02 07:37:27 -04:00
Andrew Gallant
813c676eca searcher: fix roll buffer bug
This commit fixes a subtle bug in how the line buffer was rolling its
contents. Specifically, when ripgrep searches without memory maps,
it uses a "roll" buffer for incremental line oriented search without
needing to read the entire file into memory at once. The roll buffer
works by reading a chunk of bytes from the file into memory, and then
searching everything in that buffer up to the last `\n` byte. The bytes
*after* the last `\n` byte are preserved, since they likely correspond
to *part* of the next line. Once ripgrep is done searching the buffer,
it "rolls" the buffer such that the start of the next line is at the
beginning of the buffer, and then ripgrep reads more data into the
buffer starting at the (possibly) partial end of that line.

The implication of this strategy, necessarily so, is that a buffer must
be big enough to fit a single line in memory. This is because the regex
engine needs a contiguous block of memory to search, so there is no way
to search anything smaller than a single line. So if a file contains a
single line with 7.5 million bytes, then the buffer will grow to be at
least that size. (Many files have super long lines like this, but they
tend to be *binary* files, which ripgrep will detect and stop searching
unless the user forces it with the `-a/--text` flag. So in practice,
they aren't usually a problem. However, in this case, #1335 found a case
where a plain text file had a line with 7.5 million bytes.)

Now, for performance reasons, ripgrep reuses these buffers across its
search. Typically, it will create `N` of these line buffers when it
starts (where `N` is the number of threads it is using), and then reuse
them without creating any new ones as it searches through files.

This means that if you search a file with a very long line, that buffer
will expand to be big enough to store that line. ripgrep never contracts
these buffers, so once it searches the next file, ripgrep will continue
to use this large buffer. While it might be prudent to contract these
buffers in some circumstances, this isn't otherwise inherently a
problem. The memory has already been allocated, and there isn't much
cost to using it, other than the fact that ripgrep hangs on to it and
never gives it back to the OS.

However, the `roll` implementation described above had a really
important bug in it that was impacted by the size of the buffer.
Specifically, it used the following to "roll" the partial line at the
end of the buffer to the beginning:

    self.buf.copy_within_str(self.pos.., 0);

Which means that if the buffer is very large, ripgrep will copy
*everything* from `self.pos` (which might be very small, e.g., for small
files) to the end of the buffer, and move it to the beginning of the
buffer. This will happen repeatedly each time the buffer is used to
search small files, which winds up being quite a large slow down if the
line was exceptionally large (say, megabytes).

It turns out that copying everything is completely unnecessary. We only
need to copy the remainder of the last read to the beginning of the
buffer. Everything *after* the last read in the buffer is just free
space that can be filled for the next read. So, all we need to do is
copy just those bytes:

    self.buf.copy_within_str(self.pos..self.end, 0);

... which is typically much much smaller than the rest of the buffer.

This was likely also causing small performance losses in other cases as
well. For example, when searching a lot of small files, ripgrep would
likely do a lot more copying than necessary. Although, given that the
default buffer size is 8KB, this extra copying was likely pretty small,
and was thus harder to observe.

Fixes #1335
2019-08-02 07:23:27 -04:00
Andrew Gallant
f625d72b6f pkg: update brew tap to 11.0.2 2019-08-01 19:39:53 -04:00
Andrew Gallant
3de31f7527 ci: fix musl deployment
The docker image that the Linux binary is now built in does not have
ASCII doc installed, so setup Cross to point to my own image with those
tools installed.
2019-08-01 18:41:44 -04:00
Andrew Gallant
e402d6c260 ripgrep: release 11.0.2 2019-08-01 18:02:15 -04:00
Andrew Gallant
48b5bdc441 src: remove old directories
termcolor has had its own repository for a while now. No need for these
redirects any more.
2019-08-01 17:49:28 -04:00
Andrew Gallant
709ca91f50 ignore: release 0.4.9 2019-08-01 17:48:37 -04:00
Andrew Gallant
9c220f9a9b grep-regex: release 0.1.4 2019-08-01 17:47:45 -04:00
Andrew Gallant
9085bed139 grep-matcher: release 0.1.3 2019-08-01 17:46:59 -04:00
Andrew Gallant
931ab35f76 changelog: start work on 11.0.2 release 2019-08-01 17:42:38 -04:00
Andrew Gallant
b5e5979ff1 deps: update everything
This drops `spin` and `autocfg`, yay.
2019-08-01 17:42:38 -04:00
Andrew Gallant
052c857da0 doc: mention .ignore and .rgignore more prominently
Fixes #1284
2019-08-01 17:37:46 -04:00
Andrew Gallant
5e84e784c8 doc: add translations section
We note that they may not be up to date and are unofficial.

Fixes #1246
2019-08-01 17:37:46 -04:00
Andrew Gallant
01e8e11621 doc: improve PCRE2 failure mode documentation
If a user tries to search for an explicit `\n` character in a PCRE2
regex, ripgrep won't report an error and instead will (likely) silently
fail to match.

Fixes #1261
2019-08-01 17:32:44 -04:00
Ninan John
9268ff8e8d ripgrep: fix bug when CWD has directory named -
Specifically, when searching stdin, if the current directory has a
directory named `-`, then the `--with-filename` flag would automatically
be turned on. This is because `--with-filename` is automatically enabled
when ripgrep is given a single path that is a directory. When ripgrep is
given empty arguments, and if it is searching stdin, then its default
path list is just simple `["-"]`. The `is_dir` check passes, and
`--with-filename` gets enabled.

This commit fixes the problem by checking whether the path is `-` first.
If so, then we assume it isn't a directory. This is fine, since if it is
a directory and one asks to search it explicitly, then ripgrep will
interpret `-` as stdin anyway (which is arguably a bug on its own, but
probably not one worth fixing).

Fixes #1223, Closes #1292
2019-08-01 17:27:23 -04:00
dana
c2cb0a4de4 ripgrep: add --glob-case-insensitive
This flag forces -g/--glob patterns to be treated case-insensitively, as with
--iglob patterns.

Fixes #1293
2019-08-01 17:08:58 -04:00
Andrew Gallant
adb9332f52 regex: fix -F aho-corasick optimization
It turns out that when the -F flag was used, if any of the patterns
contained a regex meta character (such as `.`), then we winded up
escaping the pattern first before handing it off to Aho-Corasick, which
treats all patterns literally.

We continue to apply band-aides here and just avoid Aho-Corasick if
there is an escape in any of the literal patterns. This is unfortunate,
but making this work better requires more refactoring, and the right
solution is to get this optimization pushed down into the regex engine.

Fixes #1334
2019-08-01 16:58:12 -04:00
Matthew Davidson
bc37c32717 ignore/types: add edn type from Clojure ecosystem
PR #1330
2019-07-29 16:43:28 -04:00
Andrew Gallant
08ae4da2b7 deps: update them
There are some nice removals. It looks like rand has slimmed down, and
smallvec is gone now as well.
2019-07-25 07:52:33 -04:00
Andrew Gallant
7ac95c1f50 deps: bump ignore 2019-07-24 12:56:47 -04:00
Andrew Gallant
7a6903bd4e ignore-0.4.8 2019-07-24 12:56:01 -04:00
Tiziano Santoro
9801fae29f ignore: support compilation on wasm
Currently the crate assumes that exactly one of `cfg(windows)` or
`cfg(unix)` is true, but this is not actually the case, for instance
when compiling for `wasm32`.

Implement the missing functions so that the crate can compile on other
platforms, even though those functions will always return an error.

PR #1327
2019-07-24 12:55:37 -04:00
Miloš Stojanović
abdf7140d7 readme: fix broken link to Scoop bucket
PR #1324
2019-07-20 12:03:46 -04:00
Conrad Olega
b83e7968ef ignore/types: add Robot Framework
PR #1322
2019-07-14 08:12:34 -04:00
Hugo Locurcio
8ebc113847 doc: improve docs for --replace flag
Specifically, we document shell-specific caveats related to the `--replace`
flag.

PR #1318
2019-07-04 11:42:35 -04:00
Andrew Gallant
785c1f1766 release: globset, grep-cli, grep-printer, grep-searcher 2019-06-26 16:53:30 -04:00
Andrew Gallant
8b734cb490 deps: update everything 2019-06-26 16:51:06 -04:00
Andrew Gallant
b93762ea7a bstr: update everything to bstr 0.2 2019-06-26 16:47:33 -04:00
Andrew Gallant
34677d2622 search: a few small touchups 2019-06-18 20:23:47 -04:00
Andrew Gallant
d1389db2e3 search: better errors for preprocessor commands
If a preprocessor command could not be started, we now show some
additional context with the error message. Previously, it showed
something like this:

  some/file: No such file or directory (os error 2)

Which is itself pretty misleading. Now it shows:

  some/file: preprocessor command could not start: '"nonexist" "some/file"': No such file or directory (os error 2)

Fixes #1302
2019-06-16 19:02:02 -04:00
Andrew Gallant
50bcb7409e deps: update everything 2019-06-16 18:38:45 -04:00
Andrew Gallant
7b9972c308 style: fix deprecations
Use `dyn` for trait objects and use `..=` for inclusive ranges.
2019-06-16 18:37:51 -04:00
Hitesh Jasani
9f000c2910 ignore/types: add more nim types
PR #1297
2019-06-12 14:02:28 -04:00
skierpage
392682d352 doc: point regex doc link to the latest version
The latest doc is different, e.g. adds "symmetric differences" under
https://docs.rs/regex/*/regex/#character-classes

PR #1287
2019-06-01 08:44:55 -04:00
Andrew Gallant
7d3f794588 ignore: remove .git check in some cases
When we know we aren't going to process gitignores, we shouldn't waste
the syscall in every directory to check for a git repo.
2019-05-29 18:06:11 -04:00
bruce-one
290fd2a7b6 readme: mention Zstandard and Brotli
Also alphabetise the list.

PR #1288
2019-05-29 13:37:31 -04:00
Fabian Würfl
d1e4d28f30 readme: remove outdated statement
Issue #10 already states that "ripgrep is now in most or all of the major
package repositories."

PR #1280
2019-05-14 18:44:50 -04:00
Andrew Gallant
5ce2d7351d ci: use cross for musl x86_64 builds
This is necessary because jemalloc + musl + Ubuntu 16.04 is apparently
broken.

Moreover, jemalloc doesn't support i686, so we accept the performance
regression there.

See also: https://github.com/gnzlbg/jemallocator/issues/124
2019-04-25 11:12:14 -04:00
Andrew Gallant
9dcfd9a205 deps: bump pcre2-sys to 0.2.1
This brings in a bug fix that no longer tries to run `git` to update the
submodule if the `git` command doesn't exist.

This is useful is more restricted build contexts where `git` isn't
installed. Such as in the docker image used for running `cross`.
2019-04-25 11:12:14 -04:00
Andrew Gallant
36b276c6d0 printer: remove unnecessary mut 2019-04-24 17:22:27 -04:00
Andrew Gallant
03bf37ff4a alloc: use jemalloc when building with musl
It turns out that musl's allocator is slow enough to cause a fairly
noticeable performance regression when ripgrep is built as a static
binary with musl. We fix this by using jemalloc when building with musl.

We continue to use the default system allocator in all other scenarios.
Namely, glibc's allocator doesn't noticeably regress performance compared
to jemalloc. But we could add more targets to this logic if other
system allocators (macOS, Windows) prove to be slow.

This wasn't necessary before because rustc recently stopped using jemalloc
by default.

Fixes #1268
2019-04-24 17:21:38 -04:00
Andrew Gallant
e7829c05d3 cli: fix bug where last byte was stripped
In an effort to strip line terminators, we assumed their existence. But
a pattern file may not end with a line terminator, so we shouldn't
unconditionally strip them.

We fix this by moving to bstr's line handling, which does this for us
automatically.
2019-04-19 07:11:44 -04:00
Rory O’Kane
a6222939f9 readme: mention --pcre2 as long form of -P
This is for consistency with the short and long flags given in other
bullet points. I originally assumed there was no long flag for `-P`
because none was given here.

PR #1254
2019-04-16 21:22:48 -04:00
Rory O’Kane
6ffd434232 readme: mention --auto-hybrid-regex in advantages
This feature solves a major reason I was skeptical of using ripgrep, so
I think it’s good to mention it in the section about why one should use
it.

I use backreferences a lot, so I had previously thought that ripgrep
would provide no speed advantage over ag, since I would always have
`-P` enabled. But when I saw `--auto-hybrid-regex` in the 11.0.0
changelog, I learned that ripgrep can use it to speed up simple queries
while still allowing me to write backreferences.

PR #1253
2019-04-16 17:21:40 -04:00
Andrew Gallant
1f1cd9b467 pkg: update brew tap to 11.0.1 2019-04-16 13:39:56 -04:00
Andrew Gallant
973de50c9e ripgrep: release 11.0.1, take 2 2019-04-16 13:11:28 -04:00
Andrew Gallant
5f8805a496 ripgrep: release 11.0.1 2019-04-16 13:10:29 -04:00
Andrew Gallant
fdde2bcd38 deps: update regex to 1.1.6
This brings in a fix for a regression introduced in ripgrep 11.

Fixes #1247
2019-04-16 08:34:30 -04:00
Gerard de Melo
7b3fe6b325 doc: fix typo in FAQ
PR #1248
2019-04-16 08:32:30 -04:00
Max Horn
b3dd3ae203 ignore/types: add GAP
Add support for file types used by the GAP language, a research system
computational discrete algebra, see <https://www.gap-system.org>

PR #1249
2019-04-16 08:31:58 -04:00
Andrew Gallant
f3083e4574 readme: remove brew tap instructions
The brew tap isn't really needed any more, since SIMD is now
automatically enabled in all binaries.
2019-04-15 18:32:33 -04:00
Andrew Gallant
d03e30707e pkg: update brew tap to 11.0.0 2019-04-15 18:32:10 -04:00
Andrew Gallant
d7f57d9aab ripgrep: release 11.0.0 2019-04-15 18:09:40 -04:00
Andrew Gallant
1a2a24ea74 grep: release 0.2.4 2019-04-15 18:03:46 -04:00
Andrew Gallant
d66610b295 grep-cli: release 0.1.2 2019-04-15 18:02:44 -04:00
Andrew Gallant
019ae1989b grep-printer: release 0.1.2 2019-04-15 18:00:49 -04:00
Andrew Gallant
36d3f235dc grep-searcher: release 0.1.4 2019-04-15 17:59:22 -04:00
Andrew Gallant
79018eb693 grep-pcre2: release 0.1.3 2019-04-15 17:57:03 -04:00
Andrew Gallant
44cd344438 grep-regex: release 0.1.3 2019-04-15 17:56:04 -04:00
Andrew Gallant
e493e54b9b grep-matcher: release 0.1.2 2019-04-15 17:53:29 -04:00
Andrew Gallant
8e8215aa65 ignore: release 0.4.7 2019-04-15 17:50:37 -04:00
Andrew Gallant
3fe701498e doc: add note about --pre-glob
There was a performance warning in the --pre docs, but didn't mention
--pre-glob as a possible mitigation to it.
2019-04-15 17:47:48 -04:00
Andrew Gallant
e79085e9e4 release: globset 0.4.3 2019-04-15 14:07:03 -04:00
Andrew Gallant
764c197022 complete: fix typo 2019-04-15 07:04:57 -04:00
Andrew Gallant
ef1611b5f5 ripgrep: max-column-preview --> max-columns-preview
Credit to @okdana for catching this. This naming is a bit more
consistent with the existing --max-columns flag.
2019-04-15 06:51:51 -04:00
Andrew Gallant
45d12abbc5 changelog: small fixups 2019-04-14 20:21:55 -04:00
Andrew Gallant
5fde8391f9 changelog: backfill it
I went through every commit since the 0.10.0 release and added anything
that I thought was missing.
2019-04-14 20:04:01 -04:00
Marco Herrn
3edb11c513 ignore/types: add additional java files
- .jspx for XHTML JSP files
- .properties for Java Properties files (resource bundles, etc.)

Closes #1242
2019-04-14 19:38:24 -04:00
Andrew Gallant
ed144be775 ci: bump MSRV to 1.34.0 2019-04-14 19:29:27 -04:00
Andrew Gallant
967e7ad0de ripgrep: add --auto-hybrid-regex flag
This flag, when set, will automatically dispatch to PCRE2 if the given
regex cannot be compiled by Rust's regex engine. If both engines fail to
compile the regex, then both errors are surfaced.

Closes #1155
2019-04-14 19:29:27 -04:00
Andrew Gallant
9952ba2068 deps: update glob dev-dependency 2019-04-14 19:29:27 -04:00
Andrew Gallant
b751758d60 deps: update everything 2019-04-14 19:29:27 -04:00
Andrew Gallant
8f14cb18a5 ripgrep: increase pcre2's default JIT stack size
The default stack size is 32KB, and this increases it to 10MB. 32KB is
pretty paltry in the environments in which ripgrep runs, and 10MB is
easily afforded as a maximum size. (The size limit we set for Rust's
regex engine is considerably larger.)

This was motivated due to the fack that JIT stack limits have been
observed to be hit in the wild:
https://github.com/Microsoft/vscode/issues/64606
2019-04-14 19:29:27 -04:00
Andrew Gallant
da9d720431 ripgrep: add --pcre2-version flag
This flag will output details about the version of PCRE2 that ripgrep
is using (if any).
2019-04-14 19:29:27 -04:00
Andrew Gallant
a9d71a0368 pcre2: add a few re-exports
This adds the top-level is_jit_available and version free functions from
the underlying pcre2 crate, and also forwards the max_jit_stack_size
option.
2019-04-14 19:29:27 -04:00
Andrew Gallant
f3646242cc deps: use pcre2 0.2.0
This comes with PCRE 10.32 and a few new options we'll use in subsequent
commits.
2019-04-14 19:29:27 -04:00
Andrew Gallant
601f212a0b ripgrep: add -I as a short option for --no-filename
This flag is commonly used in pipelines and it can be annoying to write
it out every time you need it.

Ideally, we would use -h for this to match GNU grep, but -h is used to
print help output.

Closes #1185
2019-04-14 19:29:27 -04:00
Andrew Gallant
5a565354f8 versioning: next version will be ripgrep 11
This sets up the release announcement and briefly describes the
versioning change. The actual version change itself won't happen until
the release.

Closes #1172
2019-04-14 19:29:27 -04:00
Andrew Gallant
2a6532ae71 doc: note cases of exorbitant memory usage
Fixes #1189
2019-04-14 19:29:27 -04:00
Andrew Gallant
ece1f50cfe printer: support previews for long lines
This commit adds support for showing a preview of long lines. While the
default still remains as completely suppressing the entire line, this
new functionality will show the first N graphemes of a matching line,
including the number of matches that are suppressed.

This was unfortunately a fairly invasive change to the printer that
required a bit of refactoring. On the bright side, the single line
and multi-line coloring are now more unified than they were before.

Closes #1078
2019-04-14 19:29:27 -04:00
Andrew Gallant
a7d26c8f14 binary: rejigger ripgrep's handling of binary files
This commit attempts to surface binary filtering in a slightly more
user friendly way. Namely, before, ripgrep would silently stop
searching a file if it detected a NUL byte, even if it had previously
printed a match. This can lead to the user quite reasonably assuming
that there are no more matches, since a partial search is fairly
unintuitive. (ripgrep has this behavior by default because it really
wants to NOT search binary files at all, just like it doesn't search
gitignored or hidden files.)

With this commit, if a match has already been printed and ripgrep detects
a NUL byte, then it will print a warning message indicating that the search
stopped prematurely.

Moreover, this commit adds a new flag, --binary, which causes ripgrep to
stop filtering binary files, but in a way that still avoids dumping
binary data into terminals. That is, the --binary flag makes ripgrep
behave more like grep's default behavior.

For files explicitly specified in a search, e.g., `rg foo some-file`,
then no binary filtering is applied (just like no gitignore and no
hidden file filtering is applied). Instead, ripgrep behaves as if you
gave the --binary flag for all explicitly given files.

This was a fairly invasive change, and potentially increases the UX
complexity of ripgrep around binary files. (Before, there were two
binary modes, where as now there are three.) However, ripgrep is now a
bit louder with warning messages when binary file detection might
otherwise be hiding potential matches, so hopefully this is a net
improvement.

Finally, the `-uuu` convenience now maps to `--no-ignore --hidden
--binary`, since this is closer to the actualy intent of the
`--unrestricted` flag, i.e., to reduce ripgrep's smart filtering. As a
consequence, `rg -uuu foo` should now search roughly the same number of
bytes as `grep -r foo`, and `rg -uuua foo` should search roughly the
same number of bytes as `grep -ra foo`. (The "roughly" weasel word is
used because grep's and ripgrep's binary file detection might differ
somewhat---perhaps based on buffer sizes---which can impact exactly what
is and isn't searched.)

See the numerous tests in tests/binary.rs for intended behavior.

Fixes #306, Fixes #855
2019-04-14 19:29:27 -04:00
Andrew Gallant
bd222ae93f regex: fix HIR analysis bug
An alternate can be empty at this point, so we must handle it. We didn't
before because the regex engine actually disallows empty alternates,
however, this code runs before the regex compiler rejects the regex.
2019-04-14 19:29:27 -04:00
hupfdule
4359d8aac0 ignore/types: add more extensions for xml
This includes:

    *.dtd for Document Type Definitions
    *.xsl and *.xslt for XSL Transformation descriptions
    *.xsd for XML Schema definitions
    *.xjb for JAXB bindings
    *.rng for Relax NG files
    *.sch for Schematron files

PR #1243
2019-04-09 15:17:57 -04:00
tonypai
308819fb1f ignore/types: add lock files
Treat anything with a `.lock` extension as a lock file, with
an extra rule or two for special cases, e.g., package-lock.json.
2019-04-09 10:24:48 -04:00
Andrew Gallant
09108b7fda regex: make multi-literal searcher faster
This makes the case of searching for a dictionary of a very large number
of literals much much faster. (~10x or so.) In particular, we achieve this
by short-circuiting the construction of a full regex when we know we have
a simple alternation of literals. Building the regex for a large dictionary
(>100,000 literals) turns out to be quite slow, even if it internally will
dispatch to Aho-Corasick.

Even that isn't quite enough. It turns out that even *parsing* such a regex
is quite slow. So when the -F/--fixed-strings flag is set, we short
circuit regex parsing completely and jump straight to Aho-Corasick.

We aren't quite as fast as GNU grep here, but it's much closer (less than
2x slower).

In general, this is somewhat of a hack. In particular, it seems plausible
that this optimization could be implemented entirely in the regex engine.
Unfortunately, the regex engine's internals are just not amenable to this
at all, so it would require a larger refactoring effort. For now, it's
good enough to add this fairly simple hack at a higher level.

Unfortunately, if you don't pass -F/--fixed-strings, then ripgrep will
be slower, because of the aforementioned missing optimization. Moreover,
passing flags like `-i` or `-S` will cause ripgrep to abandon this
optimization and fall back to something potentially much slower. Again,
this fix really needs to happen inside the regex engine, although we
might be able to special case -i when the input literals are pure ASCII
via Aho-Corasick's `ascii_case_insensitive`.

Fixes #497, Fixes #838
2019-04-07 19:11:03 -04:00
Andrew Gallant
743d64f2e4 deps: update to clap 2.33 2019-04-06 10:35:08 -04:00
lesnyrumcajs
5962abc465 searcher: add option to disable BOM sniffing
This commit adds a new encoding feature where the -E/--encoding flag
will now accept a value of 'none'. When given this value, all encoding
related machinery is disabled and ripgrep will search the raw bytes of
the file, including the BOM if it's present.

Closes #1207, Closes #1208
2019-04-06 10:35:08 -04:00
dana
1604a18db3 ignore/types: add *.am and *.in for C/C++/make
PR #1205
2019-04-06 08:02:04 -04:00
luzpaz
9eeb0b01ce readme: add Repology badge
This adds a badge to the README.md file indicating to users that click
on it if their os/distro carries that latest version of ripgrep.

PR #1213
2019-04-06 08:00:40 -04:00
dana
df4400209a ripgrep: remove extra new-line after Clap output
PR #1222
2019-04-06 07:59:36 -04:00
Andrew Gallant
77439f99a4 deps: add bstr to Cargo.lock 2019-04-05 23:24:08 -04:00
Andrew Gallant
be7d6dd9ce regex: print out final regex in trace mode
This is useful for debugging to see what regex is actually being run.
We put this as a trace since the regex can be quite gnarly. (It is not
pretty printed.)
2019-04-05 23:24:08 -04:00
Andrew Gallant
9f15e3b671 regex: fix a perf bug when using -w flag
When looking for an inner literal to speed up searches, if only a prefix
is found, then we generally give up doing inner literal optimizations since
the regex engine will generally handle it for us. Unfortunately, this
decision was being made *before* we wrap the regex in (^|\W)...($|\W) when
using the -w/--word-regexp flag, which would then defeat the literal
optimizations inside the regex engine.

We fix this with a bit of a hack that says, "if we're doing a word regexp,
then give me back any literal you find, even if it's a prefix."
2019-04-05 23:24:08 -04:00
Andrew Gallant
254b8b67bb globset: small perf improvements
This tweaks the path handling functions slightly to make them a hair
faster. In particular, `file_name` is called on every path that ripgrep
visits, and it was possible to remove a few branches without changing
behavior.
2019-04-05 23:24:08 -04:00
Andrew Gallant
8a7f43b84d globset: use bstr
This simplifies the various path related functions and pushed more platform
dependent code down into bstr. This likely also makes things a bit more
efficient on Windows, since we now only do a single UTF-8 check for each
file path.
2019-04-05 23:24:08 -04:00
Andrew Gallant
d968a27ed5 cli: use bstr
This uses bstr in the unescaping logic. This lets us remove some platform
specific code, and also lets us remove a hacked UTF-8 decoder on raw
bytes.
2019-04-05 23:24:08 -04:00
Andrew Gallant
9b8f5cbaba config: switch to using bstrs
This lets us implement correct Unicode trimming and also simplifies the
parsing logic a bit. This also removes the last platform specific bits of
code in ripgrep core.
2019-04-05 23:24:08 -04:00
Andrew Gallant
c52da74ac3 printer: use bstr
This starts the usage of bstr in the printer. We don't use it too much
yet, but it comes in handy for implementing PrinterPath and lets us push
down some platform specific code into bstr.
2019-04-05 23:24:08 -04:00
Andrew Gallant
7dcbff9a9b searcher: partially migrate to bstr
This commit causes grep-searcher to use byte strings internally for its
line buffer support. We manage to remove a use of `unsafe` by doing this
(by pushing it down into `bstr`).

We stop short of using byte strings everywhere else because we rely
heavily on the `impl ops::Index<[u8]> for grep_matcher::Match` impl,
which isn't available for byte strings. (It is premature to make bstr a
public dep of a core crate like grep-matcher, but maybe some day.)
2019-04-05 23:24:08 -04:00
Andrew Gallant
bef1f0e770 ci: switch to xenial (#1234)
Rust is having problems with trusty, in particular, see this bug I
filed: https://github.com/rust-lang/rust/issues/59411

This was purpotedly fixed in
https://github.com/rust-lang/rust/pull/59468,
but it appears the issue is still occurring.

This commit tries to update to Ubuntu 16.04 in the hope that it will fix
this problem.
2019-04-03 19:52:34 -04:00
Andrew Gallant
cd9815cb37 deps: update to aho-corasick 0.7
We do the simplest possible change to migrate to the new version.

Fixes #1228
2019-04-03 13:51:26 -04:00
Andrew Gallant
3f22c3a658 deps: update everything
This updates all dependencies to their latest versions.

We tolerate a duplicative aho-corasick for now, which we will fix in the
next commit.
2019-04-03 13:07:26 -04:00
Andrew Gallant
0913972104 deps: bump encoding_rs_io
This brings in a new API for disabling BOM sniffing.

This is part of the work toward completing
https://github.com/BurntSushi/ripgrep/issues/1207
2019-03-03 16:36:34 -05:00
Andrew Gallant
f19b84fb23 regex: bump regex dep to fix match bug
See

* 661bf53d5b
* edf45e6f5f

for details on the bug fix, which was in the regex engine.

Fixes #1203
2019-02-27 17:42:14 -05:00
Andrew Gallant
59fc583aeb readme: include details about filtering
Despite the fact that we mention this in several places, people are
still surprised by ripgrep's "smart" filtering.
2019-02-27 08:01:23 -05:00
Andrew Gallant
1c7c4e6640 deps: update tempfile 2019-02-21 16:32:17 -05:00
Andrew Gallant
69c5e3938d deps: bump smallvec
This gets rid of the unmaintained crates `unreachable` and `void`. Yay!
2019-02-21 16:31:48 -05:00
Andrew Gallant
d9cf05ad50 deps: update to aho-corasick 0.6.10
This brings in a fix for this bug:
https://github.com/BurntSushi/aho-corasick/issues/37

Fixes #1079
2019-02-16 11:39:33 -05:00
Andrew Gallant
af8b6caebb deps: update various dependencies 2019-02-16 09:39:42 -05:00
Andrew Gallant
c84cfb6756 grep-regex-0.1.2 2019-02-16 09:30:06 -05:00
Andrew Gallant
895e26a000 ci: don't do releases on all tags
This attempts to make Appveyor more conservative in what tags it thinks
are releases. I don't know for sure, but it looks like the previous
regex could match anywhere, so we anchor it.

Fixes #1195
2019-02-10 12:51:56 -05:00
Andrew Gallant
8c95290ff6 deps: miscellaneous updates 2019-02-10 07:45:08 -05:00
Andrew Gallant
d6feeb7ff2 grep-searcher-0.1.3 2019-02-10 07:42:37 -05:00
Andrew Gallant
626ed00c19 searcher: revert big-endian patch
This undoes the patch to stop using bytecount on big-endian
architectures. In particular, we bump our bytecount dependency to the
latest release, which has a fix.

This reverts commit a4868b8835.

Fixes #1144 (again), Closes #1194
2019-02-10 07:40:32 -05:00
Andrew Gallant
332ad18401 tests: use const constructor for atomics
We did this in 05411b2b for core ripgrep, but didn't carry it over to
tests.
2019-02-09 16:27:25 -05:00
Andrew Gallant
fc3cf41247 grep-searcher-0.1.2 2019-02-09 16:13:07 -05:00
Andrew Gallant
a4868b8835 searcher: use naive line counting on big-endian
This patches out bytecount's "fast" vectorized algorithm on big-endian
machines, where it has been observed to fail. Going forward, bytecount
should probably fix this on their end, but for now, we take a small
performance hit on big-endian machines.

Fixes #1144
2019-02-09 16:13:07 -05:00
John Schmidt
f99b991117 ignore/types: add zig
PR #1191
2019-02-08 08:12:40 -05:00
Andrew Gallant
de0bc78982 deps: bump encoding_rs to 0.8.16
This brings in an updated `encoding_rs` crate that uses `packed_simd`,
which compiles on the latest nightly. Compilation times do appear to be
impacted significantly though.

Fixes #1175 (again)
2019-02-07 17:05:14 -05:00
Steffen Banhardt
147e96914c ignore/types: *.dtx and *.ins added for tex
PR #1182
2019-01-31 09:06:19 -05:00
Andrew Gallant
0abc40c23c readme: bump MSRV
We bumped it a while back in the CI configuration, but didn't update the
README.
2019-01-29 13:10:43 -05:00
Andrew Gallant
f768796e4f deps: update other deps 2019-01-29 13:08:56 -05:00
Andrew Gallant
da0c0c4705 deps: update to crossbeam-channel 0.3.8
This drops dependencies on parking_lot and rand from ripgrep.

(rand is still used for tests.)
2019-01-29 13:07:37 -05:00
Andrew Gallant
05411b2b32 deprecated: remove use of ATOMIC_BOOL_INIT
Our MSRV is high enough that we can use const functions now.
2019-01-29 13:05:16 -05:00
Andrew Gallant
cc93db3b18 cargo: include auto-generated message
This is going to be annoying for a while if one switches between the
latest nightly compiler and older compilers. Sigh.
2019-01-29 13:04:40 -05:00
Alex Macleod
049354b766 readme: remove EOL Fedora install instructions
Fedora 27 and below are past their EOL, so it can now be said that it's
supported regularly on Fedora.

PR #1177
2019-01-28 08:15:36 -05:00
Andrew Gallant
386dd2806d changelog: BUG #916
This was fixed by bumping the MSRV above Rust 1.28.

Fixes #916
2019-01-27 13:15:17 -05:00
Andrew Gallant
5fe9a954e6 changelog: BUG #1154 2019-01-27 13:05:50 -05:00
Andrew Gallant
f158a42a71 ignore: correctly detect hidden files on Windows
This commit fixes a bug where ripgrep only treated files beginning with
a `.` as hidden. On Windows, we continue this tradition, but
additionally check whether a file has the special Windows "hidden"
attribute set. If so, we treat it as a hidden file.

In order to make this work without an additional stat call, we had to
rearrange some of the plumbing from the directory traverser.

Fixes #1154
2019-01-27 12:11:52 -05:00
Andrew Gallant
5724391d39 doc: small updates to the FAQ and GUIDE
Notably, ripgrep can do multiline search now. We also update the
supported compression format list and replace deprecated flags like
`--sort-files` with `--sort path`.
2019-01-26 16:19:09 -05:00
Andrew Gallant
0df71240ff search: fix -F and -f interaction bug
This fixes what appears to be a pretty egregious regression where the
`-F/--fixed-strings` flag wasn't be applied to patterns supplied via
the `-f/--file` flag. The same bug existed for the `-x/--line-regexp`
flag as well, which we fix here.

Fixes #1176
2019-01-26 16:01:52 -05:00
Andrew Gallant
f3164f2615 exit: tweak exit status logic
This changes how ripgrep emit exit status codes. In particular, any error
that occurs while searching will now cause ripgrep to emit a `2` exit
code, where as it previously would emit either a `0` or a `1` code based
on whether it matched or not. That is, ripgrep would only emit a `2` exit
code for a catastrophic error.

This tweak includes additional logic that GNU grep adheres to, which seems
like good sense. Namely, if -q/--quiet is given, and an error occurs and
a match occurs, then ripgrep will emit a `0` exit code.

Closes #1159
2019-01-26 15:44:49 -05:00
Andrew Gallant
31d3e24130 args: prevent panicking in 'rg -h | rg'
Previously, we relied on clap to handle printing either an error
message, or --help/--version output, in addition to setting the exit
status code. Unfortunately, for --help/--version output, clap was
panicking if the write failed, which can happen in fairly common
scenarios via a broken pipe error. e.g., `rg -h | head`.

We fix this by using clap's "safe" API and doing the printing ourselves.
We also set the exit code to `2` when an invalid command has been given.

Fixes #1125 and partially addresses #1159
2019-01-26 14:39:40 -05:00
Andrew Gallant
bf842dbc7f doc: add note about inverted flags
Fixes #1091
2019-01-26 14:13:06 -05:00
Andrew Gallant
6d5dba85bd doc: clarify automatic encoding detection
Fixes #1103
2019-01-26 13:55:47 -05:00
Andrew Gallant
afb89bcdad fmt: shorten --ignore-file-case-insensitive description 2019-01-26 13:45:02 -05:00
Andrew Gallant
332dc56372 changelog: BUG #1095 2019-01-26 13:40:59 -05:00
Andrew Gallant
12a6ca45f9 config: add --no-ignore-dot flag
This flag causes ripgrep to ignore `.ignore` files.

Closes #1138
2019-01-26 13:40:12 -05:00
Andrew Gallant
9d703110cf regex: make CRLF hack more robust
This commit improves the CRLF hack to be more robust. In particular, in
addition to rewriting `$` as `(?:\r??$)`, we now strip `\r` from the end
of a match if and only if the regex has an ending line anchor required for
a match. This doesn't quite make the hack 100% correct, but should fix most
use cases in practice. An example of a regex that will still be incorrect
is `foo|bar$`, since the analysis isn't quite sophisticated enough to
determine that a `\r` can be safely stripped from any match. Even if we
fix that, regexes like `foo\r|bar$` still won't be handled correctly. Alas,
more work on this front should really be focused on enabling this in the
regex engine itself.

The specific cause of this bug was that grep-searcher was sneakily
stripping CRLF from matching lines when it really shouldn't have. We remove
that code now, and instead rely on better match semantics provided at a
lower level.

Fixes #1095
2019-01-26 12:34:28 -05:00
Andrew Gallant
e99b6bda0e deps: bump regex-syntax to 0.6.5
This is necessary for the use of the new is_line_anchored_{start,end}
APIs.
2019-01-26 12:20:02 -05:00
Andrew Gallant
276e2c9b9a searcher: always strip BOM
This fixes a bug where a BOM prefix was included. While this was somewhat
intentional in order to have a faithful "UTF8 passthru" option, in
practice, this causes problems such as breaking patterns like `^` in a
really non-obvious way.

The actual fix was to add a new API to encoding_rs_io, which this commit
brings in.

Fixes #1163
2019-01-25 17:18:57 -05:00
Andrew Gallant
9a9f54d44c readme: encoding_rs's SIMD support is broken
Add a note about it to the README.

Also, remove mention of the avx-accel feature since it no longer exists.
(bytecount now uses runtime detection to enable SIMD support.)

Fixes #1175
2019-01-24 07:00:53 -05:00
Andrew Gallant
47833b9ce7 deps: update removal of grep devdeps 2019-01-23 20:14:37 -05:00
Awad Mackie
44a9e37737 ignore/types: add method for retrieving file type definition
Fixes #1116, Closes #1120
2019-01-23 20:08:48 -05:00
Andrew Gallant
8fd05cacee changelog: BUG #1121 2019-01-23 20:06:01 -05:00
Rob Lourens
4691d11034 ripgrep: don't skip stdout in --files mode
Specifically, this avoids triggering Windows antimalware when in --files mode.

See also #600.

Fixes #1121
2019-01-23 20:04:44 -05:00
Andrew Gallant
519a6b68af grep: remove unused dependencies
We remove these for now, but we'll eventually add them back once the
examples get more fleshed out.

Closes #1043
2019-01-23 20:01:32 -05:00
Andrew Gallant
9c940b45f4 globset: permit ** to appear anywhere
Previously, `man gitignore` specified that `**` was invalid unless it
was used in one of a few specific circumstances, i.e., `**`, `a/**`,
`**/b` or `a/**/b`. That is, `**` always had to be surrounded by either
a path separator or the beginning/end of the pattern.

It turns out that git itself has treated `**` outside the above contexts
as valid for quite a while, so there was an inconsistency between the
spec `man gitignore` and the implementation, and it wasn't clear which
was actually correct.

@okdana filed a bug against git[1] and got this fixed. The spec was wrong,
which has now been fixed [2] and updated[2].

This commit brings ripgrep in line with git and treats `**` outside of
the above contexts as two consecutive `*` patterns. We deprecate the
`InvalidRecursive` error since it is no longer used.

Fixes #373, Fixes #1098

[1] - https://public-inbox.org/git/C16A9F17-0375-42F9-90A9-A92C9F3D8BBA@dana.is
[2] - 627186d020
[3] - https://git-scm.com/docs/gitignore
2019-01-23 19:59:39 -05:00
Andrew Gallant
0a167021c3 changelog: BUG #1174 2019-01-23 19:19:26 -05:00
Andrew Gallant
aeaa5fc1b1 globset: fix repeated use of **
This fixes a bug where repeated use of ** didn't behave as it should. In
particular, each use of `**` added a new requirement directory depth
requirement. For example, something like `**/**/b` would match
`foo/bar/b`, but it wouldn't match `foo/b` even though it should. In
particular, `**` semantics demand "infinite" depth, so repeated uses of
`**` should just coalesce as if only one was given.

We do this coalescing in the parser. It's a little tricky because we
treat `**/a`, `a/**` and `a/**/b` as distinct tokens with their own
regex conversions. We also test the crap out of it.

Fixes #1174
2019-01-23 19:15:02 -05:00
Andrew Gallant
7048a06c31 changelog: BUG #1173 2019-01-23 18:14:16 -05:00
Andrew Gallant
23be3cf850 ignore: fix handling of **
When deciding whether to add the `**/` prefix or not, we should choose
not to add it if the pattern is simply a bare `**`. Previously, we were
only not adding it if it was `**/`, which is correct, but we also need
to do it for `**` since `**` can already match anywhere.

There's likely a more principled solution to this, but this works for
now.

Fixes #1173
2019-01-23 18:12:35 -05:00
Andrew Gallant
b48bbf527d changelog: PR #1093 2019-01-23 17:56:18 -05:00
dana
8eabe47b57 ignore: always use literal_separator for gitignore patterns (#1093)
PR #1093
2019-01-23 17:54:28 -05:00
Michele Bologna
ff712bfd9d readme: add instructions for openSUSE 15.0
PR #1088
2019-01-22 21:46:11 -05:00
Mika Dede
a7f2d48234 printer: fix path handling in summarizer
This commit fixes a bug where both of the following commands always
reported an error:

    rg --files-with-matches foo file
    rg --files-without-match foo file

In particular, the printer was erroneously respecting the `path` option
even the the summary kind was `PathWithMatch` or `PathWithoutMatch`. The
documented behavior is that those summary kinds always require a path,
and thus, the `path` option has no effect. We fix this by correcting the
case analysis.

This also fixes a bug where the exit code for `--files-without-match`
was not set correctly. We update the printer's `has_match` method to
report the correct value.

Fixes #1106, Closes #1130
2019-01-22 21:37:23 -05:00
Andrew Gallant
57500ad013 changelog: brotli/zstd addition 2019-01-22 20:57:28 -05:00
dana
0b04553aff grep-cli: support Brotli/Zstd decompression
Fixes #1099
2019-01-22 20:56:16 -05:00
dana
1ae121122f ignore/types: add/update brotli, bzip2, gzip, xz, zstd 2019-01-22 20:56:16 -05:00
Andrew Gallant
688003e51c ripgrep: ban rustfmt 2019-01-22 20:07:26 -05:00
David Torosyan
718a00f6f2 ripgrep: add --ignore-file-case-insensitive
The --ignore-file-case-insensitive flag causes all
.gitignore/.rgignore/.ignore files to have their globs matched without
regard for case. Because this introduces a potentially significant
performance regression, this is always disabled by default. Users that
need case insensitive matching can enable it on a case by case basis.

Closes #1164, Closes #1170
2019-01-22 20:03:59 -05:00
Andrew Gallant
7cbc535d70 edition: fix build.rs 2019-01-19 10:46:57 -05:00
Andrew Gallant
7a6a40bae1 edition: move core ripgrep to Rust 2018 2019-01-19 10:44:30 -05:00
Andrew Gallant
1e9ee2cc85 deps: update memmap 2019-01-19 10:44:30 -05:00
Andrew Gallant
968491f8e9 deps: update to bytecount 0.5
bytecount now uses runtime dispatch for enabling SIMD, which means we can
no longer need the avx-accel features. We remove it from ripgrep since the
next release will be a minor version bump, but leave them as no-ops for
the crates that previously used it.
2019-01-19 10:44:30 -05:00
Andrew Gallant
63b0f31a22 deps: update various dependencies
We also increase the MSRV to 1.32, the current stable release, which sets
the stage for migrating to Rust 2018.
2019-01-19 10:44:30 -05:00
P M
7ecee299a5 ignore/types: add QML
PR #1165
2019-01-18 06:48:47 -05:00
David Håsäther
dd396ff34e doc: fix typo
PR #1161
2019-01-14 06:50:30 -05:00
Andrew Gallant
fb0a82f3c3 grep-printer: add macro docs, redux 2019-01-11 09:18:09 -05:00
Andrew Gallant
dbc8ca9cc1 grep-searcher: add docs for assert_eq_printed
Looks like the deny(missing_docs) lint got a bit stronger.
2019-01-11 09:03:00 -05:00
Marco Hinz
c3db8db93d doc: fix typo 2019-01-05 11:18:05 -05:00
Andrew Gallant
17ef4c40f3 ignore-0.4.6 2018-12-30 08:46:09 -05:00
Andrew Gallant
a9e0477ea8 ignore: permit use of deprecated trim_right 2018-12-30 08:44:59 -05:00
Andrew Gallant
b3c5773266 deps: bump ignore 2018-12-30 08:43:18 -05:00
Andrew Gallant
118b950085 ignore-0.4.5 2018-12-15 08:44:10 -05:00
Andrew Gallant
b45b2f58ea deps: update most other dependencies
This commit is the result of doing:

  $ cargo update
  $ cargo update -p encoding_rs --precise 0.8.10

where the latter line prevents encoding_rs from updating to 0.8.11 (or
newer). In particular, the 0.8.11 release increased the minimum Rust
version to 1.29, where as ripgrep 0.10.x is still on 1.28. We stay on an
older version for now until ripgrep is ready to move to 0.11.x.
2018-12-15 08:42:14 -05:00
Andrew Gallant
662a9bc73d deps: update to crossbeam-channel 0.3
This also requires corresponding updates to both rand and rand_core. Doing
an update of rand without doing an update of rand_core results in
compilation errors because two distinct versions of rand_core are included
in the build, and the traits they expose are distinct and incompatible.

We also switch over to using tempfile instead of tempdir, which drops the
last remaining thing keeping rand 0.4 in the build.

Fixes #1141, Fixes #1142
2018-12-15 08:40:04 -05:00
Andrew Gallant
401add0a99 deps: update regex and regex-syntax
This brings in some new Unicode properties, such as \p{Emoji}.

It is now also technically possible construct a regex that recognizes
grapheme clusters.
2018-12-09 16:33:37 -05:00
Simon Morgan
f81b72721b ignore/types: add ASP
PR #1134
2018-12-07 16:19:33 -05:00
Antony Lee
1d4fccaadc ignore/types: add postscript
Although postscript/encapsulated postscript is usually thought of as a
binary format, it's actually mostly ASCII, so ripgrep will not ignore
these files.

The situation is basically the same as for pdf, which is also already
present in the list of known filetypes.

PR #1118
2018-11-23 09:46:11 -05:00
Matteo Bertini
09e464e674 ignore/types: add more Cython file types
From the [Cython file types](https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html?highlight=pxi#cython-file-types) paragraph on the official docs:

> There are three file types in Cython:
>    The implementation files, carrying a .py or .pyx suffix.
>    The definition files, carrying a .pxd suffix.
>    The include files, carrying a .pxi suffix.

PR #1113
2018-11-19 07:37:00 -05:00
Jon Parise
31adff6f3c ignore/types: add Apache Thrift
PR #1102
2018-11-07 07:42:13 -05:00
Andrew Gallant
b41e596327 doc: escape braces in AsciiDoc
This commit fixes a bug where AsciiDoc would drop any line containing a
'{foo}' because it interpreted it as an undefined attribute reference:

> Simple attribute references take the form {<name>}. If the attribute name
> is defined its text value is substituted otherwise the line containing the
> reference is dropped from the output.

See: https://www.methods.co.nz/asciidoc/chunked/ch30.html

We fix this by simply replacing all occurrences of '{' and '}' with
their escaped forms: '&#123;' and '&#125;'.

Fixes #1101
2018-11-06 06:57:16 -05:00
Andrew Gallant
fb62266620 deps: update encoding_rs
This commit bumps the version of encoding_rs to use the latest release.
This appears to fix a panic in UTF-16 decoding.

Fixes #1089
2018-10-22 06:50:35 -04:00
Dave Lee
acf226c39d ignore/types: add BUILD.bazel to bazel file type
PR #1074
2018-10-02 18:00:04 -04:00
Mathieu Bridon
8299625e48 ignore/types: add buildstream
BuildStream is a Free Software tool for building/integrating software stacks.: https://buildstream.gitlab.io/buildstream/

It uses recipes written in YAML, in files with the `.bst` extension.

PR #1071
2018-09-28 08:32:24 -04:00
Andrew Gallant
db256c87eb ripgrep: suggest -U/--multiline
When a "\n literal is not allowed" error is reported, ripgrep will now
suggest the use of the -U/--multiline flag, which enables matching
newlines.

Fixes #1055
2018-09-25 16:56:04 -04:00
Andrew Gallant
ba533f390e grep-searcher: update to encoding_rs_io 0.1.3
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
https://github.com/hsivonen/encoding_rs/issues/34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
2018-09-25 16:56:04 -04:00
Andrew Gallant
ba503eb677 grep-regex: fix inner literal detection
It seems the inner literal detector fails spectacularly in cases of
concatenations that involve groups. The issue here is that if the prefix
of a group inside a concatenation can match the empty string, then any
literals generated to that point in the concatenation need to be cut
such that they are never extended. The detector isn't really built to
handle this case, so we just act conservative cut literals whenever we
see a sub-group. This may make some regexes slower, but the inner
literal detector already misses plenty of cases.

Literal detection (including in the regex engine) is a key component
that needs to be completely rethought at some point.

Fixes #1064
2018-09-25 16:56:04 -04:00
Andrew Gallant
f72c2dfd90 readme: touch up README
Make the wording consistent.
2018-09-14 11:33:56 -04:00
Sylvestre Ledru
c0aa58b4f7 Ripgrep is also available in Ubuntu (from Cosmic) 2018-09-14 08:41:05 +02:00
ykgmfq
184ee4c328 deb: add section info
Put it in the same section as
https://packages.debian.org/stretch/grep

PR #1051
2018-09-13 08:17:24 -04:00
Gabe Berke-Williams
e82fbf2c46 doc: fix typo
"cretion" -> "creation"

PR #1045
2018-09-10 06:49:48 -04:00
Andrew Gallant
eb18da0450 pcre2: use jit_if_available
This will allow PCRE2 to fall back to non-JIT matching when running on
platforms without JIT support.

ref https://github.com/BurntSushi/rust-pcre2/issues/3
2018-09-08 17:12:14 -04:00
Andrew Gallant
0f7494216f readme: update dpkg version 2018-09-08 10:46:40 -04:00
Andrew Chin
442a278635 readme: fancy regexes are not supported by default
PR #1042
2018-09-07 17:43:24 -04:00
Andrew Gallant
7ebed3ace6 pkg: update brew tap to 0.10.0 2018-09-07 14:43:59 -04:00
Andrew Gallant
8a7db1a918 ci: tweak deployment conditions 2018-09-07 14:07:52 -04:00
Andrew Gallant
ce80d794c0 changelog: add release date 2018-09-07 14:00:23 -04:00
Andrew Gallant
c5d467a2ab ci: always force PCRE2 static builds for releases 2018-09-07 14:00:23 -04:00
Andrew Gallant
a62cd553c2 ci: clean up appveyor
Remove some outdated comments and unused config. Also, make the regex for
matching tags a bit more specific.
2018-09-07 14:00:22 -04:00
Andrew Gallant
ce5188335b ci: remove 'branch' condition for deployment
Travis docs[1] say this is ignore when 'tags' is used.

[1] - https://docs.travis-ci.com/user/deployment/#conditional-releases-with-on
2018-09-07 14:00:22 -04:00
Andrew Gallant
b7a456ae83 deb: add completions
This commit adds Bash, zsh and fish completions to the Debian binary
package.

Fixes #1032
2018-09-07 14:00:22 -04:00
Andrew Gallant
d14f0b37d6 deps: update versions for all crates
I don't think every change here is needed, but this ensures we're using
the latest version of every direct dependency.
2018-09-07 14:00:22 -04:00
Andrew Gallant
3ddc3c040f deps: minor updates 2018-09-07 13:03:01 -04:00
Andrew Gallant
eeaa42ecaf scripts: add copy-examples
This is a preliminary script to copy example code from a Markdown file
into a crate's example directory.

This is intended to be used for the upcoming libripgrep guide, but we
don't commit any examples yet.
2018-09-07 12:27:48 -04:00
Andrew Gallant
3797a2a5cb simplegrep: touch up 2018-09-07 12:24:50 -04:00
Andrew Gallant
0e2f8f7b47 grep: add clap and regex dev dependencies to grep
These are (or will be) used in grep's examples.
2018-09-07 12:06:05 -04:00
Andrew Gallant
3dd4b77dfb grep-searcher: add Box<...> impl for Sink
We initially did not have this impl because the first revision of the Sink
trait was much more complicated. In particular, each method was
parameterized over a Matcher. But not every Sink impl actually needs a
Matcher, and it is just as easy to borrow a Matcher explicitly, so the
added parameterization wasn't holding its own.

This does permit Sink implementations to be used as trait objects. One
key use case here is to reduce compile times, since there is quite a bit
of code inside grep-searcher that is parameterized on Sink. Unfortunately,
that code is *also* parameterized on Matcher, and the various printers in
grep-printer are also parameterized on Matcher, which means Sink trait
objects are necessary but no sufficient for a major reduction in compile
times. Unfortunately, the path to making Matcher object safe isn't quite
clear. Extension traits maybe? There's also stuff in the Serde ecosystem
that might help, but the type shenanigans can get pretty gnarly.
2018-09-07 12:06:05 -04:00
Andrew Gallant
3b5cdea862 doc: minor touchups to API docs 2018-09-07 12:06:05 -04:00
Andrew Gallant
54b3e9eb10 grep-printer: delete unused code 2018-09-07 12:06:05 -04:00
Andrew Gallant
56e8864426 grep-matcher: add LineTerminator::is_suffix
This centralizes the logic for checking whether a line has a line
terminator or not.
2018-09-07 12:06:04 -04:00
Andrew Gallant
b8f619d16e readme: a few clarifications 2018-09-07 12:06:04 -04:00
Andrew Gallant
83dff33326 deps: update various deps 2018-09-04 23:29:22 -04:00
Andrew Gallant
003c3695f4 deps: update grep version 2018-09-04 23:29:05 -04:00
Andrew Gallant
10777c150d grep-0.2.1 2018-09-04 23:25:39 -04:00
Andrew Gallant
827179250b changelog: assign feature id 2018-09-04 23:24:22 -04:00
Andrew Gallant
fd22cd520b windows: fix unused warnings on Windows 2018-09-04 23:18:55 -04:00
Andrew Gallant
241bc8f8fc ripgrep: add --pre-glob flag
The --pre-glob flag is like the --glob flag, except it applies to filtering
files through the preprocessor instead of for search. This makes it
possible to apply the preprocessor to only a small subset of files, which
can greatly reduce the process overhead of using a preprocessor when
searching large directories.
2018-09-04 23:18:55 -04:00
Andrew Gallant
b6e30124e0 ripgrep: add --line-buffered and --block-buffered
These flags provide granular control over ripgrep's buffering strategy.
The --line-buffered flag can be genuinely useful in certain types of shell
pipelines. The --block-buffered flag has a murkier use case, but we add it
for completeness.
2018-09-04 23:18:55 -04:00
Andrew Gallant
4846d63539 grep-cli: introduce new grep-cli crate
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.

There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.

In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.

Fixes #990
2018-09-04 23:18:55 -04:00
helloer
13c47530a6 ignore/types: add pascal type
PR #1036
2018-09-03 07:25:07 -04:00
Jakub Wilk
328f4369e6 doc: fix typos 2018-08-31 11:59:28 -04:00
Andrew Gallant
04518e32e7 deps: update other crates 2018-08-30 23:03:07 -04:00
Andrew Gallant
f2eaf5b977 deps: update termcolor for perf tweaks 2018-08-30 22:57:01 -04:00
Andrew Gallant
3edeeca6e9 changelog: fix typo 2018-08-29 18:46:34 -04:00
Andrew Gallant
c41b353009 changelog: update
This brings the changelog up to date with HEAD and rewords a few things.
2018-08-29 18:25:08 -04:00
Aaron Power
d18839f3dc ignore: add into_path for DirEntry (#1031)
This commit adds ignore::DirEntry::into_path to match
the corresponding method on walkdir::DirEntry.
2018-08-28 18:27:34 -04:00
Andrew Gallant
8f978a3cf7 doc: clarify and fix typo
Clarify that --byte-offset may be wrong if the source isn't being read
directly.

Also tweak the README a bit. And remove a damned Oxford comma.
2018-08-27 21:21:37 -04:00
Andrew Gallant
87b745454d ripgrep: use 'ignore' for skipping stdout
This removes ripgrep-specific code for filtering files that correspond to
stdout and instead uses the 'ignore' crate's functionality for doing the
same.
2018-08-27 21:18:53 -04:00
Andrew Gallant
e5bb750995 ignore: add 'stdout' skipping to the walker
This commit adds a new 'skip_stdout' option to the directory walker. When
enabled, it will skip yielding any directory entries that are believed to
correspond to stdout for the current process. This is useful for filtering
out 'results' in a command like 'grep -r foo > results' in order to avoid
an unbounded feedback mechanism.
2018-08-27 21:18:53 -04:00
dana
d599f0b3c7 complete: don't complete bare pattern after -f 2018-08-27 07:56:40 -04:00
Andrew Gallant
40e310a9f9 ripgrep: add --sort and --sortr flags
These flags each accept one of five choices: none, path, modified,
accessed or created. The value indicates how the results are sorted.
For --sort, results are sorted in ascending order where as for --sortr,
results are sorted in descending order.

Closes #404
2018-08-26 18:42:25 -04:00
Andrew Gallant
510f15f4da ignore: add sort_by_file_path builder method
This permits callers to sort entries by their full file path, which makes
it easy to query for various file statistics.

It would have been better to provide a comparator on DirEntry itself,
similar to how walkdir does it, but this seems to require quite a bit of
work to make the types work out, assuming we want to continue to use
walkdir's sorting support (we do).
2018-08-26 18:42:25 -04:00
Andrew Gallant
f9ce7a84a8 ignore: add 'same_file_system' option
This commit adds a 'same_file_system' option to the walk builder. For
single threaded walking, it defers to the walkdir crate, which has the
same option. The bulk of this commit implements this flag for the parallel
walker. We add one very feeble test for this.

The parallel walker is now officially a complete mess.

Closes #321
2018-08-26 18:42:25 -04:00
Andrew Gallant
1b6089674e deps: more updates 2018-08-26 18:42:25 -04:00
Andrew Gallant
05a0389555 ripgrep: use winapi-util for stdin_is_readable 2018-08-25 00:30:15 -04:00
Andrew Gallant
16353bad6e deps: update various deps
This includes a new crate, winapi-util, that is now used in wincolor,
walkdir and same-file.
2018-08-25 00:19:40 -04:00
Tim Kilbourn
fe442de091 changelog: fix typo
Fuchsia is a pain to spell.

PR #1026
2018-08-23 13:17:27 -04:00
Andrew Gallant
1bb8b7170f doc: clarify use of SIMD features
You need a nightly compiler.

Ref #188
2018-08-23 09:56:37 -04:00
Andrew Gallant
55ed698a98 deps: update walkdir minimum version
We'll want to be using the new `same_file_system` option soon.
2018-08-23 09:54:45 -04:00
Andrew Gallant
f1e025873f deps: update dependencies
This includes an update to walkdir 2.2.2, which includes a
`same_file_system` option.
2018-08-22 20:50:24 -04:00
Andrew Gallant
033ad2b8e4 deps: update clap
Update clap to the latest version.

Also, drop the ansi_term dependency by disabling color output in clap's
error messages.
2018-08-21 23:10:34 -04:00
Andrew Gallant
098a8ee843 deps: various patch upgrades 2018-08-21 23:05:52 -04:00
Andrew Gallant
2f3dbf5fee ignore: fix false positive in path_is_symlink
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.

Part of this fix includes updating walkdir to 2.2.1, which also includes
a corresponding bug fix.

Fixes #984
2018-08-21 23:05:52 -04:00
Andrew Gallant
5c80e4adb6 release: better support for binary Debian package
This commit beefs up the package metadata used by the 'cargo deb' tool to
produce a binary dpkg. In particular, we now include ripgrep's man page.

This commit includes a new script, 'ci/build_deb.sh', which will handle
the build process for a dpkg, which has become a bit more nuanced than
just running 'cargo deb'. We don't (yet) run this script in CI.

Fixes #842
2018-08-21 23:05:52 -04:00
Andrew Gallant
fcd1853031 doc: update ripgrep's description
This now mentions PCRE2 support.
2018-08-21 23:05:52 -04:00
Andrew Gallant
74a89be641 grep-printer: fix bug in printing truncated lines
When emitting color, the printer wasn't checking whether the line
exceeded the maximum allowed length.
2018-08-21 23:05:52 -04:00
Andrew Gallant
5b1ce8bdc2 tests: touch up tests on Windows
This fixes warnings and adds an additional invalid UTF-8 test that will
run on Windows.
2018-08-21 23:05:52 -04:00
Andrew Gallant
1529ce3341 ripgrep: remove workaround for std bug
This commit undoes a work-around for a bug in Rust's standard library
that prevented correct file type detection on Windows in OneDrive
directories. We remove the work-around because we are moving to a
latest-stable Rust version policy, which has included this fix for a while
now.

ref #705, https://github.com/rust-lang/rust/issues/46484
2018-08-21 23:05:52 -04:00
Andrew Gallant
95a4f15916 ignore: clarify docs for DirEntry::error
Fixes #953
2018-08-21 23:05:52 -04:00
Andrew Gallant
0eef05142a ripgrep: move minimum version to Rust stable
This also updates some code to make use of our more liberal versioning
requirement, including the use of crossbeam-channel instead of the MsQueue
from the older an unmaintained crossbeam 0.3. This does regrettably add
a sizable number of dependencies, however, compile times seem mostly
unaffected.

Closes #1019
2018-08-21 23:05:52 -04:00
Andrew Gallant
edd6eb4e06 ripgrep: make --no-pcre2-unicode the canonical flag
Previously, we used --pcre2-unicode as the canonical flag despite the
fact that it is enabled by default, which is inconsistent with how we
handle other similar flags.

The reason why --pcre2-unicode was made the canonical flag was to make
it easier to discover since it would be sorted near the --pcre2 flag. To
solve that problem, we simply start a convention that lists related
flags in the docs.

Fixes #1022
2018-08-21 23:05:52 -04:00
Andrew Gallant
7ac9782970 doc: fix typo 2018-08-20 18:00:14 -04:00
Andrew Gallant
180054d7dc doc: caveats 2018-08-20 17:58:29 -04:00
Andrew Gallant
7eaaa04c69 ripgrep: small cleanups 2018-08-20 17:34:45 -04:00
Andrew Gallant
87a627631c doc: add section on PCRE2 performance 2018-08-20 17:34:45 -04:00
Andrew Gallant
9df60e164e deps: update other dependencies to latest 2018-08-20 17:34:45 -04:00
Andrew Gallant
afa06c518a deps: update libripgrep crate versions
This prepares them for an initial 0.1.0 release.
2018-08-20 17:34:45 -04:00
Andy Freeland
e46aeb34f8 ignore/types: add .mako and .mao for Mako templates
I've personally never seen `.mao`, but GitHub includes it in Linguist: 
4f11062304/lib/linguist/languages.yml (L2702-L2709)
2018-08-20 15:26:49 -04:00
dana
d8f187e990 complete: add completion reference guide 2018-08-20 11:53:19 -04:00
dana
7d93d2ab05 ripgrep: add --no-multiline-dotall 2018-08-20 07:50:00 -04:00
dana
9ca2d68e94 ripgrep: fix typos in option descriptions 2018-08-20 07:50:00 -04:00
dana
60b0e3ff80 complete: update wording, exclusion, &c. 2018-08-20 07:50:00 -04:00
dana
3a1c081c13 test_complete: match certain long options in description bodies 2018-08-20 07:50:00 -04:00
Andrew Gallant
d5c0b03030 changelog: massive update for libripgrep
This commit updates the CHANGELOG to reflect all the work done to make
libripgrep a reality.

* Closes #162 (libripgrep)
* Closes #176 (multiline search)
* Closes #188 (opt-in PCRE2 support)
* Closes #244 (JSON output)
* Closes #416 (Windows CRLF support)
* Closes #917 (trim prefix whitespace)
* Closes #993 (add --null-data flag)
* Closes #997 (--passthru works with --replace)

* Fixes #2 (memory maps and context handling work)
* Fixes #200 (ripgrep stops when pipe is closed)
* Fixes #389 (more intuitive `-w/--word-regexp`)
* Fixes #643 (detection of stdin on Windows is better)
* Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird)
* Fixes #764 (coalesce color escapes)
* Fixes #922 (memory maps failing is no big deal)
* Fixes #937 (color escapes no longer used for empty matches)
* Fixes #940 (--passthru does not impact exit status)
* Fixes #1013 (show runtime CPU features in --version output)
2018-08-20 07:10:19 -04:00
Andrew Gallant
eb184d7711 tests: re-tool integration tests
This basically rewrites every integration test. We reduce the amount of
magic involved here in terms of which arguments are being passed to
ripgrep processes. To make up for the boiler plate saved by the magic,
we make the Dir (formerly WorkDir) type a bit nicer to use, along with a
new TestCommand that wraps a std::process::Command. In exchange, we get
tests that are easier to read and write.

We also run every test with the `--pcre2` flag to make sure that works,
when PCRE2 is available.
2018-08-20 07:10:19 -04:00
Andrew Gallant
bb110c1ebe ripgrep: migrate to libripgrep
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.

The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
2018-08-20 07:10:19 -04:00
Andrew Gallant
d9ca529356 libripgrep: initial commit introducing libripgrep
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:

  1. Pattern matching (e.g., by a regex engine).
  2. Searching a file using a pattern matcher.
  3. Printing results.

Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.

Closes #162
2018-08-20 07:10:19 -04:00
Sylvestre Ledru
0958837ee1 readme: ripgrep is available in Debian Buster
PR #1016
2018-08-17 06:35:43 -04:00
Andrew Gallant
94be3bd4bb grep: remove senseless test
It was pulling in a sizable data file and doesn't appear to be testing
anything meaningful that isn't covered by a variety of other tests.
2018-08-15 19:52:50 -04:00
woky
deb1de6e1e ignore/types: add *.sbt to scala type
Sbt is currently most used Scala build tool which uses
*.sbt files, which are basically Scala.

PR #1010
2018-08-14 06:29:27 -07:00
Vanessa McHale
6afdf15d85 ignore/types: add Idris, Dhall and ATS
And also improve Haskell detection.

PR #1007
2018-08-07 13:10:19 -04:00
Jonatan Hamberg
6cda7b24e9 readme: update debian link to 0.9.0
PR #1006
2018-08-07 07:50:08 -04:00
llogiq
ad9befbc1d deps: update bytecount to 0.3.2
PR #1003
2018-08-06 06:44:16 -04:00
Andrew Gallant
e86d3d95c2 pkg: update brew tap to 0.9.0 2018-08-03 17:04:36 -04:00
Andrew Gallant
6799dcfc0e release: 0.9.0 2018-08-03 16:13:31 -04:00
Andrew Gallant
0fdab0ec5e grep-0.1.9 2018-08-03 16:12:08 -04:00
Andrew Gallant
74ec5b8932 deps: update termcolor and encoding_rs_io 2018-08-03 16:08:57 -04:00
Andrew Gallant
2913fc4cd0 tests: reduce reliance on state in tests
This commit improves the integration test setup by running tests inside
the system's temporary directory instead of within ripgrep's `target`
directory. The motivation here is to attempt to reduce the effect of
unanticipated state on ripgrep's integration tests, such as the presence
of `.gitignore` files in ripgrep's checkout directory hierarchy
(including parent directories).

This doesn't remove all possible state. For example, there's no
guarantee that the system's temporary directory isn't itself within a
git repository. Moreover, there may still be other ignore rules in the
directory tree that might impact test behavior. Fixing this seems
somewhat difficult. Conceptually, it seems like ripgrep should run each
test in its own `chroot`-like environment, but doing this in a
non-annoying and portable way (including Windows) doesn't appear to be
possible.

Another approach to take here might be to teach ripgrep itself that a
particular directory should be treated as root, and therefore, never
look at anything outside that directory. This also seems complex to
implement, but tractable. Let's see how this approach works for now.

Fixes #448, #996
2018-07-29 10:41:03 -04:00
Andrew Gallant
7c412bb2fa tests/style: 80 columns, dammit 2018-07-29 10:41:03 -04:00
Andrew Gallant
651a0f1ddf ignore: fix typo 2018-07-29 08:37:24 -04:00
Andrew Gallant
45473ba48f ignore/style: 80 columns, dammit 2018-07-29 08:31:04 -04:00
Andrew Gallant
0863c75a5a ignore: fix bug in matched_path_or_any_parents
This method was supposed to panic whenever the given path wasn't under
the root of the gitignore patcher. Instead of using assert!, it was using
debug_assert!. This actually caused tests to fail when running under
release mode, because the debug_assert! wouldn't trip.

Fixes #671
2018-07-29 08:30:53 -04:00
Andrew Gallant
d94d99f657 ignore-0.4.3 2018-07-28 11:05:27 -04:00
Andrew Gallant
84585908ac globset-0.4.1 2018-07-28 10:59:54 -04:00
Andrew Gallant
1611c04e6f ignore: respect XDG_CONFIG_DIR/git/config
This commit updates the logic for finding the value of git's
`core.excludesFile` configuration parameter. Namely, we now check
`$XDG_CONFIG_DIR/git/config` in addition to `$HOME/.gitconfig` (where
the latter overrules the former on a knob-by-knob basis).

Fixes #995
2018-07-28 10:04:16 -04:00
dana
d857ad6ed3 complete: add --no-pre, improve --pre/--search-zip exclusivity 2018-07-24 07:21:18 -04:00
Andrew Gallant
4dd2f8e40e deps: update atty and winapi
This updates atty and winapi to their latest versions, including the bug
fix in atty that allows it to work with winapi 0.3.5.
2018-07-22 13:07:48 -04:00
Andrew Gallant
dca8110da2 ripgrep: when given no patterns, don't match
Generally speaking, ripgrep prevents the case of not having any patterns
via its arg parsing. However, it is possible for users to provide a file
of patterns via the `-f` flag. If that file is empty, then ripgrep has
nothing to search for and therefore should not ever produce any match.

One way of fixing this might be to replace the absence of patterns with
a pattern that can never match, but this still requires opening and
searching through every file, which is quite a waste. Instead, we detect
this case explicitly and quit early.

Fixes #900
2018-07-22 12:07:18 -04:00
Andrew Gallant
0d11497d21 tests: be looser with gzip failure
Don't expect the exact error message. Instead, just ask that the error
message exist and be non-empty.

Fixes #903
2018-07-22 11:08:16 -04:00
Andrew Gallant
22ac2e056e ripgrep: stop early when --files --quiet is used
This commit tweaks the implementation of the --files flag to stop early
when --quiet is provided.

Fixes #907
2018-07-22 11:05:24 -04:00
Andrew Gallant
03af61fc7b ripgrep: don't skip tar archives
This removes logic from the decompressor for skipping tar archives. This
logic was originally added under the assumption that we probably want to
avoid the cost of reading them. However, this is generally inconsistent
with how ripgrep treats files like tar archives: it should search them
and do binary detection like normal.

Fixes #918
2018-07-22 10:59:09 -04:00
Andrew Gallant
560dffd247 ripgrep: add --no-ignore-global flag
This commit adds a new --no-ignore-global flag that permits disabling
the use of global gitignore filtering. Global gitignores are generally
found in `$HOME/.config/git/ignore`, but its location can be configured
via git's `core.excludesFile` option.

Closes #934
2018-07-22 10:42:32 -04:00
Andrew Gallant
e65ca21a6c ignore: only respect .gitignore in git repos
This commit fixes an interesting bug in the `ignore` crate where it
would basically respect any `.gitignore` file anywhere (including global
gitignores in `~/.config/git/ignore`), regardless of whether we were
searching in a git repository or not. This commit rectifies that
behavior to only respect gitignore rules when in a git repo.

The key change here is to move the logic of whether to traverse parents
into the directory matcher rather than putting the onus on the directory
traverser. In particular, we now need to traverse parent directories in
more cases than we previously did, since we need to determine whether
we're in a git repository or not.

Fixes #934
2018-07-22 10:33:23 -04:00
Andrew Gallant
6771626553 ripgrep: improve usage documentation
This shows an example for reading stdin.

Fixes #951
2018-07-22 09:38:49 -04:00
Andrew Gallant
b9c922be53 ripgrep: better --path-separator error message
This commit improves the error message when --path-separator fails. Namely,
it prints the separator it got and also prints a notice for Windows users
for common failure modes.

Fixes #957
2018-07-22 09:33:03 -04:00
Andrew Gallant
7a44cad599 deps: pin winapi to 0.3.4
winapi 0.3.5 changed how it represents some of its structs, which caused
a bug to surface in atty that prevents tty detection on Windows. atty
has an open PR to fix this: https://github.com/softprops/atty/pull/28

Until a new release of atty, we pin winapi to a version that works.
2018-07-22 09:31:22 -04:00
Andrew Gallant
6dc09c5b1b ripgrep: add --no-pre switch
This switch explicitly disables the --pre behavior. An empty string will
also disable --pre behavior, but actually utterring an empty flag value
is quite awkward, so we provide an explicit switch to do the same thing.

Thanks to @c-blake for pointing this out.
2018-07-21 20:57:27 -04:00
Andrew Gallant
209a125ea2 ripgrep: replace decoder with encoding_rs_io
This commit mostly moves the transcoder implementation to its own
crate: https://github.com/BurntSushi/encoding_rs_io

The new crate adds clear documentation and cleans up the implementation
to fully implement the contract of io::Read.
2018-07-21 20:36:32 -04:00
Andrew Gallant
090216cf00 ripgrep: reformat --pre warning 2018-07-21 17:53:55 -04:00
Andrew Gallant
c96a358593 ripgrep: add warning to --pre flag
The --pre flag can result in a pretty large performance penalty, so put
a warning in the flag documentation. This warning is important because a
flag like this could easily wind up in a user's configuration file.
2018-07-21 17:50:54 -04:00
Charles Blake
231456c409 ripgrep: add --pre flag
The preprocessor flag accepts a command program and executes this
program for every input file that is searched. Instead of searching the
file directly, ripgrep will instead search the stdout contents of the
program.

Closes #978, Closes #981
2018-07-21 17:25:12 -04:00
Kalle Samuels
1d09d4d31b ripgrep: add support for lz4 decompression
This uses the lz4 binary for decompression.

Closes #898
2018-07-21 16:26:39 -04:00
Andrew Gallant
02f08f3800 changelog: updates
We continue our quest to update the CHANGELOG incrementally.
2018-07-21 14:23:38 -04:00
Mateusz Mikuła
470afa1bd7 snap: build without wrappers
Wrappers usually define things required for snaps to work like library path.
Ripgrep being static binary doesn't need it at all.

This will give minor speed up in scenarios when ripgrep is ran multiple
times and can be considered as good practice for static binaries.

PR #979
2018-07-21 13:28:27 -04:00
phiresky
aa2ce39d14 ignore: fix has_any_ignore_rules for explicit ignores
When building a ignore::WalkBuilder by disabling all standard filters
and adding a custom global ignore file, the ignore file is not used. Example:

    let mut walker = ignore::WalkBuilder::new(dir);
    walker.standard_filters(false);
    walker.add_ignore(myfile);

This makes it impossible to use the ignore crate to walk a directory
with only custom ignore files. Very similar to issue #800 (fixed in
b71a110).

PR #988
2018-07-21 13:26:54 -04:00
Andrew Gallant
d11a3b3377 crates.io: use 'OR' instead of '/'
Fixes #987
2018-07-21 13:25:39 -04:00
Andrew Gallant
d09e2f6af1 globset: clarify documentation on regex method
This makes it clear that the `bytes` API of the regex crate should be
used instead of the Unicode API.

Fixes #985
2018-07-21 13:23:46 -04:00
Andrew Gallant
7b6af5a177 deps: update regex to 1.0.2
And also update to regex-syntax 0.6.2.
2018-07-18 09:29:04 -04:00
Andrew Gallant
9bd1aa1c04 readme: update rogue 1.20 reference 2018-07-17 20:41:34 -04:00
Andrew Gallant
1393ce4b6b deps: update all transitive dependencies
This updates all remaining transitive dependencies. Most changes appear
minor and there appear to be no minimum Rust version conflicts. Yay!
2018-07-17 20:34:03 -04:00
Andrew Gallant
8e358ee056 deps: bump various dependencies
Nothing major here. All patch releases. This should bring us completely
up to date with all direct dependencies.
2018-07-17 20:33:13 -04:00
Andrew Gallant
5b5f4e74d9 deps: bump encoding_rs to 0.8
This brings in performance improvements.
2018-07-17 20:29:41 -04:00
Andrew Gallant
7829850bf0 deps: bump minimum Rust to 1.23.0 from 1.20.0
1.23.0 is the first Rust release of 2018 and is around half a year old,
which seems old enough to move to. This also lets us bring in encoding_rs
0.8, which includes performance optimizations.
2018-07-17 20:29:20 -04:00
Andrew Gallant
06b66efd59 deps: get rid of unstable feature
This was introduced as a temporary measure for dealing with the regex
crate's unstable feature, but it was never included in a release of
ripgrep. Thus, we remove it. The regex crate will now automatically enable
SIMD optimizations when available.
2018-07-17 20:27:04 -04:00
Andrew Gallant
7e5a590276 grep: small literal detection fix
This commit tweaks the inner literal detection heuristic such that if it
comes up with any literal that is all whitespace, then it's likely a bad
literal to look for since it's so common. Therefore, we simply reject the
inner literal optimization in this case and let the regex engine do its
thang.
2018-07-17 20:27:04 -04:00
Andrew Gallant
d17ca45063 deps: update termcolor to 1.0.0 2018-07-17 18:37:02 -04:00
Andrew Gallant
df469fe1b4 termcolor: moved to its own repository
We also move wincolor with it.

Fixes #924
2018-07-17 18:03:38 -04:00
Andrew Gallant
b64475aeac ignore: permit use of env::home_dir
Upstream deprecated env::home_dir because of minor bugs in some corner
cases. We should probably eventually migrate to a correct implementation
in the `dirs` crate, but using the buggy version is just fine for now.
2018-07-16 17:40:02 -04:00
dana
fca9709d94 complete: improve zsh completion
- Use groups with _arguments
- Conditionally complete 'negation' options (--messages, --no-column, &c.)
- Improve option exclusivity
- Improve option descriptions
- Improve completion of colour 'whens'
- Improve completion of colour specs
- Remove some unnecessary work-arounds
- Use more idiomatic conventions
2018-07-06 10:23:20 -04:00
dana
62b4813b8a ci: minor improvements to test_complete.sh 2018-07-06 10:23:20 -04:00
dana
b38b101c77 ripgrep: rename --maxdepth to --max-depth
We keep the old `--maxdepth` spelling to preserve backward
compatibility.

PR #967
2018-06-25 20:22:09 -04:00
dana
ac90316e35 ripgrep: add --fixed-strings flag
Fixes #964

PR #965
2018-06-25 17:02:02 -04:00
Andrew Gallant
a6467f880a ripgrep: disable autotests
This keeps us working in Rust 2018.

Fixes #923
2018-06-23 20:49:05 -04:00
Andrew Gallant
004bb35694 ripgrep/printer: fix small performance regression
This commit removes an unconditional extra regex search that is in fact
not always necessary. This can result in a 2x performance improvement in
cases where ripgrep reports many matches.

The fix itself isn't ideal, but we continue to punt on cleaning up the
printer until it is rewritten for libripgrep, which is happening Real
Soon Now.

Fixes #955
2018-06-23 20:49:05 -04:00
Andrew Gallant
cd6c190967 ripgrep: use new BufferedStandardStream from termcolor
Specifically, this will use a buffered writer when not printing to a tty.
This fixes a long standing performance regression where ripgrep would
slow down dramatically if it needed to report a lot of matches.

Fixes #955
2018-06-23 20:49:05 -04:00
Andrew Gallant
d5139228e5 termcolor: add BufferedStandardStream
This commit adds a new type, BufferedStandardStream, which emulates the
StandardStream API (sans `lock`), but will internally use a buffered
writer.

To achieve this, we add a new default method to the WriteColor trait that
indicates whether the underlying writer must synchronously communicate
with an API to control coloring (e.g., the Windows console API). The new
BufferedStandardStream then uses this default method to determine how
eager it should be to flush its buffer before employing color settings.
This should have basically zero overhead when using ANSI color escape
sequences.
2018-06-23 20:49:05 -04:00
Stepan Koltsov
bb16ba6311 ignore/types: add Bazel
`BUILD`, `WORKSPACE` are mentioned here:
https://docs.bazel.build/versions/master/build-ref.html

`*.bzl` is mentioned here:
https://docs.bazel.build/versions/master/skylark/concepts.html

PR #960
2018-06-23 15:25:43 -04:00
Andrew Gallant
5e85f2577b deps: update to regex 1.0.1
This causes SIMD to kick in automatically when compiling with stable
Rust 1.27+.

We also update the README to describe the current state of things.

Thanks to @hartley for pointing this out:
https://twitter.com/hartley/status/1009950392862453760
2018-06-21 20:14:23 -04:00
Jon Surrell
ca23a170f7 ripgrep: use exit code 2 to indicate error
Exit code 1 was shared to indicate both "no results" and "error." Use
status code 2 to indicate errors, similar to grep's behavior.

Fixes #948 

PR #954
2018-06-19 07:41:44 -04:00
Mårten Kongstad
223d7d9846 ignore/types: add Android makefile types
The Android platform is gradually moving from Makefiles (*.mk) to
Blueprint files (*.bp). Add support for both. See [1] for more
information.

1. https://android.googlesource.com/platform/build/blueprint/+/master-soong
2018-06-15 08:29:53 -04:00
Mårten Kongstad
e4bce86111 ignore/types: add AIDL file type
Add support for Android Interface Definition Language files (*.aidl).
See [1] for more information.

1. https://developer.android.com/guide/components/aidl
2018-06-15 08:29:53 -04:00
Tim Kilbourn
15fa77cdb3 ignore/types: add FIDL type
FIDL is the Fuchsia Interface Definition Language
https://fuchsia.googlesource.com/zircon/+/HEAD/docs/fidl/index.md
2018-06-06 07:42:08 -04:00
Ahmed El Gabri
c3f97513d6 doc: sync config file examples
This brings the examples for configuration files in sync between
the man page and the guide.
2018-05-25 06:42:05 -04:00
George Plymale II
c21b9b20cf doc: add MacPorts installation instructions 2018-05-24 13:01:28 -04:00
Khalid Jebbari
ce145c6a2a doc: update crates.io badge
This is the official markdown snippet from shields.io
2018-05-24 06:46:08 -04:00
Wesley Moore
a383d5c4e9 doc: add BSD packages to README 2018-05-14 06:45:39 -04:00
Michael Hay
64317bda9f doc: fix broken link to RegexSet docs 2018-05-08 12:03:47 -04:00
Stephen E. Baker
7f3a0f0828 ignore/types: add jsp extension to java type 2018-05-08 12:03:19 -04:00
Bastien Orivel
49f36c7dcd deps: update regex to 1.0
We retain the `simd-accel` feature on globset for backwards
compatibility, but will remove it in the next semver release.
2018-05-07 13:07:30 -04:00
Garrett Squire
83b4fdb8d6 ignore/doc: improve docs for case_insensitive
This commit updates the OverrideBuilder and GitignoreBuilder docs
for the case_insensitive method, denoting that it must be called before
adding any patterns.
2018-05-06 19:03:11 -04:00
Elliott Slaughter
8b57d78b96 doc: update suggested version for ignore 2018-05-06 19:02:08 -04:00
Bram Geron
a2d8c49d6f ignore/types: add shorthand 'hs' for Haskell 2018-05-03 09:05:00 -04:00
Andrew Gallant
6ffb4b7466 doc: go away snap
Snap has caused a number of nonsensical bug reports, and not even the
`--classic` flag seems capable of fixing them. Therefore, remove snap
from the README and put in a special line in the ISSUE_TEMPLATE about
snap.

FIxes #902
2018-04-30 15:25:51 -04:00
Andrew Gallant
198d1fede9 ignore/types: fix typo in puppet glob
Thanks @CYBAI for noticing this!

See #899
2018-04-29 09:30:44 -04:00
Zach Crownover
667b9a7d62 ignore/types: add puppet
Puppet is primarily written in it's own format of .pp files, but
custom facts and functions are often written in Ruby. The templating
language is ERB and so this will allow scanning of any of the three
most commonly used formats for Puppet specific things.
2018-04-29 09:21:48 -04:00
Andrew Gallant
1f528f1641 doc: update crates.io description
This brings it in line with the README.
2018-04-26 17:04:35 -04:00
Andrew Gallant
ab64da73ab ignore: speed up Gitignore::empty
This commit makes Gitignore::empty a bit faster by avoiding allocation
and manually specializing the implementation instead of routing it through
the GitignoreBuilder.

This helps improve uses of ripgrep that traverse *many* directories, and
in particular, when the use of ignores is disabled via command line
switches.

Fixes #835, Closes #836
2018-04-24 11:19:03 -04:00
Jonathan Klimt
1266de3d4c ignore/types: add verilog 2018-04-24 09:00:03 -04:00
Andrew Gallant
bf51058eb2 tests: fix tests on Windows
A bug in the atty crate was previously masking a problem with the
integration tests on Windows. Namely, the bug in atty resulted in
atty::is(Stdin) returning true if we couldn't get the file name for the
stdin stream. This in turn caused tests like `rg foo` to search the CWD,
which was the intended behavior. However, once the atty bug was fixed,
atty::is(Stdin) no longer returned true, causing `rg foo` searches to
fail.

On Unix-like systems, the atty behavior has always been correct.
However, on Unix-like systems we have a decent way of detecting whether
stdin is readable or not. If it isn't---which is the case in the
integration tests---then we fall back to searching the CWD. On Windows
however, we haven't yet implemented anything to detect whether stdin is
readable or not, so we must always assume that it is. Therefore, we
never get the "go ahead" to search the CWD and the tests fail.

Most of the tests are written to search the CWD explicitly, but there
were a few stragglers that don't.

This isn't great, and we should try to figure out how to do better stdin
detection on Windows.
2018-04-23 20:37:59 -04:00
Andrew Gallant
3dc6fe6f05 output: remove unnecessary mut binding 2018-04-23 20:06:03 -04:00
Andrew Gallant
06438d5360 changelog: update 2018-04-23 20:01:57 -04:00
Andrew Gallant
ae6f871491 output: remove --line-number-width flag
This commit does what no software project has ever done before: we've
outright removed a flag with no possible way to recapture its
functionality.

This flag presents numerous problems in that it never really worked well
in the first place, and completely falls over when ripgrep uses the
--no-heading output format. Well meaning users want ripgrep to fix this
by getting into the alignment business by buffering all output, but that
is a line that I refuse to cross.

Fixes #795
2018-04-23 19:57:22 -04:00
Andrew Gallant
ed059559cd deps: update to atty 0.2.9
https://github.com/softprops/atty/pull/25 was merged, so we can upgrade.
2018-04-23 19:32:39 -04:00
Andrew Gallant
b75526bd7f output: add --no-column flag
This disables columns in the output if they were otherwise enabled.

Fixes #880
2018-04-23 19:26:58 -04:00
Andrew Gallant
507801c1f2 ignore: support .git directory OR file
This improves support for submodules, which seem to use a '.git' file
instead of a '.git' directory to indicate a worktree.

Fixes #893
2018-04-23 18:33:25 -04:00
Andrew Gallant
2a9d007261 complete: add --no-ignore-messages 2018-04-23 18:29:02 -04:00
Andrew Gallant
0ee0b160b5 logging: add new --no-ignore-messages flag
The new --no-ignore-messages flag permits suppressing errors related to
parsing .gitignore or .ignore files. These error messages can be somewhat
annoying since they can surface from repositories that one has no control
over.

Fixes #646
2018-04-23 18:18:44 -04:00
Andrew Gallant
b4781e2f91 doc: more specific docs for --no-messages
This makes it clear that the --no-messages flag doesn't actually
suppress all error messages, and is therefore not equivalent to
redirecting stderr to /dev/null.

See also: #860
2018-04-23 18:07:57 -04:00
Jeremy Day
8cb03941e6 doc: add search and replace faq
Closes #870
2018-04-23 17:45:26 -04:00
Andrew Gallant
6b15ce2342 deps: update remove_dir_all 2018-04-21 12:13:16 -04:00
Andrew Gallant
4c0b0c6c9d ignore: release 0.4.2 2018-04-21 12:10:16 -04:00
Andrew Gallant
6c8b1e93d5 globset: release 0.4.0 2018-04-21 12:09:15 -04:00
Andrew Gallant
ebdb7c1d4c ignore: impl Clone for DirEntry
There is a small hiccup here in that a `DirEntry` can embed errors
associated with reading an ignore file, which can be accessed and logged
by consumers if desired. That error type can contain an io::Error, which
isn't cloneable. We therefore implement Clone on our library's error
type in a way that re-creates the I/O error as best as possible.

Fixes #891
2018-04-21 12:01:11 -04:00
Andrew Gallant
58bd0c67da deps: pin to atty 0.2.6
atty 0.2.7 (and 0.2.8) contain a regression in cygwin terminals that
prevents basic use of ripgrep, and is also the cause of the Windows CI
test failures. For now, we pin to 0.2.6, but a patch has been submitted
upstream: https://github.com/softprops/atty/pull/25
2018-04-21 12:01:11 -04:00
Andrew Gallant
1503b3175f readme: add --classic flag to snap install
It would be nicer to switch to the `ripgrep` snap package, but
apparently it is configured to install with a binary name `ripgrep.rg`
instead of just `rg`. *sigh*
2018-04-17 06:43:43 -04:00
Andrew Gallant
0345e089aa deps: update regex-syntax 2018-04-15 08:45:05 -04:00
Avindra Goolcharan
0911ab1546 readme: add openSUSE Tumbleweed package
Link to software.opensuse.org package and add `zypper` instructions.
2018-04-09 07:22:04 -04:00
FlorentBecker
c4dd927a13 ignore: add Clone/Debug for builders 2018-04-05 08:06:26 -04:00
Andrew Gallant
34abed597f deps: update all dependencies
In particular, we can now drop rand 0.3.
2018-04-01 10:59:44 -04:00
Andrew Gallant
835600794f termcolor: release 0.3.6 2018-03-26 17:28:21 -04:00
ehuss
07713fb5c5 termcolor: fix bold + intense colors in Win 10
There is an issue with the Windows 10 console where if you issue the bold
escape sequence after one of the extended foreground colors, it overrides the
color.  This happens in termcolor if you have bold, intense, and color set.
The workaround is to issue the bold sequence before the color.

Fixes rust-lang/rust#49322
2018-03-26 16:42:48 -04:00
Dezhi “Andy” Fang
d7c9323a68 deps: update regex
This fixes build failures on latest nightly with SIMD features.
2018-03-17 19:33:34 -04:00
Andrew Gallant
b7d29d126f deps: update clap, atty, libc
Nothing to see here.

Note that we continue to refrain to update tempdir, which means we are
still bringing in rand 0.4 and rand 0.3. Updating tempdir brings in an
old version of remove_dir_all, which in turn brings in winapi 0.2. No
thanks.
2018-03-13 22:55:39 -04:00
Andrew Gallant
42b8132d0a grep: add "perfect" smart case detection
This commit removes the previous smart case detection logic and replaces
it with detection based on the regex AST. This particular AST is a faithful
representation of the concrete syntax, which lets us be very precise in
how we handle it.

Closes #851
2018-03-13 22:55:39 -04:00
Andrew Gallant
cd08707c7c grep: upgrade to regex-syntax 0.5
This update brings with it many bug fixes:

  * Better error messages are printed overall. We also include
    explicit call out for unsupported features like backreferences
    and look-around.
  * Regexes like `\s*{` no longer emit incomprehensible errors.
  * Unicode escape sequences, such as `\u{..}` are now supported.

For the most part, this upgrade was done in a straight-forward way. We
resist the urge to refactor the `grep` crate, in anticipation of it
being rewritten anyway.

Note that we removed the `--fixed-strings` suggestion whenever a regex
syntax error occurs. In practice, I've found that it results in a lot of
false positives, and I believe that its use is not as paramount now that
regex parse errors are much more readable.

Closes #268, Closes #395, Closes #702, Closes #853
2018-03-13 22:55:39 -04:00
Andrew Gallant
c2e97cd858 changelog: update for 0.9.0 2018-03-12 23:21:42 -04:00
Andrew Gallant
1f70e9187c deps: update regex crate
This update brings with it a new feature of the regex crate which will
now use SIMD optimizations automatically at runtime with no necessary
compile time flags. All that's needed is to enable the `unstable` feature.

Other crates, such as bytecount and encoding_rs, are still using the
old-style SIMD support, so we leave the simd-accel and avx-accel features.
However, the binaries we distribute on Github no longer have those
features enabled, which makes them truly portable.

Fixes #135
2018-03-12 23:21:42 -04:00
Markus Staab
7120f32258 globset/doc: update README for 0.3 release 2018-03-12 07:19:55 -04:00
Balaji Sivaraman
00520b30f5 output: add --stats flag
This commit provides basic support for a --stats flag, which will print
various aggregate statistics about a search after all of the results
have been printed. This is mostly intended to support a similar feature
found in the Silver Searcher. Note though that we don't emit the total
bytes searched; this is a first pass at an implementation and we can
improve upon it later.

Closes #411, Closes #799
2018-03-10 10:59:00 -05:00
Andrew Gallant
11a8f0eaf0 args: treat --count --only-matching as --count-matches
Namely, when ripgrep is asked to count things and is also asked to print
every match on its own line, then we should just automatically count the
matches and not the lines. This is a departure from how GNU grep behaves,
but there is a compelling argument to be made that GNU grep's behavior
doesn't make a lot of sense.

Note that since this changes the behavior of combining two existing
flags, this is a breaking change.
2018-03-10 10:38:34 -05:00
Balaji Sivaraman
27fc9f2fd3 search: add a --count-matches flag
This commit introduces a new flag, --count-matches, which will cause
ripgrep to report a total count of all matches instead of a count of
total lines matched.

Closes #566, Closes #814
2018-03-10 10:38:25 -05:00
Balaji Sivaraman
96f73293c0 cleanup: rename match_count to match_line_count 2018-03-10 10:23:38 -05:00
Balaji Sivaraman
b006943c01 search: add -b/--byte-offset flag
This commit adds support for printing 0-based byte offset before each
line. We handle corner cases such as `-o/--only-matching` and
`-C/--context` as well.

Closes #812
2018-03-10 10:15:19 -05:00
Brian Malehorn
91d0756f62 ignore: support backslash escaping
Use the new `Globset::backslash_escape` knob to conform to git behavior:
`\` will escape the following character. For example, the pattern `\*`
will match a file literally named `*`.

Also tweak a test in ripgrep that was relying on this incorrect
behavior.

Closes #526, Closes #811
2018-03-10 09:30:55 -05:00
Andrew Gallant
54256515b4 globset: make ErrorKind enum extensible
This commit makes the ErrorKind enum extensible by adding a
__Nonexhaustive variant. Callers should use this as a hint that
exhaustive case analysis isn't possible in a stable way since new
variants may be added in the future without a semver bump.
2018-03-10 09:30:55 -05:00
Brian Malehorn
e2516ed095 globset: support backslash escaping
From `man 7 glob`:

    One can remove the special meaning of '?', '*' and '[' by preceding
    them by a backslash, or, in case this is part of a shell command
    line, enclosing them in quotes.

Conform to glob / fnmatch / git implementations by making `\` escape the
following character - for example `\?` will match a literal `?`.

However, only enable this by default on Unix platforms. Windows builds
will continue to use `\` as a path separator, but can still get the new
behavior by calling `globset.backslash_escape(true)`.

Adding tests for the `Globset::backslash_escape` option was a bit
involved, since the default value of this option is platform-dependent.

Extend the options framework to hold an `Option<T>` for each
knob, where `None` means "default" and `Some(v)` means "override with
`v`". This way we only have to specify the default values once in
`GlobOptions::default()` rather than replicated in both code and tests.

Finally write a few behavioral tests, and some tests to confirm it
varies by platform.
2018-03-10 09:30:55 -05:00
Alejandro Barreto
c0c80e0209 doc: add Windows Scoop install instructions 2018-03-10 08:15:22 -05:00
Andrew Gallant
dbf6f15625 mmap: handle ENOMEM error
This commit causes a memory map strategy to fall back to a file backed
strategy if the mmap call fails with an `ENOMEM` error.

Fixes #852
2018-03-10 08:13:27 -05:00
Ryan Hayle
9163aaac27 doc: add additional instructions for snap
In the snap store, ripgrep 0.8.1 is only available as a candidate
release, while the default (stable) release is 0.7.1.
2018-03-09 07:09:02 -05:00
Patrick Artounian
9d7448bfc0 doc: shorten Rust nightly brew install command
The burntsushi/ripgrep/ prefix is not needed.
2018-03-07 12:44:06 -05:00
Richard Dodd (dodj)
b98585b429 termcolor/doc: fix typo 2018-03-03 09:20:28 -05:00
Andrew Gallant
f5411b992c doc: use PATTERNFILE for -f/--file flag
Fixes #832
2018-02-23 12:18:15 -05:00
Anthony Wong
492effc7be pkg: update snapcraft.yaml
Update version number and don't declare alias as it has been deprecated.
2018-02-23 06:51:39 -05:00
Andrew Gallant
4889d2d37c doc: clarify licensing story 2018-02-22 19:01:43 -05:00
Andrew Gallant
354996a16f doc: clarify snap installation instructions
Fixes #782
2018-02-22 16:56:22 -05:00
Andrew Gallant
cbebb010a7 doc: fix typos 2018-02-21 15:59:12 -05:00
Andrew Gallant
7098daf6a8 doc: "to rip" means "fast"
Answer the origin story of ripgrep's name.
2018-02-21 15:53:50 -05:00
Andrew Gallant
17d09c0882 pkg: update brew tap to 0.8.1 2018-02-20 21:51:41 -05:00
Andrew Gallant
c8e9f25b85 ci: fix macOS asciidoc installation 2018-02-20 21:05:40 -05:00
Andrew Gallant
9305f89f39 ci: fix man page generation on macOS 2018-02-20 20:55:12 -05:00
Andrew Gallant
9c216ad9a4 release: 0.8.1 2018-02-20 20:19:03 -05:00
Andrew Gallant
6862e07870 changelog: 0.8.1 2018-02-20 20:18:01 -05:00
Andrew Gallant
a6d09b2d42 deps: update to clap 2.30.0 2018-02-20 20:16:57 -05:00
Andrew Gallant
ab1b877c20 termcolor: release 0.3.5 2018-02-20 20:15:08 -05:00
Andrew Gallant
2b5c488814 ignore: release 0.4.1 2018-02-20 20:13:56 -05:00
Andrew Gallant
cb47be938e ci: build man page on ARM cross-compile
This fixes a bug where ripgrep's man page wasn't generated in the ARM
cross-compile build. Mostly, this should just require installing
asciidoc and making sure we test that it actually works.

Fixes #791
2018-02-20 20:05:55 -05:00
Andrew Gallant
fe9be658f4 doc: omit revision when it isn't available
If the revision is empty, then we shouldn't show the `(rev )` text in
the output of `rg --version`.

Fixes #789
2018-02-20 20:05:55 -05:00
Andrew Gallant
8c800adab7 doc: clarify failure mode
This adds a hint for end users that run into a common failure mode where
ripgrep won't search any files because they have a `*` rule in their
`$HOME/.gitignore`.

Fixes #815
2018-02-20 20:05:55 -05:00
Andrew Gallant
d65966efbc ignore: fix performance regression on Windows
This commit fixes a performance regression in Windows that resulted from
fallout from fixing #705. In particular, we introduced an additional
stat call for every single directory entry, which can be quite
disastrous for performance.

There is a corresponding companion PR that fixes the same bug in
walkdir: https://github.com/BurntSushi/walkdir/pull/96

Fixes #820
2018-02-20 19:50:52 -05:00
Markus Westerlind
597bf04a56 termcolor: add ?Sized bound for &mut T impl 2018-02-20 07:13:37 -05:00
Balaji Sivaraman
c78ab9e669 termcolor: improve "intense" docs
Fixes #797
2018-02-20 07:11:25 -05:00
Balaji Sivaraman
d57fc58081 termcolor: add underline support
This commit adds underline support to the termcolor crate, and
exposes it through ripgrep.

Fixes #798
2018-02-20 07:10:03 -05:00
Andrew Gallant
d09538c974 doc: clarify Debian/Ubuntu install instructions
Thanks @x4121 for the tip!
2018-02-20 07:02:08 -05:00
David Peter
94768881e1 doc: fix typo in Debian install instructions 2018-02-20 06:59:21 -05:00
Andrey Kolomoets
f3a9ced82c types: add csv 2018-02-19 20:59:15 -05:00
Andrew Gallant
18f549d289 ignore: fix symlink following on Windows
This commit fixes a bug where symlinks were always being followed on
Windows, even if the user did not request it. This only impacts the
parallel iterator.

This is a regression from the fallout of fixing #705.

Fixes #824
2018-02-19 20:52:37 -05:00
Cosmin Lehene
c749b604dc doc: clarify --ignore-file flag
Fixes #684
2018-02-18 17:53:10 -05:00
Andrew Gallant
d6748a3445 doc: add .deb installation instructions
Generating a Debian binary package was pretty easy using `cargo deb`, so
it is now part of the release. This commit updates the README's
installation methods to reference it.

I did look into setting up a PPA for Ubuntu, but my eyes glazed over while
reading the documentation. Providing a binary Debian package is likely a
faux pas, but it is extraordinarily convenient.
2018-02-18 10:32:53 -05:00
Uwe Dauernheim
9b7f420faa doc: fix files-with-matches typo 2018-02-17 08:24:31 -05:00
Andrew Gallant
361698b90a ignore: fix improper hidden filtering
This commit fixes a bug where `rg --hidden .` would behave differently
with respect to ignore filtering than `rg --hidden ./`. In particular,
this was due to a bug where the directory name `.` caused the leading
`.` in a hidden directory to get stripped, which in turn caused the
ignore rules to fail.

Fixes #807
2018-02-14 18:16:38 -05:00
David Peter
b71a110ccf ignore: fix custom ignore name bug
This commit fixes a bug in the handling of custom gitignore file names.
Previously, the directory walker would check for whether there were any
ignore rules present, but this check didn't incorporate the custom gitignore
rules. At a high level, this permits custom gitignore names to be used
even if no other source of gitignore rules is used.

Fixes #800
2018-02-14 06:53:26 -05:00
unsignedint
5c1af3c25d types: add VHDL 2018-02-13 07:32:16 -05:00
Keith Devens
ad3f55b0e5 doc: the year is 2018 2018-02-12 18:30:02 -05:00
Andrew Gallant
b8e6d50bbe doc: add "grep replacement" question to FAQ
I am tired of being throwing "but ripgrep is marketed as a grep
replacement" in my face. Let's answer it once and for all.
2018-02-12 17:57:14 -05:00
KARASZI István
81afe8c5a0 doc: fix typo 2018-02-12 07:16:10 -05:00
Andrew Gallant
c4e0d4bd7b pkg: update brew tap to 0.8.0, take 2 2018-02-11 20:39:12 -05:00
196 changed files with 37168 additions and 27791 deletions

195
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,195 @@
name: ci
on:
pull_request:
push:
branches:
- master
schedule:
- cron: '00 01 * * *'
jobs:
test:
name: test
env:
# For some builds, we use cross to test on 32-bit and big-endian
# systems.
CARGO: cargo
# When CARGO is set to CROSS, this is set to `--target matrix.target`.
TARGET_FLAGS:
# When CARGO is set to CROSS, TARGET_DIR includes matrix.target.
TARGET_DIR: ./target
# Emit backtraces on panics.
RUST_BACKTRACE: 1
runs-on: ${{ matrix.os }}
strategy:
matrix:
build:
# We test ripgrep on a pinned version of Rust, along with the moving
# targets of 'stable' and 'beta' for good measure.
- pinned
- stable
- beta
# Our release builds are generated by a nightly compiler to take
# advantage of the latest optimizations/compile time improvements. So
# we test all of them here. (We don't do mips releases, but test on
# mips for big-endian coverage.)
- nightly
- nightly-musl
- nightly-32
- nightly-mips
- nightly-arm
- macos
- win-msvc
- win-gnu
include:
- build: pinned
os: ubuntu-18.04
rust: 1.41.0
- build: stable
os: ubuntu-18.04
rust: stable
- build: beta
os: ubuntu-18.04
rust: beta
- build: nightly
os: ubuntu-18.04
rust: nightly
- build: nightly-musl
os: ubuntu-18.04
rust: nightly
target: x86_64-unknown-linux-musl
- build: nightly-32
os: ubuntu-18.04
rust: nightly
target: i686-unknown-linux-gnu
- build: nightly-mips
os: ubuntu-18.04
rust: nightly
target: mips64-unknown-linux-gnuabi64
- build: nightly-arm
os: ubuntu-18.04
rust: nightly
# For stripping release binaries:
# docker run --rm -v $PWD/target:/target:Z \
# rustembedded/cross:arm-unknown-linux-gnueabihf \
# arm-linux-gnueabihf-strip \
# /target/arm-unknown-linux-gnueabihf/debug/rg
target: arm-unknown-linux-gnueabihf
- build: macos
os: macos-latest
rust: nightly
- build: win-msvc
os: windows-2019
rust: nightly
- build: win-gnu
os: windows-2019
rust: nightly-x86_64-gnu
steps:
- name: Checkout repository
uses: actions/checkout@v1
- name: Install packages (Ubuntu)
if: matrix.os == 'ubuntu-18.04'
run: |
ci/ubuntu-install-packages
- name: Install packages (macOS)
if: matrix.os == 'macos-latest'
run: |
ci/macos-install-packages
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: ${{ matrix.rust }}
profile: minimal
override: true
- name: Use Cross
if: matrix.target != ''
run: |
# FIXME: to work around bugs in latest cross release, install master.
# See: https://github.com/rust-embedded/cross/issues/357
cargo install --git https://github.com/rust-embedded/cross
echo "::set-env name=CARGO::cross"
echo "::set-env name=TARGET_FLAGS::--target ${{ matrix.target }}"
echo "::set-env name=TARGET_DIR::./target/${{ matrix.target }}"
- name: Show command used for Cargo
run: |
echo "cargo command is: ${{ env.CARGO }}"
echo "target flag is: ${{ env.TARGET_FLAGS }}"
- name: Build ripgrep and all crates
run: ${{ env.CARGO }} build --verbose --all ${{ env.TARGET_FLAGS }}
- name: Build ripgrep with PCRE2
run: ${{ env.CARGO }} build --verbose --all --features pcre2 ${{ env.TARGET_FLAGS }}
# This is useful for debugging problems when the expected build artifacts
# (like shell completions and man pages) aren't generated.
- name: Show build.rs stderr
shell: bash
run: |
set +x
stderr="$(find "${{ env.TARGET_DIR }}/debug" -name stderr -print0 | xargs -0 ls -t | head -n1)"
if [ -s "$stderr" ]; then
echo "===== $stderr ===== "
cat "$stderr"
echo "====="
fi
set -x
- name: Run tests with PCRE2 (sans cross)
if: matrix.target == ''
run: ${{ env.CARGO }} test --verbose --all --features pcre2 ${{ env.TARGET_FLAGS }}
- name: Run tests without PCRE2 (with cross)
# These tests should actually work, but they almost double the runtime.
# Every integration test spins up qemu to run 'rg', and when PCRE2 is
# enabled, every integration test is run twice: one with the default
# regex engine and once with PCRE2.
if: matrix.target != ''
run: ${{ env.CARGO }} test --verbose --all ${{ env.TARGET_FLAGS }}
- name: Test for existence of build artifacts (Windows)
if: matrix.os == 'windows-2019'
shell: bash
run: |
outdir="$(ci/cargo-out-dir "${{ env.TARGET_DIR }}")"
ls "$outdir/_rg.ps1" && file "$outdir/_rg.ps1"
- name: Test for existence of build artifacts (Unix)
if: matrix.os != 'windows-2019'
shell: bash
run: |
outdir="$(ci/cargo-out-dir "${{ env.TARGET_DIR }}")"
for f in rg.bash rg.fish rg.1; do
# We could use file -E here, but it isn't supported on macOS.
ls "$outdir/$f" && file "$outdir/$f"
done
- name: Test zsh shell completions (Unix, sans cross)
# We could test this when using Cross, but we'd have to execute the
# 'rg' binary (done in test-complete) with qemu, which is a pain and
# doesn't really gain us much. If shell completion works in one place,
# it probably works everywhere.
if: matrix.target == '' && matrix.os != 'windows-2019'
shell: bash
run: ci/test-complete
rustfmt:
name: rustfmt
runs-on: ubuntu-18.04
steps:
- name: Checkout repository
uses: actions/checkout@v1
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
profile: minimal
components: rustfmt
- name: Check formatting
run: |
cargo fmt --all -- --check

211
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,211 @@
# The way this works is a little weird. But basically, the create-release job
# runs purely to initialize the GitHub release itself. Once done, the upload
# URL of the release is saved as an artifact.
#
# The build-release job runs only once create-release is finished. It gets
# the release upload URL by downloading the corresponding artifact (which was
# uploaded by create-release). It then builds the release executables for each
# supported platform and attaches them as release assets to the previously
# created release.
#
# The key here is that we create the release only once.
name: release
on:
push:
# Enable when testing release infrastructure on a branch.
# branches:
# - ag/release
tags:
- '[0-9]+.[0-9]+.[0-9]+'
jobs:
create-release:
name: create-release
runs-on: ubuntu-latest
env:
# Set to force version number, e.g., when no tag exists.
# RG_VERSION: TEST-0.0.0
steps:
- name: Create artifacts directory
run: mkdir artifacts
- name: Get the release version from the tag
if: env.RG_VERSION == ''
run: |
# Apparently, this is the right way to get a tag name. Really?
#
# See: https://github.community/t5/GitHub-Actions/How-to-get-just-the-tag-name/m-p/32167/highlight/true#M1027
echo "::set-env name=RG_VERSION::${GITHUB_REF#refs/tags/}"
echo "version is: ${{ env.RG_VERSION }}"
- name: Create GitHub release
id: release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ env.RG_VERSION }}
release_name: ripgrep ${{ env.RG_VERSION }}
- name: Save release upload URL to artifact
run: echo "${{ steps.release.outputs.upload_url }}" > artifacts/release-upload-url
- name: Save version number to artifact
run: echo "${{ env.RG_VERSION }}" > artifacts/release-version
- name: Upload artifacts
uses: actions/upload-artifact@v1
with:
name: artifacts
path: artifacts
build-release:
name: build-release
needs: ['create-release']
runs-on: ${{ matrix.os }}
env:
# For some builds, we use cross to test on 32-bit and big-endian
# systems.
CARGO: cargo
# When CARGO is set to CROSS, this is set to `--target matrix.target`.
TARGET_FLAGS:
# When CARGO is set to CROSS, TARGET_DIR includes matrix.target.
TARGET_DIR: ./target
# Emit backtraces on panics.
RUST_BACKTRACE: 1
# Build static releases with PCRE2.
PCRE2_SYS_STATIC: 1
strategy:
matrix:
build: [linux, linux-arm, macos, win-msvc, win-gnu, win32-msvc]
include:
- build: linux
os: ubuntu-18.04
rust: nightly
target: x86_64-unknown-linux-musl
- build: linux-arm
os: ubuntu-18.04
rust: nightly
target: arm-unknown-linux-gnueabihf
- build: macos
os: macos-latest
rust: nightly
target: x86_64-apple-darwin
- build: win-msvc
os: windows-2019
rust: nightly
target: x86_64-pc-windows-msvc
- build: win-gnu
os: windows-2019
rust: nightly-x86_64-gnu
target: x86_64-pc-windows-gnu
- build: win32-msvc
os: windows-2019
rust: nightly
target: i686-pc-windows-msvc
steps:
- name: Checkout repository
uses: actions/checkout@v1
with:
fetch-depth: 1
- name: Install packages (Ubuntu)
if: matrix.os == 'ubuntu-18.04'
run: |
ci/ubuntu-install-packages
- name: Install packages (macOS)
if: matrix.os == 'macos-latest'
run: |
ci/macos-install-packages
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: ${{ matrix.rust }}
profile: minimal
override: true
target: ${{ matrix.target }}
- name: Use Cross
# if: matrix.os != 'windows-2019'
run: |
# FIXME: to work around bugs in latest cross release, install master.
# See: https://github.com/rust-embedded/cross/issues/357
cargo install --git https://github.com/rust-embedded/cross
echo "::set-env name=CARGO::cross"
echo "::set-env name=TARGET_FLAGS::--target ${{ matrix.target }}"
echo "::set-env name=TARGET_DIR::./target/${{ matrix.target }}"
- name: Show command used for Cargo
run: |
echo "cargo command is: ${{ env.CARGO }}"
echo "target flag is: ${{ env.TARGET_FLAGS }}"
echo "target dir is: ${{ env.TARGET_DIR }}"
- name: Get release download URL
uses: actions/download-artifact@v1
with:
name: artifacts
path: artifacts
- name: Set release upload URL and release version
shell: bash
run: |
release_upload_url="$(cat artifacts/release-upload-url)"
echo "::set-env name=RELEASE_UPLOAD_URL::$release_upload_url"
echo "release upload url: $RELEASE_UPLOAD_URL"
release_version="$(cat artifacts/release-version)"
echo "::set-env name=RELEASE_VERSION::$release_version"
echo "release version: $RELEASE_VERSION"
- name: Build release binary
run: ${{ env.CARGO }} build --verbose --release --features pcre2 ${{ env.TARGET_FLAGS }}
- name: Strip release binary (linux and macos)
if: matrix.build == 'linux' || matrix.build == 'macos'
run: strip "target/${{ matrix.target }}/release/rg"
- name: Strip release binary (arm)
if: matrix.build == 'linux-arm'
run: |
docker run --rm -v \
"$PWD/target:/target:Z" \
rustembedded/cross:arm-unknown-linux-gnueabihf \
arm-linux-gnueabihf-strip \
/target/arm-unknown-linux-gnueabihf/release/rg
- name: Build archive
shell: bash
run: |
outdir="$(ci/cargo-out-dir "${{ env.TARGET_DIR }}")"
staging="ripgrep-${{ env.RELEASE_VERSION }}-${{ matrix.target }}"
mkdir -p "$staging"/{complete,doc}
cp {README.md,COPYING,UNLICENSE,LICENSE-MIT} "$staging/"
cp {CHANGELOG.md,FAQ.md,GUIDE.md} "$staging/doc/"
cp "$outdir"/{rg.bash,rg.fish,_rg.ps1} "$staging/complete/"
cp complete/_rg "$staging/complete/"
if [ "${{ matrix.os }}" = "windows-2019" ]; then
cp "target/${{ matrix.target }}/release/rg.exe" "$staging/"
7z a "$staging.zip" "$staging"
echo "::set-env name=ASSET::$staging.zip"
else
# The man page is only generated on Unix systems. ¯\_(ツ)_/¯
cp "$outdir"/rg.1 "$staging/doc/"
cp "target/${{ matrix.target }}/release/rg" "$staging/"
tar czf "$staging.tar.gz" "$staging"
echo "::set-env name=ASSET::$staging.tar.gz"
fi
- name: Upload release archive
uses: actions/upload-release-asset@v1.0.1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ env.RELEASE_UPLOAD_URL }}
asset_path: ${{ env.ASSET }}
asset_name: ${{ env.ASSET }}
asset_content_type: application/octet-stream

View File

@@ -1,93 +0,0 @@
language: rust
env:
global:
- PROJECT_NAME: ripgrep
- RUST_BACKTRACE: full
addons:
apt:
packages:
# For generating man page.
- libxslt1-dev
- asciidoc
- docbook-xsl
- xsltproc
- libxml2-utils
# Needed for completion-function test.
- zsh
# Needed for testing decompression search.
- xz-utils
matrix:
fast_finish: true
include:
# Nightly channel.
# All *nix releases are done on the nightly channel to take advantage
# of the regex library's multiple pattern SIMD search.
- os: linux
rust: nightly
env: TARGET=i686-unknown-linux-musl
- os: linux
rust: nightly
env: TARGET=x86_64-unknown-linux-musl
- os: osx
rust: nightly
env: TARGET=x86_64-apple-darwin
- os: linux
rust: nightly
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
addons:
apt:
packages:
- gcc-4.8-arm-linux-gnueabihf
- binutils-arm-linux-gnueabihf
- libc6-armhf-cross
- libc6-dev-armhf-cross
# Beta channel. We enable these to make sure there are no regressions in
# Rust beta releases.
- os: linux
rust: beta
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: beta
env: TARGET=x86_64-unknown-linux-gnu
# Minimum Rust supported channel. We enable these to make sure ripgrep
# continues to work on the advertised minimum Rust version.
- os: linux
rust: 1.20.0
env: TARGET=x86_64-unknown-linux-gnu
- os: linux
rust: 1.20.0
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: 1.20.0
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
addons:
apt:
packages:
- gcc-4.8-arm-linux-gnueabihf
- binutils-arm-linux-gnueabihf
- libc6-armhf-cross
- libc6-dev-armhf-cross
install: ci/install.sh
script: ci/script.sh
before_deploy: ci/before_deploy.sh
deploy:
provider: releases
file_glob: true
file: deployment/${PROJECT_NAME}-${TRAVIS_TAG}-${TARGET}.tar.gz
skip_cleanup: true
on:
condition: $TRAVIS_RUST_VERSION = nightly
branch: master
tags: true
api_key:
secure: "IbSnsbGkxSydR/sozOf1/SRvHplzwRUHzcTjM7BKnr7GccL86gRPUrsrvD103KjQUGWIc1TnK1YTq5M0Onswg/ORDjqa1JEJPkPdPnVh9ipbF7M2De/7IlB4X4qXLKoApn8+bx2x/mfYXu4G+G1/2QdbaKK2yfXZKyjz0YFx+6CNrVCT2Nk8q7aHvOOzAL58vsG8iPDpupuhxlMDDn/UhyOWVInmPPQ0iJR1ZUJN8xJwXvKvBbfp3AhaBiAzkhXHNLgBR8QC5noWWMXnuVDMY3k4f3ic0V+p/qGUCN/nhptuceLxKFicMCYObSZeUzE5RAI0/OBW7l3z2iCoc+TbAnn+JrX/ObJCfzgAOXAU3tLaBFMiqQPGFKjKg1ltSYXomOFP/F7zALjpvFp4lYTBajRR+O3dqaxA9UQuRjw27vOeUpMcga4ZzL4VXFHzrxZKBHN//XIGjYAVhJ1NSSeGpeJV5/+jYzzWKfwSagRxQyVCzMooYFFXzn8Yxdm3PJlmp3GaAogNkdB9qKcrEvRINCelalzALPi0hD/HUDi8DD2PNTCLLMo6VSYtvc685Zbe+KgNzDV1YyTrRCUW6JotrS0r2ULLwnsh40hSB//nNv3XmwNmC/CmW5QAnIGj8cBMF4S2t6ohADIndojdAfNiptmaZOIT6owK7bWMgPMyopo="
branches:
only:
# Pushes and PR to the master branch
- master
# Ruby regex to match tags. Required, or travis won't trigger deploys when
# a new tag is pushed.
- /^\d+\.\d+\.\d+.*$/
notifications:
email:
on_success: never

View File

@@ -1,4 +1,499 @@
0.8.0 (2017-02-11)
12.0.0 (2020-03-15)
===================
ripgrep 12 is a new major version release of ripgrep that contains many bug
fixes, several important performance improvements and a few minor new features.
In a near future release, I am hoping to add an
[indexing feature](https://github.com/BurntSushi/ripgrep/issues/1497)
to ripgrep, which will dramatically speed up searching by building an index.
Feedback would very much be appreciated, especially on the user experience
which will be difficult to get right.
This release has no known breaking changes.
Deprecations:
* The `--no-pcre2-unicode` flag is deprecated. Instead, use the `--no-unicode`
flag, which applies to both the default regex engine and PCRE2. For now,
`--no-pcre2-unicode` and `--pcre2-unicode` are aliases to `--no-unicode`
and `--unicode`, respectively. The `--[no-]pcre2-unicode` flags may be
removed in a future release.
* The `--auto-hybrid-regex` flag is deprecated. Instead, use the new `--engine`
flag with the `auto` value.
Performance improvements:
* [PERF #1087](https://github.com/BurntSushi/ripgrep/pull/1087):
ripgrep is smarter when detected literals are whitespace.
* [PERF #1381](https://github.com/BurntSushi/ripgrep/pull/1381):
Directory traversal is sped up with speculative ignore-file existence checks.
* [PERF cd8ec38a](https://github.com/BurntSushi/ripgrep/commit/cd8ec38a):
Improve inner literal detection to cover more cases more effectively.
e.g., ` +Sherlock Holmes +` now has ` Sherlock Holmes ` extracted instead
of ` `.
* [PERF 6a0e0147](https://github.com/BurntSushi/ripgrep/commit/6a0e0147):
Improve literal detection when the `-w/--word-regexp` flag is used.
* [PERF ad97e9c9](https://github.com/BurntSushi/ripgrep/commit/ad97e9c9):
Improve overall performance of the `-w/--word-regexp` flag.
Feature enhancements:
* Added or improved file type filtering for erb, diff, Gradle, HAML, Org,
Postscript, Skim, Slim, Slime, RPM Spec files, Typoscript, xml.
* [FEATURE #1370](https://github.com/BurntSushi/ripgrep/pull/1370):
Add `--include-zero` flag that shows files searched without matches.
* [FEATURE #1390](https://github.com/BurntSushi/ripgrep/pull/1390):
Add `--no-context-separator` flag that always hides context separators.
* [FEATURE #1414](https://github.com/BurntSushi/ripgrep/pull/1414):
Add `--no-require-git` flag to allow ripgrep to respect gitignores anywhere.
* [FEATURE #1420](https://github.com/BurntSushi/ripgrep/pull/1420):
Add `--no-ignore-exclude` to disregard rules in `.git/info/exclude` files.
* [FEATURE #1466](https://github.com/BurntSushi/ripgrep/pull/1466):
Add `--no-ignore-files` flag to disable all `--ignore-file` flags.
* [FEATURE #1488](https://github.com/BurntSushi/ripgrep/pull/1488):
Add '--engine' flag for easier switching between regex engines.
* [FEATURE 75cbe88f](https://github.com/BurntSushi/ripgrep/commit/75cbe88f):
Add `--no-unicode` flag. This works on all supported regex engines.
Bug fixes:
* [BUG #1291](https://github.com/BurntSushi/ripgrep/issues/1291):
ripgrep now works in non-existent directories.
* [BUG #1319](https://github.com/BurntSushi/ripgrep/issues/1319):
Fix match bug due to errant literal detection.
* [**BUG #1335**](https://github.com/BurntSushi/ripgrep/issues/1335):
Fixes a performance bug when searching plain text files with very long lines.
This was a serious performance regression in some cases.
* [BUG #1344](https://github.com/BurntSushi/ripgrep/issues/1344):
Document usage of `--type all`.
* [BUG #1389](https://github.com/BurntSushi/ripgrep/issues/1389):
Fixes a bug where ripgrep would panic when searching a symlinked directory.
* [BUG #1439](https://github.com/BurntSushi/ripgrep/issues/1439):
Improve documentation for ripgrep's automatic stdin detection.
* [BUG #1441](https://github.com/BurntSushi/ripgrep/issues/1441):
Remove CPU features from man page.
* [BUG #1442](https://github.com/BurntSushi/ripgrep/issues/1442),
[BUG #1478](https://github.com/BurntSushi/ripgrep/issues/1478):
Improve documentation of the `-g/--glob` flag.
* [BUG #1445](https://github.com/BurntSushi/ripgrep/issues/1445):
ripgrep now respects ignore rules from .git/info/exclude in worktrees.
* [BUG #1485](https://github.com/BurntSushi/ripgrep/issues/1485):
Fish shell completions from the release Debian package are now installed to
`/usr/share/fish/vendor_completions.d/rg.fish`.
11.0.2 (2019-08-01)
===================
ripgrep 11.0.2 is a new patch release that fixes a few bugs, including a
performance regression and a matching bug when using the `-F/--fixed-strings`
flag.
Feature enhancements:
* [FEATURE #1293](https://github.com/BurntSushi/ripgrep/issues/1293):
Added `--glob-case-insensitive` flag that makes `--glob` behave as `--iglob`.
Bug fixes:
* [BUG #1246](https://github.com/BurntSushi/ripgrep/issues/1246):
Add translations to README, starting with an unofficial Chinese translation.
* [BUG #1259](https://github.com/BurntSushi/ripgrep/issues/1259):
Fix bug where the last byte of a `-f file` was stripped if it wasn't a `\n`.
* [BUG #1261](https://github.com/BurntSushi/ripgrep/issues/1261):
Document that no error is reported when searching for `\n` with `-P/--pcre2`.
* [BUG #1284](https://github.com/BurntSushi/ripgrep/issues/1284):
Mention `.ignore` and `.rgignore` more prominently in the README.
* [BUG #1292](https://github.com/BurntSushi/ripgrep/issues/1292):
Fix bug where `--with-filename` was sometimes enabled incorrectly.
* [BUG #1268](https://github.com/BurntSushi/ripgrep/issues/1268):
Fix major performance regression in GitHub `x86_64-linux` binary release.
* [BUG #1302](https://github.com/BurntSushi/ripgrep/issues/1302):
Show better error messages when a non-existent preprocessor command is given.
* [BUG #1334](https://github.com/BurntSushi/ripgrep/issues/1334):
Fix match regression with `-F` flag when patterns contain meta characters.
11.0.1 (2019-04-16)
===================
ripgrep 11.0.1 is a new patch release that fixes a search regression introduced
in the previous 11.0.0 release. In particular, ripgrep can enter an infinite
loop for some search patterns when searching invalid UTF-8.
Bug fixes:
* [BUG #1247](https://github.com/BurntSushi/ripgrep/issues/1247):
Fix search bug that can cause ripgrep to enter an infinite loop.
11.0.0 (2019-04-15)
===================
ripgrep 11 is a new major version release of ripgrep that contains many bug
fixes, some performance improvements and a few feature enhancements. Notably,
ripgrep's user experience for binary file filtering has been improved. See the
[guide's new section on binary data](GUIDE.md#binary-data) for more details.
This release also marks a change in ripgrep's versioning. Where as the previous
version was `0.10.0`, this version is `11.0.0`. Moving forward, ripgrep's
major version will be increased a few times per year. ripgrep will continue to
be conservative with respect to backwards compatibility, but may occasionally
introduce breaking changes, which will always be documented in this CHANGELOG.
See [issue 1172](https://github.com/BurntSushi/ripgrep/issues/1172) for a bit
more detail on why this versioning change was made.
This release increases the **minimum supported Rust version** from 1.28.0 to
1.34.0.
**BREAKING CHANGES**:
* ripgrep has tweaked its exit status codes to be more like GNU grep's. Namely,
if a non-fatal error occurs during a search, then ripgrep will now always
emit a `2` exit status code, regardless of whether a match is found or not.
Previously, ripgrep would only emit a `2` exit status code for a catastrophic
error (e.g., regex syntax error). One exception to this is if ripgrep is run
with `-q/--quiet`. In that case, if an error occurs and a match is found,
then ripgrep will exit with a `0` exit status code.
* Supplying the `-u/--unrestricted` flag three times is now equivalent to
supplying `--no-ignore --hidden --binary`. Previously, `-uuu` was equivalent
to `--no-ignore --hidden --text`. The difference is that `--binary` disables
binary file filtering without potentially dumping binary data into your
terminal. That is, `rg -uuu foo` should now be equivalent to `grep -r foo`.
* The `avx-accel` feature of ripgrep has been removed since it is no longer
necessary. All uses of AVX in ripgrep are now enabled automatically via
runtime CPU feature detection. The `simd-accel` feature does remain available
(only for enabling SIMD for transcoding), however, it does increase
compilation times substantially at the moment.
Performance improvements:
* [PERF #497](https://github.com/BurntSushi/ripgrep/issues/497),
[PERF #838](https://github.com/BurntSushi/ripgrep/issues/838):
Make `rg -F -f dictionary-of-literals` much faster.
Feature enhancements:
* Added or improved file type filtering for Apache Thrift, ASP, Bazel, Brotli,
BuildStream, bzip2, C, C++, Cython, gzip, Java, Make, Postscript, QML, Tex,
XML, xz, zig and zstd.
* [FEATURE #855](https://github.com/BurntSushi/ripgrep/issues/855):
Add `--binary` flag for disabling binary file filtering.
* [FEATURE #1078](https://github.com/BurntSushi/ripgrep/pull/1078):
Add `--max-columns-preview` flag for showing a preview of long lines.
* [FEATURE #1099](https://github.com/BurntSushi/ripgrep/pull/1099):
Add support for Brotli and Zstd to the `-z/--search-zip` flag.
* [FEATURE #1138](https://github.com/BurntSushi/ripgrep/pull/1138):
Add `--no-ignore-dot` flag for ignoring `.ignore` files.
* [FEATURE #1155](https://github.com/BurntSushi/ripgrep/pull/1155):
Add `--auto-hybrid-regex` flag for automatically falling back to PCRE2.
* [FEATURE #1159](https://github.com/BurntSushi/ripgrep/pull/1159):
ripgrep's exit status logic should now match GNU grep. See updated man page.
* [FEATURE #1164](https://github.com/BurntSushi/ripgrep/pull/1164):
Add `--ignore-file-case-insensitive` for case insensitive ignore globs.
* [FEATURE #1185](https://github.com/BurntSushi/ripgrep/pull/1185):
Add `-I` flag as a short option for the `--no-filename` flag.
* [FEATURE #1207](https://github.com/BurntSushi/ripgrep/pull/1207):
Add `none` value to `-E/--encoding` to forcefully disable all transcoding.
* [FEATURE da9d7204](https://github.com/BurntSushi/ripgrep/commit/da9d7204):
Add `--pcre2-version` for querying showing PCRE2 version information.
Bug fixes:
* [BUG #306](https://github.com/BurntSushi/ripgrep/issues/306),
[BUG #855](https://github.com/BurntSushi/ripgrep/issues/855):
Improve the user experience for ripgrep's binary file filtering.
* [BUG #373](https://github.com/BurntSushi/ripgrep/issues/373),
[BUG #1098](https://github.com/BurntSushi/ripgrep/issues/1098):
`**` is now accepted as valid syntax anywhere in a glob.
* [BUG #916](https://github.com/BurntSushi/ripgrep/issues/916):
ripgrep no longer hangs when searching `/proc` with a zombie process present.
* [BUG #1052](https://github.com/BurntSushi/ripgrep/issues/1052):
Fix bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1055](https://github.com/BurntSushi/ripgrep/issues/1055):
Suggest `-U/--multiline` when a pattern contains a `\n`.
* [BUG #1063](https://github.com/BurntSushi/ripgrep/issues/1063):
Always strip a BOM if it's present, even for UTF-8.
* [BUG #1064](https://github.com/BurntSushi/ripgrep/issues/1064):
Fix inner literal detection that could lead to incorrect matches.
* [BUG #1079](https://github.com/BurntSushi/ripgrep/issues/1079):
Fixes a bug where the order of globs could result in missing a match.
* [BUG #1089](https://github.com/BurntSushi/ripgrep/issues/1089):
Fix another bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1091](https://github.com/BurntSushi/ripgrep/issues/1091):
Add note about inverted flags to the man page.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
Fix handling of literal slashes in gitignore patterns.
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
Fix corner cases involving the `--crlf` flag.
* [BUG #1101](https://github.com/BurntSushi/ripgrep/issues/1101):
Fix AsciiDoc escaping for man page output.
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
Clarify what `--encoding auto` does.
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
`--files-with-matches` and `--files-without-match` work with one file.
* [BUG #1121](https://github.com/BurntSushi/ripgrep/issues/1121):
Fix bug that was triggering Windows antimalware when using the `--files`
flag.
* [BUG #1125](https://github.com/BurntSushi/ripgrep/issues/1125),
[BUG #1159](https://github.com/BurntSushi/ripgrep/issues/1159):
ripgrep shouldn't panic for `rg -h | rg` and should emit correct exit status.
* [BUG #1144](https://github.com/BurntSushi/ripgrep/issues/1144):
Fixes a bug where line numbers could be wrong on big-endian machines.
* [BUG #1154](https://github.com/BurntSushi/ripgrep/issues/1154):
Windows files with "hidden" attribute are now treated as hidden.
* [BUG #1173](https://github.com/BurntSushi/ripgrep/issues/1173):
Fix handling of `**` patterns in gitignore files.
* [BUG #1174](https://github.com/BurntSushi/ripgrep/issues/1174):
Fix handling of repeated `**` patterns in gitignore files.
* [BUG #1176](https://github.com/BurntSushi/ripgrep/issues/1176):
Fix bug where `-F`/`-x` weren't applied to patterns given via `-f`.
* [BUG #1189](https://github.com/BurntSushi/ripgrep/issues/1189):
Document cases where ripgrep may use a lot of memory.
* [BUG #1203](https://github.com/BurntSushi/ripgrep/issues/1203):
Fix a matching bug related to the suffix literal optimization.
* [BUG 8f14cb18](https://github.com/BurntSushi/ripgrep/commit/8f14cb18):
Increase the default stack size for PCRE2's JIT.
0.10.0 (2018-09-07)
===================
This is a new minor version release of ripgrep that contains some major new
features, a huge number of bug fixes, and is the first release based on
libripgrep. The entirety of ripgrep's core search and printing code has been
rewritten and generalized so that anyone can make use of it.
Major new features include PCRE2 support, multi-line search and a JSON output
format.
**BREAKING CHANGES**:
* The minimum version required to compile Rust has now changed to track the
latest stable version of Rust. Patch releases will continue to compile with
the same version of Rust as the previous patch release, but new minor
versions will use the current stable version of the Rust compile as its
minimum supported version.
* The match semantics of `-w/--word-regexp` have changed slightly. They used
to be `\b(?:<your pattern>)\b`, but now it's
`(?:^|\W)(?:<your pattern>)(?:$|\W)`. This matches the behavior of GNU grep
and is believed to be closer to the intended semantics of the flag. See
[#389](https://github.com/BurntSushi/ripgrep/issues/389) for more details.
Feature enhancements:
* [FEATURE #162](https://github.com/BurntSushi/ripgrep/issues/162):
libripgrep is now a thing. The primary crate is
[`grep`](https://docs.rs/grep).
* [FEATURE #176](https://github.com/BurntSushi/ripgrep/issues/176):
Add `-U/--multiline` flag that permits matching over multiple lines.
* [FEATURE #188](https://github.com/BurntSushi/ripgrep/issues/188):
Add `-P/--pcre2` flag that gives support for look-around and backreferences.
* [FEATURE #244](https://github.com/BurntSushi/ripgrep/issues/244):
Add `--json` flag that prints results in a JSON Lines format.
* [FEATURE #321](https://github.com/BurntSushi/ripgrep/issues/321):
Add `--one-file-system` flag to skip directories on different file systems.
* [FEATURE #404](https://github.com/BurntSushi/ripgrep/issues/404):
Add `--sort` and `--sortr` flag for more sorting. Deprecate `--sort-files`.
* [FEATURE #416](https://github.com/BurntSushi/ripgrep/issues/416):
Add `--crlf` flag to permit `$` to work with carriage returns on Windows.
* [FEATURE #917](https://github.com/BurntSushi/ripgrep/issues/917):
The `--trim` flag strips prefix whitespace from all lines printed.
* [FEATURE #993](https://github.com/BurntSushi/ripgrep/issues/993):
Add `--null-data` flag, which makes ripgrep use NUL as a line terminator.
* [FEATURE #997](https://github.com/BurntSushi/ripgrep/issues/997):
The `--passthru` flag now works with the `--replace` flag.
* [FEATURE #1038-1](https://github.com/BurntSushi/ripgrep/issues/1038):
Add `--line-buffered` and `--block-buffered` for forcing a buffer strategy.
* [FEATURE #1038-2](https://github.com/BurntSushi/ripgrep/issues/1038):
Add `--pre-glob` for filtering files through the `--pre` flag.
Bug fixes:
* [BUG #2](https://github.com/BurntSushi/ripgrep/issues/2):
Searching with non-zero context can now use memory maps if appropriate.
* [BUG #200](https://github.com/BurntSushi/ripgrep/issues/200):
ripgrep will now stop correctly when its output pipe is closed.
* [BUG #389](https://github.com/BurntSushi/ripgrep/issues/389):
The `-w/--word-regexp` flag now works more intuitively.
* [BUG #643](https://github.com/BurntSushi/ripgrep/issues/643):
Detection of readable stdin has improved on Windows.
* [BUG #441](https://github.com/BurntSushi/ripgrep/issues/441),
[BUG #690](https://github.com/BurntSushi/ripgrep/issues/690),
[BUG #980](https://github.com/BurntSushi/ripgrep/issues/980):
Matching empty lines now works correctly in several corner cases.
* [BUG #764](https://github.com/BurntSushi/ripgrep/issues/764):
Color escape sequences now coalesce, which reduces output size.
* [BUG #842](https://github.com/BurntSushi/ripgrep/issues/842):
Add man page to binary Debian package.
* [BUG #922](https://github.com/BurntSushi/ripgrep/issues/922):
ripgrep is now more robust with respect to memory maps failing.
* [BUG #937](https://github.com/BurntSushi/ripgrep/issues/937):
Color escape sequences are no longer emitted for empty matches.
* [BUG #940](https://github.com/BurntSushi/ripgrep/issues/940):
Context from the `--passthru` flag should not impact process exit status.
* [BUG #984](https://github.com/BurntSushi/ripgrep/issues/984):
Fixes bug in `ignore` crate where first path was always treated as a symlink.
* [BUG #990](https://github.com/BurntSushi/ripgrep/issues/990):
Read stderr asynchronously when running a process.
* [BUG #1013](https://github.com/BurntSushi/ripgrep/issues/1013):
Add compile time and runtime CPU features to `--version` output.
* [BUG #1028](https://github.com/BurntSushi/ripgrep/pull/1028):
Don't complete bare pattern after `-f` in zsh.
0.9.0 (2018-08-03)
==================
This is a new minor version release of ripgrep that contains some minor new
features and a panoply of bug fixes.
Releases provided on Github for `x86_64` will now work on all target CPUs, and
will also automatically take advantage of features found on modern CPUs (such
as AVX2) for additional optimizations.
This release increases the **minimum supported Rust version** from 1.20.0 to
1.23.0.
It is anticipated that the next release of ripgrep (0.10.0) will provide
multi-line search support and a JSON output format.
**BREAKING CHANGES**:
* When `--count` and `--only-matching` are provided simultaneously, the
behavior of ripgrep is as if the `--count-matches` flag was given. That is,
the total number of matches is reported, where there may be multiple matches
per line. Previously, the behavior of ripgrep was to report the total number
of matching lines. (Note that this behavior diverges from the behavior of
GNU grep.)
* Octal syntax is no longer supported. ripgrep previously accepted expressions
like `\1` as syntax for matching `U+0001`, but ripgrep will now report an
error instead.
* The `--line-number-width` flag has been removed. Its functionality was not
carefully considered with all ripgrep output formats.
See [#795](https://github.com/BurntSushi/ripgrep/issues/795) for more
details.
Feature enhancements:
* Added or improved file type filtering for Android, Bazel, Fuchsia, Haskell,
Java and Puppet.
* [FEATURE #411](https://github.com/BurntSushi/ripgrep/issues/411):
Add a `--stats` flag, which emits aggregate statistics after search results.
* [FEATURE #646](https://github.com/BurntSushi/ripgrep/issues/646):
Add a `--no-ignore-messages` flag, which suppresses parse errors from reading
`.ignore` and `.gitignore` files.
* [FEATURE #702](https://github.com/BurntSushi/ripgrep/issues/702):
Support `\u{..}` Unicode escape sequences.
* [FEATURE #812](https://github.com/BurntSushi/ripgrep/issues/812):
Add `-b/--byte-offset` flag that shows the byte offset of each matching line.
* [FEATURE #814](https://github.com/BurntSushi/ripgrep/issues/814):
Add `--count-matches` flag, which is like `--count`, but for each match.
* [FEATURE #880](https://github.com/BurntSushi/ripgrep/issues/880):
Add a `--no-column` flag, which disables column numbers in the output.
* [FEATURE #898](https://github.com/BurntSushi/ripgrep/issues/898):
Add support for `lz4` when using the `-z/--search-zip` flag.
* [FEATURE #924](https://github.com/BurntSushi/ripgrep/issues/924):
`termcolor` has moved to its own repository:
https://github.com/BurntSushi/termcolor
* [FEATURE #934](https://github.com/BurntSushi/ripgrep/issues/934):
Add a new flag, `--no-ignore-global`, that permits disabling global
gitignores.
* [FEATURE #967](https://github.com/BurntSushi/ripgrep/issues/967):
Rename `--maxdepth` to `--max-depth` for consistency. Keep `--maxdepth` for
backwards compatibility.
* [FEATURE #978](https://github.com/BurntSushi/ripgrep/issues/978):
Add a `--pre` option to filter inputs with an arbitrary program.
* [FEATURE fca9709d](https://github.com/BurntSushi/ripgrep/commit/fca9709d):
Improve zsh completion.
Bug fixes:
* [BUG #135](https://github.com/BurntSushi/ripgrep/issues/135):
Release portable binaries that conditionally use SSSE3, AVX2, etc., at
runtime.
* [BUG #268](https://github.com/BurntSushi/ripgrep/issues/268):
Print descriptive error message when trying to use look-around or
backreferences.
* [BUG #395](https://github.com/BurntSushi/ripgrep/issues/395):
Show comprehensible error messages for regexes like `\s*{`.
* [BUG #526](https://github.com/BurntSushi/ripgrep/issues/526):
Support backslash escapes in globs.
* [BUG #795](https://github.com/BurntSushi/ripgrep/issues/795):
Fix problems with `--line-number-width` by removing it.
* [BUG #832](https://github.com/BurntSushi/ripgrep/issues/832):
Clarify usage instructions for `-f/--file` flag.
* [BUG #835](https://github.com/BurntSushi/ripgrep/issues/835):
Fix small performance regression while crawling very large directory trees.
* [BUG #851](https://github.com/BurntSushi/ripgrep/issues/851):
Fix `-S/--smart-case` detection once and for all.
* [BUG #852](https://github.com/BurntSushi/ripgrep/issues/852):
Be robust with respect to `ENOMEM` errors returned by `mmap`.
* [BUG #853](https://github.com/BurntSushi/ripgrep/issues/853):
Upgrade `grep` crate to `regex-syntax 0.6.0`.
* [BUG #893](https://github.com/BurntSushi/ripgrep/issues/893):
Improve support for git submodules.
* [BUG #900](https://github.com/BurntSushi/ripgrep/issues/900):
When no patterns are given, ripgrep should never match anything.
* [BUG #907](https://github.com/BurntSushi/ripgrep/issues/907):
ripgrep will now stop traversing after the first file when `--quiet --files`
is used.
* [BUG #918](https://github.com/BurntSushi/ripgrep/issues/918):
Don't skip tar archives when `-z/--search-zip` is used.
* [BUG #934](https://github.com/BurntSushi/ripgrep/issues/934):
Don't respect gitignore files when searching outside git repositories.
* [BUG #948](https://github.com/BurntSushi/ripgrep/issues/948):
Use exit code 2 to indicate error, and use exit code 1 to indicate no
matches.
* [BUG #951](https://github.com/BurntSushi/ripgrep/issues/951):
Add stdin example to ripgrep usage documentation.
* [BUG #955](https://github.com/BurntSushi/ripgrep/issues/955):
Use buffered writing when not printing to a tty, which fixes a performance
regression.
* [BUG #957](https://github.com/BurntSushi/ripgrep/issues/957):
Improve the error message shown for `--path separator /` in some Windows
shells.
* [BUG #964](https://github.com/BurntSushi/ripgrep/issues/964):
Add a `--no-fixed-strings` flag to disable `-F/--fixed-strings`.
* [BUG #988](https://github.com/BurntSushi/ripgrep/issues/988):
Fix a bug in the `ignore` crate that prevented the use of explicit ignore
files after disabling all other ignore rules.
* [BUG #995](https://github.com/BurntSushi/ripgrep/issues/995):
Respect `$XDG_CONFIG_DIR/git/config` for detecting `core.excludesFile`.
0.8.1 (2018-02-20)
==================
This is a patch release of ripgrep that primarily fixes regressions introduced
in 0.8.0 (#820 and #824) in directory traversal on Windows. These regressions
do not impact non-Windows users.
Feature enhancements:
* Added or improved file type filtering for csv and VHDL.
* [FEATURE #798](https://github.com/BurntSushi/ripgrep/issues/798):
Add `underline` support to `termcolor` and ripgrep. See documentation on the
`--colors` flag for details.
Bug fixes:
* [BUG #684](https://github.com/BurntSushi/ripgrep/issues/684):
Improve documentation for the `--ignore-file` flag.
* [BUG #789](https://github.com/BurntSushi/ripgrep/issues/789):
Don't show `(rev )` if the revision wasn't available during the build.
* [BUG #791](https://github.com/BurntSushi/ripgrep/issues/791):
Add man page to ARM release.
* [BUG #797](https://github.com/BurntSushi/ripgrep/issues/797):
Improve documentation for "intense" setting in `termcolor`.
* [BUG #800](https://github.com/BurntSushi/ripgrep/issues/800):
Fix a bug in the `ignore` crate for custom ignore files. This had no impact
on ripgrep.
* [BUG #807](https://github.com/BurntSushi/ripgrep/issues/807):
Fix a bug where `rg --hidden .` behaved differently from `rg --hidden ./`.
* [BUG #815](https://github.com/BurntSushi/ripgrep/issues/815):
Clarify a common failure mode in user guide.
* [BUG #820](https://github.com/BurntSushi/ripgrep/issues/820):
Fixes a bug on Windows where symlinks were followed even if not requested.
* [BUG #824](https://github.com/BurntSushi/ripgrep/issues/824):
Fix a performance regression in directory traversal on Windows.
0.8.0 (2018-02-11)
==================
This is a new minor version releae of ripgrep that satisfies several popular
feature requests (config files, search compressed files, true colors), fixes
@@ -47,7 +542,7 @@ Feature enhancements:
Add extended or "true" color support. Works in Windows 10!
[See the FAQ for details.](FAQ.md#colors)
* [FEATURE #539](https://github.com/BurntSushi/ripgrep/issues/539):
Search gzip, bsip2, lzma or xz files when given `-z/--search-zip` flag.
Search gzip, bzip2, lzma or xz files when given `-z/--search-zip` flag.
* [FEATURE #544](https://github.com/BurntSushi/ripgrep/issues/544):
Add support for line number alignment via a new `--line-number-width` flag.
* [FEATURE #654](https://github.com/BurntSushi/ripgrep/pull/654):

602
Cargo.lock generated
View File

@@ -1,69 +1,113 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
[[package]]
name = "aho-corasick"
version = "0.6.4"
version = "0.7.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ansi_term"
version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "atty"
version = "0.2.6"
version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"hermit-abi 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "autocfg"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "base64"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bitflags"
version = "1.0.1"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bytecount"
version = "0.3.1"
name = "bstr"
version = "0.2.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"simd 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-automata 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "bytecount"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "byteorder"
version = "1.3.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "cc"
version = "1.0.50"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "cfg-if"
version = "0.1.2"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "clap"
version = "2.29.4"
version = "2.33.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"ansi_term 0.10.2 (registry+https://github.com/rust-lang/crates.io-index)",
"atty 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"bitflags 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
"textwrap 0.9.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
"bitflags 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)",
"textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "crossbeam"
version = "0.3.2"
name = "crossbeam-channel"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)",
"maybe-uninit 2.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "encoding_rs"
name = "crossbeam-utils"
version = "0.7.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"autocfg 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "encoding_rs"
version = "0.8.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
"packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "encoding_rs_io"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"encoding_rs 0.8.22 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -72,275 +116,407 @@ version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "fuchsia-zircon"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"bitflags 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"fuchsia-zircon-sys 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "fuchsia-zircon-sys"
version = "0.3.3"
name = "fs_extra"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "glob"
version = "0.2.11"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "globset"
version = "0.3.0"
version = "0.4.5"
dependencies = [
"aho-corasick 0.6.4 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick 0.7.10 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.48 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep"
version = "0.1.8"
version = "0.2.5"
dependencies = [
"log 0.4.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-cli 0.1.4",
"grep-matcher 0.1.4",
"grep-pcre2 0.1.4",
"grep-printer 0.1.4",
"grep-regex 0.1.6",
"grep-searcher 0.1.7",
"termcolor 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-cli"
version = "0.1.4"
dependencies = [
"atty 0.2.14 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.5",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-matcher"
version = "0.1.4"
dependencies = [
"memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-pcre2"
version = "0.1.4"
dependencies = [
"grep-matcher 0.1.4",
"pcre2 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-printer"
version = "0.1.4"
dependencies = [
"base64 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.4",
"grep-regex 0.1.6",
"grep-searcher 0.1.7",
"serde 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.48 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-regex"
version = "0.1.6"
dependencies = [
"aho-corasick 0.7.10 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.4",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.17 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep-searcher"
version = "0.1.7"
dependencies = [
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"bytecount 0.6.0 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs 0.8.22 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs_io 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-matcher 0.1.4",
"grep-regex 0.1.6",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "hermit-abi"
version = "0.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ignore"
version = "0.4.0"
version = "0.4.12"
dependencies = [
"crossbeam 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.3.0",
"lazy_static 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"tempdir 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"crossbeam-channel 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)",
"crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.5",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "itoa"
version = "0.4.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "jemalloc-sys"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cc 1.0.50 (registry+https://github.com/rust-lang/crates.io-index)",
"fs_extra 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "jemallocator"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"jemalloc-sys 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "lazy_static"
version = "1.0.0"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "libc"
version = "0.2.36"
version = "0.2.67"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "log"
version = "0.4.1"
version = "0.4.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "maybe-uninit"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "memchr"
version = "2.0.1"
version = "2.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "memmap"
version = "0.6.2"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "num_cpus"
version = "1.8.0"
version = "1.12.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"hermit-abi 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand"
version = "0.3.22"
name = "packed_simd"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"fuchsia-zircon 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"rand 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand"
version = "0.4.2"
name = "pcre2"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"fuchsia-zircon 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"pcre2-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "redox_syscall"
version = "0.1.37"
name = "pcre2-sys"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cc 1.0.50 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)",
"pkg-config 0.3.17 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "pkg-config"
version = "0.3.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "redox_termios"
version = "0.1.1"
name = "proc-macro2"
version = "1.0.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"redox_syscall 0.1.37 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "quote"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"proc-macro2 1.0.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex"
version = "0.2.6"
version = "1.3.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"aho-corasick 0.6.4 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick 0.7.10 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.17 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex-automata"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"byteorder 1.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex-syntax"
version = "0.4.2"
version = "0.6.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "ripgrep"
version = "0.8.0"
version = "12.0.0"
dependencies = [
"atty 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"bytecount 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.29.4 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.3.0",
"grep 0.1.8",
"ignore 0.4.0",
"lazy_static 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.8.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 0.3.4",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)",
"grep 0.2.5",
"ignore 0.4.12",
"jemallocator 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.12.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_json 1.0.48 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ryu"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "same-file"
version = "1.0.2"
version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "simd"
version = "0.2.1"
name = "serde"
version = "1.0.104"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "serde_derive"
version = "1.0.104"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"proc-macro2 1.0.9 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"syn 1.0.16 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "serde_json"
version = "1.0.48"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"itoa 0.4.5 (registry+https://github.com/rust-lang/crates.io-index)",
"ryu 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"serde 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "strsim"
version = "0.7.0"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "tempdir"
version = "0.3.5"
name = "syn"
version = "1.0.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand 0.3.22 (registry+https://github.com/rust-lang/crates.io-index)",
"proc-macro2 1.0.9 (registry+https://github.com/rust-lang/crates.io-index)",
"quote 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "termcolor"
version = "0.3.4"
dependencies = [
"wincolor 0.1.6",
]
[[package]]
name = "termion"
version = "1.5.1"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_syscall 0.1.37 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "textwrap"
version = "0.9.0"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"unicode-width 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "thread_local"
version = "0.3.5"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unreachable 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "unicode-width"
version = "0.1.4"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unreachable"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "utf8-ranges"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "void"
version = "1.0.2"
name = "unicode-xid"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "walkdir"
version = "2.1.3"
version = "2.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "winapi"
version = "0.3.4"
version = "0.3.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -352,56 +528,72 @@ name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "winapi-util"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "wincolor"
version = "0.1.6"
dependencies = [
"winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)",
]
[metadata]
"checksum aho-corasick 0.6.4 (registry+https://github.com/rust-lang/crates.io-index)" = "d6531d44de723825aa81398a6415283229725a00fa30713812ab9323faa82fc4"
"checksum ansi_term 0.10.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6b3568b48b7cefa6b8ce125f9bb4989e52fbcc29ebea88df04cc7c5f12f70455"
"checksum atty 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)" = "8352656fd42c30a0c3c89d26dea01e3b77c0ab2af18230835c15e2e13cd51859"
"checksum bitflags 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "b3c30d3802dfb7281680d6285f2ccdaa8c2d8fee41f93805dba5c4cf50dc23cf"
"checksum bytecount 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "882585cd7ec84e902472df34a5e01891202db3bf62614e1f0afe459c1afcf744"
"checksum cfg-if 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "d4c819a1287eb618df47cc647173c5c4c66ba19d888a6e50d605672aed3140de"
"checksum clap 2.29.4 (registry+https://github.com/rust-lang/crates.io-index)" = "7b8f59bcebcfe4269b09f71dab0da15b355c75916a8f975d3876ce81561893ee"
"checksum crossbeam 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "24ce9782d4d5c53674646a6a4c1863a21a8fc0cb649b3c94dfc16e45071dea19"
"checksum encoding_rs 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)" = "98fd0f24d1fb71a4a6b9330c8ca04cbd4e7cc5d846b54ca74ff376bc7c9f798d"
"checksum aho-corasick 0.7.10 (registry+https://github.com/rust-lang/crates.io-index)" = "8716408b8bc624ed7f65d223ddb9ac2d044c0547b6fa4b0d554f3a9540496ada"
"checksum atty 0.2.14 (registry+https://github.com/rust-lang/crates.io-index)" = "d9b39be18770d11421cdb1b9947a45dd3f37e93092cbf377614828a319d5fee8"
"checksum autocfg 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "f8aac770f1885fd7e387acedd76065302551364496e46b3dd00860b2f8359b9d"
"checksum base64 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b41b7ea54a0c9d92199de89e20e58d49f02f8e699814ef3fdf266f6f748d15c7"
"checksum bitflags 1.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "cf1de2fe8c75bc145a2f577add951f8134889b4795d47466a54a5c846d691693"
"checksum bstr 0.2.12 (registry+https://github.com/rust-lang/crates.io-index)" = "2889e6d50f394968c8bf4240dc3f2a7eb4680844d27308f798229ac9d4725f41"
"checksum bytecount 0.6.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b0017894339f586ccb943b01b9555de56770c11cda818e7e3d8bd93f4ed7f46e"
"checksum byteorder 1.3.4 (registry+https://github.com/rust-lang/crates.io-index)" = "08c48aae112d48ed9f069b33538ea9e3e90aa263cfa3d1c24309612b1f7472de"
"checksum cc 1.0.50 (registry+https://github.com/rust-lang/crates.io-index)" = "95e28fa049fda1c330bcf9d723be7663a899c4679724b34c81e9f5a326aab8cd"
"checksum cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)" = "4785bdd1c96b2a846b2bd7cc02e86b6b3dbf14e7e53446c4f54c92a361040822"
"checksum clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)" = "5067f5bb2d80ef5d68b4c87db81601f0b75bca627bc2ef76b141d7b846a3c6d9"
"checksum crossbeam-channel 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)" = "cced8691919c02aac3cb0a1bc2e9b73d89e832bf9a06fc579d4e71b68a2da061"
"checksum crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)" = "c3c7c73a2d1e9fc0886a08b93e98eb643461230d5f1925e4036204d5f2e261a8"
"checksum encoding_rs 0.8.22 (registry+https://github.com/rust-lang/crates.io-index)" = "cd8d03faa7fe0c1431609dfad7bbe827af30f82e1e2ae6f7ee4fca6bd764bc28"
"checksum encoding_rs_io 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)" = "1cc3c5651fb62ab8aa3103998dade57efdd028544bd300516baa31840c252a83"
"checksum fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "2fad85553e09a6f881f739c29f0b00b0f01357c743266d478b68951ce23285f3"
"checksum fuchsia-zircon 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "2e9763c69ebaae630ba35f74888db465e49e259ba1bc0eda7d06f4a067615d82"
"checksum fuchsia-zircon-sys 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "3dcaa9ae7725d12cdb85b3ad99a434db70b468c09ded17e012d86b5c1010f7a7"
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
"checksum lazy_static 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "c8f31047daa365f19be14b47c29df4f7c3b581832407daabe6ae77397619237d"
"checksum libc 0.2.36 (registry+https://github.com/rust-lang/crates.io-index)" = "1e5d97d6708edaa407429faa671b942dc0f2727222fb6b6539bf1db936e4b121"
"checksum log 0.4.1 (registry+https://github.com/rust-lang/crates.io-index)" = "89f010e843f2b1a31dbd316b3b8d443758bc634bed37aabade59c686d644e0a2"
"checksum memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "796fba70e76612589ed2ce7f45282f5af869e0fdd7cc6199fa1aa1f1d591ba9d"
"checksum memmap 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)" = "e2ffa2c986de11a9df78620c01eeaaf27d94d3ff02bf81bfcca953102dd0c6ff"
"checksum num_cpus 1.8.0 (registry+https://github.com/rust-lang/crates.io-index)" = "c51a3322e4bca9d212ad9a158a02abc6934d005490c054a2778df73a70aa0a30"
"checksum rand 0.3.22 (registry+https://github.com/rust-lang/crates.io-index)" = "15a732abf9d20f0ad8eeb6f909bf6868722d9a06e1e50802b6a70351f40b4eb1"
"checksum rand 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)" = "eba5f8cb59cc50ed56be8880a5c7b496bfd9bd26394e176bc67884094145c2c5"
"checksum redox_syscall 0.1.37 (registry+https://github.com/rust-lang/crates.io-index)" = "0d92eecebad22b767915e4d529f89f28ee96dbbf5a4810d2b844373f136417fd"
"checksum redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7e891cfe48e9100a70a3b6eb652fef28920c117d366339687bd5576160db0f76"
"checksum regex 0.2.6 (registry+https://github.com/rust-lang/crates.io-index)" = "5be5347bde0c48cfd8c3fdc0766cdfe9d8a755ef84d620d6794c778c91de8b2b"
"checksum regex-syntax 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)" = "8e931c58b93d86f080c734bfd2bce7dd0079ae2331235818133c8be7f422e20e"
"checksum same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "cfb6eded0b06a0b512c8ddbcf04089138c9b4362c2f696f3c3d76039d68f3637"
"checksum simd 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3dd0805c7363ab51a829a1511ad24b6ed0349feaa756c4bc2f977f9f496e6673"
"checksum strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bb4f380125926a99e52bc279241539c018323fab05ad6368b56f93d9369ff550"
"checksum tempdir 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "87974a6f5c1dfb344d733055601650059a3363de2a6104819293baff662132d6"
"checksum termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "689a3bdfaab439fd92bc87df5c4c78417d3cbe537487274e9b0b2dce76e92096"
"checksum textwrap 0.9.0 (registry+https://github.com/rust-lang/crates.io-index)" = "c0b59b6b4b44d867f1370ef1bd91bfb262bf07bf0ae65c202ea2fbc16153b693"
"checksum thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "279ef31c19ededf577bfd12dfae728040a21f635b06a24cd670ff510edd38963"
"checksum unicode-width 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)" = "bf3a113775714a22dcb774d8ea3655c53a32debae63a063acc00a91cc586245f"
"checksum unreachable 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "382810877fe448991dfc7f0dd6e3ae5d58088fd0ea5e35189655f84e6814fa56"
"checksum utf8-ranges 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "662fab6525a98beff2921d7f61a39e7d59e0b425ebc7d0d9e66d316e55124122"
"checksum void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6a02e4885ed3bc0f2de90ea6dd45ebcbb66dacffe03547fadbb0eeae2770887d"
"checksum walkdir 2.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "b167e9a4420d8dddb260e70c90a4a375a1e5691f21f70e715553da87b6c2503a"
"checksum winapi 0.3.4 (registry+https://github.com/rust-lang/crates.io-index)" = "04e3bd221fcbe8a271359c04f21a76db7d0c6028862d1bb5512d85e1e2eb5bb3"
"checksum fs_extra 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "5f2a4a2034423744d2cc7ca2068453168dcdb82c438419e639a26bd87839c674"
"checksum glob 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "9b919933a397b79c37e33b77bb2aa3dc8eb6e165ad809e58ff75bc7db2e34574"
"checksum hermit-abi 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)" = "1010591b26bbfe835e9faeabeb11866061cc7dcebffd56ad7d0942d0e61aefd8"
"checksum itoa 0.4.5 (registry+https://github.com/rust-lang/crates.io-index)" = "b8b7a7c0c47db5545ed3fef7468ee7bb5b74691498139e4b3f6a20685dc6dd8e"
"checksum jemalloc-sys 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "0d3b9f3f5c9b31aa0f5ed3260385ac205db665baa41d49bb8338008ae94ede45"
"checksum jemallocator 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "43ae63fcfc45e99ab3d1b29a46782ad679e98436c3169d15a167a1108a724b69"
"checksum lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
"checksum libc 0.2.67 (registry+https://github.com/rust-lang/crates.io-index)" = "eb147597cdf94ed43ab7a9038716637d2d1bf2bc571da995d0028dec06bd3018"
"checksum log 0.4.8 (registry+https://github.com/rust-lang/crates.io-index)" = "14b6052be84e6b71ab17edffc2eeabf5c2c3ae1fdb464aae35ac50c67a44e1f7"
"checksum maybe-uninit 2.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "60302e4db3a61da70c0cb7991976248362f30319e88850c487b9b95bbf059e00"
"checksum memchr 2.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "3728d817d99e5ac407411fa471ff9800a778d88a24685968b36824eaf4bee400"
"checksum memmap 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "6585fd95e7bb50d6cc31e20d4cf9afb4e2ba16c5846fc76793f11218da9c475b"
"checksum num_cpus 1.12.0 (registry+https://github.com/rust-lang/crates.io-index)" = "46203554f085ff89c235cd12f7075f3233af9b11ed7c9e16dfe2560d03313ce6"
"checksum packed_simd 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "a85ea9fc0d4ac0deb6fe7911d38786b32fc11119afd9e9d38b84ff691ce64220"
"checksum pcre2 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)" = "85b30f2f69903b439dd9dc9e824119b82a55bf113b29af8d70948a03c1b11ab1"
"checksum pcre2-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "876c72d05059d23a84bd9fcdc3b1d31c50ea7fe00fe1522b4e68cd3608db8d5b"
"checksum pkg-config 0.3.17 (registry+https://github.com/rust-lang/crates.io-index)" = "05da548ad6865900e60eaba7f589cc0783590a92e940c26953ff81ddbab2d677"
"checksum proc-macro2 1.0.9 (registry+https://github.com/rust-lang/crates.io-index)" = "6c09721c6781493a2a492a96b5a5bf19b65917fe6728884e7c44dd0c60ca3435"
"checksum quote 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "2bdc6c187c65bca4260c9011c9e3132efe4909da44726bad24cf7572ae338d7f"
"checksum regex 1.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "8900ebc1363efa7ea1c399ccc32daed870b4002651e0bed86e72d501ebbe0048"
"checksum regex-automata 0.1.9 (registry+https://github.com/rust-lang/crates.io-index)" = "ae1ded71d66a4a97f5e961fd0cb25a5f366a42a41570d16a763a69c092c26ae4"
"checksum regex-syntax 0.6.17 (registry+https://github.com/rust-lang/crates.io-index)" = "7fe5bd57d1d7414c6b5ed48563a2c855d995ff777729dcd91c369ec7fea395ae"
"checksum ryu 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "535622e6be132bccd223f4bb2b8ac8d53cda3c7a6394944d3b2b33fb974f9d76"
"checksum same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502"
"checksum serde 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)" = "414115f25f818d7dfccec8ee535d76949ae78584fc4f79a6f45a904bf8ab4449"
"checksum serde_derive 1.0.104 (registry+https://github.com/rust-lang/crates.io-index)" = "128f9e303a5a29922045a830221b8f78ec74a5f544944f3d5984f8ec3895ef64"
"checksum serde_json 1.0.48 (registry+https://github.com/rust-lang/crates.io-index)" = "9371ade75d4c2d6cb154141b9752cf3781ec9c05e0e5cf35060e1e70ee7b9c25"
"checksum strsim 0.8.0 (registry+https://github.com/rust-lang/crates.io-index)" = "8ea5119cdb4c55b55d432abb513a0429384878c15dde60cc77b1c99de1a95a6a"
"checksum syn 1.0.16 (registry+https://github.com/rust-lang/crates.io-index)" = "123bd9499cfb380418d509322d7a6d52e5315f064fe4b3ad18a53d6b92c07859"
"checksum termcolor 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bb6bfa289a4d7c5766392812c0a1f4c1ba45afa1ad47803c11e1f407d846d75f"
"checksum textwrap 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)" = "d326610f408c7a4eb6f51c37c330e496b08506c9457c9d34287ecc38809fb060"
"checksum thread_local 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "d40c6d1b69745a6ec6fb1ca717914848da4b44ae29d9b3080cbee91d72a69b14"
"checksum unicode-width 0.1.7 (registry+https://github.com/rust-lang/crates.io-index)" = "caaa9d531767d1ff2150b9332433f32a24622147e5ebb1f26409d5da67afd479"
"checksum unicode-xid 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "826e7639553986605ec5979c7dd957c7895e93eabed50ab2ffa7f6128a75097c"
"checksum walkdir 2.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "777182bc735b6424e1a57516d35ed72cb8019d85c8c9bf536dccb3445c1a2f7d"
"checksum winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "8093091eeb260906a183e6ae1abdba2ef5ef2257a21801128899c3fc699229c6"
"checksum winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
"checksum winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "4ccfbf554c6ad11084fb7517daca16cfdcaccbdadba4fc336f032a8b12c2ad80"
"checksum winapi-x86_64-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

View File

@@ -1,10 +1,11 @@
[package]
name = "ripgrep"
version = "0.8.0" #:version
version = "12.0.0" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Line oriented search tool using Rust's regex library. Combines the raw
performance of grep with the usability of the silver searcher.
ripgrep is a line-oriented search tool that recursively searches your current
directory for a regex pattern while respecting your gitignore rules. ripgrep
has first class support on Windows, macOS and Linux.
"""
documentation = "https://github.com/BurntSushi/ripgrep"
homepage = "https://github.com/BurntSushi/ripgrep"
@@ -12,9 +13,11 @@ repository = "https://github.com/BurntSushi/ripgrep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
categories = ["command-line-utilities", "text-processing"]
license = "Unlicense/MIT"
license = "Unlicense OR MIT"
exclude = ["HomebrewFormula"]
build = "build.rs"
autotests = false
edition = "2018"
[badges]
travis-ci = { repository = "BurntSushi/ripgrep" }
@@ -22,7 +25,7 @@ appveyor = { repository = "BurntSushi/ripgrep" }
[[bin]]
bench = false
path = "src/main.rs"
path = "crates/core/main.rs"
name = "rg"
[[test]]
@@ -30,49 +33,80 @@ name = "integration"
path = "tests/tests.rs"
[workspace]
members = ["grep", "globset", "ignore", "termcolor", "wincolor"]
[dependencies]
atty = "0.2.2"
bytecount = "0.3.1"
encoding_rs = "0.7"
globset = { version = "0.3.0", path = "globset" }
grep = { version = "0.1.8", path = "grep" }
ignore = { version = "0.4.0", path = "ignore" }
lazy_static = "1"
libc = "0.2"
log = "0.4"
memchr = "2"
memmap = "0.6"
num_cpus = "1"
regex = "0.2.4"
same-file = "1"
termcolor = { version = "0.3.4", path = "termcolor" }
[dependencies.clap]
version = "2.29.4"
default-features = false
features = ["suggestions", "color"]
[target.'cfg(windows)'.dependencies.winapi]
version = "0.3"
features = ["std", "winnt"]
[build-dependencies]
lazy_static = "1"
[build-dependencies.clap]
version = "2.29.4"
default-features = false
features = ["suggestions", "color"]
[features]
avx-accel = ["bytecount/avx-accel"]
simd-accel = [
"bytecount/simd-accel",
"regex/simd-accel",
"encoding_rs/simd-accel",
members = [
"crates/globset",
"crates/grep",
"crates/cli",
"crates/matcher",
"crates/pcre2",
"crates/printer",
"crates/regex",
"crates/searcher",
"crates/ignore",
]
[dependencies]
bstr = "0.2.12"
grep = { version = "0.2.5", path = "crates/grep" }
ignore = { version = "0.4.12", path = "crates/ignore" }
lazy_static = "1.1.0"
log = "0.4.5"
num_cpus = "1.8.0"
regex = "1.3.5"
serde_json = "1.0.23"
termcolor = "1.1.0"
[dependencies.clap]
version = "2.33.0"
default-features = false
features = ["suggestions"]
[target.'cfg(all(target_env = "musl", target_pointer_width = "64"))'.dependencies.jemallocator]
version = "0.3.0"
[build-dependencies]
lazy_static = "1.1.0"
[build-dependencies.clap]
version = "2.33.0"
default-features = false
features = ["suggestions"]
[dev-dependencies]
serde = "1.0.77"
serde_derive = "1.0.77"
walkdir = "2"
[features]
simd-accel = ["grep/simd-accel"]
pcre2 = ["grep/pcre2"]
[profile.release]
debug = true
debug = 1
[package.metadata.deb]
features = ["pcre2"]
section = "utils"
assets = [
["target/release/rg", "usr/bin/", "755"],
["COPYING", "usr/share/doc/ripgrep/", "644"],
["LICENSE-MIT", "usr/share/doc/ripgrep/", "644"],
["UNLICENSE", "usr/share/doc/ripgrep/", "644"],
["CHANGELOG.md", "usr/share/doc/ripgrep/CHANGELOG", "644"],
["README.md", "usr/share/doc/ripgrep/README", "644"],
["FAQ.md", "usr/share/doc/ripgrep/FAQ", "644"],
# The man page is automatically generated by ripgrep's build process, so
# this file isn't actually commited. Instead, to create a dpkg, either
# create a deployment/deb directory and copy the man page to it, or use the
# 'ci/build-deb' script.
["deployment/deb/rg.1", "usr/share/man/man1/rg.1", "644"],
# Similarly for shell completions.
["deployment/deb/rg.bash", "usr/share/bash-completion/completions/rg", "644"],
["deployment/deb/rg.fish", "usr/share/fish/vendor_completions.d/rg.fish", "644"],
["deployment/deb/_rg", "usr/share/zsh/vendor-completions/", "644"],
]
extended-description = """\
ripgrep (rg) recursively searches your current directory for a regex pattern.
By default, ripgrep will respect your .gitignore and automatically skip hidden
files/directories and binary files.
"""

11
Cross.toml Normal file
View File

@@ -0,0 +1,11 @@
[target.x86_64-unknown-linux-musl]
image = "burntsushi/cross:x86_64-unknown-linux-musl"
[target.i686-unknown-linux-gnu]
image = "burntsushi/cross:i686-unknown-linux-gnu"
[target.mips64-unknown-linux-gnuabi64]
image = "burntsushi/cross:mips64-unknown-linux-gnuabi64"
[target.arm-unknown-linux-gnueabihf]
image = "burntsushi/cross:arm-unknown-linux-gnueabihf"

576
FAQ.md
View File

@@ -16,10 +16,16 @@
* [How do I get around the regex size limit?](#size-limit)
* [How do I make the `-f/--file` flag faster?](#dfa-size)
* [How do I make the output look like The Silver Searcher's output?](#silver-searcher-output)
* [Why does ripgrep get slower when I enabled PCRE2 regexes?](#pcre2-slow)
* [When I run `rg`, why does it execute some other command?](#rg-other-cmd)
* [How do I create an alias for ripgrep on Windows?](#rg-alias-windows)
* [How do I create a PowerShell profile?](#powershell-profile)
* [How do I pipe non-ASCII content to ripgrep on Windows?](#pipe-non-ascii-windows)
* [How can I search and replace with ripgrep?](#search-and-replace)
* [How is ripgrep licensed?](#license)
* [Can ripgrep replace grep?](#posix4ever)
* [What does the "rip" in ripgrep mean?](#intentcountsforsomething)
* [How can I donate to ripgrep or its maintainers?](#donations)
<h3 name="config">
@@ -45,9 +51,9 @@ ripgrep is a project whose contributors are volunteers. A release schedule
adds undue stress to said volunteers. Therefore, releases are made on a best
effort basis and no dates **will ever be given**.
One exception to this is high impact bugs. If a ripgrep release contains a
significant regression, then there will generally be a strong push to get a
patch release out with a fix.
An exception to this _can be_ high impact bugs. If a ripgrep release contains
a significant regression, then there will generally be a strong push to get a
patch release out with a fix. However, no promises are made.
<h3 name="manpage">
@@ -113,7 +119,7 @@ from run to run of ripgrep.
The only way to make the order of results consistent is to ask ripgrep to
sort the output. Currently, this will disable all parallelism. (On smaller
repositories, you might not notice much of a performance difference!) You
can achieve this with the `--sort-files` flag.
can achieve this with the `--sort path` flag.
There is more discussion on this topic here:
https://github.com/BurntSushi/ripgrep/issues/152
@@ -131,10 +137,10 @@ How do I search compressed files?
</h3>
ripgrep's `-z/--search-zip` flag will cause it to search compressed files
automatically. Currently, this supports gzip, bzip2, lzma and xz only and
requires the corresponding `gzip`, `bzip2` and `xz` binaries to be installed on
your system. (That is, ripgrep does decompression by shelling out to another
process.)
automatically. Currently, this supports gzip, bzip2, xz, lzma, lz4, Brotli and
Zstd. Each of these requires requires the corresponding `gzip`, `bzip2`, `xz`,
`lz4`, `brotli` and `zstd` binaries to be installed on your system. (That is,
ripgrep does decompression by shelling out to another process.)
ripgrep currently does not search archive formats, so `*.tar.gz` files, for
example, are skipped.
@@ -144,22 +150,45 @@ example, are skipped.
How do I search over multiple lines?
</h3>
This isn't currently possible. ripgrep is fundamentally a line-oriented search
tool. With that said,
[multiline search is a planned opt-in feature](https://github.com/BurntSushi/ripgrep/issues/176).
The `-U/--multiline` flag enables ripgrep to report results that span over
multiple lines.
<h3 name="fancy">
How do I use lookaround and/or backreferences?
</h3>
This isn't currently possible. ripgrep uses finite automata to implement
regular expression search, and in turn, guarantees linear time searching on all
inputs. It is difficult to efficiently support lookaround and backreferences in
finite automata engines, so ripgrep does not provide these features.
ripgrep's default regex engine does not support lookaround or backreferences.
This is primarily because the default regex engine is implemented using finite
state machines in order to guarantee a linear worst case time complexity on all
inputs. Backreferences are not possible to implement in this paradigm, and
lookaround appears difficult to do efficiently.
If a production quality regular expression engine with these features is ever
written in Rust, then it is possible ripgrep will provide it as an opt-in
However, ripgrep optionally supports using PCRE2 as the regex engine instead of
the default one based on finite state machines. You can enable PCRE2 with the
`-P/--pcre2` flag. For example, in the root of the ripgrep repo, you can easily
find all palindromes:
```
$ rg -P '(\w{10})\1'
tests/misc.rs
483: cmd.arg("--max-filesize").arg("44444444444444444444");
globset/src/glob.rs
1206: matches!(match7, "a*a*a*a*a*a*a*a*a", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
```
If your version of ripgrep doesn't support PCRE2, then you'll get an error
message when you try to use the `-P/--pcre2` flag:
```
$ rg -P '(\w{10})\1'
PCRE2 is not available in this build of ripgrep
```
Most of the releases distributed by the ripgrep project here on GitHub will
come bundled with PCRE2 enabled. If you installed ripgrep through a different
means (like your system's package manager), then please reach out to the
maintainer of that package to see whether it's possible to enable the PCRE2
feature.
@@ -189,10 +218,10 @@ The --colors` flag is a bit more complicated. The general format is:
* `{attribute}` should be one of `fg`, `bg` or `style`, corresponding to
foreground color, background color, or miscellaneous styling (such as whether
to bold the output or not).
* `{value}` is determined by the value of `{attribute}`. If `{attribute}` is
`style`, then `{value}` should be one of `nobold`, `bold`, `nointense` or
`intense`. If `{attribute}` is `fg` or `bg`, then `{value}` should be a
color.
* `{value}` is determined by the value of `{attribute}`. If
`{attribute}` is `style`, then `{value}` should be one of `nobold`,
`bold`, `nointense`, `intense`, `nounderline` or `underline`. If
`{attribute}` is `fg` or `bg`, then `{value}` should be a color.
A color is specified by either one of eight of English names, a single 256-bit
number or an RGB triple (with over 16 million possible values, or "true
@@ -364,6 +393,301 @@ $ RIPGREP_CONFIG_PATH=$HOME/.config/ripgrep/rc rg foo
```
<h3 name="pcre2-slow">
Why does ripgrep get slower when I enable PCRE2 regexes?
</h3>
When you use the `--pcre2` (`-P` for short) flag, ripgrep will use the PCRE2
regex engine instead of the default. Both regex engines are quite fast,
but PCRE2 provides a number of additional features such as look-around and
backreferences that many enjoy using. This is largely because PCRE2 uses
a backtracking implementation where as the default regex engine uses a finite
automaton based implementation. The former provides the ability to add lots of
bells and whistles over the latter, but the latter executes with worst case
linear time complexity.
With that out of the way, if you've used `-P` with ripgrep, you may have
noticed that it can be slower. The reasons for why this is are quite complex,
and they are complex because the optimizations that ripgrep uses to implement
fast search are complex.
The task ripgrep has before it is somewhat simple; all it needs to do is search
a file for occurrences of some pattern and then print the lines containing
those occurrences. The problem lies in what is considered a valid match and how
exactly we read the bytes from a file.
In terms of what is considered a valid match, remember that ripgrep will only
report matches spanning a single line by default. The problem here is that
some patterns can match across multiple lines, and ripgrep needs to prevent
that from happening. For example, `foo\sbar` will match `foo\nbar`. The most
obvious way to achieve this is to read the data from a file, and then apply
the pattern search to that data for each line. The problem with this approach
is that it can be quite slow; it would be much faster to let the pattern
search across as much data as possible. It's faster because it gets rid of the
overhead of finding the boundaries of every line, and also because it gets rid
of the overhead of starting and stopping the pattern search for every single
line. (This is operating under the general assumption that matching lines are
much rarer than non-matching lines.)
It turns out that we can use the faster approach by applying a very simple
restriction to the pattern: *statically prevent* the pattern from matching
through a `\n` character. Namely, when given a pattern like `foo\sbar`,
ripgrep will remove `\n` from the `\s` character class automatically. In some
cases, a simple removal is not so easy. For example, ripgrep will return an
error when your pattern includes a `\n` literal:
```
$ rg '\n'
the literal '"\n"' is not allowed in a regex
```
So what does this have to do with PCRE2? Well, ripgrep's default regex engine
exposes APIs for doing syntactic analysis on the pattern in a way that makes
it quite easy to strip `\n` from the pattern (or otherwise detect it and report
an error if stripping isn't possible). PCRE2 seemingly does not provide a
similar API, so ripgrep does not do any stripping when PCRE2 is enabled. This
forces ripgrep to use the "slow" search strategy of searching each line
individually.
OK, so if enabling PCRE2 slows down the default method of searching because it
forces matches to be limited to a single line, then why is PCRE2 also sometimes
slower when performing multiline searches? Well, that's because there are
*multiple* reasons why using PCRE2 in ripgrep can be slower than the default
regex engine. This time, blame PCRE2's Unicode support, which ripgrep enables
by default. In particular, PCRE2 cannot simultaneously enable Unicode support
and search arbitrary data. That is, when PCRE2's Unicode support is enabled,
the data **must** be valid UTF-8 (to do otherwise is to invoke undefined
behavior). This is in contrast to ripgrep's default regex engine, which can
enable Unicode support and still search arbitrary data. ripgrep's default
regex engine simply won't match invalid UTF-8 for a pattern that can otherwise
only match valid UTF-8. Why doesn't PCRE2 do the same? This author isn't
familiar with its internals, so we can't comment on it here.
The bottom line here is that we can't enable PCRE2's Unicode support without
simultaneously incurring a performance penalty for ensuring that we are
searching valid UTF-8. In particular, ripgrep will transcode the contents
of each file to UTF-8 while replacing invalid UTF-8 data with the Unicode
replacement codepoint. ripgrep then disables PCRE2's own internal UTF-8
checking, since we've guaranteed the data we hand it will be valid UTF-8. The
reason why ripgrep takes this approach is because if we do hand PCRE2 invalid
UTF-8, then it will report a match error if it comes across an invalid UTF-8
sequence. This is not good news for ripgrep, since it will stop it from
searching the rest of the file, and will also print potentially undesirable
error messages to users.
All right, the above is a lot of information to swallow if you aren't already
familiar with ripgrep internals. Let's make this concrete with some examples.
First, let's get some data big enough to magnify the performance differences:
```
$ curl -O 'https://burntsushi.net/stuff/subtitles2016-sample.gz'
$ gzip -d subtitles2016-sample
$ md5sum subtitles2016-sample
e3cb796a20bbc602fbfd6bb43bda45f5 subtitles2016-sample
```
To search this data, we will use the pattern `^\w{42}$`, which contains exactly
one hit in the file and has no literals. Having no literals is important,
because it ensures that the regex engine won't use literal optimizations to
speed up the search. In other words, it lets us reason coherently about the
actual task that the regex engine is performing.
Let's now walk through a few examples in light of the information above. First,
let's consider the default search using ripgrep's default regex engine and
then the same search with PCRE2:
```
$ time rg '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.783s
user 0m1.731s
sys 0m0.051s
$ time rg -P '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m2.458s
user 0m2.419s
sys 0m0.038s
```
In this particular example, both pattern searches are using a Unicode aware
`\w` character class and both are counting lines in order to report line
numbers. The key difference here is that the first search will not search
line by line, but the second one will. We can observe which strategy ripgrep
uses by passing the `--trace` flag:
```
$ rg '^\w{42}$' subtitles2016-sample --trace
[... snip ...]
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:712: slice reader: searching via slice-by-line strategy
TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:61: searcher core: will use fast line searcher
[... snip ...]
$ rg -P '^\w{42}$' subtitles2016-sample --trace
[... snip ...]
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:705: slice reader: needs transcoding, using generic reader
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:685: generic reader: searching via roll buffer strategy
TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:63: searcher core: will use slow line searcher
[... snip ...]
```
The first says it is using the "fast line searcher" where as the latter says
it is using the "slow line searcher." The latter also shows that we are
decoding the contents of the file, which also impacts performance.
Interestingly, in this case, the pattern does not match a `\n` and the file
we're searching is valid UTF-8, so neither the slow line-by-line search
strategy nor the decoding are necessary. We could fix the former issue with
better PCRE2 introspection APIs. We can actually fix the latter issue with
ripgrep's `--no-encoding` flag, which prevents the automatic UTF-8 decoding,
but will enable PCRE2's own UTF-8 validity checking. Unfortunately, it's slower
in my build of ripgrep:
```
$ time rg -P '^\w{42}$' subtitles2016-sample --no-encoding
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m3.074s
user 0m3.021s
sys 0m0.051s
```
(Tip: use the `--trace` flag to verify that no decoding in ripgrep is
happening.)
A possible reason why PCRE2's UTF-8 checking is slower is because it might
not be better than the highly optimized UTF-8 checking routines found in the
[`encoding_rs`](https://github.com/hsivonen/encoding_rs) library, which is what
ripgrep uses for UTF-8 decoding. Moreover, my build of ripgrep enables
`encoding_rs`'s SIMD optimizations, which may be in play here.
Also, note that using the `--no-encoding` flag can cause PCRE2 to report
invalid UTF-8 errors, which causes ripgrep to stop searching the file:
```
$ cat invalid-utf8
foobar
$ xxd invalid-utf8
00000000: 666f 6fff 6261 720a foo.bar.
$ rg foo invalid-utf8
1:foobar
$ rg -P foo invalid-utf8
1:foo<6F>bar
$ rg -P foo invalid-utf8 --no-encoding
invalid-utf8: PCRE2: error matching: UTF-8 error: illegal byte (0xfe or 0xff)
```
All right, so at this point, you might think that we could remove the penalty
for line-by-line searching by enabling multiline search. After all, our
particular pattern can't match across multiple lines anyway, so we'll still get
the results we want. Let's try it:
```
$ time rg -U '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.803s
user 0m1.748s
sys 0m0.054s
$ time rg -P -U '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m2.962s
user 0m2.246s
sys 0m0.713s
```
Search times remain the same with the default regex engine, but the PCRE2
search gets _slower_. What happened? The secrets can be revealed with the
`--trace` flag once again. In the former case, ripgrep actually detects that
the pattern can't match across multiple lines, and so will fall back to the
"fast line search" strategy as with our search without `-U`.
However, for PCRE2, things are much worse. Namely, since Unicode mode is still
enabled, ripgrep is still going to decode UTF-8 to ensure that it hands only
valid UTF-8 to PCRE2. Unfortunately, one key downside of multiline search is
that ripgrep cannot do it incrementally. Since matches can be arbitrarily long,
ripgrep actually needs the entire file in memory at once. Normally, we can use
a memory map for this, but because we need to UTF-8 decode the file before
searching it, ripgrep winds up reading the entire contents of the file on to
the heap before executing a search. Owch.
OK, so Unicode is killing us here. The file we're searching is _mostly_ ASCII,
so maybe we're OK with missing some data. (Try `rg '[\w--\p{ascii}]'` to see
non-ASCII word characters that an ASCII-only `\w` character class would miss.)
We can disable Unicode in both searches, but this is done differently depending
on the regex engine we use:
```
$ time rg '(?-u)^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.714s
user 0m1.669s
sys 0m0.044s
$ time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.997s
user 0m1.958s
sys 0m0.037s
```
For the most part, ripgrep's default regex engine performs about the same.
PCRE2 does improve a little bit, and is now almost as fast as the default
regex engine. If you look at the output of `--trace`, you'll see that ripgrep
will no longer perform UTF-8 decoding, but it does still use the slow
line-by-line searcher.
At this point, we can combine all of our insights above: let's try to get off
of the slow line-by-line searcher by enabling multiline mode, and let's stop
UTF-8 decoding by disabling Unicode support:
```
$ time rg -U '(?-u)^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.714s
user 0m1.655s
sys 0m0.058s
$ time rg -P -U '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.121s
user 0m1.071s
sys 0m0.048s
```
Ah, there's PCRE2's JIT shining! ripgrep's default regex engine once again
remains about the same, but PCRE2 no longer needs to search line-by-line and it
no longer needs to do any kind of UTF-8 checks. This allows the file to get
memory mapped and passed right through PCRE2's JIT at impressive speeds. (As
a brief and interesting historical note, the configuration of "memory map +
multiline + no-Unicode" is exactly the configuration used by The Silver
Searcher. This analysis perhaps sheds some reasoning as to why that
configuration is useful!)
In summary, if you want PCRE2 to go as fast as possible and you don't care
about Unicode and you don't care about matches possibly spanning across
multiple lines, then enable multiline mode with `-U` and disable PCRE2's
Unicode support with the `--no-pcre2-unicode` flag.
Caveat emptor: This author is not a PCRE2 expert, so there may be APIs that can
improve performance that the author missed. Similarly, there may be alternative
designs for a searching tool that are more amenable to how PCRE2 works.
<h3 name="rg-other-cmd">
When I run <code>rg</code>, why does it execute some other command?
</h3>
@@ -466,3 +790,213 @@ that the console will use for printing to UTF-8 with
`[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8`. This
will also reset when PowerShell is restarted, so you can add that line
to your profile as well if you want to make the setting permanent.
<h3 name="search-and-replace">
How can I search and replace with ripgrep?
</h3>
Using ripgrep alone, you can't. ripgrep is a search tool that will never
touch your files. However, the output of ripgrep can be piped to other tools
that do modify files on disk. See
[this issue](https://github.com/BurntSushi/ripgrep/issues/74) for more
information.
sed is one such tool that can modify files on disk. sed can take a filename
and a substitution command to search and replace in the specified file.
Files containing matching patterns can be provided to sed using
```
rg foo --files-with-matches
```
The output of this command is a list of filenames that contain a match for
the `foo` pattern.
This list can be piped into `xargs`, which will split the filenames from
standard input into arguments for the command following xargs. You can use this
combination to pipe a list of filenames into sed for replacement. For example:
```
rg foo --files-with-matches | xargs sed -i 's/foo/bar/g'
```
will replace all instances of 'foo' with 'bar' in the files in which
ripgrep finds the foo pattern. The `-i` flag to sed indicates that you are
editing files in place, and `s/foo/bar/g` says that you are performing a
**s**ubstitution of the pattren `foo` for `bar`, and that you are doing this
substitution **g**lobally (all occurrences of the pattern in each file).
Note: the above command assumes that you are using GNU sed. If you are using
BSD sed (the default on macOS and FreeBSD) then you must modify the above
command to be the following:
```
rg foo --files-with-matches | xargs sed -i '' 's/foo/bar/g'
```
The `-i` flag in BSD sed requires a file extension to be given to make backups
for all modified files. Specifying the empty string prevents file backups from
being made.
Finally, if any of your file paths contain whitespace in them, then you might
need to delimit your file paths with a NUL terminator. This requires telling
ripgrep to output NUL bytes between each path, and telling xargs to read paths
delimited by NUL bytes:
```
rg foo --files-with-matches -0 | xargs -0 sed -i 's/foo/bar/g'
```
To learn more about sed, see the sed manual
[here](https://www.gnu.org/software/sed/manual/sed.html).
Additionally, Facebook has a tool called
[fastmod](https://github.com/facebookincubator/fastmod)
that uses some of the same libraries as ripgrep and might provide a more
ergonomic search-and-replace experience.
<h3 name="license">
How is ripgrep licensed?
</h3>
ripgrep is dual licensed under the
[Unlicense](https://unlicense.org/)
and MIT licenses. Specifically, you may use ripgrep under the terms of either
license.
The reason why ripgrep is dual licensed this way is two-fold:
1. I, as ripgrep's author, would like to participate in a small bit of
ideological activism by promoting the Unlicense's goal: to disclaim
copyright monopoly interest.
2. I, as ripgrep's author, would like as many people to use rigprep as
possible. Since the Unlicense is not a proven or well known license, ripgrep
is also offered under the MIT license, which is ubiquitous and accepted by
almost everyone.
More specifically, ripgrep and all its dependencies are compatible with this
licensing choice. In particular, ripgrep's dependencies (direct and transitive)
will always be limited to permissive licenses. That is, ripgrep will never
depend on code that is not permissively licensed. This means rejecting any
dependency that uses a copyleft license such as the GPL, LGPL, MPL or any of
the Creative Commons ShareAlike licenses. Whether the license is "weak"
copyleft or not does not matter; ripgrep will **not** depend on it.
<h3 name="posix4ever">
Can ripgrep replace grep?
</h3>
Yes and no.
If, upon hearing that "ripgrep can replace grep," you *actually* hear, "ripgrep
can be used in every instance grep can be used, in exactly the same way, for
the same use cases, with exactly the same bug-for-bug behavior," then no,
ripgrep trivially *cannot* replace grep. Moreover, ripgrep will *never* replace
grep.
If, upon hearing that "ripgrep can replace grep," you *actually* hear, "ripgrep
can replace grep in some cases and not in other use cases," then yes, that is
indeed true!
Let's go over some of those use cases in favor of ripgrep. Some of these may
not apply to you. That's OK. There may be other use cases not listed here that
do apply to you. That's OK too.
(For all claims related to performance in the following words, see my
[blog post](https://blog.burntsushi.net/ripgrep/)
introducing ripgrep.)
* Are you frequently searching a repository of code? If so, ripgrep might be a
good choice since there's likely a good chunk of your repository that you
don't want to search. grep, can, of course, be made to filter files using
recursive search, and if you don't mind writing out the requisite `--exclude`
rules or writing wrapper scripts, then grep might be sufficient. (I'm not
kidding, I myself did this with grep for almost a decade before writing
ripgrep.) But if you instead enjoy having a search tool respect your
`.gitignore`, then ripgrep might be perfect for you!
* Are you frequently searching non-ASCII text that is UTF-8 encoded? One of
ripgrep's key features is that it can handle Unicode features in your
patterns in a way that tends to be faster than GNU grep. Unicode features
in ripgrep are enabled by default; there is no need to configure your locale
settings to use ripgrep properly because ripgrep doesn't respect your locale
settings.
* Do you need to search UTF-16 files and you don't want to bother explicitly
transcoding them? Great. ripgrep does this for you automatically. No need
to enable it.
* Do you need to search a large directory of large files? ripgrep uses
parallelism by default, which tends to make it faster than a standard
`grep -r` search. However, if you're OK writing the occasional
`find ./ -print0 | xargs -P8 -0 grep` command, then maybe grep is good
enough.
Here are some cases where you might *not* want to use ripgrep. The same caveats
for the previous section apply.
* Are you writing portable shell scripts intended to work in a variety of
environments? Great, probably not a good idea to use ripgrep! ripgrep has
nowhere near the ubiquity of grep, so if you do use ripgrep, you might need
to futz with the installation process more than you would with grep.
* Do you care about POSIX compatibility? If so, then you can't use ripgrep
because it never was, isn't and never will be POSIX compatible.
* Do you hate tools that try to do something smart? If so, ripgrep is all about
being smart, so you might prefer to just stick with grep.
* Is there a particular feature of grep you rely on that ripgrep either doesn't
have or never will have? If the former, file a bug report, maybe ripgrep can
do it! If the latter, well, then, just use grep.
<h3 name="intentcountsforsomething">
What does the "rip" in ripgrep mean?
</h3>
When I first started writing ripgrep, I called it `rep`, intending it to be a
shorter variant of `grep`. Soon after, I renamed it to `xrep` since `rep`
wasn't obvious enough of a name for my taste. And also because adding `x` to
anything always makes it better, right?
Before ripgrep's first public release, I decided that I didn't like `xrep`. I
thought it was slightly awkward to type, and despite my previous praise of the
letter `x`, I kind of thought it was pretty lame. Being someone who really
likes Rust, I wanted to call it "rustgrep" or maybe "rgrep" for short. But I
thought that was just as lame, and maybe a little too in-your-face. But I
wanted to continue using `r` so I could at least pretend Rust had something to
do with it.
I spent a couple of days trying to think of very short words that began with
the letter `r` that were even somewhat related to the task of searching. I
don't remember how it popped into my head, but "rip" came up as something that
meant "fast," as in, "to rip through your text." The fact that RIP is also
an initialism for "Rest in Peace" (as in, "ripgrep kills grep") never really
dawned on me. Perhaps the coincidence is too striking to believe that, but
I didn't realize it until someone explicitly pointed it out to me after the
initial public release. I admit that I found it mildly amusing, but if I had
realized it myself before the public release, I probably would have pressed on
and chose a different name. Alas, renaming things after a release is hard, so I
decided to mush on.
Given the fact that
[ripgrep never was, is or will be a 100% drop-in replacement for
grep](#posix4ever),
ripgrep is neither actually a "grep killer" nor was it ever intended to be. It
certainly does eat into some of its use cases, but that's nothing that other
tools like ack or The Silver Searcher weren't already doing.
<h3 name="donations">
How can I donate to ripgrep or its maintainers?
</h3>
As of now, you can't. While I believe the various efforts that are being
undertaken to help fund FOSS are extremely important, they aren't a good fit
for me. ripgrep is and I hope will remain a project of love that I develop in
my free time. As such, involving money---even in the form of donations given
without expectations---would severely change that dynamic for me personally.
Instead, I'd recommend donating to something else that is doing work that you
find meaningful. If you would like suggestions, then my favorites are:
* [The Internet Archive](https://archive.org/donate/)
* [Rails Girls](https://railsgirlssummerofcode.org/campaign/)
* [Wikipedia](https://wikimediafoundation.org/support/)

158
GUIDE.md
View File

@@ -18,6 +18,7 @@ translatable to any command line shell environment.
* [Replacements](#replacements)
* [Configuration file](#configuration-file)
* [File encoding](#file-encoding)
* [Binary data](#binary-data)
* [Common options](#common-options)
@@ -58,6 +59,10 @@ $ rg fast README.md
129: optimizations to make searching very fast.
```
(**Note:** If you see an error message from ripgrep saying that it didn't
search any files, then re-run ripgrep with the `--debug` flag. One likely cause
of this is that you have a `*` rule in a `$HOME/.gitignore` file.)
So what happened here? ripgrep read the contents of `README.md`, and for each
line that contained `fast`, ripgrep printed it to your terminal. ripgrep also
included the line number for each line by default. If your terminal supports
@@ -105,7 +110,7 @@ colors, you'll notice that `faster` will be highlighted instead of just the
It is beyond the scope of this guide to provide a full tutorial on regular
expressions, but ripgrep's specific syntax is documented here:
https://docs.rs/regex/0.2.5/regex/#syntax
https://docs.rs/regex/*/regex/#syntax
### Recursive search
@@ -223,7 +228,7 @@ with the following contents:
```
ripgrep treats `.ignore` files with higher precedence than `.gitignore` files
(and treats `.rgignore` files with higher precdence than `.ignore` files).
(and treats `.rgignore` files with higher precedence than `.ignore` files).
This means ripgrep will see the `!log/` whitelist rule first and search that
directory.
@@ -231,6 +236,11 @@ Like `.gitignore`, a `.ignore` file can be placed in any directory. Its rules
will be processed with respect to the directory it resides in, just like
`.gitignore`.
To process `.gitignore` and `.ignore` files case insensitively, use the flag
`--ignore-file-case-insensitive`. This is especially useful on case insensitive
file systems like those on Windows and macOS. Note though that this can come
with a significant performance penalty, and is therefore disabled by default.
For a more in depth description of how glob patterns in a `.gitignore` file
are interpreted, please see `man gitignore`.
@@ -401,6 +411,21 @@ alias rg="rg --type-add 'web:*.{html,css,js}'"
or add `--type-add=web:*.{html,css,js}` to your ripgrep configuration file.
([Configuration files](#configuration-file) are covered in more detail later.)
#### The special `all` file type
A special option supported by the `--type` flag is `all`. `--type all` looks
for a match in any of the supported file types listed by `--type-list`,
including those added on the command line using `--type-add`. It's equivalent
to the command `rg --type agda --type asciidoc --type asm ...`, where `...`
stands for a list of `--type` flags for the rest of the types in `--type-list`.
As an example, let's suppose you have a shell script in your current directory,
`my-shell-script`, which includes a shell library, `my-shell-library.bash`.
Both `rg --type sh` and `rg --type all` would only search for matches in
`my-shell-library.bash`, not `my-shell-script`, because the globs matched
by the `sh` file type don't include files without an extension. On the
other hand, `rg --type-not all` would search `my-shell-script` but not
`my-shell-library.bash`.
### Replacements
@@ -516,9 +541,9 @@ config file. Once the environment variable is set, open the file and just type
in the flags you want set automatically. There are only two rules for
describing the format of the config file:
1. Every line is a shell argument, after trimming ASCII whitespace.
2. Lines starting with `#` (optionally preceded by any amount of
ASCII whitespace) are ignored.
1. Every line is a shell argument, after trimming whitespace.
2. Lines starting with `#` (optionally preceded by any amount of whitespace)
are ignored.
In particular, there is no escaping. Each line is given to ripgrep as a single
command line argument verbatim.
@@ -528,13 +553,21 @@ formatting peculiarities:
```
$ cat $HOME/.ripgreprc
# Don't let ripgrep vomit really long lines to my terminal.
# Don't let ripgrep vomit really long lines to my terminal, and show a preview.
--max-columns=150
--max-columns-preview
# Add my 'web' type.
--type-add
web:*.{html,css,js}*
# Using glob patterns to include/exclude files or folders
--glob=!git/*
# or
--glob
!git/*
# Set the colors.
--colors=line:none
--colors=line:style:bold
@@ -569,7 +602,7 @@ override it.
If you're confused about what configuration file ripgrep is reading arguments
from, then running ripgrep with the `--debug` flag should help clarify things.
The debug output should note what config file is being loaded and the arugments
The debug output should note what config file is being loaded and the arguments
that have been read from the configuration.
Finally, if you want to make absolutely sure that ripgrep *isn't* reading a
@@ -587,13 +620,14 @@ topic, but we can try to summarize its relevancy to ripgrep:
* Files are generally just a bundle of bytes. There is no reliable way to know
their encoding.
* Either the encoding of the pattern must match the encoding of the files being
searched, or a form of transcoding must be performed converts either the
searched, or a form of transcoding must be performed that converts either the
pattern or the file to the same encoding as the other.
* ripgrep tends to work best on plain text files, and among plain text files,
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
a special exception, UTF-16 is prevalent in Windows environments
In light of the above, here is how ripgrep behaves:
In light of the above, here is how ripgrep behaves when `--encoding auto` is
given, which is the default:
* All input is assumed to be ASCII compatible (which means every byte that
corresponds to an ASCII codepoint actually is an ASCII codepoint). This
@@ -609,12 +643,15 @@ In light of the above, here is how ripgrep behaves:
they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
the file from UTF-16 to UTF-8, and then execute the search on the transcoded
version of the file. (This incurs a performance penalty since transcoding
is slower than regex searching.)
is slower than regex searching.) If the file contains invalid UTF-16, then
the Unicode replacement codepoint is substituted in place of invalid code
units.
* To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
you to specify an encoding from the
[Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
ripgrep will assume *all* files searched are the encoding specified and
will perform a transcoding step just like in the UTF-16 case described above.
ripgrep will assume *all* files searched are the encoding specified (unless
the file has a BOM) and will perform a transcoding step just like in the
UTF-16 case described above.
By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
can and will search arbitrary bytes. The key here is that if you're searching
@@ -624,9 +661,26 @@ pattern won't find anything. With all that said, this mode of operation is
important, because it lets you find ASCII or UTF-8 *within* files that are
otherwise arbitrary bytes.
As a special case, the `-E/--encoding` flag supports the value `none`, which
will completely disable all encoding related logic, including BOM sniffing.
When `-E/--encoding` is set to `none`, ripgrep will search the raw bytes of
the underlying file with no transcoding step. For example, here's how you might
search the raw UTF-16 encoding of the string `Шерлок`:
```
$ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
```
Of course, that's just an example meant to show how one can drop down into
raw bytes. Namely, the simpler command works as you might expect automatically:
```
$ rg 'Шерлок' some-utf16-file
```
Finally, it is possible to disable ripgrep's Unicode support from within the
pattern regular expression. For example, let's say you wanted `.` to match any
byte rather than any Unicode codepoint. (You might want this while searching a
regular expression. For example, let's say you wanted `.` to match any byte
rather than any Unicode codepoint. (You might want this while searching a
binary file, since `.` by default will not match invalid UTF-8.) You could do
this by disabling Unicode via a regular expression flag:
@@ -643,6 +697,76 @@ $ rg '\w(?-u:\w)\w'
```
### Binary data
In addition to skipping hidden files and files in your `.gitignore` by default,
ripgrep also attempts to skip binary files. ripgrep does this by default
because binary files (like PDFs or images) are typically not things you want to
search when searching for regex matches. Moreover, if content in a binary file
did match, then it's possible for undesirable binary data to be printed to your
terminal and wreak havoc.
Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
rules, a file cannot as easily be classified as binary. In order to figure out
whether a file is binary, the most effective heuristic that balances
correctness with performance is to simply look for `NUL` bytes. At that point,
the determination is simple: a file is considered "binary" if and only if it
contains a `NUL` byte somewhere in its contents.
The issue is that while most binary files will have a `NUL` byte toward the
beginning of its contents, this is not necessarily true. The `NUL` byte might
be the very last byte in a large file, but that file is still considered
binary. While this leads to a fair amount of complexity inside ripgrep's
implementation, it also results in some unintuitive user experiences.
At a high level, ripgrep operates in three different modes with respect to
binary files:
1. The default mode is to attempt to remove binary files from a search
completely. This is meant to mirror how ripgrep removes hidden files and
files in your `.gitignore` automatically. That is, as soon as a file is
detected as binary, searching stops. If a match was already printed (because
it was detected long before a `NUL` byte), then ripgrep will print a warning
message indicating that the search stopped prematurely. This default mode
**only applies to files searched by ripgrep as a result of recursive
directory traversal**, which is consistent with ripgrep's other automatic
filtering. For example, `rg foo .file` will search `.file` even though it
is hidden. Similarly, `rg foo binary-file` will search `binary-file` in
"binary" mode automatically.
2. Binary mode is similar to the default mode, except it will not always
stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
will continue searching a file that is known to be binary until the first
of two conditions is met: 1) the end of the file has been reached or 2) a
match is or has been seen. This means that in binary mode, if ripgrep
reports no matches, then there are no matches in the file. When a match does
occur, ripgrep prints a message similar to one it prints when in its default
mode indicating that the search has stopped prematurely. This mode can be
forcefully enabled for all files with the `--binary` flag. The purpose of
binary mode is to provide a way to discover matches in all files, but to
avoid having binary data dumped into your terminal.
3. Text mode completely disables all binary detection and searches all files
as if they were text. This is useful when searching a file that is
predominantly text but contains a `NUL` byte, or if you are specifically
trying to search binary data. This mode can be enabled with the `-a/--text`
flag. Note that when using this mode on very large binary files, it is
possible for ripgrep to use a lot of memory.
Unfortunately, there is one additional complexity in ripgrep that can make it
difficult to reason about binary files. That is, the way binary detection works
depends on the way that ripgrep searches your files. Specifically:
* When ripgrep uses memory maps, then binary detection is only performed on the
first few kilobytes of the file in addition to every matching line.
* When ripgrep doesn't use memory maps, then binary detection is performed on
all bytes searched.
This means that whether a file is detected as binary or not can change based
on the internal search strategy used by ripgrep. If you prefer to keep
ripgrep's binary file detection consistent, then you can disable memory maps
via the `--no-mmap` flag. (The cost will be a small performance regression when
searching very large files on some platforms.)
### Common options
ripgrep has a lot of flags. Too many to keep in your head at once. This section
@@ -664,10 +788,10 @@ used options that will likely impact how you use ripgrep on a regular basis.
* `--files`: Print the files that ripgrep *would* search, but don't actually
search them.
* `-a/--text`: Search binary files as if they were plain text.
* `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz). This is
disabled by default.
* `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz, lz4,
brotli, zstd). This is disabled by default.
* `-C/--context`: Show the lines surrounding a match.
* `--sort-files`: Force ripgrep to sort its output by file name. (This disables
* `--sort path`: Force ripgrep to sort its output by file name. (This disables
parallelism, so it might be slower.)
* `-L/--follow`: Follow symbolic links while recursively searching.
* `-M/--max-columns`: Limit the length of lines printed by ripgrep.

View File

@@ -2,6 +2,12 @@
Replace this text with the output of `rg --version`.
#### How did you install ripgrep?
If you installed ripgrep with snap and are getting strange file permission or
file not found errors, then please do not file a bug. Instead, use one of the
Github binary releases.
#### What operating system are you using ripgrep on?
Replace this text with your operating system and version.

322
README.md
View File

@@ -1,17 +1,18 @@
ripgrep (rg)
------------
ripgrep is a line-oriented search tool that recursively searches your current
directory for a regex pattern while respecting your gitignore rules. ripgrep
directory for a regex pattern. By default, ripgrep will respect your .gitignore
and automatically skip hidden files/directories and binary files. ripgrep
has first class support on Windows, macOS and Linux, with binary downloads
available for [every release](https://github.com/BurntSushi/ripgrep/releases).
ripgrep is similar to other popular search tools like The Silver Searcher,
ack and grep.
ripgrep is similar to other popular search tools like The Silver Searcher, ack
and grep.
[![Linux build status](https://travis-ci.org/BurntSushi/ripgrep.svg?branch=master)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![Crates.io](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
[![Packaging status](https://repology.org/badge/tiny-repos/ripgrep.svg)](https://repology.org/project/ripgrep/badges)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org).
### CHANGELOG
@@ -23,76 +24,78 @@ Please see the [CHANGELOG](CHANGELOG.md) for a release history.
* [Installation](#installation)
* [User Guide](GUIDE.md)
* [Frequently Asked Questions](FAQ.md)
* [Regex syntax](https://docs.rs/regex/0.2.5/regex/#syntax)
* [Regex syntax](https://docs.rs/regex/1/regex/#syntax)
* [Configuration files](GUIDE.md#configuration-file)
* [Shell completions](FAQ.md#complete)
* [Building](#building)
* [Translations](#translations)
### Screenshot of search results
[![A screenshot of a sample search with ripgrep](http://burntsushi.net/stuff/ripgrep1.png)](http://burntsushi.net/stuff/ripgrep1.png)
[![A screenshot of a sample search with ripgrep](https://burntsushi.net/stuff/ripgrep1.png)](https://burntsushi.net/stuff/ripgrep1.png)
### Quick examples comparing tools
This example searches the entire Linux kernel source tree (after running
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and
ripgrep was compiled with SIMD enabled.
This example searches the entire
[Linux kernel source tree](https://github.com/BurntSushi/linux)
(after running `make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where
all matches must be words. Timings were collected on a system with an Intel
i7-6900K 3.2 GHz.
Please remember that a single benchmark is never enough! See my
[blog post on ripgrep](http://blog.burntsushi.net/ripgrep/)
[blog post on ripgrep](https://blog.burntsushi.net/ripgrep/)
for a very detailed comparison with more benchmarks and analysis.
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.106s** |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.553s |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.589s |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.266s |
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.505s |
| [ack](https://github.com/petdance/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 6.823s |
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 14.208s |
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 452 | **0.136s** |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `git grep -P -n -w '[A-Z]+_SUSPEND'` | 452 | 0.348s |
| [ugrep (Unicode)](https://github.com/Genivia/ugrep) | `ugrep -r --ignore-files --no-hidden -I -w '[A-Z]+_SUSPEND'` | 452 | 0.506s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 452 | 1.150s |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 452 | 0.654s |
| [ack](https://github.com/beyondgrep/ack3) | `ack -w '[A-Z]+_SUSPEND'` | 452 | 4.054s |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 452 | 4.205s |
(Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
[bug](https://github.com/petdance/ack2/issues/14).)
Here's another benchmark that disregards gitignore files and searches with a
whitelist instead. The corpus is the same as in the previous benchmark, and the
flags passed to each command ensure that they are doing equivalent work:
Here's another benchmark on the same corpus as above that disregards gitignore
files and searches with a whitelist instead. The corpus is the same as in the
previous benchmark, and the flags passed to each command ensure that they are
doing equivalent work:
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.079s** |
| [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 390 | 0.163s |
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.611s |
| ripgrep | `rg -uuu -tc -n -w '[A-Z]+_SUSPEND'` | 388 | **0.096s** |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 388 | 0.493s |
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 388 | 0.806s |
(`ucg` [has slightly different behavior in the presence of symbolic links](https://github.com/gvansickle/ucg/issues/106).)
And finally, a straight-up comparison between ripgrep and GNU grep on a single
large file (~9.3GB,
[`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz)):
And finally, a straight-up comparison between ripgrep, ugrep and GNU grep on a
single large file cached in memory
(~13GB, [`OpenSubtitles.raw.en.gz`](http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.raw.en.gz)):
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.108s** |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.014s |
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 7882 | **2.769s** |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -w 'Sherlock [A-Z]\w+'` | 7882 | 6.802s |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=en_US.UTF-8 egrep -w 'Sherlock [A-Z]\w+'` | 7882 | 9.027s |
In the above benchmark, passing the `-n` flag (for showing line numbers)
increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
increases the times to `3.423s` for ripgrep and `13.031s` for GNU grep. ugrep
times are unaffected by the presence or absence of `-n`.
### Why should I use ripgrep?
* It can replace both The Silver Searcher and GNU grep because it is generally
faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
for both, but the feature sets are far more similar than different.)
* Like The Silver Searcher, ripgrep defaults to recursive directory search
and won't search files ignored by your `.gitignore` files. It also ignores
hidden and binary files by default. ripgrep also implements full support
for `.gitignore`, whereas there are many bugs related to that functionality
in The Silver Searcher.
* It can replace many use cases served by other search tools
because it contains most of their features and is generally faster. (See
[the FAQ](FAQ.md#posix4ever) for more details on whether ripgrep can truly
replace grep.)
* Like other tools specialized to code search, ripgrep defaults to recursive
directory search and won't search files ignored by your
`.gitignore`/`.ignore`/`.rgignore` files. It also ignores hidden and binary
files by default. ripgrep also implements full support for `.gitignore`,
whereas there are many bugs related to that functionality in other code
search tools claiming to provide the same functionality.
* ripgrep can search specific types of files. For example, `rg -tpy foo`
limits your search to Python files and `rg -Tjs foo` excludes Javascript
files from your search. ripgrep can be taught about new file types with
@@ -101,48 +104,60 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
of search results, searching multiple patterns, highlighting matches with
color and full Unicode support. Unlike GNU grep, ripgrep stays fast while
supporting Unicode (which is always on).
* ripgrep has optional support for switching its regex engine to use PCRE2.
Among other things, this makes it possible to use look-around and
backreferences in your patterns, which are not supported in ripgrep's default
regex engine. PCRE2 support can be enabled with `-P/--pcre2` (use PCRE2
always) or `--auto-hybrid-regex` (use PCRE2 only if needed). An alternative
syntax is provided via the `--engine (default|pcre2|auto-hybrid)` option.
* ripgrep supports searching files in text encodings other than UTF-8, such
as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
automatically detecting UTF-16 is provided. Other text encodings must be
specifically specified with the `-E/--encoding` flag.)
* ripgrep supports searching files compressed in a common format (gzip, xz,
lzma or bzip2 current) with the `-z/--search-zip` flag.
* ripgrep supports searching files compressed in a common format (brotli,
bzip2, gzip, lz4, lzma, xz, or zstandard) with the `-z/--search-zip` flag.
* ripgrep supports arbitrary input preprocessing filters which could be PDF
text extraction, less supported decompression, decrypting, automatic encoding
detection and so on.
In other words, use ripgrep if you like speed, filtering by default, fewer
bugs, and Unicode support.
bugs and Unicode support.
### Why shouldn't I use ripgrep?
I'd like to try to convince you why you *shouldn't* use ripgrep. This should
give you a glimpse at some important downsides or missing features of
ripgrep.
Despite initially not wanting to add every feature under the sun to ripgrep,
over time, ripgrep has grown support for most features found in other file
searching tools. This includes searching for results spanning across multiple
lines, and opt-in support for PCRE2, which provides look-around and
backreference support.
* ripgrep uses a regex engine based on finite automata, so if you want fancy
regex features such as backreferences or lookaround, ripgrep won't provide
them to you. ripgrep does support lots of things though, including, but not
limited to: lazy quantification (e.g., `a+?`), repetitions (e.g., `a{2,5}`),
begin/end assertions (e.g., `^\w+$`), word boundaries (e.g., `\bfoo\b`), and
support for Unicode categories (e.g., `\p{Sc}` to match currency symbols or
`\p{Lu}` to match any uppercase letter). (Fancier regexes will never be
supported.)
* ripgrep doesn't have multiline search. (Will happen as an opt-in feature.)
At this point, the primary reasons not to use ripgrep probably consist of one
or more of the following:
In other words, if you like fancy regexes or multiline search, then ripgrep
may not quite meet your needs (yet).
* You need a portable and ubiquitous tool. While ripgrep works on Windows,
macOS and Linux, it is not ubiquitous and it does not conform to any
standard such as POSIX. The best tool for this job is good old grep.
* There still exists some other feature (or bug) not listed in this README that
you rely on that's in another tool that isn't in ripgrep.
* There is a performance edge case where ripgrep doesn't do well where another
tool does do well. (Please file a bug report!)
* ripgrep isn't possible to install on your machine or isn't available for your
platform. (Please file a bug report!)
### Is it really faster than everything else?
Generally, yes. A large number of benchmarks with detailed analysis for each is
[available on my blog](http://blog.burntsushi.net/ripgrep/).
[available on my blog](https://blog.burntsushi.net/ripgrep/).
Summarizing, ripgrep is fast because:
* It is built on top of
[Rust's regex engine](https://github.com/rust-lang-nursery/regex).
[Rust's regex engine](https://github.com/rust-lang/regex).
Rust's regex engine uses finite automata, SIMD and aggressive literal
optimizations to make searching very fast.
optimizations to make searching very fast. (PCRE2 support can be opted into
with the `-P/--pcre2` flag.)
* Rust's regex library maintains performance with full Unicode support by
building UTF-8 decoding directly into its deterministic finite automaton
engine.
@@ -151,7 +166,7 @@ Summarizing, ripgrep is fast because:
latter is better for large directories. ripgrep chooses the best searching
strategy for you automatically.
* Applies your ignore patterns in `.gitignore` files using a
[`RegexSet`](https://doc.rust-lang.org/regex/regex/struct.RegexSet.html).
[`RegexSet`](https://docs.rs/regex/1/regex/struct.RegexSet.html).
That means a single file path can be matched against multiple glob patterns
simultaneously.
* It uses a lock-free parallel recursive directory iterator, courtesy of
@@ -165,6 +180,11 @@ Andy Lester, author of [ack](https://beyondgrep.com/), has published an
excellent table comparing the features of ack, ag, git-grep, GNU grep and
ripgrep: https://beyondgrep.com/feature-comparison/
Note that ripgrep has grown a few significant new features recently that
are not yet present in Andy's table. This includes, but is not limited to,
configuration files, passthru, support for searching compressed files,
multiline search and opt-in fancy regex support via PCRE2.
### Installation
@@ -172,8 +192,8 @@ The binary name for ripgrep is `rg`.
**[Archives of precompiled binaries for ripgrep are available for Windows,
macOS and Linux.](https://github.com/BurntSushi/ripgrep/releases)** Users of
platforms not explicitly mentioned below (such as Debian) are advised
to download one of these archives.
platforms not explicitly mentioned below are advised to download one of these
archives.
Linux binaries are static executables. Windows binaries are available either as
built with MinGW (GNU) or with Microsoft Visual C++ (MSVC). When possible,
@@ -181,54 +201,63 @@ prefer MSVC over GNU, but you'll need to have the [Microsoft VC++ 2015
redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)
installed.
If you're a **macOS Homebrew** or a **Linuxbrew** user,
then you can install ripgrep either
from homebrew-core, (compiled with rust stable, no SIMD):
If you're a **macOS Homebrew** or a **Linuxbrew** user, then you can install
ripgrep from homebrew-core:
```
$ brew install ripgrep
```
or you can install a binary compiled with rust nightly (including SIMD and all
optimizations) by utilizing a custom tap:
If you're a **MacPorts** user, then you can install ripgrep from the
[official ports](https://www.macports.org/ports.php?by=name&substr=ripgrep):
```
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install burntsushi/ripgrep/ripgrep-bin
$ sudo port install ripgrep
```
If you're a **Windows Chocolatey** user, then you can install ripgrep from the [official repo](https://chocolatey.org/packages/ripgrep):
If you're a **Windows Chocolatey** user, then you can install ripgrep from the
[official repo](https://chocolatey.org/packages/ripgrep):
```
$ choco install ripgrep
```
If you're a **Windows Scoop** user, then you can install ripgrep from the
[official bucket](https://github.com/ScoopInstaller/Main/blob/master/bucket/ripgrep.json):
```
$ scoop install ripgrep
```
If you're an **Arch Linux** user, then you can install ripgrep from the official repos:
```
$ pacman -S ripgrep
```
If you're a **Gentoo** user, you can install ripgrep from the [official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
If you're a **Gentoo** user, you can install ripgrep from the
[official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
```
$ emerge sys-apps/ripgrep
```
If you're a **Fedora 27+** user, you can install ripgrep from official repositories.
If you're a **Fedora** user, you can install ripgrep from official
repositories.
```
$ sudo dnf install ripgrep
```
If you're a **Fedora 24+** user, you can install ripgrep from [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
If you're an **openSUSE** user, ripgrep is included in **openSUSE Tumbleweed**
and **openSUSE Leap** since 15.1.
```
$ sudo dnf copr enable carlwgeorge/ripgrep
$ sudo dnf install ripgrep
$ sudo zypper install ripgrep
```
If you're a **RHEL/CentOS 7** user, you can install ripgrep from [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
If you're a **RHEL/CentOS 7/8** user, you can install ripgrep from
[copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
```
$ sudo yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/repo/epel-7/carlwgeorge-ripgrep-epel-7.repo
@@ -243,17 +272,72 @@ $ nix-env --install ripgrep
$ # (Or using the attribute name, which is also ripgrep.)
```
If you're an **Ubuntu** user, ripgrep can be installed from the `snap` store.
* Note that if you are using `16.04 LTS` or later, snap is already installed.
* For older versions you can install snap using
[this guide](https://docs.snapcraft.io/core/install-ubuntu).
If you're a **Debian** user (or a user of a Debian derivative like **Ubuntu**),
then ripgrep can be installed using a binary `.deb` file provided in each
[ripgrep release](https://github.com/BurntSushi/ripgrep/releases).
```
sudo snap install rg
$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/11.0.2/ripgrep_11.0.2_amd64.deb
$ sudo dpkg -i ripgrep_11.0.2_amd64.deb
```
If you run Debian Buster (currently Debian stable) or Debian sid, ripgrep is
[officially maintained by Debian](https://tracker.debian.org/pkg/rust-ripgrep).
```
$ sudo apt-get install ripgrep
```
If you're an **Ubuntu Cosmic (18.10)** (or newer) user, ripgrep is
[available](https://launchpad.net/ubuntu/+source/rust-ripgrep) using the same
packaging as Debian:
```
$ sudo apt-get install ripgrep
```
(N.B. Various snaps for ripgrep on Ubuntu are also available, but none of them
seem to work right and generate a number of very strange bug reports that I
don't know how to fix and don't have the time to fix. Therefore, it is no
longer a recommended installation option.)
If you're a **FreeBSD** user, then you can install ripgrep from the
[official ports](https://www.freshports.org/textproc/ripgrep/):
```
# pkg install ripgrep
```
If you're an **OpenBSD** user, then you can install ripgrep from the
[official ports](http://openports.se/textproc/ripgrep):
```
$ doas pkg_add ripgrep
```
If you're a **NetBSD** user, then you can install ripgrep from
[pkgsrc](http://pkgsrc.se/textproc/ripgrep):
```
# pkgin install ripgrep
```
If you're a **Haiku x86_64** user, then you can install ripgrep from the
[official ports](https://github.com/haikuports/haikuports/tree/master/sys-apps/ripgrep):
```
$ pkgman install ripgrep
```
If you're a **Haiku x86_gcc2** user, then you can install ripgrep from the
same port as Haiku x86_64 using the x86 secondary architecture build:
```
$ pkgman install ripgrep_x86
```
If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
* Note that the minimum supported version of Rust for ripgrep is **1.20**,
* Note that the minimum supported version of Rust for ripgrep is **1.34.0**,
although ripgrep may work with older versions.
* Note that the binary may be bigger than expected because it contains debug
symbols. This is intentional. To remove debug symbols and therefore reduce
@@ -263,15 +347,15 @@ If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
$ cargo install ripgrep
```
ripgrep isn't currently in any other package repositories.
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
### Building
ripgrep is written in Rust, so you'll need to grab a
[Rust installation](https://www.rust-lang.org/) in order to compile it.
ripgrep compiles with Rust 1.20 (stable) or newer. Building is easy:
ripgrep compiles with Rust 1.34.0 (stable) or newer. In general, ripgrep tracks
the latest stable release of the Rust compiler.
To build ripgrep:
```
$ git clone https://github.com/BurntSushi/ripgrep
@@ -282,14 +366,50 @@ $ ./target/release/rg --version
```
If you have a Rust nightly compiler and a recent Intel CPU, then you can enable
optional SIMD acceleration like so:
additional optional SIMD acceleration like so:
```
RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'
RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel'
```
If your machine doesn't support AVX instructions, then simply remove
`avx-accel` from the features list. Similarly for SIMD.
The `simd-accel` feature enables SIMD support in certain ripgrep dependencies
(responsible for transcoding). They are not necessary to get SIMD optimizations
for search; those are enabled automatically. Hopefully, some day, the
`simd-accel` feature will similarly become unnecessary. **WARNING:** Currently,
enabling this option can increase compilation times dramatically.
Finally, optional PCRE2 support can be built with ripgrep by enabling the
`pcre2` feature:
```
$ cargo build --release --features 'pcre2'
```
(Tip: use `--features 'pcre2 simd-accel'` to also include compile time SIMD
optimizations, which will only work with a nightly compiler.)
Enabling the PCRE2 feature works with a stable Rust compiler and will
attempt to automatically find and link with your system's PCRE2 library via
`pkg-config`. If one doesn't exist, then ripgrep will build PCRE2 from source
using your system's C compiler and then statically link it into the final
executable. Static linking can be forced even when there is an available PCRE2
system library by either building ripgrep with the MUSL target or by setting
`PCRE2_SYS_STATIC=1`.
ripgrep can be built with the MUSL target on Linux by first installing the MUSL
library on your system (consult your friendly neighborhood package manager).
Then you just need to add MUSL support to your Rust toolchain and rebuild
ripgrep, which yields a fully static executable:
```
$ rustup target add x86_64-unknown-linux-musl
$ cargo build --release --target x86_64-unknown-linux-musl
```
Applying the `--features` flag from above works as expected. If you want to
build a static executable with MUSL and with PCRE2, then you will need to have
`musl-gcc` installed, which might be in a separate package from the actual
MUSL library, depending on your Linux distribution.
### Running tests
@@ -302,3 +422,11 @@ $ cargo test --all
```
from the repository root.
### Translations
The following is a list of known translations of ripgrep's documentation. These
are unofficially maintained and may not be up to date.
* [Chinese](https://github.com/chinanf-boy/ripgrep-zh#%E6%9B%B4%E6%96%B0-)

View File

@@ -1,85 +0,0 @@
# Inspired from https://github.com/habitat-sh/habitat/blob/master/appveyor.yml
cache:
- c:\cargo\registry
- c:\cargo\git
- c:\projects\ripgrep\target
init:
- mkdir c:\cargo
- mkdir c:\rustup
- SET PATH=c:\cargo\bin;%PATH%
clone_folder: c:\projects\ripgrep
environment:
CARGO_HOME: "c:\\cargo"
RUSTUP_HOME: "c:\\rustup"
CARGO_TARGET_DIR: "c:\\projects\\ripgrep\\target"
global:
PROJECT_NAME: ripgrep
RUST_BACKTRACE: full
matrix:
- TARGET: i686-pc-windows-gnu
CHANNEL: stable
- TARGET: i686-pc-windows-msvc
CHANNEL: stable
- TARGET: x86_64-pc-windows-gnu
CHANNEL: stable
- TARGET: x86_64-pc-windows-msvc
CHANNEL: stable
matrix:
fast_finish: true
# Install Rust and Cargo
# (Based on from https://github.com/rust-lang/libc/blob/master/appveyor.yml)
install:
- curl -sSf -o rustup-init.exe https://win.rustup.rs/
- rustup-init.exe -y --default-host %TARGET% --no-modify-path
- if defined MSYS2_BITS set PATH=%PATH%;C:\msys64\mingw%MSYS2_BITS%\bin
- rustc -V
- cargo -V
# ???
build: false
# Equivalent to Travis' `script` phase
# TODO modify this phase as you see fit
test_script:
- cargo test --verbose --all
before_deploy:
# Generate artifacts for release
# TODO(burntsushi): How can we enable SSSE3 on Windows?
- cargo build --release
- mkdir staging
- copy target\release\rg.exe staging
- ps: copy target\release\build\ripgrep-*\out\_rg.ps1 staging
- cd staging
# release zipfile will look like 'rust-everywhere-v1.2.3-x86_64-pc-windows-msvc'
- 7z a ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip *
- appveyor PushArtifact ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip
deploy:
description: 'Automatically deployed release'
# All the zipped artifacts will be deployed
artifact: /.*\.zip/
auth_token:
secure: vv4vBCEosGlyQjaEC1+kraP2P6O4CQSa+Tw50oHWFTGcmuXxaWS0/yEXbxsIRLpw
provider: GitHub
# deploy when a new tag is pushed and only on the stable channel
on:
# channel to use to produce the release artifacts
# NOTE make sure you only release *once* per target
# TODO you may want to pick a different channel
CHANNEL: stable
appveyor_repo_tag: true
branches:
only:
- /\d+\.\d+\.\d+/
- master
# - appveyor
# - /\d+\.\d+\.\d+/
# except:
# - master

View File

@@ -1,8 +1,3 @@
#[macro_use]
extern crate clap;
#[macro_use]
extern crate lazy_static;
use std::env;
use std::fs::{self, File};
use std::io::{self, Read, Write};
@@ -14,7 +9,7 @@ use clap::Shell;
use app::{RGArg, RGArgKind};
#[allow(dead_code)]
#[path = "src/app.rs"]
#[path = "crates/core/app.rs"]
mod app;
fn main() {
@@ -26,7 +21,8 @@ fn main() {
eprintln!(
"OUT_DIR environment variable not defined. \
Please file a bug: \
https://github.com/BurntSushi/ripgrep/issues/new");
https://github.com/BurntSushi/ripgrep/issues/new"
);
process::exit(1);
}
};
@@ -58,8 +54,13 @@ fn git_revision_hash() -> Option<String> {
let result = process::Command::new("git")
.args(&["rev-parse", "--short=10", "HEAD"])
.output();
result.ok().map(|output| {
String::from_utf8_lossy(&output.stdout).trim().to_string()
result.ok().and_then(|output| {
let v = String::from_utf8_lossy(&output.stdout).trim().to_string();
if v.is_empty() {
None
} else {
Some(v)
}
})
}
@@ -85,13 +86,15 @@ fn generate_man_page<P: AsRef<Path>>(outdir: P) -> io::Result<()> {
let githash = git_revision_hash();
let githash = githash.as_ref().map(|x| &**x);
tpl = tpl.replace("{VERSION}", &app::long_version(githash));
tpl = tpl.replace("{VERSION}", &app::long_version(githash, false));
File::create(&txt_path)?.write_all(tpl.as_bytes())?;
let result = process::Command::new("a2x")
.arg("--no-xmllint")
.arg("--doctype").arg("manpage")
.arg("--format").arg("manpage")
.arg("--doctype")
.arg("manpage")
.arg("--format")
.arg("manpage")
.arg(&txt_path)
.spawn()?
.wait()?;
@@ -114,7 +117,7 @@ fn formatted_options() -> io::Result<String> {
// ripgrep only has two positional arguments, and probably will only
// ever have two positional arguments, so we just hardcode them into
// the template.
if let app::RGArgKind::Positional{..} = arg.kind {
if let app::RGArgKind::Positional { .. } = arg.kind {
continue;
}
formatted.push(formatted_arg(&arg)?);
@@ -124,7 +127,9 @@ fn formatted_options() -> io::Result<String> {
fn formatted_arg(arg: &RGArg) -> io::Result<String> {
match arg.kind {
RGArgKind::Positional{..} => panic!("unexpected positional argument"),
RGArgKind::Positional { .. } => {
panic!("unexpected positional argument")
}
RGArgKind::Switch { long, short, multiple } => {
let mut out = vec![];
@@ -163,7 +168,13 @@ fn formatted_arg(arg: &RGArg) -> io::Result<String> {
}
fn formatted_doc_txt(arg: &RGArg) -> io::Result<String> {
let paragraphs: Vec<&str> = arg.doc_long.split("\n\n").collect();
let paragraphs: Vec<String> = arg
.doc_long
.replace("{", "&#123;")
.replace("}", r"&#125;")
.split("\n\n")
.map(|s| s.to_string())
.collect();
if paragraphs.is_empty() {
return Err(ioerr(format!("missing docs for --{}", arg.name)));
}

View File

@@ -1,60 +0,0 @@
#!/bin/bash
# package the build artifacts
set -ex
. "$(dirname $0)/utils.sh"
# Generate artifacts for release
mk_artifacts() {
if is_ssse3_target; then
RUSTFLAGS="-C target-feature=+ssse3" \
cargo build --target "$TARGET" --release --features simd-accel
else
cargo build --target "$TARGET" --release
fi
}
mk_tarball() {
# When cross-compiling, use the right `strip` tool on the binary.
local gcc_prefix="$(gcc_prefix)"
# Create a temporary dir that contains our staging area.
# $tmpdir/$name is what eventually ends up as the deployed archive.
local tmpdir="$(mktemp -d)"
local name="${PROJECT_NAME}-${TRAVIS_TAG}-${TARGET}"
local staging="$tmpdir/$name"
mkdir -p "$staging"/{complete,doc}
# The deployment directory is where the final archive will reside.
# This path is known by the .travis.yml configuration.
local out_dir="$(pwd)/deployment"
mkdir -p "$out_dir"
# Find the correct (most recent) Cargo "out" directory. The out directory
# contains shell completion files and the man page.
local cargo_out_dir="$(cargo_out_dir "target/$TARGET")"
# Copy the ripgrep binary and strip it.
cp "target/$TARGET/release/rg" "$staging/rg"
"${gcc_prefix}strip" "$staging/rg"
# Copy the licenses and README.
cp {README.md,UNLICENSE,COPYING,LICENSE-MIT} "$staging/"
# Copy documentation and man page.
cp {CHANGELOG.md,FAQ.md,GUIDE.md} "$staging/doc/"
if command -V a2x 2>&1 > /dev/null; then
# The man page should only exist if we have asciidoc installed.
cp "$cargo_out_dir/rg.1" "$staging/doc/"
fi
# Copy shell completion files.
cp "$cargo_out_dir"/{rg.bash,rg.fish,_rg.ps1} "$staging/complete/"
cp complete/_rg "$staging/complete/"
(cd "$tmpdir" && tar czf "$out_dir/$name.tar.gz" "$name")
rm -rf "$tmpdir"
}
main() {
mk_artifacts
mk_tarball
}
main

36
ci/build-deb Executable file
View File

@@ -0,0 +1,36 @@
#!/bin/bash
set -e
D="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
# This script builds a binary dpkg for Debian based distros. It does not
# currently run in CI, and is instead run manually and the resulting dpkg is
# uploaded to GitHub via the web UI.
#
# Note that this requires 'cargo deb', which can be installed with
# 'cargo install cargo-deb'.
#
# This should be run from the root of the ripgrep repo.
if ! command -V cargo-deb > /dev/null 2>&1; then
echo "cargo-deb command missing" >&2
exit 1
fi
# 'cargo deb' does not seem to provide a way to specify an asset that is
# created at build time, such as ripgrep's man page. To work around this,
# we force a debug build, copy out the man page (and shell completions)
# produced from that build, put it into a predictable location and then build
# the deb, which knows where to look.
DEPLOY_DIR=deployment/deb
OUT_DIR="$("$D"/cargo-out-dir target/debug/)"
mkdir -p "$DEPLOY_DIR"
cargo build
# Copy man page and shell completions.
cp "$OUT_DIR"/{rg.1,rg.bash,rg.fish,_rg} "$DEPLOY_DIR/"
# Since we're distributing the dpkg, we don't know whether the user will have
# PCRE2 installed, so just do a static build.
PCRE2_SYS_STATIC=1 cargo deb

19
ci/cargo-out-dir Executable file
View File

@@ -0,0 +1,19 @@
#!/bin/bash
# Finds Cargo's `OUT_DIR` directory from the most recent build.
#
# This requires one parameter corresponding to the target directory
# to search for the build output.
if [ $# != 1 ]; then
echo "Usage: $(basename "$0") <target-dir>" >&2
exit 2
fi
# This works by finding the most recent stamp file, which is produced by
# every ripgrep build.
target_dir="$1"
find "$target_dir" -name ripgrep-stamp -print0 \
| xargs -0 ls -t \
| head -n1 \
| xargs dirname

24
ci/docker/README.md Normal file
View File

@@ -0,0 +1,24 @@
These are Docker images used for cross compilation in CI builds (or locally)
via the [Cross](https://github.com/rust-embedded/cross) tool.
The Cross tool actually provides its own Docker images, and all Docker images
in this directory are derived from one of them. We provide our own in order
to customize the environment. For example, we need to install some things like
`asciidoc` in order to generate man pages. We also install compression tools
like `xz` so that tests for the `-z/--search-zip` flag are run.
If you make a change to a Docker image, then you can re-build it. `cd` into the
directory containing the `Dockerfile` and run:
$ cd x86_64-unknown-linux-musl
$ ./build
At this point, subsequent uses of `cross` will now use your built image since
Docker prefers local images over remote images. In order to make these changes
stick, they need to be pushed to Docker Hub:
$ docker push burntsushi/cross:x86_64-unknown-linux-musl
Of course, only I (BurntSushi) can push to that location. To make `cross` use
a different location, then edit `Cross.toml` in the root of this repo to use
a different image name for the desired target.

View File

@@ -0,0 +1,4 @@
FROM rustembedded/cross:arm-unknown-linux-gnueabihf
COPY stage/ubuntu-install-packages /
RUN /ubuntu-install-packages

View File

@@ -0,0 +1,5 @@
#!/bin/sh
mkdir -p stage
cp ../../ubuntu-install-packages ./stage/
docker build -t burntsushi/cross:arm-unknown-linux-gnueabihf .

View File

@@ -0,0 +1,4 @@
FROM rustembedded/cross:i686-unknown-linux-gnu
COPY stage/ubuntu-install-packages /
RUN /ubuntu-install-packages

View File

@@ -0,0 +1,5 @@
#!/bin/sh
mkdir -p stage
cp ../../ubuntu-install-packages ./stage/
docker build -t burntsushi/cross:i686-unknown-linux-gnu .

View File

@@ -0,0 +1,4 @@
FROM rustembedded/cross:mips64-unknown-linux-gnuabi64
COPY stage/ubuntu-install-packages /
RUN /ubuntu-install-packages

View File

@@ -0,0 +1,5 @@
#!/bin/sh
mkdir -p stage
cp ../../ubuntu-install-packages ./stage/
docker build -t burntsushi/cross:mips64-unknown-linux-gnuabi64 .

View File

@@ -0,0 +1,4 @@
FROM rustembedded/cross:x86_64-unknown-linux-musl
COPY stage/ubuntu-install-packages /
RUN /ubuntu-install-packages

View File

@@ -0,0 +1,5 @@
#!/bin/sh
mkdir -p stage
cp ../../ubuntu-install-packages ./stage/
docker build -t burntsushi/cross:x86_64-unknown-linux-musl .

View File

@@ -1,61 +0,0 @@
#!/bin/bash
# install stuff needed for the `script` phase
# Where rustup gets installed.
export PATH="$PATH:$HOME/.cargo/bin"
set -ex
. "$(dirname $0)/utils.sh"
install_rustup() {
curl https://sh.rustup.rs -sSf \
| sh -s -- -y --default-toolchain="$TRAVIS_RUST_VERSION"
rustc -V
cargo -V
}
install_targets() {
if [ $(host) != "$TARGET" ]; then
rustup target add $TARGET
fi
}
install_osx_dependencies() {
if ! is_osx; then
return
fi
brew install asciidoc
}
configure_cargo() {
local prefix=$(gcc_prefix)
if [ -n "${prefix}" ]; then
local gcc_suffix=
if [ -n "$GCC_VERSION" ]; then
gcc_suffix="-$GCC_VERSION"
fi
local gcc="${prefix}gcc${gcc_suffix}"
# information about the cross compiler
"${gcc}" -v
# tell cargo which linker to use for cross compilation
mkdir -p .cargo
cat >>.cargo/config <<EOF
[target.$TARGET]
linker = "${gcc}"
EOF
fi
}
main() {
install_osx_dependencies
install_rustup
install_targets
configure_cargo
}
main

3
ci/macos-install-packages Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/sh
brew install asciidoc docbook-xsl

View File

@@ -1,48 +0,0 @@
#!/bin/bash
# build, test and generate docs in this phase
set -ex
. "$(dirname $0)/utils.sh"
main() {
# Test a normal debug build.
cargo build --target "$TARGET" --verbose --all
# Show the output of the most recent build.rs stderr.
set +x
stderr="$(find "target/$TARGET/debug" -name stderr -print0 | xargs -0 ls -t | head -n1)"
if [ -s "$stderr" ]; then
echo "===== $stderr ====="
cat "$stderr"
echo "====="
fi
set -x
# sanity check the file type
file target/"$TARGET"/debug/rg
# Apparently tests don't work on arm, so just bail now. I guess we provide
# ARM releases on a best effort basis?
if is_arm; then
return 0
fi
# Test that zsh completions are in sync with ripgrep's actual args.
"$(dirname "${0}")/test_complete.sh"
# Check that we've generated man page and other shell completions.
outdir="$(cargo_out_dir "target/$TARGET/debug")"
file "$outdir/rg.bash"
file "$outdir/rg.fish"
file "$outdir/_rg.ps1"
# N.B. man page isn't generated on ARM cross-compile, but we gave up
# long before this anyway.
file "$outdir/rg.1"
# Run tests for ripgrep and all sub-crates.
cargo test --target "$TARGET" --verbose --all
}
main

View File

@@ -1,70 +1,73 @@
#!/usr/bin/env zsh
emulate zsh -o extended_glob -o no_function_argzero -o no_unset
##
# Compares options in `rg --help` output to options in zsh completion function
emulate -R zsh
setopt extended_glob
setopt no_function_argzero
setopt no_unset
get_comp_args() {
# Technically there are many options that the completion system sets that
# our function may rely on, but we'll trust that we've got it mostly right
setopt local_options unset
# Our completion function recognises a special variable which tells it to
# dump the _arguments specs and then just return. But do this in a sub-shell
# anyway to avoid any weirdness
( _RG_COMPLETE_LIST_ARGS=1 source $1 )
return $?
}
main() {
local diff
local rg="${${0:a}:h}/../target/${TARGET:-}/release/rg"
local _rg="${${0:a}:h}/../complete/_rg"
local rg="${0:a:h}/../${TARGET_DIR:-target}/release/rg"
local _rg="${0:a:h}/../complete/_rg"
local -a help_args comp_args
[[ -e $rg ]] || rg=${rg/%\/release\/rg/\/debug\/rg}
rg=${rg:a}
_rg=${_rg:a}
[[ -e $rg ]] || {
printf >&2 'File not found: %s\n' $rg
print -r >&2 "File not found: $rg"
return 1
}
[[ -e $_rg ]] || {
printf >&2 'File not found: %s\n' $_rg
print -r >&2 "File not found: $_rg"
return 1
}
printf 'Comparing options:\n-%s\n+%s\n' $rg $_rg
print -rl - 'Comparing options:' "-$rg" "+$_rg"
# 'Parse' options out of the `--help` output. To prevent false positives we
# only look at lines where the first non-white-space character is `-`
# only look at lines where the first non-white-space character is `-`, or
# where a long option starting with certain letters (see `_rg`) is found.
# Occasionally we may have to handle some manually, however
help_args=( ${(f)"$(
$rg --help |
$rg -- '^\s*-' |
$rg -io -- '[\t ,](-[a-z0-9]|--[a-z0-9-]+)\b' |
tr -d '\t ,' |
$rg -i -- '^\s+--?[a-z0-9]|--[a-z]' |
$rg -ior '$1' -- $'[\t /\"\'`.,](-[a-z0-9]|--[a-z0-9-]+)\\b' |
$rg -v -- --print0 | # False positives
sort -u
)"} )
# 'Parse' options out of the completion function
comp_args=( ${(f)"$( get_comp_args $_rg )"} )
# Note that we currently exclude hidden (!...) options; matching these
# properly against the `--help` output could be irritating
comp_args=( ${comp_args#\(*\)} ) # Strip excluded options
comp_args=( ${comp_args#\*} ) # Strip repetition indicator
comp_args=( ${comp_args%%-[:[]*} ) # Strip everything after -optname-
comp_args=( ${comp_args%%[:+=[]*} ) # Strip everything after other optspecs
comp_args=( ${comp_args##[^-]*} ) # Remove non-options
# This probably isn't necessary, but we should ensure the same order
comp_args=( ${(f)"$( printf '%s\n' $comp_args | sort -u )"} )
comp_args=( ${(f)"$( print -rl - $comp_args | sort -u )"} )
(( $#help_args )) || {
printf >&2 'Failed to get help_args\n'
print -r >&2 'Failed to get help_args'
return 1
}
(( $#comp_args )) || {
printf >&2 'Failed to get comp_args\n'
print -r >&2 'Failed to get comp_args'
return 1
}
@@ -73,12 +76,12 @@ main() {
diff -U2 \
--label '`rg --help`' \
--label '`_rg`' \
=( printf '%s\n' $help_args ) =( printf '%s\n' $comp_args )
=( print -rl - $help_args ) =( print -rl - $comp_args )
else
diff -U2 \
-L '`rg --help`' \
-L '`_rg`' \
=( printf '%s\n' $help_args ) =( printf '%s\n' $comp_args )
=( print -rl - $help_args ) =( print -rl - $comp_args )
fi
)"
@@ -91,4 +94,4 @@ main() {
return 0
}
main "${@}"
main "$@"

6
ci/ubuntu-install-packages Executable file
View File

@@ -0,0 +1,6 @@
#!/bin/sh
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
libxslt1-dev asciidoc docbook-xsl xsltproc libxml2-utils \
zsh xz-utils liblz4-tool musl-tools

View File

@@ -55,10 +55,10 @@ gcc_prefix() {
esac
}
is_ssse3_target() {
case "$(architecture)" in
amd64) return 0 ;;
*) return 1 ;;
is_musl() {
case "$TARGET" in
*-musl) return 0 ;;
*) return 1 ;;
esac
}
@@ -69,6 +69,13 @@ is_x86() {
esac
}
is_x86_64() {
case "$(architecture)" in
amd64) return 0 ;;
*) return 1 ;;
esac
}
is_arm() {
case "$(architecture)" in
armhf) return 0 ;;
@@ -89,3 +96,14 @@ is_osx() {
*) return 1 ;;
esac
}
builder() {
if is_musl && is_x86_64; then
# cargo install cross
# To work around https://github.com/rust-embedded/cross/issues/357
cargo install --git https://github.com/rust-embedded/cross --force
echo "cross"
else
echo "cargo"
fi
}

View File

@@ -3,163 +3,390 @@
##
# zsh completion function for ripgrep
#
# Run ci/test_complete.sh after building to ensure that the options supported by
# Run ci/test-complete after building to ensure that the options supported by
# this function stay in synch with the `rg` binary.
#
# @see https://github.com/zsh-users/zsh/blob/master/Etc/completion-style-guide
# For convenience, a completion reference guide is included at the bottom of
# this file.
#
# Based on code from the zsh-users project — see copyright notice below.
# Originally based on code from the zsh-users project — see copyright notice
# below.
_rg() {
local state_descr ret curcontext="${curcontext:-}"
local -a context line state
local -A opt_args val_args
local -a rg_args
local curcontext=$curcontext no='!' descr ret=1
local -a context line state state_descr args tmp suf
local -A opt_args
# Sort by long option name to match `rg --help`
rg_args=(
'(-A -C --after-context --context)'{-A+,--after-context=}'[specify lines to show after each match]:number of lines'
'(-B -C --before-context --context)'{-B+,--before-context=}'[specify lines to show before each match]:number of lines'
'(-i -s -S --ignore-case --case-sensitive --smart-case)'{-s,--case-sensitive}'[search case-sensitively]'
'--color=[specify when to use colors in output]:when:( never auto always ansi )'
'*--colors=[specify color settings and styles]: :->colorspec'
'--column[show column numbers]'
'(-A -B -C --after-context --before-context --context)'{-C+,--context=}'[specify lines to show before and after each match]:number of lines'
'--context-separator=[specify string used to separate non-continuous context lines in output]:separator'
'(-c --count --passthrough --passthru)'{-c,--count}'[only show count of matches for each file]'
'--debug[show debug messages]'
'--dfa-size-limit=[specify upper size limit of generated DFA]:DFA size'
'(-E --encoding)'{-E+,--encoding=}'[specify text encoding of files to search]: :_rg_encodings'
'*'{-f+,--file=}'[specify file containing patterns to search for]:file:_files'
"(1)--files[show each file that would be searched (but don't search)]"
'(-l --files-with-matches --files-without-match)'{-l,--files-with-matches}'[only show names of files with matches]'
'(-l --files-with-matches --files-without-match)--files-without-match[only show names of files without matches]'
'(-F --fixed-strings)'{-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
'(-L --follow)'{-L,--follow}'[follow symlinks]'
'*'{-g+,--glob=}'[include or exclude files for searching that match the specified glob]:glob'
'(: -)'{-h,--help}'[display help information]'
'(-p --no-heading --pretty --vimgrep)--heading[show matches grouped by file name]'
# ripgrep has many options which negate the effect of a more common one — for
# example, `--no-column` to negate `--column`, and `--messages` to negate
# `--no-messages`. There are so many of these, and they're so infrequently
# used, that some users will probably find it irritating if they're completed
# indiscriminately, so let's not do that unless either the current prefix
# matches one of those negation options or the user has the `complete-all`
# style set. Note that this prefix check has to be updated manually to account
# for all of the potential negation options listed below!
if
# We also want to list all of these options during testing
[[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] ||
# (--[imnp]* => --ignore*, --messages, --no-*, --pcre2-unicode)
[[ $PREFIX$SUFFIX == --[imnp]* ]] ||
zstyle -t ":complete:$curcontext:*" complete-all
then
no=
fi
# We make heavy use of argument groups here to prevent the option specs from
# growing unwieldy. These aren't supported in zsh <5.4, though, so we'll strip
# them out below if necessary. This makes the exclusions inaccurate on those
# older versions, but oh well — it's not that big a deal
args=(
+ '(exclusive)' # Misc. fully exclusive options
'(: * -)'{-h,--help}'[display help information]'
'(: * -)'{-V,--version}'[display version information]'
'(: * -)'--pcre2-version'[print the version of PCRE2 used by ripgrep, if available]'
+ '(buffered)' # buffering options
'--line-buffered[force line buffering]'
$no"--no-line-buffered[don't force line buffering]"
'--block-buffered[force block buffering]'
$no"--no-block-buffered[don't force block buffering]"
+ '(case)' # Case-sensitivity options
{-i,--ignore-case}'[search case-insensitively]'
{-s,--case-sensitive}'[search case-sensitively]'
{-S,--smart-case}'[search case-insensitively if pattern is all lowercase]'
+ '(context-a)' # Context (after) options
'(context-c)'{-A+,--after-context=}'[specify lines to show after each match]:number of lines'
+ '(context-b)' # Context (before) options
'(context-c)'{-B+,--before-context=}'[specify lines to show before each match]:number of lines'
+ '(context-c)' # Context (combined) options
'(context-a context-b)'{-C+,--context=}'[specify lines to show before and after each match]:number of lines'
+ '(column)' # Column options
'--column[show column numbers for matches]'
$no"--no-column[don't show column numbers for matches]"
+ '(count)' # Counting options
{-c,--count}'[only show count of matching lines for each file]'
'--count-matches[only show count of individual matches for each file]'
'--include-zero[include files with zero matches in summary]'
+ '(encoding)' # Encoding options
{-E+,--encoding=}'[specify text encoding of files to search]: :_rg_encodings'
$no'--no-encoding[use default text encoding]'
+ '(engine)' # Engine choice options
'--engine=[select which regex engine to use]:when:((
default\:"use default engine"
pcre2\:"identical to --pcre2"
auto\:"identical to --auto-hybrid-regex"
))'
+ file # File-input options
'(1)*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'
+ '(file-match)' # Files with/without match options
'(stats)'{-l,--files-with-matches}'[only show names of files with matches]'
'(stats)--files-without-match[only show names of files without matches]'
+ '(file-name)' # File-name options
{-H,--with-filename}'[show file name for matches]'
{-I,--no-filename}"[don't show file name for matches]"
+ '(file-system)' # File system options
"--one-file-system[don't descend into directories on other file systems]"
$no'--no-one-file-system[descend into directories on other file systems]'
+ '(fixed)' # Fixed-string options
{-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
$no"--no-fixed-strings[don't treat pattern as literal string]"
+ '(follow)' # Symlink-following options
{-L,--follow}'[follow symlinks]'
$no"--no-follow[don't follow symlinks]"
+ glob # File-glob options
'*'{-g+,--glob=}'[include/exclude files matching specified glob]:glob'
'*--iglob=[include/exclude files matching specified case-insensitive glob]:glob'
+ '(glob-case-insensitive)' # File-glob case sensitivity options
'--glob-case-insensitive[treat -g/--glob patterns case insensitively]'
$no'--no-glob-case-insensitive[treat -g/--glob patterns case sensitively]'
+ '(heading)' # Heading options
'(pretty-vimgrep)--heading[show matches grouped by file name]'
"(pretty-vimgrep)--no-heading[don't show matches grouped by file name]"
+ '(hidden)' # Hidden-file options
'--hidden[search hidden files and directories]'
'*--iglob=[include or exclude files for searching that match the specified case-insensitive glob]:glob'
'(-i -s -S --case-sensitive --ignore-case --smart-case)'{-i,--ignore-case}'[search case-insensitively]'
'--ignore-file=[specify additional ignore file]:file:_files'
'(-v --invert-match)'{-v,--invert-match}'[invert matching]'
'(-n -N --line-number --no-line-number)'{-n,--line-number}'[show line numbers]'
'(-N --no-line-number)--line-number-width=[specify width of displayed line number]:number of columns'
'(-w -x --line-regexp --word-regexp)'{-x,--line-regexp}'[only show matches surrounded by line boundaries]'
'(-M --max-columns)'{-M+,--max-columns=}'[specify max length of lines to print]:number of bytes'
'(-m --max-count)'{-m+,--max-count=}'[specify max number of matches per file]:number of matches'
'--max-filesize=[specify size above which files should be ignored]:file size'
'--maxdepth=[specify max number of directories to descend]:number of directories'
'(--mmap --no-mmap)--mmap[search using memory maps when possible]'
'(-H --with-filename --no-filename)--no-filename[suppress all file names]'
"(-p --heading --pretty --vimgrep)--no-heading[don't group matches by file name]"
"--no-config[don't load configuration files]"
"(--no-ignore-parent)--no-ignore[don't respect ignore files]"
$no"--no-hidden[don't search hidden files and directories]"
+ '(hybrid)' # hybrid regex options
'--auto-hybrid-regex[dynamically use PCRE2 if necessary]'
$no"--no-auto-hybrid-regex[don't dynamically use PCRE2 if necessary]"
+ '(ignore)' # Ignore-file options
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
$no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
+ '(ignore-file-case-insensitive)' # Ignore-file case sensitivity options
'--ignore-file-case-insensitive[process ignore files case insensitively]'
$no'--no-ignore-file-case-insensitive[process ignore files case sensitively]'
+ '(ignore-exclude)' # Local exclude (ignore)-file options
"--no-ignore-exclude[don't respect local exclude (ignore) files]"
$no'--ignore-exclude[respect local exclude (ignore) files]'
+ '(ignore-global)' # Global ignore-file options
"--no-ignore-global[don't respect global ignore files]"
$no'--ignore-global[respect global ignore files]'
+ '(ignore-parent)' # Parent ignore-file options
"--no-ignore-parent[don't respect ignore files in parent directories]"
$no'--ignore-parent[respect ignore files in parent directories]'
+ '(ignore-vcs)' # VCS ignore-file options
"--no-ignore-vcs[don't respect version control ignore files]"
'(-n -N --line-number --no-line-number)'{-N,--no-line-number}'[suppress line numbers]'
'--no-messages[suppress all error messages]'
"(--mmap --no-mmap)--no-mmap[don't search using memory maps]"
'(-0 --null)'{-0,--null}'[print NUL byte after file names]'
'(-o -r --only-matching --passthrough --passthru --replace)'{-o,--only-matching}'[show only matching part of each line]'
'(-c -o -r --count --only-matching --passthrough --replace)--passthru[show both matching and non-matching lines]'
'!(-c -o -r --count --only-matching --passthru --replace)--passthrough'
'--path-separator=[specify path separator to use when printing file names]:separator'
'(-p --heading --no-heading --pretty --vimgrep)'{-p,--pretty}'[alias for --color=always --heading -n]'
'(-q --quiet)'{-q,--quiet}'[suppress normal output]'
'--regex-size-limit=[specify upper size limit of compiled regex]:regex size'
'(1 -f --file)*'{-e+,--regexp=}'[specify pattern]:pattern'
'(-c -o -r --count --only-matching --passthrough --passthru --replace)'{-r+,--replace=}'[specify string used to replace matches]:replace string'
'(-i -s -S --ignore-case --case-sensitive --smart-case)'{-S,--smart-case}'[search case-insensitively if the pattern is all lowercase]'
'(-j --threads)--sort-files[sort results by file path (disables parallelism)]'
'(-a --text)'{-a,--text}'[search binary files as if they were text]'
'(-j --sort-files --threads)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
$no'--ignore-vcs[respect version control ignore files]'
+ '(require-git)' # git specific settings
"--no-require-git[don't require git repository to respect gitignore rules]"
$no'--require-git[require git repository to respect gitignore rules]'
+ '(ignore-dot)' # .ignore options
"--no-ignore-dot[don't respect .ignore files]"
$no'--ignore-dot[respect .ignore files]'
+ '(ignore-files)' # custom global ignore file options
"--no-ignore-files[don't respect --ignore-file flags]"
$no'--ignore-files[respect --ignore-file files]'
+ '(json)' # JSON options
'--json[output results in JSON Lines format]'
$no"--no-json[don't output results in JSON Lines format]"
+ '(line-number)' # Line-number options
{-n,--line-number}'[show line numbers for matches]'
{-N,--no-line-number}"[don't show line numbers for matches]"
+ '(line-terminator)' # Line-terminator options
'--crlf[use CRLF as line terminator]'
$no"--no-crlf[don't use CRLF as line terminator]"
'(text)--null-data[use NUL as line terminator]'
+ '(max-columns-preview)' # max column preview options
'--max-columns-preview[show preview for long lines (with -M)]'
$no"--no-max-columns-preview[don't show preview for long lines (with -M)]"
+ '(max-depth)' # Directory-depth options
'--max-depth=[specify max number of directories to descend]:number of directories'
'!--maxdepth=:number of directories'
+ '(messages)' # Error-message options
'(--no-ignore-messages)--no-messages[suppress some error messages]'
$no"--messages[don't suppress error messages affected by --no-messages]"
+ '(messages-ignore)' # Ignore-error message options
"--no-ignore-messages[don't show ignore-file parse error messages]"
$no'--ignore-messages[show ignore-file parse error messages]'
+ '(mmap)' # mmap options
'--mmap[search using memory maps when possible]'
"--no-mmap[don't search using memory maps]"
+ '(multiline)' # Multiline options
{-U,--multiline}'[permit matching across multiple lines]'
$no'(multiline-dotall)--no-multiline[restrict matches to at most one line each]'
+ '(multiline-dotall)' # Multiline DOTALL options
'(--no-multiline)--multiline-dotall[allow "." to match newline (with -U)]'
$no"(--no-multiline)--no-multiline-dotall[don't allow \".\" to match newline (with -U)]"
+ '(only)' # Only-match options
{-o,--only-matching}'[show only matching part of each line]'
+ '(passthru)' # Pass-through options
'(--vimgrep)--passthru[show both matching and non-matching lines]'
'!(--vimgrep)--passthrough'
+ '(pcre2)' # PCRE2 options
{-P,--pcre2}'[enable matching with PCRE2]'
$no'(pcre2-unicode)--no-pcre2[disable matching with PCRE2]'
+ '(pcre2-unicode)' # PCRE2 Unicode options
$no'(--no-pcre2 --no-pcre2-unicode)--pcre2-unicode[enable PCRE2 Unicode mode (with -P)]'
'(--no-pcre2 --pcre2-unicode)--no-pcre2-unicode[disable PCRE2 Unicode mode (with -P)]'
+ '(pre)' # Preprocessing options
'(-z --search-zip)--pre=[specify preprocessor utility]:preprocessor utility:_command_names -e'
$no'--no-pre[disable preprocessor utility]'
+ pre-glob # Preprocessing glob options
'*--pre-glob[include/exclude files for preprocessing with --pre]'
+ '(pretty-vimgrep)' # Pretty/vimgrep display options
'(heading)'{-p,--pretty}'[alias for --color=always --heading -n]'
'(heading passthru)--vimgrep[show results in vim-compatible format]'
+ regexp # Explicit pattern options
'(1 file)*'{-e+,--regexp=}'[specify pattern]:pattern'
+ '(replace)' # Replacement options
{-r+,--replace=}'[specify string used to replace matches]:replace string'
+ '(sort)' # File-sorting options
'(threads)--sort=[sort results in ascending order (disables parallelism)]:sort method:((
none\:"no sorting"
path\:"sort by file path"
modified\:"sort by last modified time"
accessed\:"sort by last accessed time"
created\:"sort by creation time"
))'
'(threads)--sortr=[sort results in descending order (disables parallelism)]:sort method:((
none\:"no sorting"
path\:"sort by file path"
modified\:"sort by last modified time"
accessed\:"sort by last accessed time"
created\:"sort by creation time"
))'
'!(threads)--sort-files[sort results by file path (disables parallelism)]'
+ '(stats)' # Statistics options
'(--files file-match)--stats[show search statistics]'
$no"--no-stats[don't show search statistics]"
+ '(text)' # Binary-search options
{-a,--text}'[search binary files as if they were text]'
"--binary[search binary files, don't print binary data]"
$no"--no-binary[don't search binary files]"
$no"(--null-data)--no-text[don't search binary files as if they were text]"
+ '(threads)' # Thread-count options
'(sort)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
+ '(trim)' # Trim options
'--trim[trim any ASCII whitespace prefix from each line]'
$no"--no-trim[don't trim ASCII whitespace prefix from each line]"
+ type # Type options
'*'{-t+,--type=}'[only search files matching specified type]: :_rg_types'
'*--type-add=[add new glob for file type]: :->typespec'
'*--type-add=[add new glob for specified file type]: :->typespec'
'*--type-clear=[clear globs previously defined for specified file type]: :_rg_types'
# This should actually be exclusive with everything but other type options
'(:)--type-list[show all supported file types and their associated globs]'
'*'{-T+,--type-not=}"[don't search files matching specified type]: :_rg_types"
'(: *)--type-list[show all supported file types and their associated globs]'
'*'{-T+,--type-not=}"[don't search files matching specified file type]: :_rg_types"
+ '(word-line)' # Whole-word/line match options
{-w,--word-regexp}'[only show matches surrounded by word boundaries]'
{-x,--line-regexp}'[only show matches surrounded by line boundaries]'
+ '(unicode)' # Unicode options
$no'--unicode[enable Unicode mode]'
'--no-unicode[disable Unicode mode]'
+ '(zip)' # Compression options
'(--pre)'{-z,--search-zip}'[search in compressed files]'
$no"--no-search-zip[don't search in compressed files]"
+ misc # Other options — no need to separate these at the moment
'(-b --byte-offset)'{-b,--byte-offset}'[show 0-based byte offset for each matching line]'
'--color=[specify when to use colors in output]:when:((
never\:"never use colors"
auto\:"use colors or not based on stdout, TERM, etc."
always\:"always use colors"
ansi\:"always use ANSI colors (even on Windows)"
))'
'*--colors=[specify color and style settings]: :->colorspec'
'--context-separator=[specify string used to separate non-continuous context lines in output]:separator'
$no"--no-context-separator[don't print context separators]"
'--debug[show debug messages]'
'--trace[show more verbose debug messages]'
'--dfa-size-limit=[specify upper size limit of generated DFA]:DFA size (bytes)'
"(1 stats)--files[show each file that would be searched (but don't search)]"
'*--ignore-file=[specify additional ignore file]:ignore file:_files'
'(-v --invert-match)'{-v,--invert-match}'[invert matching]'
'(-M --max-columns)'{-M+,--max-columns=}'[specify max length of lines to print]:number of bytes'
'(-m --max-count)'{-m+,--max-count=}'[specify max number of matches per file]:number of matches'
'--max-filesize=[specify size above which files should be ignored]:file size (bytes)'
"--no-config[don't load configuration files]"
'(-0 --null)'{-0,--null}'[print NUL byte after file names]'
'--path-separator=[specify path separator to use when printing file names]:separator'
'(-q --quiet)'{-q,--quiet}'[suppress normal output]'
'--regex-size-limit=[specify upper size limit of compiled regex]:regex size (bytes)'
'*'{-u,--unrestricted}'[reduce level of "smart" searching]'
'(: -)'{-V,--version}'[display version information]'
'(-p --heading --no-heading --pretty)--vimgrep[show results in vim-compatible format]'
'(-H --no-filename --with-filename)'{-H,--with-filename}'[display the file name for matches]'
'(-w -x --line-regexp --word-regexp)'{-w,--word-regexp}'[only show matches surrounded by word boundaries]'
'(-e -f --file --files --regexp --type-list)1: :_rg_pattern'
'(--type-list)*:file:_files'
'(-z --search-zip)'{-z,--search-zip}'[search in compressed files]'
+ operand # Operands
'(--files --type-list file regexp)1: :_guard "^-*" pattern'
'(--type-list)*: :_files'
)
[[ ${_RG_COMPLETE_LIST_ARGS:-} == (1|t*|y*) ]] && {
printf '%s\n' "${rg_args[@]}"
# This is used with test-complete to verify that there are no options
# listed in the help output that aren't also defined here
[[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] && {
print -rl - $args
return 0
}
_arguments -s -S : "${rg_args[@]}" && return 0
# Strip out argument groups where unsupported (see above)
[[ $ZSH_VERSION == (4|5.<0-3>)(.*)# ]] &&
args=( ${(@)args:#(#i)(+|[a-z0-9][a-z0-9_-]#|\([a-z0-9][a-z0-9_-]#\))} )
while (( $#state )); do
case "${state[1]}" in
colorspec)
# @todo I don't like this because it allows you to do weird things like
# `line:line:bg:`. Also, i would like the `compadd -q` behaviour
[[ -prefix *:none: ]] && return 1
[[ -prefix *:*:*:* ]] && return 1
_arguments -C -s -S : $args && ret=0
_values -S ':' 'color/style type' \
'column[specify coloring for column numbers]: :->attribute' \
'line[specify coloring for line numbers]: :->attribute' \
'match[specify coloring for match text]: :->attribute' \
'path[specify color for file names]: :->attribute' && return 0
case $state in
colorspec)
if [[ ${IPREFIX#--*=}$PREFIX == [^:]# ]]; then
suf=( -qS: )
tmp=(
'column:specify coloring for column numbers'
'line:specify coloring for line numbers'
'match:specify coloring for match text'
'path:specify coloring for file names'
)
descr='color/style type'
elif [[ ${IPREFIX#--*=}$PREFIX == (column|line|match|path):[^:]# ]]; then
suf=( -qS: )
tmp=(
'none:clear color/style for type'
'bg:specify background color'
'fg:specify foreground color'
'style:specify text style'
)
descr='color/style attribute'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:(bg|fg):[^:]# ]]; then
tmp=( black blue green red cyan magenta yellow white )
descr='color name or r,g,b'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:style:[^:]# ]]; then
tmp=( {,no}bold {,no}intense {,no}underline )
descr='style name'
else
_message -e colorspec 'no more arguments'
fi
[[ "${state}" == 'attribute' ]] &&
_values -S ':' 'color/style attribute' \
'none[clear color/style for type]' \
'bg[specify background color]: :->color' \
'fg[specify foreground color]: :->color' \
'style[specify text style]: :->style' && return 0
(( $#tmp )) && {
compset -P '*:'
_describe -t colorspec $descr tmp $suf && ret=0
}
;;
[[ "${state}" == 'color' ]] &&
_values -S ':' 'color value' \
black blue green red cyan magenta yellow white && return 0
typespec)
if compset -P '[^:]##:include:'; then
_sequence -s , _rg_types && ret=0
# @todo This bit in particular could be better, but it's a little
# complex, and attempting to solve it seems to run us up against a crash
# bug — zsh # 40362
elif compset -P '[^:]##:'; then
_message 'glob or include directive' && ret=1
elif [[ ! -prefix *:* ]]; then
_rg_types -qS : && ret=0
fi
;;
esac
[[ "${state}" == 'style' ]] &&
_values -S ':' 'style value' \
bold nobold intense nointense && return 0
;;
typespec)
if compset -P '[^:]##:include:'; then
_sequence -s ',' _rg_types && return 0
# @todo This bit in particular could be better, but it's a little
# complex, and attempting to solve it seems to run us up against a crash
# bug — zsh # 40362
elif compset -P '[^:]##:'; then
_message 'glob or include directive' && return 1
elif [[ ! -prefix *:* ]]; then
_rg_types -qS ':' && return 0
fi
;;
esac
shift state
done
return 1
}
# zsh 5.1 refuses to complete options if a 'match-less' operand like our pattern
# could be 'completed' instead. We can use _guard() to avoid this problem, but
# it introduces another one: zsh won't print the message if we try to complete
# the pattern after having passed `--`. To work around *that* problem, we can
# use this function to bypass the _guard() when `--` is on the command line.
# This is inaccurate (it'd get confused by e.g. `rg -e --`), but zsh's handling
# of `--` isn't accurate anyway
_rg_pattern() {
if (( ${words[(I)--]} )); then
_message 'pattern'
else
_guard '^-*' 'pattern'
fi
return ret
}
# Complete encodings
@@ -192,10 +419,10 @@ _rg_encodings() {
shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
windows-{31j,874,949,125{0..8}} dos-874 tis-620 ansi_x3.4-1968
x-user-defined auto
x-user-defined auto none
)
_wanted rg-encodings expl 'encoding' compadd -a "${@}" - _encodings
_wanted encodings expl encoding compadd -a "$@" - _encodings
}
# Complete file types
@@ -203,12 +430,163 @@ _rg_types() {
local -a expl
local -aU _types
_types=( ${${(f)"$( _call_program rg-types rg --type-list )"}%%:*} )
_types=( ${(@)${(f)"$( _call_program types rg --type-list )"}%%:*} )
_wanted rg-types expl 'file type' compadd -a "${@}" - _types
_wanted types expl 'file type' compadd -a "$@" - _types
}
_rg "${@}"
_rg "$@"
################################################################################
# ZSH COMPLETION REFERENCE
#
# For the convenience of developers who aren't especially familiar with zsh
# completion functions, a brief reference guide follows. This is in no way
# comprehensive; it covers just enough of the basic structure, syntax, and
# conventions to help someone make simple changes like adding new options. For
# more complete documentation regarding zsh completion functions, please see the
# following:
#
# * http://zsh.sourceforge.net/Doc/Release/Completion-System.html
# * https://github.com/zsh-users/zsh/blob/master/Etc/completion-style-guide
#
# OVERVIEW
#
# Most zsh completion functions are defined in terms of `_arguments`, which is a
# shell function that takes a series of argument specifications. The specs for
# `rg` are stored in an array, which is common for more complex functions; the
# elements of the array are passed to `_arguments` on invocation.
#
# ARGUMENT-SPECIFICATION SYNTAX
#
# The following is a contrived example of the argument specs for a simple tool:
#
# '(: * -)'{-h,--help}'[display help information]'
# '(-q -v --quiet --verbose)'{-q,--quiet}'[decrease output verbosity]'
# '!(-q -v --quiet --verbose)--silent'
# '(-q -v --quiet --verbose)'{-v,--verbose}'[increase output verbosity]'
# '--color=[specify when to use colors]:when:(always never auto)'
# '*:example file:_files'
#
# Although there may appear to be six specs here, there are actually nine; we
# use brace expansion to combine specs for options that go by multiple names,
# like `-q` and `--quiet`. This is customary, and ties in with the fact that zsh
# merges completion possibilities together when they have the same description.
#
# The first line defines the option `-h`/`--help`. With most tools, it isn't
# useful to complete anything after `--help` because it effectively overrides
# all others; the `(: * -)` at the beginning of the spec tells zsh not to
# complete any other operands (`:` and `*`) or options (`-`) after this one has
# been used. The `[...]` at the end associates a description with `-h`/`--help`;
# as mentioned, zsh will see the identical descriptions and merge these options
# together when offering completion possibilities.
#
# The next line defines `-q`/`--quiet`. Here we don't want to suppress further
# completions entirely, but we don't want to offer `-q` if `--quiet` has been
# given (since they do the same thing), nor do we want to offer `-v` (since it
# doesn't make sense to be quiet and verbose at the same time). We don't need to
# tell zsh not to offer `--quiet` a second time, since that's the default
# behaviour, but since this line expands to two specs describing `-q` *and*
# `--quiet` we do need to explicitly list all of them here.
#
# The next line defines a hidden option `--silent` — maybe it's a deprecated
# synonym for `--quiet`. The leading `!` indicates that zsh shouldn't offer this
# option during completion. The benefit of providing a spec for an option that
# shouldn't be completed is that, if someone *does* use it, we can correctly
# suppress completion of other options afterwards.
#
# The next line defines `-v`/`--verbose`; this works just like `-q`/`--quiet`.
#
# The next line defines `--color`. In this example, `--color` doesn't have a
# corresponding short option, so we don't need to use brace expansion. Further,
# there are no other options it's exclusive with (just itself), so we don't need
# to define those at the beginning. However, it does take a mandatory argument.
# The `=` at the end of `--color=` indicates that the argument may appear either
# like `--color always` or like `--color=always`; this is how most GNU-style
# command-line tools work. The corresponding short option would normally use `+`
# — for example, `-c+` would allow either `-c always` or `-calways`. For this
# option, the arguments are known ahead of time, so we can simply list them in
# parentheses at the end (`when` is used as the description for the argument).
#
# The last line defines an operand (a non-option argument). In this example, the
# operand can be used any number of times (the leading `*`), and it should be a
# file path, so we tell zsh to call the `_files` function to complete it. The
# `example file` in the middle is the description to use for this operand; we
# could use a space instead to accept the default provided by `_files`.
#
# GROUPING ARGUMENT SPECIFICATIONS
#
# Newer versions of zsh support grouping argument specs together. All specs
# following a `+` and then a group name are considered to be members of the
# named group. Grouping is useful mostly for organisational purposes; it makes
# the relationship between different options more obvious, and makes it easier
# to specify exclusions.
#
# We could rewrite our example above using grouping as follows:
#
# '(: * -)'{-h,--help}'[display help information]'
# '--color=[specify when to use colors]:when:(always never auto)'
# '*:example file:_files'
# + '(verbosity)'
# {-q,--quiet}'[decrease output verbosity]'
# '!--silent'
# {-v,--verbose}'[increase output verbosity]'
#
# Here we take advantage of a useful feature of spec grouping — when the group
# name is surrounded by parentheses, as in `(verbosity)`, it tells zsh that all
# of the options in that group are exclusive with each other. As a result, we
# don't need to manually list out the exclusions at the beginning of each
# option.
#
# Groups can also be referred to by name in other argument specs; for example:
#
# '(xyz)--aaa' '*: :_files'
# + xyz --xxx --yyy --zzz
#
# Here we use the group name `xyz` to tell zsh that `--xxx`, `--yyy`, and
# `--zzz` are not to be completed after `--aaa`. This makes the exclusion list
# much more compact and reusable.
#
# CONVENTIONS
#
# zsh completion functions generally adhere to the following conventions:
#
# * Use two spaces for indentation
# * Combine specs for options with different names using brace expansion
# * In combined specs, list the short option first (as in `{-a,--text}`)
# * Use `+` or `=` as described above for options that take arguments
# * Provide a description for all options, option-arguments, and operands
# * Capitalise/punctuate argument descriptions as phrases, not complete
# sentences — 'display help information', never 'Display help information.'
# (but still capitalise acronyms and proper names)
# * Write argument descriptions as verb phrases — 'display x', 'enable y',
# 'use z'
# * Word descriptions to make it clear when an option expects an argument;
# usually this is done with the word 'specify', as in 'specify x' or
# 'use specified x')
# * Write argument descriptions as tersely as possible — for example, articles
# like 'a' and 'the' should be omitted unless it would be confusing
#
# Other conventions currently used by this function:
#
# * Order argument specs alphabetically by group name, then option name
# * Group options that are directly related, mutually exclusive, or frequently
# referenced by other argument specs
# * Use only characters in the set [a-z0-9_-] in group names
# * Order exclusion lists as follows: short options, long options, groups
# * Use American English in descriptions
# * Use 'don't' in descriptions instead of 'do not'
# * Word descriptions for related options as similarly as possible. For example,
# `--foo[enable foo]` and `--no-foo[disable foo]`, or `--foo[use foo]` and
# `--no-foo[don't use foo]`
# * Word descriptions to make it clear when an option only makes sense with
# another option, usually by adding '(with -x)' to the end
# * Don't quote strings or variables unnecessarily. When quotes are required,
# prefer single-quotes to double-quotes
# * Prefix option specs with `$no` when the option serves only to negate the
# behaviour of another option that must be provided explicitly by the user.
# This prevents rarely used options from cluttering up the completion menu
################################################################################
# ------------------------------------------------------------------------------
# Copyright (c) 2011 Github zsh-users - http://github.com/zsh-users

26
crates/cli/Cargo.toml Normal file
View File

@@ -0,0 +1,26 @@
[package]
name = "grep-cli"
version = "0.1.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Utilities for search oriented command line applications.
"""
documentation = "https://docs.rs/grep-cli"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/cli"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/cli"
readme = "README.md"
keywords = ["regex", "grep", "cli", "utility", "util"]
license = "Unlicense/MIT"
[dependencies]
atty = "0.2.11"
bstr = "0.2.0"
globset = { version = "0.4.3", path = "../globset" }
lazy_static = "1.1.0"
log = "0.4.5"
regex = "1.1"
same-file = "1.0.4"
termcolor = "1.0.4"
[target.'cfg(windows)'.dependencies.winapi-util]
version = "0.1.1"

38
crates/cli/README.md Normal file
View File

@@ -0,0 +1,38 @@
grep-cli
--------
A utility library that provides common routines desired in search oriented
command line applications. This includes, but is not limited to, parsing hex
escapes, detecting whether stdin is readable and more. To the extent possible,
this crate strives for compatibility across Windows, macOS and Linux.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/grep-cli.svg)](https://crates.io/crates/grep-cli)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/grep-cli](https://docs.rs/grep-cli)
**NOTE:** You probably don't want to use this crate directly. Instead, you
should prefer the facade defined in the
[`grep`](https://docs.rs/grep)
crate.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep-cli = "0.1"
```
and this to your crate root:
```rust
extern crate grep_cli;
```

View File

@@ -0,0 +1,376 @@
use std::ffi::{OsStr, OsString};
use std::fs::File;
use std::io;
use std::path::Path;
use std::process::Command;
use globset::{Glob, GlobSet, GlobSetBuilder};
use process::{CommandError, CommandReader, CommandReaderBuilder};
/// A builder for a matcher that determines which files get decompressed.
#[derive(Clone, Debug)]
pub struct DecompressionMatcherBuilder {
/// The commands for each matching glob.
commands: Vec<DecompressionCommand>,
/// Whether to include the default matching rules.
defaults: bool,
}
/// A representation of a single command for decompressing data
/// out-of-proccess.
#[derive(Clone, Debug)]
struct DecompressionCommand {
/// The glob that matches this command.
glob: String,
/// The command or binary name.
bin: OsString,
/// The arguments to invoke with the command.
args: Vec<OsString>,
}
impl Default for DecompressionMatcherBuilder {
fn default() -> DecompressionMatcherBuilder {
DecompressionMatcherBuilder::new()
}
}
impl DecompressionMatcherBuilder {
/// Create a new builder for configuring a decompression matcher.
pub fn new() -> DecompressionMatcherBuilder {
DecompressionMatcherBuilder { commands: vec![], defaults: true }
}
/// Build a matcher for determining how to decompress files.
///
/// If there was a problem compiling the matcher, then an error is
/// returned.
pub fn build(&self) -> Result<DecompressionMatcher, CommandError> {
let defaults = if !self.defaults {
vec![]
} else {
default_decompression_commands()
};
let mut glob_builder = GlobSetBuilder::new();
let mut commands = vec![];
for decomp_cmd in defaults.iter().chain(&self.commands) {
let glob = Glob::new(&decomp_cmd.glob).map_err(|err| {
CommandError::io(io::Error::new(io::ErrorKind::Other, err))
})?;
glob_builder.add(glob);
commands.push(decomp_cmd.clone());
}
let globs = glob_builder.build().map_err(|err| {
CommandError::io(io::Error::new(io::ErrorKind::Other, err))
})?;
Ok(DecompressionMatcher { globs, commands })
}
/// When enabled, the default matching rules will be compiled into this
/// matcher before any other associations. When disabled, only the
/// rules explicitly given to this builder will be used.
///
/// This is enabled by default.
pub fn defaults(&mut self, yes: bool) -> &mut DecompressionMatcherBuilder {
self.defaults = yes;
self
}
/// Associates a glob with a command to decompress files matching the glob.
///
/// If multiple globs match the same file, then the most recently added
/// glob takes precedence.
///
/// The syntax for the glob is documented in the
/// [`globset` crate](https://docs.rs/globset/#syntax).
pub fn associate<P, I, A>(
&mut self,
glob: &str,
program: P,
args: I,
) -> &mut DecompressionMatcherBuilder
where
P: AsRef<OsStr>,
I: IntoIterator<Item = A>,
A: AsRef<OsStr>,
{
let glob = glob.to_string();
let bin = program.as_ref().to_os_string();
let args =
args.into_iter().map(|a| a.as_ref().to_os_string()).collect();
self.commands.push(DecompressionCommand { glob, bin, args });
self
}
}
/// A matcher for determining how to decompress files.
#[derive(Clone, Debug)]
pub struct DecompressionMatcher {
/// The set of globs to match. Each glob has a corresponding entry in
/// `commands`. When a glob matches, the corresponding command should be
/// used to perform out-of-process decompression.
globs: GlobSet,
/// The commands for each matching glob.
commands: Vec<DecompressionCommand>,
}
impl Default for DecompressionMatcher {
fn default() -> DecompressionMatcher {
DecompressionMatcher::new()
}
}
impl DecompressionMatcher {
/// Create a new matcher with default rules.
///
/// To add more matching rules, build a matcher with
/// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html).
pub fn new() -> DecompressionMatcher {
DecompressionMatcherBuilder::new()
.build()
.expect("built-in matching rules should always compile")
}
/// Return a pre-built command based on the given file path that can
/// decompress its contents. If no such decompressor is known, then this
/// returns `None`.
///
/// If there are multiple possible commands matching the given path, then
/// the command added last takes precedence.
pub fn command<P: AsRef<Path>>(&self, path: P) -> Option<Command> {
for i in self.globs.matches(path).into_iter().rev() {
let decomp_cmd = &self.commands[i];
let mut cmd = Command::new(&decomp_cmd.bin);
cmd.args(&decomp_cmd.args);
return Some(cmd);
}
None
}
/// Returns true if and only if the given file path has at least one
/// matching command to perform decompression on.
pub fn has_command<P: AsRef<Path>>(&self, path: P) -> bool {
self.globs.is_match(path)
}
}
/// Configures and builds a streaming reader for decompressing data.
#[derive(Clone, Debug, Default)]
pub struct DecompressionReaderBuilder {
matcher: DecompressionMatcher,
command_builder: CommandReaderBuilder,
}
impl DecompressionReaderBuilder {
/// Create a new builder with the default configuration.
pub fn new() -> DecompressionReaderBuilder {
DecompressionReaderBuilder::default()
}
/// Build a new streaming reader for decompressing data.
///
/// If decompression is done out-of-process and if there was a problem
/// spawning the process, then its error is logged at the debug level and a
/// passthru reader is returned that does no decompression. This behavior
/// typically occurs when the given file path matches a decompression
/// command, but is executing in an environment where the decompression
/// command is not available.
///
/// If the given file path could not be matched with a decompression
/// strategy, then a passthru reader is returned that does no
/// decompression.
pub fn build<P: AsRef<Path>>(
&self,
path: P,
) -> Result<DecompressionReader, CommandError> {
let path = path.as_ref();
let mut cmd = match self.matcher.command(path) {
None => return DecompressionReader::new_passthru(path),
Some(cmd) => cmd,
};
cmd.arg(path);
match self.command_builder.build(&mut cmd) {
Ok(cmd_reader) => Ok(DecompressionReader { rdr: Ok(cmd_reader) }),
Err(err) => {
debug!(
"{}: error spawning command '{:?}': {} \
(falling back to uncompressed reader)",
path.display(),
cmd,
err,
);
DecompressionReader::new_passthru(path)
}
}
}
/// Set the matcher to use to look up the decompression command for each
/// file path.
///
/// A set of sensible rules is enabled by default. Setting this will
/// completely replace the current rules.
pub fn matcher(
&mut self,
matcher: DecompressionMatcher,
) -> &mut DecompressionReaderBuilder {
self.matcher = matcher;
self
}
/// Get the underlying matcher currently used by this builder.
pub fn get_matcher(&self) -> &DecompressionMatcher {
&self.matcher
}
/// When enabled, the reader will asynchronously read the contents of the
/// command's stderr output. When disabled, stderr is only read after the
/// stdout stream has been exhausted (or if the process quits with an error
/// code).
///
/// Note that when enabled, this may require launching an additional
/// thread in order to read stderr. This is done so that the process being
/// executed is never blocked from writing to stdout or stderr. If this is
/// disabled, then it is possible for the process to fill up the stderr
/// buffer and deadlock.
///
/// This is enabled by default.
pub fn async_stderr(
&mut self,
yes: bool,
) -> &mut DecompressionReaderBuilder {
self.command_builder.async_stderr(yes);
self
}
}
/// A streaming reader for decompressing the contents of a file.
///
/// The purpose of this reader is to provide a seamless way to decompress the
/// contents of file using existing tools in the current environment. This is
/// meant to be an alternative to using decompression libraries in favor of the
/// simplicity and portability of using external commands such as `gzip` and
/// `xz`. This does impose the overhead of spawning a process, so other means
/// for performing decompression should be sought if this overhead isn't
/// acceptable.
///
/// A decompression reader comes with a default set of matching rules that are
/// meant to associate file paths with the corresponding command to use to
/// decompress them. For example, a glob like `*.gz` matches gzip compressed
/// files with the command `gzip -d -c`. If a file path does not match any
/// existing rules, or if it matches a rule whose command does not exist in the
/// current environment, then the decompression reader passes through the
/// contents of the underlying file without doing any decompression.
///
/// The default matching rules are probably good enough for most cases, and if
/// they require revision, pull requests are welcome. In cases where they must
/// be changed or extended, they can be customized through the use of
/// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html)
/// and
/// [`DecompressionReaderBuilder`](struct.DecompressionReaderBuilder.html).
///
/// By default, this reader will asynchronously read the processes' stderr.
/// This prevents subtle deadlocking bugs for noisy processes that write a lot
/// to stderr. Currently, the entire contents of stderr is read on to the heap.
///
/// # Example
///
/// This example shows how to read the decompressed contents of a file without
/// needing to explicitly choose the decompression command to run.
///
/// Note that if you need to decompress multiple files, it is better to use
/// `DecompressionReaderBuilder`, which will amortize the cost of compiling the
/// matcher.
///
/// ```no_run
/// use std::io::Read;
/// use std::process::Command;
/// use grep_cli::DecompressionReader;
///
/// # fn example() -> Result<(), Box<::std::error::Error>> {
/// let mut rdr = DecompressionReader::new("/usr/share/man/man1/ls.1.gz")?;
/// let mut contents = vec![];
/// rdr.read_to_end(&mut contents)?;
/// # Ok(()) }
/// ```
#[derive(Debug)]
pub struct DecompressionReader {
rdr: Result<CommandReader, File>,
}
impl DecompressionReader {
/// Build a new streaming reader for decompressing data.
///
/// If decompression is done out-of-process and if there was a problem
/// spawning the process, then its error is returned.
///
/// If the given file path could not be matched with a decompression
/// strategy, then a passthru reader is returned that does no
/// decompression.
///
/// This uses the default matching rules for determining how to decompress
/// the given file. To change those matching rules, use
/// [`DecompressionReaderBuilder`](struct.DecompressionReaderBuilder.html)
/// and
/// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html).
///
/// When creating readers for many paths. it is better to use the builder
/// since it will amortize the cost of constructing the matcher.
pub fn new<P: AsRef<Path>>(
path: P,
) -> Result<DecompressionReader, CommandError> {
DecompressionReaderBuilder::new().build(path)
}
/// Creates a new "passthru" decompression reader that reads from the file
/// corresponding to the given path without doing decompression and without
/// executing another process.
fn new_passthru(path: &Path) -> Result<DecompressionReader, CommandError> {
let file = File::open(path)?;
Ok(DecompressionReader { rdr: Err(file) })
}
}
impl io::Read for DecompressionReader {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
match self.rdr {
Ok(ref mut rdr) => rdr.read(buf),
Err(ref mut rdr) => rdr.read(buf),
}
}
}
fn default_decompression_commands() -> Vec<DecompressionCommand> {
const ARGS_GZIP: &[&str] = &["gzip", "-d", "-c"];
const ARGS_BZIP: &[&str] = &["bzip2", "-d", "-c"];
const ARGS_XZ: &[&str] = &["xz", "-d", "-c"];
const ARGS_LZ4: &[&str] = &["lz4", "-d", "-c"];
const ARGS_LZMA: &[&str] = &["xz", "--format=lzma", "-d", "-c"];
const ARGS_BROTLI: &[&str] = &["brotli", "-d", "-c"];
const ARGS_ZSTD: &[&str] = &["zstd", "-q", "-d", "-c"];
fn cmd(glob: &str, args: &[&str]) -> DecompressionCommand {
DecompressionCommand {
glob: glob.to_string(),
bin: OsStr::new(&args[0]).to_os_string(),
args: args
.iter()
.skip(1)
.map(|s| OsStr::new(s).to_os_string())
.collect(),
}
}
vec![
cmd("*.gz", ARGS_GZIP),
cmd("*.tgz", ARGS_GZIP),
cmd("*.bz2", ARGS_BZIP),
cmd("*.tbz2", ARGS_BZIP),
cmd("*.xz", ARGS_XZ),
cmd("*.txz", ARGS_XZ),
cmd("*.lz4", ARGS_LZ4),
cmd("*.lzma", ARGS_LZMA),
cmd("*.br", ARGS_BROTLI),
cmd("*.zst", ARGS_ZSTD),
cmd("*.zstd", ARGS_ZSTD),
]
}

272
crates/cli/src/escape.rs Normal file
View File

@@ -0,0 +1,272 @@
use std::ffi::OsStr;
use std::str;
use bstr::{ByteSlice, ByteVec};
/// A single state in the state machine used by `unescape`.
#[derive(Clone, Copy, Eq, PartialEq)]
enum State {
/// The state after seeing a `\`.
Escape,
/// The state after seeing a `\x`.
HexFirst,
/// The state after seeing a `\x[0-9A-Fa-f]`.
HexSecond(char),
/// Default state.
Literal,
}
/// Escapes arbitrary bytes into a human readable string.
///
/// This converts `\t`, `\r` and `\n` into their escaped forms. It also
/// converts the non-printable subset of ASCII in addition to invalid UTF-8
/// bytes to hexadecimal escape sequences. Everything else is left as is.
///
/// The dual of this routine is [`unescape`](fn.unescape.html).
///
/// # Example
///
/// This example shows how to convert a byte string that contains a `\n` and
/// invalid UTF-8 bytes into a `String`.
///
/// Pay special attention to the use of raw strings. That is, `r"\n"` is
/// equivalent to `"\\n"`.
///
/// ```
/// use grep_cli::escape;
///
/// assert_eq!(r"foo\nbar\xFFbaz", escape(b"foo\nbar\xFFbaz"));
/// ```
pub fn escape(bytes: &[u8]) -> String {
let mut escaped = String::new();
for (s, e, ch) in bytes.char_indices() {
if ch == '\u{FFFD}' {
for b in bytes[s..e].bytes() {
escape_byte(b, &mut escaped);
}
} else {
escape_char(ch, &mut escaped);
}
}
escaped
}
/// Escapes an OS string into a human readable string.
///
/// This is like [`escape`](fn.escape.html), but accepts an OS string.
pub fn escape_os(string: &OsStr) -> String {
escape(Vec::from_os_str_lossy(string).as_bytes())
}
/// Unescapes a string.
///
/// It supports a limited set of escape sequences:
///
/// * `\t`, `\r` and `\n` are mapped to their corresponding ASCII bytes.
/// * `\xZZ` hexadecimal escapes are mapped to their byte.
///
/// Everything else is left as is, including non-hexadecimal escapes like
/// `\xGG`.
///
/// This is useful when it is desirable for a command line argument to be
/// capable of specifying arbitrary bytes or otherwise make it easier to
/// specify non-printable characters.
///
/// The dual of this routine is [`escape`](fn.escape.html).
///
/// # Example
///
/// This example shows how to convert an escaped string (which is valid UTF-8)
/// into a corresponding sequence of bytes. Each escape sequence is mapped to
/// its bytes, which may include invalid UTF-8.
///
/// Pay special attention to the use of raw strings. That is, `r"\n"` is
/// equivalent to `"\\n"`.
///
/// ```
/// use grep_cli::unescape;
///
/// assert_eq!(&b"foo\nbar\xFFbaz"[..], &*unescape(r"foo\nbar\xFFbaz"));
/// ```
pub fn unescape(s: &str) -> Vec<u8> {
use self::State::*;
let mut bytes = vec![];
let mut state = Literal;
for c in s.chars() {
match state {
Escape => match c {
'\\' => {
bytes.push(b'\\');
state = Literal;
}
'n' => {
bytes.push(b'\n');
state = Literal;
}
'r' => {
bytes.push(b'\r');
state = Literal;
}
't' => {
bytes.push(b'\t');
state = Literal;
}
'x' => {
state = HexFirst;
}
c => {
bytes.extend(format!(r"\{}", c).into_bytes());
state = Literal;
}
},
HexFirst => match c {
'0'..='9' | 'A'..='F' | 'a'..='f' => {
state = HexSecond(c);
}
c => {
bytes.extend(format!(r"\x{}", c).into_bytes());
state = Literal;
}
},
HexSecond(first) => match c {
'0'..='9' | 'A'..='F' | 'a'..='f' => {
let ordinal = format!("{}{}", first, c);
let byte = u8::from_str_radix(&ordinal, 16).unwrap();
bytes.push(byte);
state = Literal;
}
c => {
let original = format!(r"\x{}{}", first, c);
bytes.extend(original.into_bytes());
state = Literal;
}
},
Literal => match c {
'\\' => {
state = Escape;
}
c => {
bytes.extend(c.to_string().as_bytes());
}
},
}
}
match state {
Escape => bytes.push(b'\\'),
HexFirst => bytes.extend(b"\\x"),
HexSecond(c) => bytes.extend(format!("\\x{}", c).into_bytes()),
Literal => {}
}
bytes
}
/// Unescapes an OS string.
///
/// This is like [`unescape`](fn.unescape.html), but accepts an OS string.
///
/// Note that this first lossily decodes the given OS string as UTF-8. That
/// is, an escaped string (the thing given) should be valid UTF-8.
pub fn unescape_os(string: &OsStr) -> Vec<u8> {
unescape(&string.to_string_lossy())
}
/// Adds the given codepoint to the given string, escaping it if necessary.
fn escape_char(cp: char, into: &mut String) {
if cp.is_ascii() {
escape_byte(cp as u8, into);
} else {
into.push(cp);
}
}
/// Adds the given byte to the given string, escaping it if necessary.
fn escape_byte(byte: u8, into: &mut String) {
match byte {
0x21..=0x5B | 0x5D..=0x7D => into.push(byte as char),
b'\n' => into.push_str(r"\n"),
b'\r' => into.push_str(r"\r"),
b'\t' => into.push_str(r"\t"),
b'\\' => into.push_str(r"\\"),
_ => into.push_str(&format!(r"\x{:02X}", byte)),
}
}
#[cfg(test)]
mod tests {
use super::{escape, unescape};
fn b(bytes: &'static [u8]) -> Vec<u8> {
bytes.to_vec()
}
#[test]
fn empty() {
assert_eq!(b(b""), unescape(r""));
assert_eq!(r"", escape(b""));
}
#[test]
fn backslash() {
assert_eq!(b(b"\\"), unescape(r"\\"));
assert_eq!(r"\\", escape(b"\\"));
}
#[test]
fn nul() {
assert_eq!(b(b"\x00"), unescape(r"\x00"));
assert_eq!(r"\x00", escape(b"\x00"));
}
#[test]
fn nl() {
assert_eq!(b(b"\n"), unescape(r"\n"));
assert_eq!(r"\n", escape(b"\n"));
}
#[test]
fn tab() {
assert_eq!(b(b"\t"), unescape(r"\t"));
assert_eq!(r"\t", escape(b"\t"));
}
#[test]
fn carriage() {
assert_eq!(b(b"\r"), unescape(r"\r"));
assert_eq!(r"\r", escape(b"\r"));
}
#[test]
fn nothing_simple() {
assert_eq!(b(b"\\a"), unescape(r"\a"));
assert_eq!(b(b"\\a"), unescape(r"\\a"));
assert_eq!(r"\\a", escape(b"\\a"));
}
#[test]
fn nothing_hex0() {
assert_eq!(b(b"\\x"), unescape(r"\x"));
assert_eq!(b(b"\\x"), unescape(r"\\x"));
assert_eq!(r"\\x", escape(b"\\x"));
}
#[test]
fn nothing_hex1() {
assert_eq!(b(b"\\xz"), unescape(r"\xz"));
assert_eq!(b(b"\\xz"), unescape(r"\\xz"));
assert_eq!(r"\\xz", escape(b"\\xz"));
}
#[test]
fn nothing_hex2() {
assert_eq!(b(b"\\xzz"), unescape(r"\xzz"));
assert_eq!(b(b"\\xzz"), unescape(r"\\xzz"));
assert_eq!(r"\\xzz", escape(b"\\xzz"));
}
#[test]
fn invalid_utf8() {
assert_eq!(r"\xFF", escape(b"\xFF"));
assert_eq!(r"a\xFFb", escape(b"a\xFFb"));
}
}

165
crates/cli/src/human.rs Normal file
View File

@@ -0,0 +1,165 @@
use std::error;
use std::fmt;
use std::io;
use std::num::ParseIntError;
use regex::Regex;
/// An error that occurs when parsing a human readable size description.
///
/// This error provides a end user friendly message describing why the
/// description coudln't be parsed and what the expected format is.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct ParseSizeError {
original: String,
kind: ParseSizeErrorKind,
}
#[derive(Clone, Debug, Eq, PartialEq)]
enum ParseSizeErrorKind {
InvalidFormat,
InvalidInt(ParseIntError),
Overflow,
}
impl ParseSizeError {
fn format(original: &str) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::InvalidFormat,
}
}
fn int(original: &str, err: ParseIntError) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::InvalidInt(err),
}
}
fn overflow(original: &str) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::Overflow,
}
}
}
impl error::Error for ParseSizeError {
fn description(&self) -> &str {
"invalid size"
}
}
impl fmt::Display for ParseSizeError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
use self::ParseSizeErrorKind::*;
match self.kind {
InvalidFormat => write!(
f,
"invalid format for size '{}', which should be a sequence \
of digits followed by an optional 'K', 'M' or 'G' \
suffix",
self.original
),
InvalidInt(ref err) => write!(
f,
"invalid integer found in size '{}': {}",
self.original, err
),
Overflow => write!(f, "size too big in '{}'", self.original),
}
}
}
impl From<ParseSizeError> for io::Error {
fn from(size_err: ParseSizeError) -> io::Error {
io::Error::new(io::ErrorKind::Other, size_err)
}
}
/// Parse a human readable size like `2M` into a corresponding number of bytes.
///
/// Supported size suffixes are `K` (for kilobyte), `M` (for megabyte) and `G`
/// (for gigabyte). If a size suffix is missing, then the size is interpreted
/// as bytes. If the size is too big to fit into a `u64`, then this returns an
/// error.
///
/// Additional suffixes may be added over time.
pub fn parse_human_readable_size(size: &str) -> Result<u64, ParseSizeError> {
lazy_static! {
// Normally I'd just parse something this simple by hand to avoid the
// regex dep, but we bring regex in any way for glob matching, so might
// as well use it.
static ref RE: Regex = Regex::new(r"^([0-9]+)([KMG])?$").unwrap();
}
let caps = match RE.captures(size) {
Some(caps) => caps,
None => return Err(ParseSizeError::format(size)),
};
let value: u64 =
caps[1].parse().map_err(|err| ParseSizeError::int(size, err))?;
let suffix = match caps.get(2) {
None => return Ok(value),
Some(cap) => cap.as_str(),
};
let bytes = match suffix {
"K" => value.checked_mul(1 << 10),
"M" => value.checked_mul(1 << 20),
"G" => value.checked_mul(1 << 30),
// Because if the regex matches this group, it must be [KMG].
_ => unreachable!(),
};
bytes.ok_or_else(|| ParseSizeError::overflow(size))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn suffix_none() {
let x = parse_human_readable_size("123").unwrap();
assert_eq!(123, x);
}
#[test]
fn suffix_k() {
let x = parse_human_readable_size("123K").unwrap();
assert_eq!(123 * (1 << 10), x);
}
#[test]
fn suffix_m() {
let x = parse_human_readable_size("123M").unwrap();
assert_eq!(123 * (1 << 20), x);
}
#[test]
fn suffix_g() {
let x = parse_human_readable_size("123G").unwrap();
assert_eq!(123 * (1 << 30), x);
}
#[test]
fn invalid_empty() {
assert!(parse_human_readable_size("").is_err());
}
#[test]
fn invalid_non_digit() {
assert!(parse_human_readable_size("a").is_err());
}
#[test]
fn invalid_overflow() {
assert!(parse_human_readable_size("9999999999999999G").is_err());
}
#[test]
fn invalid_suffix() {
assert!(parse_human_readable_size("123T").is_err());
}
}

250
crates/cli/src/lib.rs Normal file
View File

@@ -0,0 +1,250 @@
/*!
This crate provides common routines used in command line applications, with a
focus on routines useful for search oriented applications. As a utility
library, there is no central type or function. However, a key focus of this
crate is to improve failure modes and provide user friendly error messages
when things go wrong.
To the best extent possible, everything in this crate works on Windows, macOS
and Linux.
# Standard I/O
The
[`is_readable_stdin`](fn.is_readable_stdin.html),
[`is_tty_stderr`](fn.is_tty_stderr.html),
[`is_tty_stdin`](fn.is_tty_stdin.html)
and
[`is_tty_stdout`](fn.is_tty_stdout.html)
routines query aspects of standard I/O. `is_readable_stdin` determines whether
stdin can be usefully read from, while the `tty` methods determine whether a
tty is attached to stdin/stdout/stderr.
`is_readable_stdin` is useful when writing an application that changes behavior
based on whether the application was invoked with data on stdin. For example,
`rg foo` might recursively search the current working directory for
occurrences of `foo`, but `rg foo < file` might only search the contents of
`file`.
The `tty` methods are useful for similar reasons. Namely, commands like `ls`
will change their output depending on whether they are printing to a terminal
or not. For example, `ls` shows a file on each line when stdout is redirected
to a file or a pipe, but condenses the output to show possibly many files on
each line when stdout is connected to a tty.
# Coloring and buffering
The
[`stdout`](fn.stdout.html),
[`stdout_buffered_block`](fn.stdout_buffered_block.html)
and
[`stdout_buffered_line`](fn.stdout_buffered_line.html)
routines are alternative constructors for
[`StandardStream`](struct.StandardStream.html).
A `StandardStream` implements `termcolor::WriteColor`, which provides a way
to emit colors to terminals. Its key use is the encapsulation of buffering
style. Namely, `stdout` will return a line buffered `StandardStream` if and
only if stdout is connected to a tty, and will otherwise return a block
buffered `StandardStream`. Line buffering is important for use with a tty
because it typically decreases the latency at which the end user sees output.
Block buffering is used otherwise because it is faster, and redirecting stdout
to a file typically doesn't benefit from the decreased latency that line
buffering provides.
The `stdout_buffered_block` and `stdout_buffered_line` can be used to
explicitly set the buffering strategy regardless of whether stdout is connected
to a tty or not.
# Escaping
The
[`escape`](fn.escape.html),
[`escape_os`](fn.escape_os.html),
[`unescape`](fn.unescape.html)
and
[`unescape_os`](fn.unescape_os.html)
routines provide a user friendly way of dealing with UTF-8 encoded strings that
can express arbitrary bytes. For example, you might want to accept a string
containing arbitrary bytes as a command line argument, but most interactive
shells make such strings difficult to type. Instead, we can ask users to use
escape sequences.
For example, `a\xFFz` is itself a valid UTF-8 string corresponding to the
following bytes:
```ignore
[b'a', b'\\', b'x', b'F', b'F', b'z']
```
However, we can
interpret `\xFF` as an escape sequence with the `unescape`/`unescape_os`
routines, which will yield
```ignore
[b'a', b'\xFF', b'z']
```
instead. For example:
```
use grep_cli::unescape;
// Note the use of a raw string!
assert_eq!(vec![b'a', b'\xFF', b'z'], unescape(r"a\xFFz"));
```
The `escape`/`escape_os` routines provide the reverse transformation, which
makes it easy to show user friendly error messages involving arbitrary bytes.
# Building patterns
Typically, regular expression patterns must be valid UTF-8. However, command
line arguments aren't guaranteed to be valid UTF-8. Unfortunately, the
standard library's UTF-8 conversion functions from `OsStr`s do not provide
good error messages. However, the
[`pattern_from_bytes`](fn.pattern_from_bytes.html)
and
[`pattern_from_os`](fn.pattern_from_os.html)
do, including reporting exactly where the first invalid UTF-8 byte is seen.
Additionally, it can be useful to read patterns from a file while reporting
good error messages that include line numbers. The
[`patterns_from_path`](fn.patterns_from_path.html),
[`patterns_from_reader`](fn.patterns_from_reader.html)
and
[`patterns_from_stdin`](fn.patterns_from_stdin.html)
routines do just that. If any pattern is found that is invalid UTF-8, then the
error includes the file path (if available) along with the line number and the
byte offset at which the first invalid UTF-8 byte was observed.
# Read process output
Sometimes a command line application needs to execute other processes and read
its stdout in a streaming fashion. The
[`CommandReader`](struct.CommandReader.html)
provides this functionality with an explicit goal of improving failure modes.
In particular, if the process exits with an error code, then stderr is read
and converted into a normal Rust error to show to end users. This makes the
underlying failure modes explicit and gives more information to end users for
debugging the problem.
As a special case,
[`DecompressionReader`](struct.DecompressionReader.html)
provides a way to decompress arbitrary files by matching their file extensions
up with corresponding decompression programs (such as `gzip` and `xz`). This
is useful as a means of performing simplistic decompression in a portable
manner without binding to specific compression libraries. This does come with
some overhead though, so if you need to decompress lots of small files, this
may not be an appropriate convenience to use.
Each reader has a corresponding builder for additional configuration, such as
whether to read stderr asynchronously in order to avoid deadlock (which is
enabled by default).
# Miscellaneous parsing
The
[`parse_human_readable_size`](fn.parse_human_readable_size.html)
routine parses strings like `2M` and converts them to the corresponding number
of bytes (`2 * 1<<20` in this case). If an invalid size is found, then a good
error message is crafted that typically tells the user how to fix the problem.
*/
#![deny(missing_docs)]
extern crate atty;
extern crate bstr;
extern crate globset;
#[macro_use]
extern crate lazy_static;
#[macro_use]
extern crate log;
extern crate regex;
extern crate same_file;
extern crate termcolor;
#[cfg(windows)]
extern crate winapi_util;
mod decompress;
mod escape;
mod human;
mod pattern;
mod process;
mod wtr;
pub use decompress::{
DecompressionMatcher, DecompressionMatcherBuilder, DecompressionReader,
DecompressionReaderBuilder,
};
pub use escape::{escape, escape_os, unescape, unescape_os};
pub use human::{parse_human_readable_size, ParseSizeError};
pub use pattern::{
pattern_from_bytes, pattern_from_os, patterns_from_path,
patterns_from_reader, patterns_from_stdin, InvalidPatternError,
};
pub use process::{CommandError, CommandReader, CommandReaderBuilder};
pub use wtr::{
stdout, stdout_buffered_block, stdout_buffered_line, StandardStream,
};
/// Returns true if and only if stdin is believed to be readable.
///
/// When stdin is readable, command line programs may choose to behave
/// differently than when stdin is not readable. For example, `command foo`
/// might search the current directory for occurrences of `foo` where as
/// `command foo < some-file` or `cat some-file | command foo` might instead
/// only search stdin for occurrences of `foo`.
pub fn is_readable_stdin() -> bool {
#[cfg(unix)]
fn imp() -> bool {
use same_file::Handle;
use std::os::unix::fs::FileTypeExt;
let ft = match Handle::stdin().and_then(|h| h.as_file().metadata()) {
Err(_) => return false,
Ok(md) => md.file_type(),
};
ft.is_file() || ft.is_fifo()
}
#[cfg(windows)]
fn imp() -> bool {
use winapi_util as winutil;
winutil::file::typ(winutil::HandleRef::stdin())
.map(|t| t.is_disk() || t.is_pipe())
.unwrap_or(false)
}
!is_tty_stdin() && imp()
}
/// Returns true if and only if stdin is believed to be connectted to a tty
/// or a console.
pub fn is_tty_stdin() -> bool {
atty::is(atty::Stream::Stdin)
}
/// Returns true if and only if stdout is believed to be connectted to a tty
/// or a console.
///
/// This is useful for when you want your command line program to produce
/// different output depending on whether it's printing directly to a user's
/// terminal or whether it's being redirected somewhere else. For example,
/// implementations of `ls` will often show one item per line when stdout is
/// redirected, but will condensed output when printing to a tty.
pub fn is_tty_stdout() -> bool {
atty::is(atty::Stream::Stdout)
}
/// Returns true if and only if stderr is believed to be connectted to a tty
/// or a console.
pub fn is_tty_stderr() -> bool {
atty::is(atty::Stream::Stderr)
}

195
crates/cli/src/pattern.rs Normal file
View File

@@ -0,0 +1,195 @@
use std::error;
use std::ffi::OsStr;
use std::fmt;
use std::fs::File;
use std::io;
use std::path::Path;
use std::str;
use bstr::io::BufReadExt;
use escape::{escape, escape_os};
/// An error that occurs when a pattern could not be converted to valid UTF-8.
///
/// The purpose of this error is to give a more targeted failure mode for
/// patterns written by end users that are not valid UTF-8.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct InvalidPatternError {
original: String,
valid_up_to: usize,
}
impl InvalidPatternError {
/// Returns the index in the given string up to which valid UTF-8 was
/// verified.
pub fn valid_up_to(&self) -> usize {
self.valid_up_to
}
}
impl error::Error for InvalidPatternError {
fn description(&self) -> &str {
"invalid pattern"
}
}
impl fmt::Display for InvalidPatternError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(
f,
"found invalid UTF-8 in pattern at byte offset {} \
(use hex escape sequences to match arbitrary bytes \
in a pattern, e.g., \\xFF): '{}'",
self.valid_up_to, self.original,
)
}
}
impl From<InvalidPatternError> for io::Error {
fn from(paterr: InvalidPatternError) -> io::Error {
io::Error::new(io::ErrorKind::Other, paterr)
}
}
/// Convert an OS string into a regular expression pattern.
///
/// This conversion fails if the given pattern is not valid UTF-8, in which
/// case, a targeted error with more information about where the invalid UTF-8
/// occurs is given. The error also suggests the use of hex escape sequences,
/// which are supported by many regex engines.
pub fn pattern_from_os(pattern: &OsStr) -> Result<&str, InvalidPatternError> {
pattern.to_str().ok_or_else(|| {
let valid_up_to = pattern
.to_string_lossy()
.find('\u{FFFD}')
.expect("a Unicode replacement codepoint for invalid UTF-8");
InvalidPatternError {
original: escape_os(pattern),
valid_up_to: valid_up_to,
}
})
}
/// Convert arbitrary bytes into a regular expression pattern.
///
/// This conversion fails if the given pattern is not valid UTF-8, in which
/// case, a targeted error with more information about where the invalid UTF-8
/// occurs is given. The error also suggests the use of hex escape sequences,
/// which are supported by many regex engines.
pub fn pattern_from_bytes(
pattern: &[u8],
) -> Result<&str, InvalidPatternError> {
str::from_utf8(pattern).map_err(|err| InvalidPatternError {
original: escape(pattern),
valid_up_to: err.valid_up_to(),
})
}
/// Read patterns from a file path, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number and the file
/// path.
pub fn patterns_from_path<P: AsRef<Path>>(path: P) -> io::Result<Vec<String>> {
let path = path.as_ref();
let file = File::open(path).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("{}: {}", path.display(), err),
)
})?;
patterns_from_reader(file).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("{}:{}", path.display(), err),
)
})
}
/// Read patterns from stdin, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number and the fact
/// that it came from stdin.
pub fn patterns_from_stdin() -> io::Result<Vec<String>> {
let stdin = io::stdin();
let locked = stdin.lock();
patterns_from_reader(locked).map_err(|err| {
io::Error::new(io::ErrorKind::Other, format!("<stdin>:{}", err))
})
}
/// Read patterns from any reader, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number.
///
/// Note that this routine uses its own internal buffer, so the caller should
/// not provide their own buffered reader if possible.
///
/// # Example
///
/// This shows how to parse patterns, one per line.
///
/// ```
/// use grep_cli::patterns_from_reader;
///
/// # fn example() -> Result<(), Box<::std::error::Error>> {
/// let patterns = "\
/// foo
/// bar\\s+foo
/// [a-z]{3}
/// ";
///
/// assert_eq!(patterns_from_reader(patterns.as_bytes())?, vec![
/// r"foo",
/// r"bar\s+foo",
/// r"[a-z]{3}",
/// ]);
/// # Ok(()) }
/// ```
pub fn patterns_from_reader<R: io::Read>(rdr: R) -> io::Result<Vec<String>> {
let mut patterns = vec![];
let mut line_number = 0;
io::BufReader::new(rdr).for_byte_line(|line| {
line_number += 1;
match pattern_from_bytes(line) {
Ok(pattern) => {
patterns.push(pattern.to_string());
Ok(true)
}
Err(err) => Err(io::Error::new(
io::ErrorKind::Other,
format!("{}: {}", line_number, err),
)),
}
})?;
Ok(patterns)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn bytes() {
let pat = b"abc\xFFxyz";
let err = pattern_from_bytes(pat).unwrap_err();
assert_eq!(3, err.valid_up_to());
}
#[test]
#[cfg(unix)]
fn os() {
use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;
let pat = OsStr::from_bytes(b"abc\xFFxyz");
let err = pattern_from_os(pat).unwrap_err();
assert_eq!(3, err.valid_up_to());
}
}

270
crates/cli/src/process.rs Normal file
View File

@@ -0,0 +1,270 @@
use std::error;
use std::fmt;
use std::io::{self, Read};
use std::iter;
use std::process;
use std::thread::{self, JoinHandle};
/// An error that can occur while running a command and reading its output.
///
/// This error can be seamlessly converted to an `io::Error` via a `From`
/// implementation.
#[derive(Debug)]
pub struct CommandError {
kind: CommandErrorKind,
}
#[derive(Debug)]
enum CommandErrorKind {
Io(io::Error),
Stderr(Vec<u8>),
}
impl CommandError {
/// Create an error from an I/O error.
pub(crate) fn io(ioerr: io::Error) -> CommandError {
CommandError { kind: CommandErrorKind::Io(ioerr) }
}
/// Create an error from the contents of stderr (which may be empty).
pub(crate) fn stderr(bytes: Vec<u8>) -> CommandError {
CommandError { kind: CommandErrorKind::Stderr(bytes) }
}
}
impl error::Error for CommandError {
fn description(&self) -> &str {
"command error"
}
}
impl fmt::Display for CommandError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self.kind {
CommandErrorKind::Io(ref e) => e.fmt(f),
CommandErrorKind::Stderr(ref bytes) => {
let msg = String::from_utf8_lossy(bytes);
if msg.trim().is_empty() {
write!(f, "<stderr is empty>")
} else {
let div = iter::repeat('-').take(79).collect::<String>();
write!(
f,
"\n{div}\n{msg}\n{div}",
div = div,
msg = msg.trim()
)
}
}
}
}
}
impl From<io::Error> for CommandError {
fn from(ioerr: io::Error) -> CommandError {
CommandError { kind: CommandErrorKind::Io(ioerr) }
}
}
impl From<CommandError> for io::Error {
fn from(cmderr: CommandError) -> io::Error {
match cmderr.kind {
CommandErrorKind::Io(ioerr) => ioerr,
CommandErrorKind::Stderr(_) => {
io::Error::new(io::ErrorKind::Other, cmderr)
}
}
}
}
/// Configures and builds a streaming reader for process output.
#[derive(Clone, Debug, Default)]
pub struct CommandReaderBuilder {
async_stderr: bool,
}
impl CommandReaderBuilder {
/// Create a new builder with the default configuration.
pub fn new() -> CommandReaderBuilder {
CommandReaderBuilder::default()
}
/// Build a new streaming reader for the given command's output.
///
/// The caller should set everything that's required on the given command
/// before building a reader, such as its arguments, environment and
/// current working directory. Settings such as the stdout and stderr (but
/// not stdin) pipes will be overridden so that they can be controlled by
/// the reader.
///
/// If there was a problem spawning the given command, then its error is
/// returned.
pub fn build(
&self,
command: &mut process::Command,
) -> Result<CommandReader, CommandError> {
let mut child = command
.stdout(process::Stdio::piped())
.stderr(process::Stdio::piped())
.spawn()?;
let stdout = child.stdout.take().unwrap();
let stderr = if self.async_stderr {
StderrReader::async(child.stderr.take().unwrap())
} else {
StderrReader::sync(child.stderr.take().unwrap())
};
Ok(CommandReader {
child: child,
stdout: stdout,
stderr: stderr,
done: false,
})
}
/// When enabled, the reader will asynchronously read the contents of the
/// command's stderr output. When disabled, stderr is only read after the
/// stdout stream has been exhausted (or if the process quits with an error
/// code).
///
/// Note that when enabled, this may require launching an additional
/// thread in order to read stderr. This is done so that the process being
/// executed is never blocked from writing to stdout or stderr. If this is
/// disabled, then it is possible for the process to fill up the stderr
/// buffer and deadlock.
///
/// This is enabled by default.
pub fn async_stderr(&mut self, yes: bool) -> &mut CommandReaderBuilder {
self.async_stderr = yes;
self
}
}
/// A streaming reader for a command's output.
///
/// The purpose of this reader is to provide an easy way to execute processes
/// whose stdout is read in a streaming way while also making the processes'
/// stderr available when the process fails with an exit code. This makes it
/// possible to execute processes while surfacing the underlying failure mode
/// in the case of an error.
///
/// Moreover, by default, this reader will asynchronously read the processes'
/// stderr. This prevents subtle deadlocking bugs for noisy processes that
/// write a lot to stderr. Currently, the entire contents of stderr is read
/// on to the heap.
///
/// # Example
///
/// This example shows how to invoke `gzip` to decompress the contents of a
/// file. If the `gzip` command reports a failing exit status, then its stderr
/// is returned as an error.
///
/// ```no_run
/// use std::io::Read;
/// use std::process::Command;
/// use grep_cli::CommandReader;
///
/// # fn example() -> Result<(), Box<::std::error::Error>> {
/// let mut cmd = Command::new("gzip");
/// cmd.arg("-d").arg("-c").arg("/usr/share/man/man1/ls.1.gz");
///
/// let mut rdr = CommandReader::new(&mut cmd)?;
/// let mut contents = vec![];
/// rdr.read_to_end(&mut contents)?;
/// # Ok(()) }
/// ```
#[derive(Debug)]
pub struct CommandReader {
child: process::Child,
stdout: process::ChildStdout,
stderr: StderrReader,
done: bool,
}
impl CommandReader {
/// Create a new streaming reader for the given command using the default
/// configuration.
///
/// The caller should set everything that's required on the given command
/// before building a reader, such as its arguments, environment and
/// current working directory. Settings such as the stdout and stderr (but
/// not stdin) pipes will be overridden so that they can be controlled by
/// the reader.
///
/// If there was a problem spawning the given command, then its error is
/// returned.
///
/// If the caller requires additional configuration for the reader
/// returned, then use
/// [`CommandReaderBuilder`](struct.CommandReaderBuilder.html).
pub fn new(
cmd: &mut process::Command,
) -> Result<CommandReader, CommandError> {
CommandReaderBuilder::new().build(cmd)
}
}
impl io::Read for CommandReader {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
if self.done {
return Ok(0);
}
let nread = self.stdout.read(buf)?;
if nread == 0 {
self.done = true;
// Reap the child now that we're done reading. If the command
// failed, report stderr as an error.
if !self.child.wait()?.success() {
return Err(io::Error::from(self.stderr.read_to_end()));
}
}
Ok(nread)
}
}
/// A reader that encapsulates the asynchronous or synchronous reading of
/// stderr.
#[derive(Debug)]
enum StderrReader {
Async(Option<JoinHandle<CommandError>>),
Sync(process::ChildStderr),
}
impl StderrReader {
/// Create a reader for stderr that reads contents asynchronously.
fn async(mut stderr: process::ChildStderr) -> StderrReader {
let handle =
thread::spawn(move || stderr_to_command_error(&mut stderr));
StderrReader::Async(Some(handle))
}
/// Create a reader for stderr that reads contents synchronously.
fn sync(stderr: process::ChildStderr) -> StderrReader {
StderrReader::Sync(stderr)
}
/// Consumes all of stderr on to the heap and returns it as an error.
///
/// If there was a problem reading stderr itself, then this returns an I/O
/// command error.
fn read_to_end(&mut self) -> CommandError {
match *self {
StderrReader::Async(ref mut handle) => {
let handle = handle
.take()
.expect("read_to_end cannot be called more than once");
handle.join().expect("stderr reading thread does not panic")
}
StderrReader::Sync(ref mut stderr) => {
stderr_to_command_error(stderr)
}
}
}
}
fn stderr_to_command_error(stderr: &mut process::ChildStderr) -> CommandError {
let mut bytes = vec![];
match stderr.read_to_end(&mut bytes) {
Ok(_) => CommandError::stderr(bytes),
Err(err) => CommandError::io(err),
}
}

133
crates/cli/src/wtr.rs Normal file
View File

@@ -0,0 +1,133 @@
use std::io;
use termcolor;
use is_tty_stdout;
/// A writer that supports coloring with either line or block buffering.
pub struct StandardStream(StandardStreamKind);
/// Returns a possibly buffered writer to stdout for the given color choice.
///
/// The writer returned is either line buffered or block buffered. The decision
/// between these two is made automatically based on whether a tty is attached
/// to stdout or not. If a tty is attached, then line buffering is used.
/// Otherwise, block buffering is used. In general, block buffering is more
/// efficient, but may increase the time it takes for the end user to see the
/// first bits of output.
///
/// If you need more fine grained control over the buffering mode, then use one
/// of `stdout_buffered_line` or `stdout_buffered_block`.
///
/// The color choice given is passed along to the underlying writer. To
/// completely disable colors in all cases, use `ColorChoice::Never`.
pub fn stdout(color_choice: termcolor::ColorChoice) -> StandardStream {
if is_tty_stdout() {
stdout_buffered_line(color_choice)
} else {
stdout_buffered_block(color_choice)
}
}
/// Returns a line buffered writer to stdout for the given color choice.
///
/// This writer is useful when printing results directly to a tty such that
/// users see output as soon as it's written. The downside of this approach
/// is that it can be slower, especially when there is a lot of output.
///
/// You might consider using
/// [`stdout`](fn.stdout.html)
/// instead, which chooses the buffering strategy automatically based on
/// whether stdout is connected to a tty.
pub fn stdout_buffered_line(
color_choice: termcolor::ColorChoice,
) -> StandardStream {
let out = termcolor::StandardStream::stdout(color_choice);
StandardStream(StandardStreamKind::LineBuffered(out))
}
/// Returns a block buffered writer to stdout for the given color choice.
///
/// This writer is useful when printing results to a file since it amortizes
/// the cost of writing data. The downside of this approach is that it can
/// increase the latency of display output when writing to a tty.
///
/// You might consider using
/// [`stdout`](fn.stdout.html)
/// instead, which chooses the buffering strategy automatically based on
/// whether stdout is connected to a tty.
pub fn stdout_buffered_block(
color_choice: termcolor::ColorChoice,
) -> StandardStream {
let out = termcolor::BufferedStandardStream::stdout(color_choice);
StandardStream(StandardStreamKind::BlockBuffered(out))
}
enum StandardStreamKind {
LineBuffered(termcolor::StandardStream),
BlockBuffered(termcolor::BufferedStandardStream),
}
impl io::Write for StandardStream {
#[inline]
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.write(buf),
BlockBuffered(ref mut w) => w.write(buf),
}
}
#[inline]
fn flush(&mut self) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.flush(),
BlockBuffered(ref mut w) => w.flush(),
}
}
}
impl termcolor::WriteColor for StandardStream {
#[inline]
fn supports_color(&self) -> bool {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref w) => w.supports_color(),
BlockBuffered(ref w) => w.supports_color(),
}
}
#[inline]
fn set_color(&mut self, spec: &termcolor::ColorSpec) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.set_color(spec),
BlockBuffered(ref mut w) => w.set_color(spec),
}
}
#[inline]
fn reset(&mut self) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.reset(),
BlockBuffered(ref mut w) => w.reset(),
}
}
#[inline]
fn is_synchronous(&self) -> bool {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref w) => w.is_synchronous(),
BlockBuffered(ref w) => w.is_synchronous(),
}
}
}

15
crates/core/README.md Normal file
View File

@@ -0,0 +1,15 @@
ripgrep core
------------
This is the core ripgrep crate. In particular, `main.rs` is where the `main`
function lives.
Most of ripgrep core consists of two things:
* The definition of the CLI interface, including docs for every flag.
* Glue code that brings the `grep-matcher`, `grep-regex`, `grep-searcher` and
`grep-printer` crates together to actually execute the search.
Currently, there are no plans to make ripgrep core available as an independent
library. However, much of the heavy lifting of ripgrep is done via its
constituent crates, which can be reused independent of ripgrep. Unfortunately,
there is no guide or tutorial to teach folks how to do this yet.

3044
crates/core/app.rs Normal file

File diff suppressed because it is too large Load Diff

1854
crates/core/args.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -4,18 +4,18 @@
use std::env;
use std::error::Error;
use std::fs::File;
use std::io::{self, BufRead};
use std::ffi::OsString;
use std::fs::File;
use std::io;
use std::path::{Path, PathBuf};
use Result;
use bstr::{io::BufReadExt, ByteSlice};
use log;
use crate::Result;
/// Return a sequence of arguments derived from ripgrep rc configuration files.
///
/// If no_messages is false and there was a problem reading a config file,
/// then errors are printed to stderr.
pub fn args(no_messages: bool) -> Vec<OsString> {
pub fn args() -> Vec<OsString> {
let config_path = match env::var_os("RIPGREP_CONFIG_PATH") {
None => return vec![],
Some(config_path) => {
@@ -28,20 +28,20 @@ pub fn args(no_messages: bool) -> Vec<OsString> {
let (args, errs) = match parse(&config_path) {
Ok((args, errs)) => (args, errs),
Err(err) => {
if !no_messages {
eprintln!("{}", err);
}
message!("{}", err);
return vec![];
}
};
if !no_messages && !errs.is_empty() {
if !errs.is_empty() {
for err in errs {
eprintln!("{}:{}", config_path.display(), err);
message!("{}:{}", config_path.display(), err);
}
}
debug!(
log::debug!(
"{}: arguments loaded from config file: {:?}",
config_path.display(), args);
config_path.display(),
args
);
args
}
@@ -55,11 +55,11 @@ pub fn args(no_messages: bool) -> Vec<OsString> {
/// for each line in addition to successfully parsed arguments.
fn parse<P: AsRef<Path>>(
path: P,
) -> Result<(Vec<OsString>, Vec<Box<Error>>)> {
) -> Result<(Vec<OsString>, Vec<Box<dyn Error>>)> {
let path = path.as_ref();
match File::open(&path) {
Ok(file) => parse_reader(file),
Err(err) => errored!("{}: {}", path.display(), err),
Err(err) => Err(From::from(format!("{}: {}", path.display(), err))),
}
}
@@ -76,71 +76,39 @@ fn parse<P: AsRef<Path>>(
/// in addition to successfully parsed arguments.
fn parse_reader<R: io::Read>(
rdr: R,
) -> Result<(Vec<OsString>, Vec<Box<Error>>)> {
let mut bufrdr = io::BufReader::new(rdr);
) -> Result<(Vec<OsString>, Vec<Box<dyn Error>>)> {
let bufrdr = io::BufReader::new(rdr);
let (mut args, mut errs) = (vec![], vec![]);
let mut line = vec![];
let mut line_number = 0;
while {
line.clear();
bufrdr.for_byte_line_with_terminator(|line| {
line_number += 1;
bufrdr.read_until(b'\n', &mut line)? > 0
} {
trim(&mut line);
let line = line.trim();
if line.is_empty() || line[0] == b'#' {
continue;
return Ok(true);
}
match bytes_to_os_string(&line) {
match line.to_os_str() {
Ok(osstr) => {
args.push(osstr);
args.push(osstr.to_os_string());
}
Err(err) => {
errs.push(format!("{}: {}", line_number, err).into());
}
}
}
Ok(true)
})?;
Ok((args, errs))
}
/// Trim the given bytes of whitespace according to the ASCII definition.
fn trim(x: &mut Vec<u8>) {
let upto = x.iter().take_while(|b| is_space(**b)).count();
x.drain(..upto);
let revto = x.len() - x.iter().rev().take_while(|b| is_space(**b)).count();
x.drain(revto..);
}
/// Returns true if and only if the given byte is an ASCII space character.
fn is_space(b: u8) -> bool {
b == b'\t'
|| b == b'\n'
|| b == b'\x0B'
|| b == b'\x0C'
|| b == b'\r'
|| b == b' '
}
/// On Unix, get an OsString from raw bytes.
#[cfg(unix)]
fn bytes_to_os_string(bytes: &[u8]) -> Result<OsString> {
use std::os::unix::ffi::OsStringExt;
Ok(OsString::from_vec(bytes.to_vec()))
}
/// On non-Unix (like Windows), require UTF-8.
#[cfg(not(unix))]
fn bytes_to_os_string(bytes: &[u8]) -> Result<OsString> {
String::from_utf8(bytes.to_vec()).map(OsString::from).map_err(From::from)
}
#[cfg(test)]
mod tests {
use std::ffi::OsString;
use super::parse_reader;
use std::ffi::OsString;
#[test]
fn basic() {
let (args, errs) = parse_reader(&b"\
let (args, errs) = parse_reader(
&b"\
# Test
--context=0
--smart-case
@@ -149,13 +117,13 @@ mod tests {
# --bar
--foo
"[..]).unwrap();
"[..],
)
.unwrap();
assert!(errs.is_empty());
let args: Vec<String> =
args.into_iter().map(|s| s.into_string().unwrap()).collect();
assert_eq!(args, vec![
"--context=0", "--smart-case", "-u", "--foo",
]);
assert_eq!(args, vec!["--context=0", "--smart-case", "-u", "--foo",]);
}
// We test that we can handle invalid UTF-8 on Unix-like systems.
@@ -164,32 +132,38 @@ mod tests {
fn error() {
use std::os::unix::ffi::OsStringExt;
let (args, errs) = parse_reader(&b"\
let (args, errs) = parse_reader(
&b"\
quux
foo\xFFbar
baz
"[..]).unwrap();
"[..],
)
.unwrap();
assert!(errs.is_empty());
assert_eq!(args, vec![
OsString::from("quux"),
OsString::from_vec(b"foo\xFFbar".to_vec()),
OsString::from("baz"),
]);
assert_eq!(
args,
vec![
OsString::from("quux"),
OsString::from_vec(b"foo\xFFbar".to_vec()),
OsString::from("baz"),
]
);
}
// ... but test that invalid UTF-8 fails on Windows.
#[test]
#[cfg(not(unix))]
fn error() {
let (args, errs) = parse_reader(&b"\
let (args, errs) = parse_reader(
&b"\
quux
foo\xFFbar
baz
"[..]).unwrap();
"[..],
)
.unwrap();
assert_eq!(errs.len(), 1);
assert_eq!(args, vec![
OsString::from("quux"),
OsString::from("baz"),
]);
assert_eq!(args, vec![OsString::from("quux"), OsString::from("baz"),]);
}
}

View File

@@ -34,19 +34,30 @@ impl Log for Logger {
match (record.file(), record.line()) {
(Some(file), Some(line)) => {
eprintln!(
"{}/{}/{}:{}: {}",
record.level(), record.target(),
file, line, record.args());
"{}|{}|{}:{}: {}",
record.level(),
record.target(),
file,
line,
record.args()
);
}
(Some(file), None) => {
eprintln!(
"{}/{}/{}: {}",
record.level(), record.target(), file, record.args());
"{}|{}|{}: {}",
record.level(),
record.target(),
file,
record.args()
);
}
_ => {
eprintln!(
"{}/{}: {}",
record.level(), record.target(), record.args());
"{}|{}: {}",
record.level(),
record.target(),
record.args()
);
}
}
}

325
crates/core/main.rs Normal file
View File

@@ -0,0 +1,325 @@
use std::error;
use std::io::{self, Write};
use std::process;
use std::sync::Mutex;
use std::time::Instant;
use ignore::WalkState;
use args::Args;
use subject::Subject;
#[macro_use]
mod messages;
mod app;
mod args;
mod config;
mod logger;
mod path_printer;
mod search;
mod subject;
// Since Rust no longer uses jemalloc by default, ripgrep will, by default,
// use the system allocator. On Linux, this would normally be glibc's
// allocator, which is pretty good. In particular, ripgrep does not have a
// particularly allocation heavy workload, so there really isn't much
// difference (for ripgrep's purposes) between glibc's allocator and jemalloc.
//
// However, when ripgrep is built with musl, this means ripgrep will use musl's
// allocator, which appears to be substantially worse. (musl's goal is not to
// have the fastest version of everything. Its goal is to be small and amenable
// to static compilation.) Even though ripgrep isn't particularly allocation
// heavy, musl's allocator appears to slow down ripgrep quite a bit. Therefore,
// when building with musl, we use jemalloc.
//
// We don't unconditionally use jemalloc because it can be nice to use the
// system's default allocator by default. Moreover, jemalloc seems to increase
// compilation times by a bit.
//
// Moreover, we only do this on 64-bit systems since jemalloc doesn't support
// i686.
#[cfg(all(target_env = "musl", target_pointer_width = "64"))]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
type Result<T> = ::std::result::Result<T, Box<dyn error::Error>>;
fn main() {
if let Err(err) = Args::parse().and_then(try_main) {
eprintln!("{}", err);
process::exit(2);
}
}
fn try_main(args: Args) -> Result<()> {
use args::Command::*;
let matched = match args.command()? {
Search => search(&args),
SearchParallel => search_parallel(&args),
SearchNever => Ok(false),
Files => files(&args),
FilesParallel => files_parallel(&args),
Types => types(&args),
PCRE2Version => pcre2_version(&args),
}?;
if matched && (args.quiet() || !messages::errored()) {
process::exit(0)
} else if messages::errored() {
process::exit(2)
} else {
process::exit(1)
}
}
/// The top-level entry point for single-threaded search. This recursively
/// steps through the file list (current directory by default) and searches
/// each file sequentially.
fn search(args: &Args) -> Result<bool> {
let started_at = Instant::now();
let quit_after_match = args.quit_after_match()?;
let subject_builder = args.subject_builder();
let mut stats = args.stats()?;
let mut searcher = args.search_worker(args.stdout())?;
let mut matched = false;
for result in args.walker()? {
let subject = match subject_builder.build_from_result(result) {
Some(subject) => subject,
None => continue,
};
let search_result = match searcher.search(&subject) {
Ok(search_result) => search_result,
Err(err) => {
// A broken pipe means graceful termination.
if err.kind() == io::ErrorKind::BrokenPipe {
break;
}
err_message!("{}: {}", subject.path().display(), err);
continue;
}
};
matched = matched || search_result.has_match();
if let Some(ref mut stats) = stats {
*stats += search_result.stats().unwrap();
}
if matched && quit_after_match {
break;
}
}
if let Some(ref stats) = stats {
let elapsed = Instant::now().duration_since(started_at);
// We don't care if we couldn't print this successfully.
let _ = searcher.print_stats(elapsed, stats);
}
Ok(matched)
}
/// The top-level entry point for multi-threaded search. The parallelism is
/// itself achieved by the recursive directory traversal. All we need to do is
/// feed it a worker for performing a search on each file.
fn search_parallel(args: &Args) -> Result<bool> {
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering::SeqCst;
let quit_after_match = args.quit_after_match()?;
let started_at = Instant::now();
let subject_builder = args.subject_builder();
let bufwtr = args.buffer_writer()?;
let stats = args.stats()?.map(Mutex::new);
let matched = AtomicBool::new(false);
let mut searcher_err = None;
args.walker_parallel()?.run(|| {
let bufwtr = &bufwtr;
let stats = &stats;
let matched = &matched;
let subject_builder = &subject_builder;
let mut searcher = match args.search_worker(bufwtr.buffer()) {
Ok(searcher) => searcher,
Err(err) => {
searcher_err = Some(err);
return Box::new(move |_| WalkState::Quit);
}
};
Box::new(move |result| {
let subject = match subject_builder.build_from_result(result) {
Some(subject) => subject,
None => return WalkState::Continue,
};
searcher.printer().get_mut().clear();
let search_result = match searcher.search(&subject) {
Ok(search_result) => search_result,
Err(err) => {
err_message!("{}: {}", subject.path().display(), err);
return WalkState::Continue;
}
};
if search_result.has_match() {
matched.store(true, SeqCst);
}
if let Some(ref locked_stats) = *stats {
let mut stats = locked_stats.lock().unwrap();
*stats += search_result.stats().unwrap();
}
if let Err(err) = bufwtr.print(searcher.printer().get_mut()) {
// A broken pipe means graceful termination.
if err.kind() == io::ErrorKind::BrokenPipe {
return WalkState::Quit;
}
// Otherwise, we continue on our merry way.
err_message!("{}: {}", subject.path().display(), err);
}
if matched.load(SeqCst) && quit_after_match {
WalkState::Quit
} else {
WalkState::Continue
}
})
});
if let Some(err) = searcher_err.take() {
return Err(err);
}
if let Some(ref locked_stats) = stats {
let elapsed = Instant::now().duration_since(started_at);
let stats = locked_stats.lock().unwrap();
let mut searcher = args.search_worker(args.stdout())?;
// We don't care if we couldn't print this successfully.
let _ = searcher.print_stats(elapsed, &stats);
}
Ok(matched.load(SeqCst))
}
/// The top-level entry point for listing files without searching them. This
/// recursively steps through the file list (current directory by default) and
/// prints each path sequentially using a single thread.
fn files(args: &Args) -> Result<bool> {
let quit_after_match = args.quit_after_match()?;
let subject_builder = args.subject_builder();
let mut matched = false;
let mut path_printer = args.path_printer(args.stdout())?;
for result in args.walker()? {
let subject = match subject_builder.build_from_result(result) {
Some(subject) => subject,
None => continue,
};
matched = true;
if quit_after_match {
break;
}
if let Err(err) = path_printer.write_path(subject.path()) {
// A broken pipe means graceful termination.
if err.kind() == io::ErrorKind::BrokenPipe {
break;
}
// Otherwise, we have some other error that's preventing us from
// writing to stdout, so we should bubble it up.
return Err(err.into());
}
}
Ok(matched)
}
/// The top-level entry point for listing files without searching them. This
/// recursively steps through the file list (current directory by default) and
/// prints each path sequentially using multiple threads.
fn files_parallel(args: &Args) -> Result<bool> {
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering::SeqCst;
use std::sync::mpsc;
use std::thread;
let quit_after_match = args.quit_after_match()?;
let subject_builder = args.subject_builder();
let mut path_printer = args.path_printer(args.stdout())?;
let matched = AtomicBool::new(false);
let (tx, rx) = mpsc::channel::<Subject>();
let print_thread = thread::spawn(move || -> io::Result<()> {
for subject in rx.iter() {
path_printer.write_path(subject.path())?;
}
Ok(())
});
args.walker_parallel()?.run(|| {
let subject_builder = &subject_builder;
let matched = &matched;
let tx = tx.clone();
Box::new(move |result| {
let subject = match subject_builder.build_from_result(result) {
Some(subject) => subject,
None => return WalkState::Continue,
};
matched.store(true, SeqCst);
if quit_after_match {
WalkState::Quit
} else {
match tx.send(subject) {
Ok(_) => WalkState::Continue,
Err(_) => WalkState::Quit,
}
}
})
});
drop(tx);
if let Err(err) = print_thread.join().unwrap() {
// A broken pipe means graceful termination, so fall through.
// Otherwise, something bad happened while writing to stdout, so bubble
// it up.
if err.kind() != io::ErrorKind::BrokenPipe {
return Err(err.into());
}
}
Ok(matched.load(SeqCst))
}
/// The top-level entry point for --type-list.
fn types(args: &Args) -> Result<bool> {
let mut count = 0;
let mut stdout = args.stdout();
for def in args.type_defs()? {
count += 1;
stdout.write_all(def.name().as_bytes())?;
stdout.write_all(b": ")?;
let mut first = true;
for glob in def.globs() {
if !first {
stdout.write_all(b", ")?;
}
stdout.write_all(glob.as_bytes())?;
first = false;
}
stdout.write_all(b"\n")?;
}
Ok(count > 0)
}
/// The top-level entry point for --pcre2-version.
fn pcre2_version(args: &Args) -> Result<bool> {
#[cfg(feature = "pcre2")]
fn imp(args: &Args) -> Result<bool> {
use grep::pcre2;
let mut stdout = args.stdout();
let (major, minor) = pcre2::version();
writeln!(stdout, "PCRE2 {}.{} is available", major, minor)?;
if cfg!(target_pointer_width = "64") && pcre2::is_jit_available() {
writeln!(stdout, "JIT is available")?;
}
Ok(true)
}
#[cfg(not(feature = "pcre2"))]
fn imp(args: &Args) -> Result<bool> {
let mut stdout = args.stdout();
writeln!(stdout, "PCRE2 is not available in this build of ripgrep.")?;
Ok(false)
}
imp(args)
}

74
crates/core/messages.rs Normal file
View File

@@ -0,0 +1,74 @@
use std::sync::atomic::{AtomicBool, Ordering};
static MESSAGES: AtomicBool = AtomicBool::new(false);
static IGNORE_MESSAGES: AtomicBool = AtomicBool::new(false);
static ERRORED: AtomicBool = AtomicBool::new(false);
/// Emit a non-fatal error message, unless messages were disabled.
#[macro_export]
macro_rules! message {
($($tt:tt)*) => {
if crate::messages::messages() {
eprintln!($($tt)*);
}
}
}
/// Like message, but sets ripgrep's "errored" flag, which controls the exit
/// status.
#[macro_export]
macro_rules! err_message {
($($tt:tt)*) => {
crate::messages::set_errored();
message!($($tt)*);
}
}
/// Emit a non-fatal ignore-related error message (like a parse error), unless
/// ignore-messages were disabled.
#[macro_export]
macro_rules! ignore_message {
($($tt:tt)*) => {
if crate::messages::messages() && crate::messages::ignore_messages() {
eprintln!($($tt)*);
}
}
}
/// Returns true if and only if messages should be shown.
pub fn messages() -> bool {
MESSAGES.load(Ordering::SeqCst)
}
/// Set whether messages should be shown or not.
///
/// By default, they are not shown.
pub fn set_messages(yes: bool) {
MESSAGES.store(yes, Ordering::SeqCst)
}
/// Returns true if and only if "ignore" related messages should be shown.
pub fn ignore_messages() -> bool {
IGNORE_MESSAGES.load(Ordering::SeqCst)
}
/// Set whether "ignore" related messages should be shown or not.
///
/// By default, they are not shown.
///
/// Note that this is overridden if `messages` is disabled. Namely, if
/// `messages` is disabled, then "ignore" messages are never shown, regardless
/// of this setting.
pub fn set_ignore_messages(yes: bool) {
IGNORE_MESSAGES.store(yes, Ordering::SeqCst)
}
/// Returns true if and only if ripgrep came across a non-fatal error.
pub fn errored() -> bool {
ERRORED.load(Ordering::SeqCst)
}
/// Indicate that ripgrep has come across a non-fatal error.
pub fn set_errored() {
ERRORED.store(true, Ordering::SeqCst);
}

View File

@@ -0,0 +1,98 @@
use std::io;
use std::path::Path;
use grep::printer::{ColorSpecs, PrinterPath};
use termcolor::WriteColor;
/// A configuration for describing how paths should be written.
#[derive(Clone, Debug)]
struct Config {
colors: ColorSpecs,
separator: Option<u8>,
terminator: u8,
}
impl Default for Config {
fn default() -> Config {
Config {
colors: ColorSpecs::default(),
separator: None,
terminator: b'\n',
}
}
}
/// A builder for constructing things to search over.
#[derive(Clone, Debug)]
pub struct PathPrinterBuilder {
config: Config,
}
impl PathPrinterBuilder {
/// Return a new subject builder with a default configuration.
pub fn new() -> PathPrinterBuilder {
PathPrinterBuilder { config: Config::default() }
}
/// Create a new path printer with the current configuration that writes
/// paths to the given writer.
pub fn build<W: WriteColor>(&self, wtr: W) -> PathPrinter<W> {
PathPrinter { config: self.config.clone(), wtr }
}
/// Set the color specification for this printer.
///
/// Currently, only the `path` component of the given specification is
/// used.
pub fn color_specs(
&mut self,
specs: ColorSpecs,
) -> &mut PathPrinterBuilder {
self.config.colors = specs;
self
}
/// A path separator.
///
/// When provided, the path's default separator will be replaced with
/// the given separator.
///
/// This is not set by default, and the system's default path separator
/// will be used.
pub fn separator(&mut self, sep: Option<u8>) -> &mut PathPrinterBuilder {
self.config.separator = sep;
self
}
/// A path terminator.
///
/// When printing a path, it will be by terminated by the given byte.
///
/// This is set to `\n` by default.
pub fn terminator(&mut self, terminator: u8) -> &mut PathPrinterBuilder {
self.config.terminator = terminator;
self
}
}
/// A printer for emitting paths to a writer, with optional color support.
#[derive(Debug)]
pub struct PathPrinter<W> {
config: Config,
wtr: W,
}
impl<W: WriteColor> PathPrinter<W> {
/// Write the given path to the underlying writer.
pub fn write_path(&mut self, path: &Path) -> io::Result<()> {
let ppath = PrinterPath::with_separator(path, self.config.separator);
if !self.wtr.supports_color() {
self.wtr.write_all(ppath.as_bytes())?;
} else {
self.wtr.set_color(self.config.colors.path())?;
self.wtr.write_all(ppath.as_bytes())?;
self.wtr.reset()?;
}
self.wtr.write_all(&[self.config.terminator])
}
}

534
crates/core/search.rs Normal file
View File

@@ -0,0 +1,534 @@
use std::fs::File;
use std::io;
use std::path::{Path, PathBuf};
use std::process::{Command, Stdio};
use std::time::Duration;
use grep::cli;
use grep::matcher::Matcher;
#[cfg(feature = "pcre2")]
use grep::pcre2::RegexMatcher as PCRE2RegexMatcher;
use grep::printer::{Standard, Stats, Summary, JSON};
use grep::regex::RegexMatcher as RustRegexMatcher;
use grep::searcher::{BinaryDetection, Searcher};
use ignore::overrides::Override;
use serde_json as json;
use serde_json::json;
use termcolor::WriteColor;
use crate::subject::Subject;
/// The configuration for the search worker. Among a few other things, the
/// configuration primarily controls the way we show search results to users
/// at a very high level.
#[derive(Clone, Debug)]
struct Config {
json_stats: bool,
preprocessor: Option<PathBuf>,
preprocessor_globs: Override,
search_zip: bool,
binary_implicit: BinaryDetection,
binary_explicit: BinaryDetection,
}
impl Default for Config {
fn default() -> Config {
Config {
json_stats: false,
preprocessor: None,
preprocessor_globs: Override::empty(),
search_zip: false,
binary_implicit: BinaryDetection::none(),
binary_explicit: BinaryDetection::none(),
}
}
}
/// A builder for configuring and constructing a search worker.
#[derive(Clone, Debug)]
pub struct SearchWorkerBuilder {
config: Config,
command_builder: cli::CommandReaderBuilder,
decomp_builder: cli::DecompressionReaderBuilder,
}
impl Default for SearchWorkerBuilder {
fn default() -> SearchWorkerBuilder {
SearchWorkerBuilder::new()
}
}
impl SearchWorkerBuilder {
/// Create a new builder for configuring and constructing a search worker.
pub fn new() -> SearchWorkerBuilder {
let mut cmd_builder = cli::CommandReaderBuilder::new();
cmd_builder.async_stderr(true);
let mut decomp_builder = cli::DecompressionReaderBuilder::new();
decomp_builder.async_stderr(true);
SearchWorkerBuilder {
config: Config::default(),
command_builder: cmd_builder,
decomp_builder,
}
}
/// Create a new search worker using the given searcher, matcher and
/// printer.
pub fn build<W: WriteColor>(
&self,
matcher: PatternMatcher,
searcher: Searcher,
printer: Printer<W>,
) -> SearchWorker<W> {
let config = self.config.clone();
let command_builder = self.command_builder.clone();
let decomp_builder = self.decomp_builder.clone();
SearchWorker {
config,
command_builder,
decomp_builder,
matcher,
searcher,
printer,
}
}
/// Forcefully use JSON to emit statistics, even if the underlying printer
/// is not the JSON printer.
///
/// This is useful for implementing flag combinations like
/// `--json --quiet`, which uses the summary printer for implementing
/// `--quiet` but still wants to emit summary statistics, which should
/// be JSON formatted because of the `--json` flag.
pub fn json_stats(&mut self, yes: bool) -> &mut SearchWorkerBuilder {
self.config.json_stats = yes;
self
}
/// Set the path to a preprocessor command.
///
/// When this is set, instead of searching files directly, the given
/// command will be run with the file path as the first argument, and the
/// output of that command will be searched instead.
pub fn preprocessor(
&mut self,
cmd: Option<PathBuf>,
) -> &mut SearchWorkerBuilder {
self.config.preprocessor = cmd;
self
}
/// Set the globs for determining which files should be run through the
/// preprocessor. By default, with no globs and a preprocessor specified,
/// every file is run through the preprocessor.
pub fn preprocessor_globs(
&mut self,
globs: Override,
) -> &mut SearchWorkerBuilder {
self.config.preprocessor_globs = globs;
self
}
/// Enable the decompression and searching of common compressed files.
///
/// When enabled, if a particular file path is recognized as a compressed
/// file, then it is decompressed before searching.
///
/// Note that if a preprocessor command is set, then it overrides this
/// setting.
pub fn search_zip(&mut self, yes: bool) -> &mut SearchWorkerBuilder {
self.config.search_zip = yes;
self
}
/// Set the binary detection that should be used when searching files
/// found via a recursive directory search.
///
/// Generally, this binary detection may be `BinaryDetection::quit` if
/// we want to skip binary files completely.
///
/// By default, no binary detection is performed.
pub fn binary_detection_implicit(
&mut self,
detection: BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_implicit = detection;
self
}
/// Set the binary detection that should be used when searching files
/// explicitly supplied by an end user.
///
/// Generally, this binary detection should NOT be `BinaryDetection::quit`,
/// since we never want to automatically filter files supplied by the end
/// user.
///
/// By default, no binary detection is performed.
pub fn binary_detection_explicit(
&mut self,
detection: BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_explicit = detection;
self
}
}
/// The result of executing a search.
///
/// Generally speaking, the "result" of a search is sent to a printer, which
/// writes results to an underlying writer such as stdout or a file. However,
/// every search also has some aggregate statistics or meta data that may be
/// useful to higher level routines.
#[derive(Clone, Debug, Default)]
pub struct SearchResult {
has_match: bool,
stats: Option<Stats>,
}
impl SearchResult {
/// Whether the search found a match or not.
pub fn has_match(&self) -> bool {
self.has_match
}
/// Return aggregate search statistics for a single search, if available.
///
/// It can be expensive to compute statistics, so these are only present
/// if explicitly enabled in the printer provided by the caller.
pub fn stats(&self) -> Option<&Stats> {
self.stats.as_ref()
}
}
/// The pattern matcher used by a search worker.
#[derive(Clone, Debug)]
pub enum PatternMatcher {
RustRegex(RustRegexMatcher),
#[cfg(feature = "pcre2")]
PCRE2(PCRE2RegexMatcher),
}
/// The printer used by a search worker.
///
/// The `W` type parameter refers to the type of the underlying writer.
#[derive(Debug)]
pub enum Printer<W> {
/// Use the standard printer, which supports the classic grep-like format.
Standard(Standard<W>),
/// Use the summary printer, which supports aggregate displays of search
/// results.
Summary(Summary<W>),
/// A JSON printer, which emits results in the JSON Lines format.
JSON(JSON<W>),
}
impl<W: WriteColor> Printer<W> {
fn print_stats(
&mut self,
total_duration: Duration,
stats: &Stats,
) -> io::Result<()> {
match *self {
Printer::JSON(_) => self.print_stats_json(total_duration, stats),
Printer::Standard(_) | Printer::Summary(_) => {
self.print_stats_human(total_duration, stats)
}
}
}
fn print_stats_human(
&mut self,
total_duration: Duration,
stats: &Stats,
) -> io::Result<()> {
write!(
self.get_mut(),
"
{matches} matches
{lines} matched lines
{searches_with_match} files contained matches
{searches} files searched
{bytes_printed} bytes printed
{bytes_searched} bytes searched
{search_time:0.6} seconds spent searching
{process_time:0.6} seconds
",
matches = stats.matches(),
lines = stats.matched_lines(),
searches_with_match = stats.searches_with_match(),
searches = stats.searches(),
bytes_printed = stats.bytes_printed(),
bytes_searched = stats.bytes_searched(),
search_time = fractional_seconds(stats.elapsed()),
process_time = fractional_seconds(total_duration)
)
}
fn print_stats_json(
&mut self,
total_duration: Duration,
stats: &Stats,
) -> io::Result<()> {
// We specifically match the format laid out by the JSON printer in
// the grep-printer crate. We simply "extend" it with the 'summary'
// message type.
let fractional = fractional_seconds(total_duration);
json::to_writer(
self.get_mut(),
&json!({
"type": "summary",
"data": {
"stats": stats,
"elapsed_total": {
"secs": total_duration.as_secs(),
"nanos": total_duration.subsec_nanos(),
"human": format!("{:0.6}s", fractional),
},
}
}),
)?;
write!(self.get_mut(), "\n")
}
/// Return a mutable reference to the underlying printer's writer.
pub fn get_mut(&mut self) -> &mut W {
match *self {
Printer::Standard(ref mut p) => p.get_mut(),
Printer::Summary(ref mut p) => p.get_mut(),
Printer::JSON(ref mut p) => p.get_mut(),
}
}
}
/// A worker for executing searches.
///
/// It is intended for a single worker to execute many searches, and is
/// generally intended to be used from a single thread. When searching using
/// multiple threads, it is better to create a new worker for each thread.
#[derive(Debug)]
pub struct SearchWorker<W> {
config: Config,
command_builder: cli::CommandReaderBuilder,
decomp_builder: cli::DecompressionReaderBuilder,
matcher: PatternMatcher,
searcher: Searcher,
printer: Printer<W>,
}
impl<W: WriteColor> SearchWorker<W> {
/// Execute a search over the given subject.
pub fn search(&mut self, subject: &Subject) -> io::Result<SearchResult> {
let bin = if subject.is_explicit() {
self.config.binary_explicit.clone()
} else {
self.config.binary_implicit.clone()
};
self.searcher.set_binary_detection(bin);
let path = subject.path();
if subject.is_stdin() {
self.search_reader(path, io::stdin().lock())
} else if self.should_preprocess(path) {
self.search_preprocessor(path)
} else if self.should_decompress(path) {
self.search_decompress(path)
} else {
self.search_path(path)
}
}
/// Return a mutable reference to the underlying printer.
pub fn printer(&mut self) -> &mut Printer<W> {
&mut self.printer
}
/// Print the given statistics to the underlying writer in a way that is
/// consistent with this searcher's printer's format.
///
/// While `Stats` contains a duration itself, this only corresponds to the
/// time spent searching, where as `total_duration` should roughly
/// approximate the lifespan of the ripgrep process itself.
pub fn print_stats(
&mut self,
total_duration: Duration,
stats: &Stats,
) -> io::Result<()> {
if self.config.json_stats {
self.printer().print_stats_json(total_duration, stats)
} else {
self.printer().print_stats(total_duration, stats)
}
}
/// Returns true if and only if the given file path should be
/// decompressed before searching.
fn should_decompress(&self, path: &Path) -> bool {
if !self.config.search_zip {
return false;
}
self.decomp_builder.get_matcher().has_command(path)
}
/// Returns true if and only if the given file path should be run through
/// the preprocessor.
fn should_preprocess(&self, path: &Path) -> bool {
if !self.config.preprocessor.is_some() {
return false;
}
if self.config.preprocessor_globs.is_empty() {
return true;
}
!self.config.preprocessor_globs.matched(path, false).is_ignore()
}
/// Search the given file path by first asking the preprocessor for the
/// data to search instead of opening the path directly.
fn search_preprocessor(
&mut self,
path: &Path,
) -> io::Result<SearchResult> {
let bin = self.config.preprocessor.as_ref().unwrap();
let mut cmd = Command::new(bin);
cmd.arg(path).stdin(Stdio::from(File::open(path)?));
let rdr = self.command_builder.build(&mut cmd).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!(
"preprocessor command could not start: '{:?}': {}",
cmd, err,
),
)
})?;
self.search_reader(path, rdr).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("preprocessor command failed: '{:?}': {}", cmd, err),
)
})
}
/// Attempt to decompress the data at the given file path and search the
/// result. If the given file path isn't recognized as a compressed file,
/// then search it without doing any decompression.
fn search_decompress(&mut self, path: &Path) -> io::Result<SearchResult> {
let rdr = self.decomp_builder.build(path)?;
self.search_reader(path, rdr)
}
/// Search the contents of the given file path.
fn search_path(&mut self, path: &Path) -> io::Result<SearchResult> {
use self::PatternMatcher::*;
let (searcher, printer) = (&mut self.searcher, &mut self.printer);
match self.matcher {
RustRegex(ref m) => search_path(m, searcher, printer, path),
#[cfg(feature = "pcre2")]
PCRE2(ref m) => search_path(m, searcher, printer, path),
}
}
/// Executes a search on the given reader, which may or may not correspond
/// directly to the contents of the given file path. Instead, the reader
/// may actually cause something else to be searched (for example, when
/// a preprocessor is set or when decompression is enabled). In those
/// cases, the file path is used for visual purposes only.
///
/// Generally speaking, this method should only be used when there is no
/// other choice. Searching via `search_path` provides more opportunities
/// for optimizations (such as memory maps).
fn search_reader<R: io::Read>(
&mut self,
path: &Path,
rdr: R,
) -> io::Result<SearchResult> {
use self::PatternMatcher::*;
let (searcher, printer) = (&mut self.searcher, &mut self.printer);
match self.matcher {
RustRegex(ref m) => search_reader(m, searcher, printer, path, rdr),
#[cfg(feature = "pcre2")]
PCRE2(ref m) => search_reader(m, searcher, printer, path, rdr),
}
}
}
/// Search the contents of the given file path using the given matcher,
/// searcher and printer.
fn search_path<M: Matcher, W: WriteColor>(
matcher: M,
searcher: &mut Searcher,
printer: &mut Printer<W>,
path: &Path,
) -> io::Result<SearchResult> {
match *printer {
Printer::Standard(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::Summary(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::JSON(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: Some(sink.stats().clone()),
})
}
}
}
/// Search the contents of the given reader using the given matcher, searcher
/// and printer.
fn search_reader<M: Matcher, R: io::Read, W: WriteColor>(
matcher: M,
searcher: &mut Searcher,
printer: &mut Printer<W>,
path: &Path,
rdr: R,
) -> io::Result<SearchResult> {
match *printer {
Printer::Standard(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::Summary(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::JSON(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: Some(sink.stats().clone()),
})
}
}
}
/// Return the given duration as fractional seconds.
fn fractional_seconds(duration: Duration) -> f64 {
(duration.as_secs() as f64) + (duration.subsec_nanos() as f64 * 1e-9)
}

160
crates/core/subject.rs Normal file
View File

@@ -0,0 +1,160 @@
use std::path::Path;
use ignore::{self, DirEntry};
use log;
/// A configuration for describing how subjects should be built.
#[derive(Clone, Debug)]
struct Config {
strip_dot_prefix: bool,
}
impl Default for Config {
fn default() -> Config {
Config { strip_dot_prefix: false }
}
}
/// A builder for constructing things to search over.
#[derive(Clone, Debug)]
pub struct SubjectBuilder {
config: Config,
}
impl SubjectBuilder {
/// Return a new subject builder with a default configuration.
pub fn new() -> SubjectBuilder {
SubjectBuilder { config: Config::default() }
}
/// Create a new subject from a possibly missing directory entry.
///
/// If the directory entry isn't present, then the corresponding error is
/// logged if messages have been configured. Otherwise, if the subject is
/// deemed searchable, then it is returned.
pub fn build_from_result(
&self,
result: Result<DirEntry, ignore::Error>,
) -> Option<Subject> {
match result {
Ok(dent) => self.build(dent),
Err(err) => {
err_message!("{}", err);
None
}
}
}
/// Create a new subject using this builder's configuration.
///
/// If a subject could not be created or should otherwise not be searched,
/// then this returns `None` after emitting any relevant log messages.
pub fn build(&self, dent: DirEntry) -> Option<Subject> {
let subj =
Subject { dent, strip_dot_prefix: self.config.strip_dot_prefix };
if let Some(ignore_err) = subj.dent.error() {
ignore_message!("{}", ignore_err);
}
// If this entry was explicitly provided by an end user, then we always
// want to search it.
if subj.is_explicit() {
return Some(subj);
}
// At this point, we only want to search something if it's explicitly a
// file. This omits symlinks. (If ripgrep was configured to follow
// symlinks, then they have already been followed by the directory
// traversal.)
if subj.is_file() {
return Some(subj);
}
// We got nothin. Emit a debug message, but only if this isn't a
// directory. Otherwise, emitting messages for directories is just
// noisy.
if !subj.is_dir() {
log::debug!(
"ignoring {}: failed to pass subject filter: \
file type: {:?}, metadata: {:?}",
subj.dent.path().display(),
subj.dent.file_type(),
subj.dent.metadata()
);
}
None
}
/// When enabled, if the subject's file path starts with `./` then it is
/// stripped.
///
/// This is useful when implicitly searching the current working directory.
pub fn strip_dot_prefix(&mut self, yes: bool) -> &mut SubjectBuilder {
self.config.strip_dot_prefix = yes;
self
}
}
/// A subject is a thing we want to search. Generally, a subject is either a
/// file or stdin.
#[derive(Clone, Debug)]
pub struct Subject {
dent: DirEntry,
strip_dot_prefix: bool,
}
impl Subject {
/// Return the file path corresponding to this subject.
///
/// If this subject corresponds to stdin, then a special `<stdin>` path
/// is returned instead.
pub fn path(&self) -> &Path {
if self.strip_dot_prefix && self.dent.path().starts_with("./") {
self.dent.path().strip_prefix("./").unwrap()
} else {
self.dent.path()
}
}
/// Returns true if and only if this entry corresponds to stdin.
pub fn is_stdin(&self) -> bool {
self.dent.is_stdin()
}
/// Returns true if and only if this entry corresponds to a subject to
/// search that was explicitly supplied by an end user.
///
/// Generally, this corresponds to either stdin or an explicit file path
/// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
/// an explicit subject, but, e.g., `./some-dir/some-other-file` is not.
///
/// However, note that ripgrep does not see through shell globbing. e.g.,
/// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
/// as an explicit subject.
pub fn is_explicit(&self) -> bool {
// stdin is obvious. When an entry has a depth of 0, that means it
// was explicitly provided to our directory iterator, which means it
// was in turn explicitly provided by the end user. The !is_dir check
// means that we want to search files even if their symlinks, again,
// because they were explicitly provided. (And we never want to try
// to search a directory.)
self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
}
/// Returns true if and only if this subject points to a directory after
/// following symbolic links.
fn is_dir(&self) -> bool {
let ft = match self.dent.file_type() {
None => return false,
Some(ft) => ft,
};
if ft.is_dir() {
return true;
}
// If this is a symlink, then we want to follow it to determine
// whether it's a directory or not.
self.dent.path_is_symlink() && self.dent.path().is_dir()
}
/// Returns true if and only if this subject points to a file.
fn is_file(&self) -> bool {
self.dent.file_type().map_or(false, |ft| ft.is_file())
}
}

View File

@@ -1,6 +1,6 @@
[package]
name = "globset"
version = "0.3.0" #:version
version = "0.4.5" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Cross platform single glob and glob set matching. Glob set matching is the
@@ -8,8 +8,8 @@ process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
"""
documentation = "https://docs.rs/globset"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/globset"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/globset"
readme = "README.md"
keywords = ["regex", "glob", "multiple", "set", "pattern"]
license = "Unlicense/MIT"
@@ -19,14 +19,18 @@ name = "globset"
bench = false
[dependencies]
aho-corasick = "0.6.0"
fnv = "1.0"
log = "0.4"
memchr = "2"
regex = "0.2.1"
aho-corasick = "0.7.3"
bstr = { version = "0.2.0", default-features = false, features = ["std"] }
fnv = "1.0.6"
log = "0.4.5"
regex = "1.1.5"
serde = { version = "1.0.104", optional = true }
[dev-dependencies]
glob = "0.2"
glob = "0.3.0"
lazy_static = "1"
serde_json = "1.0.45"
[features]
simd-accel = ["regex/simd-accel"]
simd-accel = []
serde1 = ["serde"]

View File

@@ -4,7 +4,7 @@ Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/globset.svg)](https://crates.io/crates/globset)
@@ -20,7 +20,7 @@ Add this to your `Cargo.toml`:
```toml
[dependencies]
globset = "0.2"
globset = "0.3"
```
and this to your crate root:
@@ -29,6 +29,10 @@ and this to your crate root:
extern crate globset;
```
### Features
* `serde1`: Enables implementing Serde traits on the `Glob` type.
### Example: one glob
This example shows how to match a single glob against a single file path.

View File

@@ -6,14 +6,9 @@ tool itself, see the benchsuite directory.
extern crate glob;
extern crate globset;
#[macro_use]
extern crate lazy_static;
extern crate regex;
extern crate test;
use std::ffi::OsStr;
use std::path::Path;
use globset::{Candidate, Glob, GlobMatcher, GlobSet, GlobSetBuilder};
const EXT: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";

View File

@@ -2,13 +2,13 @@ use std::fmt;
use std::hash;
use std::iter;
use std::ops::{Deref, DerefMut};
use std::path::{Path, is_separator};
use std::path::{is_separator, Path};
use std::str;
use regex;
use regex::bytes::Regex;
use {Candidate, Error, ErrorKind, new_regex};
use {new_regex, Candidate, Error, ErrorKind};
/// Describes a matching strategy for a particular pattern.
///
@@ -85,16 +85,16 @@ pub struct Glob {
}
impl PartialEq for Glob {
fn eq(&self, other: &Glob) -> bool {
self.glob == other.glob && self.opts == other.opts
}
fn eq(&self, other: &Glob) -> bool {
self.glob == other.glob && self.opts == other.opts
}
}
impl hash::Hash for Glob {
fn hash<H: hash::Hasher>(&self, state: &mut H) {
self.glob.hash(state);
self.opts.hash(state);
}
fn hash<H: hash::Hasher>(&self, state: &mut H) {
self.glob.hash(state);
self.opts.hash(state);
}
}
impl fmt::Display for Glob {
@@ -103,6 +103,14 @@ impl fmt::Display for Glob {
}
}
impl str::FromStr for Glob {
type Err = Error;
fn from_str(glob: &str) -> Result<Self, Self::Err> {
Self::new(glob)
}
}
/// A matcher for a single pattern.
#[derive(Clone, Debug)]
pub struct GlobMatcher {
@@ -122,6 +130,11 @@ impl GlobMatcher {
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
self.re.is_match(&path.path)
}
/// Returns the `Glob` used to compile this matcher.
pub fn glob(&self) -> &Glob {
&self.pat
}
}
/// A strategic matcher for a single pattern.
@@ -187,13 +200,26 @@ pub struct GlobBuilder<'a> {
opts: GlobOptions,
}
#[derive(Clone, Copy, Debug, Default, Eq, Hash, PartialEq)]
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
struct GlobOptions {
/// Whether to match case insensitively.
case_insensitive: bool,
/// Whether to require a literal separator to match a separator in a file
/// path. e.g., when enabled, `*` won't match `/`.
literal_separator: bool,
/// Whether or not to use `\` to escape special characters.
/// e.g., when enabled, `\*` will match a literal `*`.
backslash_escape: bool,
}
impl GlobOptions {
fn default() -> GlobOptions {
GlobOptions {
case_insensitive: false,
literal_separator: false,
backslash_escape: !is_separator('\\'),
}
}
}
#[derive(Clone, Debug, Default, Eq, PartialEq)]
@@ -201,11 +227,15 @@ struct Tokens(Vec<Token>);
impl Deref for Tokens {
type Target = Vec<Token>;
fn deref(&self) -> &Vec<Token> { &self.0 }
fn deref(&self) -> &Vec<Token> {
&self.0
}
}
impl DerefMut for Tokens {
fn deref_mut(&mut self) -> &mut Vec<Token> { &mut self.0 }
fn deref_mut(&mut self) -> &mut Vec<Token> {
&mut self.0
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
@@ -216,10 +246,7 @@ enum Token {
RecursivePrefix,
RecursiveSuffix,
RecursiveZeroOrMore,
Class {
negated: bool,
ranges: Vec<(char, char)>,
},
Class { negated: bool, ranges: Vec<(char, char)> },
Alternates(Vec<Tokens>),
}
@@ -231,12 +258,9 @@ impl Glob {
/// Returns a matcher for this pattern.
pub fn compile_matcher(&self) -> GlobMatcher {
let re = new_regex(&self.re)
.expect("regex compilation shouldn't fail");
GlobMatcher {
pat: self.clone(),
re: re,
}
let re =
new_regex(&self.re).expect("regex compilation shouldn't fail");
GlobMatcher { pat: self.clone(), re: re }
}
/// Returns a strategic matcher.
@@ -247,13 +271,9 @@ impl Glob {
#[cfg(test)]
fn compile_strategic_matcher(&self) -> GlobStrategic {
let strategy = MatchStrategy::new(self);
let re = new_regex(&self.re)
.expect("regex compilation shouldn't fail");
GlobStrategic {
strategy: strategy,
pat: self.clone(),
re: re,
}
let re =
new_regex(&self.re).expect("regex compilation shouldn't fail");
GlobStrategic { strategy: strategy, pat: self.clone(), re: re }
}
/// Returns the original glob pattern used to build this pattern.
@@ -262,6 +282,19 @@ impl Glob {
}
/// Returns the regular expression string for this glob.
///
/// Note that regular expressions for globs are intended to be matched on
/// arbitrary bytes (`&[u8]`) instead of Unicode strings (`&str`). In
/// particular, globs are frequently used on file paths, where there is no
/// general guarantee that file paths are themselves valid UTF-8. As a
/// result, callers will need to ensure that they are using a regex API
/// that can match on arbitrary bytes. For example, the
/// [`regex`](https://crates.io/regex)
/// crate's
/// [`Regex`](https://docs.rs/regex/*/regex/struct.Regex.html)
/// API is not suitable for this since it matches on `&str`, but its
/// [`bytes::Regex`](https://docs.rs/regex/*/regex/bytes/struct.Regex.html)
/// API is suitable for this.
pub fn regex(&self) -> &str {
&self.re
}
@@ -498,7 +531,7 @@ impl Glob {
| Token::RecursiveZeroOrMore => {
return None;
}
Token::Class{..} | Token::Alternates(..) => {
Token::Class { .. } | Token::Alternates(..) => {
// We *could* be a little smarter here, but either one
// of these is going to prevent our literal optimizations
// anyway, so give up.
@@ -535,10 +568,7 @@ impl<'a> GlobBuilder<'a> {
///
/// The pattern is not compiled until `build` is called.
pub fn new(glob: &'a str) -> GlobBuilder<'a> {
GlobBuilder {
glob: glob,
opts: GlobOptions::default(),
}
GlobBuilder { glob: glob, opts: GlobOptions::default() }
}
/// Parses and builds the pattern.
@@ -549,6 +579,7 @@ impl<'a> GlobBuilder<'a> {
chars: self.glob.chars().peekable(),
prev: None,
cur: None,
opts: &self.opts,
};
p.parse()?;
if p.stack.is_empty() {
@@ -585,6 +616,19 @@ impl<'a> GlobBuilder<'a> {
self.opts.literal_separator = yes;
self
}
/// When enabled, a back slash (`\`) may be used to escape
/// special characters in a glob pattern. Additionally, this will
/// prevent `\` from being interpreted as a path separator on all
/// platforms.
///
/// This is enabled by default on platforms where `\` is not a
/// path separator and disabled by default on platforms where `\`
/// is a path separator.
pub fn backslash_escape(&mut self, yes: bool) -> &mut GlobBuilder<'a> {
self.opts.backslash_escape = yes;
self
}
}
impl Tokens {
@@ -710,6 +754,7 @@ struct Parser<'a> {
chars: iter::Peekable<str::Chars<'a>>,
prev: Option<char>,
cur: Option<char>,
opts: &'a GlobOptions,
}
impl<'a> Parser<'a> {
@@ -726,14 +771,8 @@ impl<'a> Parser<'a> {
'{' => self.push_alternate()?,
'}' => self.pop_alternate()?,
',' => self.parse_comma()?,
c => {
if is_separator(c) {
// Normalize all patterns to use / as a separator.
self.push_token(Token::Literal('/'))?
} else {
self.push_token(Token::Literal(c))?
}
}
'\\' => self.parse_backslash()?,
c => self.push_token(Token::Literal(c))?,
}
}
Ok(())
@@ -786,42 +825,79 @@ impl<'a> Parser<'a> {
}
}
fn parse_backslash(&mut self) -> Result<(), Error> {
if self.opts.backslash_escape {
match self.bump() {
None => Err(self.error(ErrorKind::DanglingEscape)),
Some(c) => self.push_token(Token::Literal(c)),
}
} else if is_separator('\\') {
// Normalize all patterns to use / as a separator.
self.push_token(Token::Literal('/'))
} else {
self.push_token(Token::Literal('\\'))
}
}
fn parse_star(&mut self) -> Result<(), Error> {
let prev = self.prev;
if self.chars.peek() != Some(&'*') {
if self.peek() != Some('*') {
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
assert!(self.bump() == Some('*'));
if !self.have_tokens()? {
self.push_token(Token::RecursivePrefix)?;
let next = self.bump();
if !next.map(is_separator).unwrap_or(true) {
return Err(self.error(ErrorKind::InvalidRecursive));
if !self.peek().map_or(true, is_separator) {
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
} else {
self.push_token(Token::RecursivePrefix)?;
assert!(self.bump().map_or(true, is_separator));
}
return Ok(());
}
self.pop_token()?;
if !prev.map(is_separator).unwrap_or(false) {
if self.stack.len() <= 1
|| (prev != Some(',') && prev != Some('{')) {
return Err(self.error(ErrorKind::InvalidRecursive));
|| (prev != Some(',') && prev != Some('{'))
{
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
}
match self.chars.peek() {
let is_suffix = match self.peek() {
None => {
assert!(self.bump().is_none());
self.push_token(Token::RecursiveSuffix)
true
}
Some(&',') | Some(&'}') if self.stack.len() >= 2 => {
self.push_token(Token::RecursiveSuffix)
}
Some(&c) if is_separator(c) => {
Some(',') | Some('}') if self.stack.len() >= 2 => true,
Some(c) if is_separator(c) => {
assert!(self.bump().map(is_separator).unwrap_or(false));
self.push_token(Token::RecursiveZeroOrMore)
false
}
_ => {
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
};
match self.pop_token()? {
Token::RecursivePrefix => {
self.push_token(Token::RecursivePrefix)?;
}
Token::RecursiveSuffix => {
self.push_token(Token::RecursiveSuffix)?;
}
_ => {
if is_suffix {
self.push_token(Token::RecursiveSuffix)?;
} else {
self.push_token(Token::RecursiveZeroOrMore)?;
}
}
_ => Err(self.error(ErrorKind::InvalidRecursive)),
}
Ok(())
}
fn parse_class(&mut self) -> Result<(), Error> {
@@ -885,7 +961,10 @@ impl<'a> Parser<'a> {
// invariant: in_range is only set when there is
// already at least one character seen.
add_to_last_range(
&self.glob, ranges.last_mut().unwrap(), c)?;
&self.glob,
ranges.last_mut().unwrap(),
c,
)?;
} else {
ranges.push((c, c));
}
@@ -899,10 +978,7 @@ impl<'a> Parser<'a> {
// it as a literal.
ranges.push(('-', '-'));
}
self.push_token(Token::Class {
negated: negated,
ranges: ranges,
})
self.push_token(Token::Class { negated: negated, ranges: ranges })
}
fn bump(&mut self) -> Option<char> {
@@ -910,6 +986,10 @@ impl<'a> Parser<'a> {
self.cur = self.chars.next();
self.cur
}
fn peek(&mut self) -> Option<char> {
self.chars.peek().map(|&ch| ch)
}
}
#[cfg(test)]
@@ -927,14 +1007,15 @@ fn ends_with(needle: &[u8], haystack: &[u8]) -> bool {
#[cfg(test)]
mod tests {
use {GlobSetBuilder, ErrorKind};
use super::{Glob, GlobBuilder, Token};
use super::Token::*;
use super::{Glob, GlobBuilder, Token};
use {ErrorKind, GlobSetBuilder};
#[derive(Clone, Copy, Debug, Default)]
struct Options {
casei: bool,
litsep: bool,
casei: Option<bool>,
litsep: Option<bool>,
bsesc: Option<bool>,
}
macro_rules! syntax {
@@ -944,7 +1025,7 @@ mod tests {
let pat = Glob::new($pat).unwrap();
assert_eq!($tokens, pat.tokens.0);
}
}
};
}
macro_rules! syntaxerr {
@@ -954,7 +1035,7 @@ mod tests {
let err = Glob::new($pat).unwrap_err();
assert_eq!(&$err, err.kind());
}
}
};
}
macro_rules! toregex {
@@ -964,11 +1045,17 @@ mod tests {
($name:ident, $pat:expr, $re:expr, $options:expr) => {
#[test]
fn $name() {
let pat = GlobBuilder::new($pat)
.case_insensitive($options.casei)
.literal_separator($options.litsep)
.build()
.unwrap();
let mut builder = GlobBuilder::new($pat);
if let Some(casei) = $options.casei {
builder.case_insensitive(casei);
}
if let Some(litsep) = $options.litsep {
builder.literal_separator(litsep);
}
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
let pat = builder.build().unwrap();
assert_eq!(format!("(?-u){}", $re), pat.regex());
}
};
@@ -981,11 +1068,17 @@ mod tests {
($name:ident, $pat:expr, $path:expr, $options:expr) => {
#[test]
fn $name() {
let pat = GlobBuilder::new($pat)
.case_insensitive($options.casei)
.literal_separator($options.litsep)
.build()
.unwrap();
let mut builder = GlobBuilder::new($pat);
if let Some(casei) = $options.casei {
builder.case_insensitive(casei);
}
if let Some(litsep) = $options.litsep {
builder.literal_separator(litsep);
}
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
let pat = builder.build().unwrap();
let matcher = pat.compile_matcher();
let strategic = pat.compile_strategic_matcher();
let set = GlobSetBuilder::new().add(pat).build().unwrap();
@@ -1003,11 +1096,17 @@ mod tests {
($name:ident, $pat:expr, $path:expr, $options:expr) => {
#[test]
fn $name() {
let pat = GlobBuilder::new($pat)
.case_insensitive($options.casei)
.literal_separator($options.litsep)
.build()
.unwrap();
let mut builder = GlobBuilder::new($pat);
if let Some(casei) = $options.casei {
builder.case_insensitive(casei);
}
if let Some(litsep) = $options.litsep {
builder.literal_separator(litsep);
}
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
let pat = builder.build().unwrap();
let matcher = pat.compile_matcher();
let strategic = pat.compile_strategic_matcher();
let set = GlobSetBuilder::new().add(pat).build().unwrap();
@@ -1018,7 +1117,9 @@ mod tests {
};
}
fn s(string: &str) -> String { string.to_string() }
fn s(string: &str) -> String {
string.to_string()
}
fn class(s: char, e: char) -> Token {
Class { negated: false, ranges: vec![(s, e)] }
@@ -1042,16 +1143,20 @@ mod tests {
syntax!(any2, "a?b", vec![Literal('a'), Any, Literal('b')]);
syntax!(seq1, "*", vec![ZeroOrMore]);
syntax!(seq2, "a*b", vec![Literal('a'), ZeroOrMore, Literal('b')]);
syntax!(seq3, "*a*b*", vec![
ZeroOrMore, Literal('a'), ZeroOrMore, Literal('b'), ZeroOrMore,
]);
syntax!(
seq3,
"*a*b*",
vec![ZeroOrMore, Literal('a'), ZeroOrMore, Literal('b'), ZeroOrMore,]
);
syntax!(rseq1, "**", vec![RecursivePrefix]);
syntax!(rseq2, "**/", vec![RecursivePrefix]);
syntax!(rseq3, "/**", vec![RecursiveSuffix]);
syntax!(rseq4, "/**/", vec![RecursiveZeroOrMore]);
syntax!(rseq5, "a/**/b", vec![
Literal('a'), RecursiveZeroOrMore, Literal('b'),
]);
syntax!(
rseq5,
"a/**/b",
vec![Literal('a'), RecursiveZeroOrMore, Literal('b'),]
);
syntax!(cls1, "[a]", vec![class('a', 'a')]);
syntax!(cls2, "[!a]", vec![classn('a', 'a')]);
syntax!(cls3, "[a-z]", vec![class('a', 'z')]);
@@ -1063,9 +1168,11 @@ mod tests {
syntax!(cls9, "[a-]", vec![rclass(&[('a', 'a'), ('-', '-')])]);
syntax!(cls10, "[-a-z]", vec![rclass(&[('-', '-'), ('a', 'z')])]);
syntax!(cls11, "[a-z-]", vec![rclass(&[('a', 'z'), ('-', '-')])]);
syntax!(cls12, "[-a-z-]", vec![
rclass(&[('-', '-'), ('a', 'z'), ('-', '-')]),
]);
syntax!(
cls12,
"[-a-z-]",
vec![rclass(&[('-', '-'), ('a', 'z'), ('-', '-')]),]
);
syntax!(cls13, "[]-z]", vec![class(']', 'z')]);
syntax!(cls14, "[--z]", vec![class('-', 'z')]);
syntax!(cls15, "[ --]", vec![class(' ', '-')]);
@@ -1076,13 +1183,6 @@ mod tests {
syntax!(cls20, "[^a]", vec![classn('a', 'a')]);
syntax!(cls21, "[^a-z]", vec![classn('a', 'z')]);
syntaxerr!(err_rseq1, "a**", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq2, "**a", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq3, "a**b", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq4, "***", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq5, "/a**", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq6, "/**a", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq7, "/a**b", ErrorKind::InvalidRecursive);
syntaxerr!(err_unclosed1, "[", ErrorKind::UnclosedClass);
syntaxerr!(err_unclosed2, "[]", ErrorKind::UnclosedClass);
syntaxerr!(err_unclosed3, "[!", ErrorKind::UnclosedClass);
@@ -1090,14 +1190,14 @@ mod tests {
syntaxerr!(err_range1, "[z-a]", ErrorKind::InvalidRange('z', 'a'));
syntaxerr!(err_range2, "[z--]", ErrorKind::InvalidRange('z', '-'));
const CASEI: Options = Options {
casei: true,
litsep: false,
};
const SLASHLIT: Options = Options {
casei: false,
litsep: true,
};
const CASEI: Options =
Options { casei: Some(true), litsep: None, bsesc: None };
const SLASHLIT: Options =
Options { casei: None, litsep: Some(true), bsesc: None };
const NOBSESC: Options =
Options { casei: None, litsep: None, bsesc: Some(false) };
const BSESC: Options =
Options { casei: None, litsep: None, bsesc: Some(true) };
toregex!(re_casei, "a", "(?i)^a$", &CASEI);
@@ -1114,8 +1214,30 @@ mod tests {
toregex!(re8, "[*]", r"^[\*]$");
toregex!(re9, "[+]", r"^[\+]$");
toregex!(re10, "+", r"^\+$");
toregex!(re11, "**", r"^.*$");
toregex!(re12, "", r"^\xe2\x98\x83$");
toregex!(re11, "", r"^\xe2\x98\x83$");
toregex!(re12, "**", r"^.*$");
toregex!(re13, "**/", r"^.*$");
toregex!(re14, "**/*", r"^(?:/?|.*/).*$");
toregex!(re15, "**/**", r"^.*$");
toregex!(re16, "**/**/*", r"^(?:/?|.*/).*$");
toregex!(re17, "**/**/**", r"^.*$");
toregex!(re18, "**/**/**/*", r"^(?:/?|.*/).*$");
toregex!(re19, "a/**", r"^a(?:/?|/.*)$");
toregex!(re20, "a/**/**", r"^a(?:/?|/.*)$");
toregex!(re21, "a/**/**/**", r"^a(?:/?|/.*)$");
toregex!(re22, "a/**/b", r"^a(?:/|/.*/)b$");
toregex!(re23, "a/**/**/b", r"^a(?:/|/.*/)b$");
toregex!(re24, "a/**/**/**/b", r"^a(?:/|/.*/)b$");
toregex!(re25, "**/b", r"^(?:/?|.*/)b$");
toregex!(re26, "**/**/b", r"^(?:/?|.*/)b$");
toregex!(re27, "**/**/**/b", r"^(?:/?|.*/)b$");
toregex!(re28, "a**", r"^a.*.*$");
toregex!(re29, "**a", r"^.*.*a$");
toregex!(re30, "a**b", r"^a.*.*b$");
toregex!(re31, "***", r"^.*.*.*$");
toregex!(re32, "/a**", r"^/a.*.*$");
toregex!(re33, "/**a", r"^/.*.*a$");
toregex!(re34, "/a**b", r"^/a.*.*b$");
matches!(match1, "a", "a");
matches!(match2, "a*b", "a_b");
@@ -1173,8 +1295,11 @@ mod tests {
matches!(matchpat4, "*hello.txt", "some\\path\\to\\hello.txt");
matches!(matchpat5, "*hello.txt", "/an/absolute/path/to/hello.txt");
matches!(matchpat6, "*some/path/to/hello.txt", "some/path/to/hello.txt");
matches!(matchpat7, "*some/path/to/hello.txt",
"a/bigger/some/path/to/hello.txt");
matches!(
matchpat7,
"*some/path/to/hello.txt",
"a/bigger/some/path/to/hello.txt"
);
matches!(matchescape, "_[[]_[]]_[?]_[*]_!_", "_[_]_?_*_!_");
@@ -1209,6 +1334,17 @@ mod tests {
#[cfg(not(unix))]
matches!(matchslash5, "abc\\def", "abc/def", SLASHLIT);
matches!(matchbackslash1, "\\[", "[", BSESC);
matches!(matchbackslash2, "\\?", "?", BSESC);
matches!(matchbackslash3, "\\*", "*", BSESC);
matches!(matchbackslash4, "\\[a-z]", "\\a", NOBSESC);
matches!(matchbackslash5, "\\?", "\\a", NOBSESC);
matches!(matchbackslash6, "\\*", "\\\\", NOBSESC);
#[cfg(unix)]
matches!(matchbackslash7, "\\a", "a");
#[cfg(not(unix))]
matches!(matchbackslash8, "\\a", "/a");
nmatches!(matchnot1, "a*b*c", "abcd");
nmatches!(matchnot2, "abc*abc*abc", "abcabcabcabcabcabcabca");
nmatches!(matchnot3, "some/**/needle.txt", "some/other/notthis.txt");
@@ -1226,40 +1362,63 @@ mod tests {
nmatches!(matchnot15, "[!-]", "-");
nmatches!(matchnot16, "*hello.txt", "hello.txt-and-then-some");
nmatches!(matchnot17, "*hello.txt", "goodbye.txt");
nmatches!(matchnot18, "*some/path/to/hello.txt",
"some/path/to/hello.txt-and-then-some");
nmatches!(matchnot19, "*some/path/to/hello.txt",
"some/other/path/to/hello.txt");
nmatches!(
matchnot18,
"*some/path/to/hello.txt",
"some/path/to/hello.txt-and-then-some"
);
nmatches!(
matchnot19,
"*some/path/to/hello.txt",
"some/other/path/to/hello.txt"
);
nmatches!(matchnot20, "a", "foo/a");
nmatches!(matchnot21, "./foo", "foo");
nmatches!(matchnot22, "**/foo", "foofoo");
nmatches!(matchnot23, "**/foo/bar", "foofoo/bar");
nmatches!(matchnot24, "/*.c", "mozilla-sha1/sha1.c");
nmatches!(matchnot25, "*.c", "mozilla-sha1/sha1.c", SLASHLIT);
nmatches!(matchnot26, "**/m4/ltoptions.m4",
"csharp/src/packages/repositories.config", SLASHLIT);
nmatches!(
matchnot26,
"**/m4/ltoptions.m4",
"csharp/src/packages/repositories.config",
SLASHLIT
);
nmatches!(matchnot27, "a[^0-9]b", "a0b");
nmatches!(matchnot28, "a[^0-9]b", "a9b");
nmatches!(matchnot29, "[^-]", "-");
nmatches!(matchnot30, "some/*/needle.txt", "some/needle.txt");
nmatches!(
matchrec31,
"some/*/needle.txt", "some/one/two/needle.txt", SLASHLIT);
"some/*/needle.txt",
"some/one/two/needle.txt",
SLASHLIT
);
nmatches!(
matchrec32,
"some/*/needle.txt", "some/one/two/three/needle.txt", SLASHLIT);
"some/*/needle.txt",
"some/one/two/three/needle.txt",
SLASHLIT
);
macro_rules! extract {
($which:ident, $name:ident, $pat:expr, $expect:expr) => {
extract!($which, $name, $pat, $expect, Options::default());
};
($which:ident, $name:ident, $pat:expr, $expect:expr, $opts:expr) => {
($which:ident, $name:ident, $pat:expr, $expect:expr, $options:expr) => {
#[test]
fn $name() {
let pat = GlobBuilder::new($pat)
.case_insensitive($opts.casei)
.literal_separator($opts.litsep)
.build().unwrap();
let mut builder = GlobBuilder::new($pat);
if let Some(casei) = $options.casei {
builder.case_insensitive(casei);
}
if let Some(litsep) = $options.litsep {
builder.literal_separator(litsep);
}
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
let pat = builder.build().unwrap();
assert_eq!($expect, pat.$which());
}
};
@@ -1302,19 +1461,27 @@ mod tests {
literal!(extract_lit7, "foo/bar", Some(s("foo/bar")));
literal!(extract_lit8, "**/foo/bar", None);
basetokens!(extract_basetoks1, "**/foo", Some(&*vec![
Literal('f'), Literal('o'), Literal('o'),
]));
basetokens!(
extract_basetoks1,
"**/foo",
Some(&*vec![Literal('f'), Literal('o'), Literal('o'),])
);
basetokens!(extract_basetoks2, "**/foo", None, CASEI);
basetokens!(extract_basetoks3, "**/foo", Some(&*vec![
Literal('f'), Literal('o'), Literal('o'),
]), SLASHLIT);
basetokens!(
extract_basetoks3,
"**/foo",
Some(&*vec![Literal('f'), Literal('o'), Literal('o'),]),
SLASHLIT
);
basetokens!(extract_basetoks4, "*foo", None, SLASHLIT);
basetokens!(extract_basetoks5, "*foo", None);
basetokens!(extract_basetoks6, "**/fo*o", None);
basetokens!(extract_basetoks7, "**/fo*o", Some(&*vec![
Literal('f'), Literal('o'), ZeroOrMore, Literal('o'),
]), SLASHLIT);
basetokens!(
extract_basetoks7,
"**/fo*o",
Some(&*vec![Literal('f'), Literal('o'), ZeroOrMore, Literal('o'),]),
SLASHLIT
);
ext!(extract_ext1, "**/*.rs", Some(s(".rs")));
ext!(extract_ext2, "**/*.rs.bak", None);

View File

@@ -91,6 +91,11 @@ Standard Unix-style glob syntax is supported:
`[!ab]` to match any character except for `a` and `b`.
* Metacharacters such as `*` and `?` can be escaped with character class
notation. e.g., `[*]` matches `*`.
* When backslash escapes are enabled, a backslash (`\`) will escape all meta
characters in a glob. If it precedes a non-meta character, then the slash is
ignored. A `\\` will match a literal `\\`. Note that this mode is only
enabled on Unix platforms by default, but can be enabled on any platform
via the `backslash_escape` setting on `Glob`.
A `GlobBuilder` can be used to prevent wildcards from matching path separators,
or to enable case insensitive matching.
@@ -99,33 +104,37 @@ or to enable case insensitive matching.
#![deny(missing_docs)]
extern crate aho_corasick;
extern crate bstr;
extern crate fnv;
#[macro_use]
extern crate log;
extern crate memchr;
extern crate regex;
#[cfg(feature = "serde1")]
extern crate serde;
use std::borrow::Cow;
use std::collections::{BTreeMap, HashMap};
use std::error::Error as StdError;
use std::ffi::OsStr;
use std::fmt;
use std::hash;
use std::path::Path;
use std::str;
use aho_corasick::{Automaton, AcAutomaton, FullAcAutomaton};
use aho_corasick::AhoCorasick;
use bstr::{ByteSlice, ByteVec, B};
use regex::bytes::{Regex, RegexBuilder, RegexSet};
use pathutil::{
file_name, file_name_ext, normalize_path, os_str_bytes, path_bytes,
};
use glob::MatchStrategy;
pub use glob::{Glob, GlobBuilder, GlobMatcher};
use pathutil::{file_name, file_name_ext, normalize_path};
mod glob;
mod pathutil;
#[cfg(feature = "serde1")]
mod serde_impl;
/// Represents an error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Error {
@@ -138,8 +147,13 @@ pub struct Error {
/// The kind of error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum ErrorKind {
/// Occurs when a use of `**` is invalid. Namely, `**` can only appear
/// adjacent to a path separator, or the beginning/end of a glob.
/// **DEPRECATED**.
///
/// This error used to occur for consistency with git's glob specification,
/// but the specification now accepts all uses of `**`. When `**` does not
/// appear adjacent to a path separator or at the beginning/end of a glob,
/// it is now treated as two consecutive `*` patterns. As such, this error
/// is no longer used.
InvalidRecursive,
/// Occurs when a character class (e.g., `[abc]`) is not closed.
UnclosedClass,
@@ -154,8 +168,17 @@ pub enum ErrorKind {
/// Occurs when an alternating group is nested inside another alternating
/// group, e.g., `{{a,b},{c,d}}`.
NestedAlternates,
/// Occurs when an unescaped '\' is found at the end of a glob.
DanglingEscape,
/// An error associated with parsing or compiling a regex.
Regex(String),
/// Hints that destructuring should not be exhaustive.
///
/// This enum may grow additional variants, so this makes sure clients
/// don't count on exhaustive matching. (Otherwise, adding a new variant
/// could break existing code.)
#[doc(hidden)]
__Nonexhaustive,
}
impl StdError for Error {
@@ -185,9 +208,7 @@ impl ErrorKind {
ErrorKind::UnclosedClass => {
"unclosed character class; missing ']'"
}
ErrorKind::InvalidRange(_, _) => {
"invalid character range"
}
ErrorKind::InvalidRange(_, _) => "invalid character range",
ErrorKind::UnopenedAlternates => {
"unopened alternate group; missing '{' \
(maybe escape '}' with '[}]'?)"
@@ -199,7 +220,9 @@ impl ErrorKind {
ErrorKind::NestedAlternates => {
"nested alternate groups are not allowed"
}
ErrorKind::DanglingEscape => "dangling '\\'",
ErrorKind::Regex(ref err) => err,
ErrorKind::__Nonexhaustive => unreachable!(),
}
}
}
@@ -223,12 +246,12 @@ impl fmt::Display for ErrorKind {
| ErrorKind::UnopenedAlternates
| ErrorKind::UnclosedAlternates
| ErrorKind::NestedAlternates
| ErrorKind::Regex(_) => {
write!(f, "{}", self.description())
}
| ErrorKind::DanglingEscape
| ErrorKind::Regex(_) => write!(f, "{}", self.description()),
ErrorKind::InvalidRange(s, e) => {
write!(f, "invalid range; '{}' > '{}'", s, e)
}
ErrorKind::__Nonexhaustive => unreachable!(),
}
}
}
@@ -239,21 +262,20 @@ fn new_regex(pat: &str) -> Result<Regex, Error> {
.size_limit(10 * (1 << 20))
.dfa_size_limit(10 * (1 << 20))
.build()
.map_err(|err| {
Error {
glob: Some(pat.to_string()),
kind: ErrorKind::Regex(err.to_string()),
}
.map_err(|err| Error {
glob: Some(pat.to_string()),
kind: ErrorKind::Regex(err.to_string()),
})
}
fn new_regex_set<I, S>(pats: I) -> Result<RegexSet, Error>
where S: AsRef<str>, I: IntoIterator<Item=S> {
RegexSet::new(pats).map_err(|err| {
Error {
glob: None,
kind: ErrorKind::Regex(err.to_string()),
}
where
S: AsRef<str>,
I: IntoIterator<Item = S>,
{
RegexSet::new(pats).map_err(|err| Error {
glob: None,
kind: ErrorKind::Regex(err.to_string()),
})
}
@@ -268,12 +290,20 @@ pub struct GlobSet {
}
impl GlobSet {
/// Create an empty `GlobSet`. An empty set matches nothing.
#[inline]
pub fn empty() -> GlobSet {
GlobSet { len: 0, strats: vec![] }
}
/// Returns true if this set is empty, and therefore matches nothing.
#[inline]
pub fn is_empty(&self) -> bool {
self.len == 0
}
/// Returns the number of globs in this set.
#[inline]
pub fn len(&self) -> usize {
self.len
}
@@ -398,11 +428,17 @@ impl GlobSet {
}
}
}
debug!("built glob set; {} literals, {} basenames, {} extensions, \
debug!(
"built glob set; {} literals, {} basenames, {} extensions, \
{} prefixes, {} suffixes, {} required extensions, {} regexes",
lits.0.len(), base_lits.0.len(), exts.0.len(),
prefixes.literals.len(), suffixes.literals.len(),
required_exts.0.len(), regexes.literals.len());
lits.0.len(),
base_lits.0.len(),
exts.0.len(),
prefixes.literals.len(),
suffixes.literals.len(),
required_exts.0.len(),
regexes.literals.len()
);
Ok(GlobSet {
len: pats.len(),
strats: vec![
@@ -412,7 +448,8 @@ impl GlobSet {
GlobSetMatchStrategy::Suffix(suffixes.suffix()),
GlobSetMatchStrategy::Prefix(prefixes.prefix()),
GlobSetMatchStrategy::RequiredExtension(
required_exts.build()?),
required_exts.build()?,
),
GlobSetMatchStrategy::Regex(regexes.regex_set()?),
],
})
@@ -421,6 +458,7 @@ impl GlobSet {
/// GlobSetBuilder builds a group of patterns that can be used to
/// simultaneously match a file path.
#[derive(Clone, Debug)]
pub struct GlobSetBuilder {
pats: Vec<Glob>,
}
@@ -441,7 +479,6 @@ impl GlobSetBuilder {
}
/// Add a new pattern to this set.
#[allow(dead_code)]
pub fn add(&mut self, pat: Glob) -> &mut GlobSetBuilder {
self.pats.push(pat);
self
@@ -464,13 +501,10 @@ pub struct Candidate<'a> {
impl<'a> Candidate<'a> {
/// Create a new candidate for matching from the given path.
pub fn new<P: AsRef<Path> + ?Sized>(path: &'a P) -> Candidate<'a> {
let path = path.as_ref();
let basename = file_name(path).unwrap_or(OsStr::new(""));
Candidate {
path: normalize_path(path_bytes(path)),
basename: os_str_bytes(basename),
ext: file_name_ext(basename).unwrap_or(Cow::Borrowed(b"")),
}
let path = normalize_path(Vec::from_path_lossy(path.as_ref()));
let basename = file_name(&path).unwrap_or(Cow::Borrowed(B("")));
let ext = file_name_ext(&basename).unwrap_or(Cow::Borrowed(B("")));
Candidate { path: path, basename: basename, ext: ext }
}
fn path_prefix(&self, max: usize) -> &[u8] {
@@ -542,12 +576,12 @@ impl LiteralStrategy {
}
fn is_match(&self, candidate: &Candidate) -> bool {
self.0.contains_key(&*candidate.path)
self.0.contains_key(candidate.path.as_bytes())
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if let Some(hits) = self.0.get(&*candidate.path) {
if let Some(hits) = self.0.get(candidate.path.as_bytes()) {
matches.extend(hits);
}
}
@@ -569,7 +603,7 @@ impl BasenameLiteralStrategy {
if candidate.basename.is_empty() {
return false;
}
self.0.contains_key(&*candidate.basename)
self.0.contains_key(candidate.basename.as_bytes())
}
#[inline(never)]
@@ -577,7 +611,7 @@ impl BasenameLiteralStrategy {
if candidate.basename.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.basename) {
if let Some(hits) = self.0.get(candidate.basename.as_bytes()) {
matches.extend(hits);
}
}
@@ -599,7 +633,7 @@ impl ExtensionStrategy {
if candidate.ext.is_empty() {
return false;
}
self.0.contains_key(&*candidate.ext)
self.0.contains_key(candidate.ext.as_bytes())
}
#[inline(never)]
@@ -607,7 +641,7 @@ impl ExtensionStrategy {
if candidate.ext.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.ext) {
if let Some(hits) = self.0.get(candidate.ext.as_bytes()) {
matches.extend(hits);
}
}
@@ -615,7 +649,7 @@ impl ExtensionStrategy {
#[derive(Clone, Debug)]
struct PrefixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
matcher: AhoCorasick,
map: Vec<usize>,
longest: usize,
}
@@ -623,8 +657,8 @@ struct PrefixStrategy {
impl PrefixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
for m in self.matcher.find_overlapping_iter(path) {
if m.start() == 0 {
return true;
}
}
@@ -633,9 +667,9 @@ impl PrefixStrategy {
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
matches.push(self.map[m.pati]);
for m in self.matcher.find_overlapping_iter(path) {
if m.start() == 0 {
matches.push(self.map[m.pattern()]);
}
}
}
@@ -643,7 +677,7 @@ impl PrefixStrategy {
#[derive(Clone, Debug)]
struct SuffixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
matcher: AhoCorasick,
map: Vec<usize>,
longest: usize,
}
@@ -651,8 +685,8 @@ struct SuffixStrategy {
impl SuffixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
for m in self.matcher.find_overlapping_iter(path) {
if m.end() == path.len() {
return true;
}
}
@@ -661,9 +695,9 @@ impl SuffixStrategy {
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
matches.push(self.map[m.pati]);
for m in self.matcher.find_overlapping_iter(path) {
if m.end() == path.len() {
matches.push(self.map[m.pattern()]);
}
}
}
@@ -677,11 +711,11 @@ impl RequiredExtensionStrategy {
if candidate.ext.is_empty() {
return false;
}
match self.0.get(&*candidate.ext) {
match self.0.get(candidate.ext.as_bytes()) {
None => false,
Some(regexes) => {
for &(_, ref re) in regexes {
if re.is_match(&*candidate.path) {
if re.is_match(candidate.path.as_bytes()) {
return true;
}
}
@@ -695,9 +729,9 @@ impl RequiredExtensionStrategy {
if candidate.ext.is_empty() {
return;
}
if let Some(regexes) = self.0.get(&*candidate.ext) {
if let Some(regexes) = self.0.get(candidate.ext.as_bytes()) {
for &(global_index, ref re) in regexes {
if re.is_match(&*candidate.path) {
if re.is_match(candidate.path.as_bytes()) {
matches.push(global_index);
}
}
@@ -713,11 +747,11 @@ struct RegexSetStrategy {
impl RegexSetStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
self.matcher.is_match(&*candidate.path)
self.matcher.is_match(candidate.path.as_bytes())
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
for i in self.matcher.matches(&*candidate.path) {
for i in self.matcher.matches(candidate.path.as_bytes()) {
matches.push(self.map[i]);
}
}
@@ -732,11 +766,7 @@ struct MultiStrategyBuilder {
impl MultiStrategyBuilder {
fn new() -> MultiStrategyBuilder {
MultiStrategyBuilder {
literals: vec![],
map: vec![],
longest: 0,
}
MultiStrategyBuilder { literals: vec![], map: vec![], longest: 0 }
}
fn add(&mut self, global_index: usize, literal: String) {
@@ -748,18 +778,16 @@ impl MultiStrategyBuilder {
}
fn prefix(self) -> PrefixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
PrefixStrategy {
matcher: AcAutomaton::new(it).into_full(),
matcher: AhoCorasick::new_auto_configured(&self.literals),
map: self.map,
longest: self.longest,
}
}
fn suffix(self) -> SuffixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
SuffixStrategy {
matcher: AcAutomaton::new(it).into_full(),
matcher: AhoCorasick::new_auto_configured(&self.literals),
map: self.map,
longest: self.longest,
}

View File

@@ -1,41 +1,26 @@
use std::borrow::Cow;
use std::ffi::OsStr;
use std::path::Path;
use bstr::{ByteSlice, ByteVec};
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(unix)]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
let path = path.as_ref().as_os_str().as_bytes();
pub fn file_name<'a>(path: &Cow<'a, [u8]>) -> Option<Cow<'a, [u8]>> {
if path.is_empty() {
return None;
} else if path.len() == 1 && path[0] == b'.' {
return None;
} else if path.last() == Some(&b'.') {
return None;
} else if path.len() >= 2 && &path[path.len() - 2..] == &b".."[..] {
} else if path.last_byte() == Some(b'.') {
return None;
}
let last_slash = memrchr(b'/', path).map(|i| i + 1).unwrap_or(0);
Some(OsStr::from_bytes(&path[last_slash..]))
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(not(unix))]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
path.as_ref().file_name()
let last_slash = path.rfind_byte(b'/').map(|i| i + 1).unwrap_or(0);
Some(match *path {
Cow::Borrowed(path) => Cow::Borrowed(&path[last_slash..]),
Cow::Owned(ref path) => {
let mut path = path.clone();
path.drain_bytes(..last_slash);
Cow::Owned(path)
}
})
}
/// Return a file extension given a path's file name.
@@ -54,55 +39,24 @@ pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
/// a pattern like `*.rs` is obviously trying to match files with a `rs`
/// extension, but it also matches files like `.rs`, which doesn't have an
/// extension according to std::path::Path::extension.
pub fn file_name_ext(name: &OsStr) -> Option<Cow<[u8]>> {
pub fn file_name_ext<'a>(name: &Cow<'a, [u8]>) -> Option<Cow<'a, [u8]>> {
if name.is_empty() {
return None;
}
let name = os_str_bytes(name);
let last_dot_at = {
let result = name
.iter().enumerate().rev()
.find(|&(_, &b)| b == b'.')
.map(|(i, _)| i);
match result {
None => return None,
Some(i) => i,
}
let last_dot_at = match name.rfind_byte(b'.') {
None => return None,
Some(i) => i,
};
Some(match name {
Some(match *name {
Cow::Borrowed(name) => Cow::Borrowed(&name[last_dot_at..]),
Cow::Owned(mut name) => {
name.drain(..last_dot_at);
Cow::Owned(ref name) => {
let mut name = name.clone();
name.drain_bytes(..last_dot_at);
Cow::Owned(name)
}
})
}
/// Return raw bytes of a path, transcoded to UTF-8 if necessary.
pub fn path_bytes(path: &Path) -> Cow<[u8]> {
os_str_bytes(path.as_os_str())
}
/// Return the raw bytes of the given OS string, possibly transcoded to UTF-8.
#[cfg(unix)]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
use std::os::unix::ffi::OsStrExt;
Cow::Borrowed(s.as_bytes())
}
/// Return the raw bytes of the given OS string, possibly transcoded to UTF-8.
#[cfg(not(unix))]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
// TODO(burntsushi): On Windows, OS strings are WTF-8, which is a superset
// of UTF-8, so even if we could get at the raw bytes, they wouldn't
// be useful. We *must* convert to UTF-8 before doing path matching.
// Unfortunate, but necessary.
match s.to_string_lossy() {
Cow::Owned(s) => Cow::Owned(s.into_bytes()),
Cow::Borrowed(s) => Cow::Borrowed(s.as_bytes()),
}
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(unix)]
@@ -129,7 +83,8 @@ pub fn normalize_path(mut path: Cow<[u8]>) -> Cow<[u8]> {
#[cfg(test)]
mod tests {
use std::borrow::Cow;
use std::ffi::OsStr;
use bstr::{ByteVec, B};
use super::{file_name_ext, normalize_path};
@@ -137,8 +92,9 @@ mod tests {
($name:ident, $file_name:expr, $ext:expr) => {
#[test]
fn $name() {
let got = file_name_ext(OsStr::new($file_name));
assert_eq!($ext.map(|s| Cow::Borrowed(s.as_bytes())), got);
let bs = Vec::from($file_name);
let got = file_name_ext(&Cow::Owned(bs));
assert_eq!($ext.map(|s| Cow::Borrowed(B(s))), got);
}
};
}
@@ -153,7 +109,8 @@ mod tests {
($name:ident, $path:expr, $expected:expr) => {
#[test]
fn $name() {
let got = normalize_path(Cow::Owned($path.to_vec()));
let bs = Vec::from_slice($path);
let got = normalize_path(Cow::Owned(bs));
assert_eq!($expected.to_vec(), got.into_owned());
}
};

View File

@@ -0,0 +1,38 @@
use serde::de::Error;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use Glob;
impl Serialize for Glob {
fn serialize<S: Serializer>(
&self,
serializer: S,
) -> Result<S::Ok, S::Error> {
serializer.serialize_str(self.glob())
}
}
impl<'de> Deserialize<'de> for Glob {
fn deserialize<D: Deserializer<'de>>(
deserializer: D,
) -> Result<Self, D::Error> {
let glob = <&str as Deserialize>::deserialize(deserializer)?;
Glob::new(glob).map_err(D::Error::custom)
}
}
#[cfg(test)]
mod tests {
use Glob;
#[test]
fn glob_json_works() {
let test_glob = Glob::new("src/**/*.rs").unwrap();
let ser = serde_json::to_string(&test_glob).unwrap();
assert_eq!(ser, "\"src/**/*.rs\"");
let de: Glob = serde_json::from_str(&ser).unwrap();
assert_eq!(test_glob, de);
}
}

32
crates/grep/Cargo.toml Normal file
View File

@@ -0,0 +1,32 @@
[package]
name = "grep"
version = "0.2.5" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.
"""
documentation = "http://burntsushi.net/rustdoc/grep/"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/grep"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/grep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense/MIT"
[dependencies]
grep-cli = { version = "0.1.4", path = "../cli" }
grep-matcher = { version = "0.1.4", path = "../matcher" }
grep-pcre2 = { version = "0.1.4", path = "../pcre2", optional = true }
grep-printer = { version = "0.1.4", path = "../printer" }
grep-regex = { version = "0.1.6", path = "../regex" }
grep-searcher = { version = "0.1.7", path = "../searcher" }
[dev-dependencies]
termcolor = "1.0.4"
walkdir = "2.2.7"
[features]
simd-accel = ["grep-searcher/simd-accel"]
pcre2 = ["grep-pcre2"]
# This feature is DEPRECATED. Runtime dispatch is used for SIMD now.
avx-accel = []

41
crates/grep/README.md Normal file
View File

@@ -0,0 +1,41 @@
grep
----
ripgrep, as a library.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/grep.svg)](https://crates.io/crates/grep)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/grep](https://docs.rs/grep)
NOTE: This crate isn't ready for wide use yet. Ambitious individuals can
probably piece together the parts, but there is no high level documentation
describing how all of the pieces fit together.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep = "0.2"
```
and this to your crate root:
```rust
extern crate grep;
```
### Features
This crate provides a `pcre2` feature (disabled by default) which, when
enabled, re-exports the `grep-pcre2` crate as an alternative `Matcher`
implementation to the standard `grep-regex` implementation.

View File

@@ -0,0 +1,72 @@
extern crate grep;
extern crate termcolor;
extern crate walkdir;
use std::env;
use std::error::Error;
use std::ffi::OsString;
use std::process;
use grep::cli;
use grep::printer::{ColorSpecs, StandardBuilder};
use grep::regex::RegexMatcher;
use grep::searcher::{BinaryDetection, SearcherBuilder};
use termcolor::ColorChoice;
use walkdir::WalkDir;
fn main() {
if let Err(err) = try_main() {
eprintln!("{}", err);
process::exit(1);
}
}
fn try_main() -> Result<(), Box<dyn Error>> {
let mut args: Vec<OsString> = env::args_os().collect();
if args.len() < 2 {
return Err("Usage: simplegrep <pattern> [<path> ...]".into());
}
if args.len() == 2 {
args.push(OsString::from("./"));
}
search(cli::pattern_from_os(&args[1])?, &args[2..])
}
fn search(pattern: &str, paths: &[OsString]) -> Result<(), Box<dyn Error>> {
let matcher = RegexMatcher::new_line_matcher(&pattern)?;
let mut searcher = SearcherBuilder::new()
.binary_detection(BinaryDetection::quit(b'\x00'))
.line_number(false)
.build();
let mut printer = StandardBuilder::new()
.color_specs(ColorSpecs::default_with_color())
.build(cli::stdout(if cli::is_tty_stdout() {
ColorChoice::Auto
} else {
ColorChoice::Never
}));
for path in paths {
for result in WalkDir::new(path) {
let dent = match result {
Ok(dent) => dent,
Err(err) => {
eprintln!("{}", err);
continue;
}
};
if !dent.file_type().is_file() {
continue;
}
let result = searcher.search_path(
&matcher,
dent.path(),
printer.sink_with_path(&matcher, dent.path()),
);
if let Err(err) = result {
eprintln!("{}: {}", dent.path().display(), err);
}
}
}
Ok(())
}

23
crates/grep/src/lib.rs Normal file
View File

@@ -0,0 +1,23 @@
/*!
ripgrep, as a library.
This library is intended to provide a high level facade to the crates that
make up ripgrep's core searching routines. However, there is no high level
documentation available yet guiding users on how to fit all of the pieces
together.
Every public API item in the constituent crates is documented, but examples
are sparse.
A cookbook and a guide are planned.
*/
#![deny(missing_docs)]
pub extern crate grep_cli as cli;
pub extern crate grep_matcher as matcher;
#[cfg(feature = "pcre2")]
pub extern crate grep_pcre2 as pcre2;
pub extern crate grep_printer as printer;
pub extern crate grep_regex as regex;
pub extern crate grep_searcher as searcher;

View File

@@ -1,14 +1,14 @@
[package]
name = "ignore"
version = "0.4.0" #:version
version = "0.4.12" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A fast library for efficiently matching ignore files such as `.gitignore`
against file paths.
"""
documentation = "https://docs.rs/ignore"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/ignore"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/ignore"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/ignore"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/ignore"
readme = "README.md"
keywords = ["glob", "ignore", "gitignore", "pattern", "file"]
license = "Unlicense/MIT"
@@ -18,22 +18,19 @@ name = "ignore"
bench = false
[dependencies]
crossbeam = "0.3"
globset = { version = "0.3.0", path = "../globset" }
lazy_static = "1"
log = "0.4"
memchr = "2"
regex = "0.2.1"
same-file = "1"
thread_local = "0.3.2"
walkdir = "2"
crossbeam-channel = "0.4.0"
crossbeam-utils = "0.7.0"
globset = { version = "0.4.3", path = "../globset" }
lazy_static = "1.1"
log = "0.4.5"
memchr = "2.1"
regex = "1.1"
same-file = "1.0.4"
thread_local = "1"
walkdir = "2.2.7"
[target.'cfg(windows)'.dependencies.winapi]
version = "0.3"
features = ["std", "winnt"]
[dev-dependencies]
tempdir = "0.3.5"
[target.'cfg(windows)'.dependencies.winapi-util]
version = "0.1.2"
[features]
simd-accel = ["globset/simd-accel"]

View File

@@ -4,7 +4,7 @@ The ignore crate provides a fast recursive directory iterator that respects
various filters such as globs, file types and `.gitignore` files. This crate
also provides lower level direct access to gitignore and file type matchers.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ignore.svg)](https://crates.io/crates/ignore)
@@ -20,7 +20,7 @@ Add this to your `Cargo.toml`:
```toml
[dependencies]
ignore = "0.3"
ignore = "0.4"
```
and this to your crate root:

View File

@@ -1,17 +1,12 @@
#![allow(dead_code, unused_imports, unused_mut, unused_variables)]
extern crate crossbeam;
extern crate crossbeam_channel as channel;
extern crate ignore;
extern crate walkdir;
use std::env;
use std::io::{self, Write};
use std::path::Path;
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::thread;
use crossbeam::sync::MsQueue;
use ignore::WalkBuilder;
use walkdir::WalkDir;
@@ -19,7 +14,7 @@ fn main() {
let mut path = env::args().nth(1).unwrap();
let mut parallel = false;
let mut simple = false;
let queue: Arc<MsQueue<Option<DirEntry>>> = Arc::new(MsQueue::new());
let (tx, rx) = channel::bounded::<DirEntry>(100);
if path == "parallel" {
path = env::args().nth(2).unwrap();
parallel = true;
@@ -28,10 +23,9 @@ fn main() {
simple = true;
}
let stdout_queue = queue.clone();
let stdout_thread = thread::spawn(move || {
let mut stdout = io::BufWriter::new(io::stdout());
while let Some(dent) = stdout_queue.pop() {
for dent in rx {
write_path(&mut stdout, dent.path());
}
});
@@ -39,28 +33,26 @@ fn main() {
if parallel {
let walker = WalkBuilder::new(path).threads(6).build_parallel();
walker.run(|| {
let queue = queue.clone();
let tx = tx.clone();
Box::new(move |result| {
use ignore::WalkState::*;
queue.push(Some(DirEntry::Y(result.unwrap())));
tx.send(DirEntry::Y(result.unwrap())).unwrap();
Continue
})
});
} else if simple {
let mut stdout = io::BufWriter::new(io::stdout());
let walker = WalkDir::new(path);
for result in walker {
queue.push(Some(DirEntry::X(result.unwrap())));
tx.send(DirEntry::X(result.unwrap())).unwrap();
}
} else {
let mut stdout = io::BufWriter::new(io::stdout());
let walker = WalkBuilder::new(path).build();
for result in walker {
queue.push(Some(DirEntry::Y(result.unwrap())));
tx.send(DirEntry::Y(result.unwrap())).unwrap();
}
}
queue.push(None);
drop(tx);
stdout_thread.join().unwrap();
}

View File

@@ -0,0 +1,246 @@
/// This list represents the default file types that ripgrep ships with. In
/// general, any file format is fair game, although it should generally be
/// limited to reasonably popular open formats. For other cases, you can add
/// types to each invocation of ripgrep with the '--type-add' flag.
///
/// If you would like to add or improve this list, please file a PR:
/// https://github.com/BurntSushi/ripgrep
///
/// Please try to keep this list sorted lexicographically and wrapped to 79
/// columns (inclusive).
#[rustfmt::skip]
pub const DEFAULT_TYPES: &[(&str, &[&str])] = &[
("agda", &["*.agda", "*.lagda"]),
("aidl", &["*.aidl"]),
("amake", &["*.mk", "*.bp"]),
("asciidoc", &["*.adoc", "*.asc", "*.asciidoc"]),
("asm", &["*.asm", "*.s", "*.S"]),
("asp", &[
"*.aspx", "*.aspx.cs", "*.aspx.cs", "*.ascx", "*.ascx.cs", "*.ascx.vb",
]),
("ats", &["*.ats", "*.dats", "*.sats", "*.hats"]),
("avro", &["*.avdl", "*.avpr", "*.avsc"]),
("awk", &["*.awk"]),
("bazel", &["*.bzl", "WORKSPACE", "BUILD", "BUILD.bazel"]),
("bitbake", &["*.bb", "*.bbappend", "*.bbclass", "*.conf", "*.inc"]),
("brotli", &["*.br"]),
("buildstream", &["*.bst"]),
("bzip2", &["*.bz2", "*.tbz2"]),
("c", &["*.[chH]", "*.[chH].in", "*.cats"]),
("cabal", &["*.cabal"]),
("cbor", &["*.cbor"]),
("ceylon", &["*.ceylon"]),
("clojure", &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
("cmake", &["*.cmake", "CMakeLists.txt"]),
("coffeescript", &["*.coffee"]),
("config", &["*.cfg", "*.conf", "*.config", "*.ini"]),
("coq", &["*.v"]),
("cpp", &[
"*.[ChH]", "*.cc", "*.[ch]pp", "*.[ch]xx", "*.hh", "*.inl",
"*.[ChH].in", "*.cc.in", "*.[ch]pp.in", "*.[ch]xx.in", "*.hh.in",
]),
("creole", &["*.creole"]),
("crystal", &["Projectfile", "*.cr"]),
("cs", &["*.cs"]),
("csharp", &["*.cs"]),
("cshtml", &["*.cshtml"]),
("css", &["*.css", "*.scss"]),
("csv", &["*.csv"]),
("cython", &["*.pyx", "*.pxi", "*.pxd"]),
("d", &["*.d"]),
("dart", &["*.dart"]),
("dhall", &["*.dhall"]),
("diff", &["*.patch", "*.diff"]),
("docker", &["*Dockerfile*"]),
("edn", &["*.edn"]),
("elisp", &["*.el"]),
("elixir", &["*.ex", "*.eex", "*.exs"]),
("elm", &["*.elm"]),
("erb", &["*.erb"]),
("erlang", &["*.erl", "*.hrl"]),
("fidl", &["*.fidl"]),
("fish", &["*.fish"]),
("fortran", &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("gap", &["*.g", "*.gap", "*.gi", "*.gd", "*.tst"]),
("gn", &["*.gn", "*.gni"]),
("go", &["*.go"]),
("gradle", &["*.gradle"]),
("groovy", &["*.groovy", "*.gradle"]),
("gzip", &["*.gz", "*.tgz"]),
("h", &["*.h", "*.hpp"]),
("haml", &["*.haml"]),
("haskell", &["*.hs", "*.lhs", "*.cpphs", "*.c2hs", "*.hsc"]),
("hbs", &["*.hbs"]),
("hs", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html", "*.ejs"]),
("idris", &["*.idr", "*.lidr"]),
("java", &["*.java", "*.jsp", "*.jspx", "*.properties"]),
("jinja", &["*.j2", "*.jinja", "*.jinja2"]),
("jl", &["*.jl"]),
("js", &["*.js", "*.jsx", "*.vue"]),
("json", &["*.json", "composer.lock"]),
("jsonl", &["*.jsonl"]),
("julia", &["*.jl"]),
("jupyter", &["*.ipynb", "*.jpynb"]),
("k", &["*.k"]),
("kotlin", &["*.kt", "*.kts"]),
("less", &["*.less"]),
("license", &[
// General
"COPYING", "COPYING[.-]*",
"COPYRIGHT", "COPYRIGHT[.-]*",
"EULA", "EULA[.-]*",
"licen[cs]e", "licen[cs]e.*",
"LICEN[CS]E", "LICEN[CS]E[.-]*", "*[.-]LICEN[CS]E*",
"NOTICE", "NOTICE[.-]*",
"PATENTS", "PATENTS[.-]*",
"UNLICEN[CS]E", "UNLICEN[CS]E[.-]*",
// GPL (gpl.txt, etc.)
"agpl[.-]*",
"gpl[.-]*",
"lgpl[.-]*",
// Other license-specific (APACHE-2.0.txt, etc.)
"AGPL-*[0-9]*",
"APACHE-*[0-9]*",
"BSD-*[0-9]*",
"CC-BY-*",
"GFDL-*[0-9]*",
"GNU-*[0-9]*",
"GPL-*[0-9]*",
"LGPL-*[0-9]*",
"MIT-*[0-9]*",
"MPL-*[0-9]*",
"OFL-*[0-9]*",
]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
("lock", &["*.lock", "package-lock.json"]),
("log", &["*.log"]),
("lua", &["*.lua"]),
("lz4", &["*.lz4"]),
("lzma", &["*.lzma"]),
("m4", &["*.ac", "*.m4"]),
("make", &[
"[Gg][Nn][Uu]makefile", "[Mm]akefile",
"[Gg][Nn][Uu]makefile.am", "[Mm]akefile.am",
"[Gg][Nn][Uu]makefile.in", "[Mm]akefile.in",
"*.mk", "*.mak"
]),
("mako", &["*.mako", "*.mao"]),
("man", &["*.[0-9lnpx]", "*.[0-9][cEFMmpSx]"]),
("markdown", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("matlab", &["*.m"]),
("md", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("mk", &["mkfile"]),
("ml", &["*.ml"]),
("msbuild", &[
"*.csproj", "*.fsproj", "*.vcxproj", "*.proj", "*.props", "*.targets",
]),
("nim", &["*.nim", "*.nimf", "*.nimble", "*.nims"]),
("nix", &["*.nix"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
("org", &["*.org", "*.org_archive"]),
("pascal", &["*.pas", "*.dpr", "*.lpr", "*.pp", "*.inc"]),
("pdf", &["*.pdf"]),
("perl", &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm", "*.t"]),
("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
("pod", &["*.pod"]),
("postscript", &["*.eps", "*.ps"]),
("protobuf", &["*.proto"]),
("ps", &["*.cdxml", "*.ps1", "*.ps1xml", "*.psd1", "*.psm1"]),
("puppet", &["*.erb", "*.pp", "*.rb"]),
("purs", &["*.purs"]),
("py", &["*.py"]),
("qmake", &["*.pro", "*.pri", "*.prf"]),
("qml", &["*.qml"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rdoc", &["*.rdoc"]),
("readme", &["README*", "*README"]),
("robot", &["*.robot"]),
("rst", &["*.rst"]),
("ruby", &["Gemfile", "*.gemspec", ".irbrc", "Rakefile", "*.rb"]),
("rust", &["*.rs"]),
("sass", &["*.sass", "*.scss"]),
("scala", &["*.scala", "*.sbt"]),
("sh", &[
// Portable/misc. init files
".login", ".logout", ".profile", "profile",
// bash-specific init files
".bash_login", "bash_login",
".bash_logout", "bash_logout",
".bash_profile", "bash_profile",
".bashrc", "bashrc", "*.bashrc",
// csh-specific init files
".cshrc", "*.cshrc",
// ksh-specific init files
".kshrc", "*.kshrc",
// tcsh-specific init files
".tcshrc",
// zsh-specific init files
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
// Extensions
"*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh", "*.zsh",
]),
("slim", &["*.skim", "*.slim", "*.slime"]),
("smarty", &["*.tpl"]),
("sml", &["*.sml", "*.sig"]),
("soy", &["*.soy"]),
("spark", &["*.spark"]),
("spec", &["*.spec"]),
("sql", &["*.sql", "*.psql"]),
("stylus", &["*.styl"]),
("sv", &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
("svg", &["*.svg"]),
("swift", &["*.swift"]),
("swig", &["*.def", "*.i"]),
("systemd", &[
"*.automount", "*.conf", "*.device", "*.link", "*.mount", "*.path",
"*.scope", "*.service", "*.slice", "*.socket", "*.swap", "*.target",
"*.timer",
]),
("taskpaper", &["*.taskpaper"]),
("tcl", &["*.tcl"]),
("tex", &["*.tex", "*.ltx", "*.cls", "*.sty", "*.bib", "*.dtx", "*.ins"]),
("textile", &["*.textile"]),
("tf", &["*.tf"]),
("thrift", &["*.thrift"]),
("toml", &["*.toml", "Cargo.lock"]),
("ts", &["*.ts", "*.tsx"]),
("twig", &["*.twig"]),
("txt", &["*.txt"]),
("typoscript", &["*.typoscript", "*.ts"]),
("vala", &["*.vala"]),
("vb", &["*.vb"]),
("verilog", &["*.v", "*.vh", "*.sv", "*.svh"]),
("vhdl", &["*.vhd", "*.vhdl"]),
("vim", &["*.vim"]),
("vimscript", &["*.vim"]),
("webidl", &["*.idl", "*.webidl", "*.widl"]),
("wiki", &["*.mediawiki", "*.wiki"]),
("xml", &[
"*.xml", "*.xml.dist", "*.dtd", "*.xsl", "*.xslt", "*.xsd", "*.xjb",
"*.rng", "*.sch", "*.xhtml",
]),
("xz", &["*.xz", "*.txz"]),
("yacc", &["*.y"]),
("yaml", &["*.yaml", "*.yml"]),
("zig", &["*.zig"]),
("zsh", &[
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
"*.zsh",
]),
("zstd", &["*.zst", "*.zstd"]),
];

View File

@@ -14,14 +14,17 @@
// well.
use std::collections::HashMap;
use std::ffi::{OsString, OsStr};
use std::ffi::{OsStr, OsString};
use std::fs::{File, FileType};
use std::io::{self, BufRead};
use std::path::{Path, PathBuf};
use std::sync::{Arc, RwLock};
use gitignore::{self, Gitignore, GitignoreBuilder};
use pathutil::{is_hidden, strip_prefix};
use overrides::{self, Override};
use pathutil::{is_hidden, strip_prefix};
use types::{self, Types};
use walk::DirEntry;
use {Error, Match, PartialErrorBuilder};
/// IgnoreMatch represents information about where a match came from when using
@@ -65,19 +68,19 @@ struct IgnoreOptions {
hidden: bool,
/// Whether to read .ignore files.
ignore: bool,
/// Whether to respect any ignore files in parent directories.
parents: bool,
/// Whether to read git's global gitignore file.
git_global: bool,
/// Whether to read .gitignore files.
git_ignore: bool,
/// Whether to read .git/info/exclude files.
git_exclude: bool,
}
impl IgnoreOptions {
/// Returns true if at least one type of ignore rules should be matched.
fn has_any_ignore_options(&self) -> bool {
self.ignore || self.git_global || self.git_ignore || self.git_exclude
}
/// Whether to ignore files case insensitively
ignore_case_insensitive: bool,
/// Whether a git repository must be present in order to apply any
/// git-related ignore rules.
require_git: bool,
}
/// Ignore is a matcher useful for recursively walking one or more directories.
@@ -158,6 +161,15 @@ impl Ignore {
&self,
path: P,
) -> (Ignore, Option<Error>) {
if !self.0.opts.parents
&& !self.0.opts.git_ignore
&& !self.0.opts.git_exclude
&& !self.0.opts.git_global
{
// If we never need info from parent directories, then don't do
// anything.
return (self.clone(), None);
}
if !self.is_root() {
panic!("Ignore::add_parents called on non-root matcher");
}
@@ -190,6 +202,11 @@ impl Ignore {
errs.maybe_push(err);
igtmp.is_absolute_parent = true;
igtmp.absolute_base = Some(absolute_base.clone());
igtmp.has_git = if self.0.opts.git_ignore {
parent.join(".git").exists()
} else {
false
};
ig = Ignore(Arc::new(igtmp));
compiled.insert(parent.as_os_str().to_os_string(), ig.clone());
}
@@ -214,38 +231,70 @@ impl Ignore {
/// Like add_child, but takes a full path and returns an IgnoreInner.
fn add_child_path(&self, dir: &Path) -> (IgnoreInner, Option<Error>) {
let git_type = if self.0.opts.git_ignore || self.0.opts.git_exclude {
dir.join(".git").metadata().ok().map(|md| md.file_type())
} else {
None
};
let has_git = git_type.map(|_| true).unwrap_or(false);
let mut errs = PartialErrorBuilder::default();
let custom_ig_matcher =
{
let (m, err) =
create_gitignore(&dir, &self.0.custom_ignore_filenames);
errs.maybe_push(err);
m
};
let ig_matcher =
if !self.0.opts.ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, &[".ignore"]);
errs.maybe_push(err);
m
};
let gi_matcher =
if !self.0.opts.git_ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, &[".gitignore"]);
errs.maybe_push(err);
m
};
let gi_exclude_matcher =
if !self.0.opts.git_exclude {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, &[".git/info/exclude"]);
errs.maybe_push(err);
m
};
let custom_ig_matcher = if self.0.custom_ignore_filenames.is_empty() {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(
&dir,
&dir,
&self.0.custom_ignore_filenames,
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let ig_matcher = if !self.0.opts.ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(
&dir,
&dir,
&[".ignore"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let gi_matcher = if !self.0.opts.git_ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(
&dir,
&dir,
&[".gitignore"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
};
let gi_exclude_matcher = if !self.0.opts.git_exclude {
Gitignore::empty()
} else {
match resolve_git_commondir(dir, git_type) {
Ok(git_dir) => {
let (m, err) = create_gitignore(
&dir,
&git_dir,
&["info/exclude"],
self.0.opts.ignore_case_insensitive,
);
errs.maybe_push(err);
m
}
Err(err) => {
errs.maybe_push(err);
Gitignore::empty()
}
}
};
let ig = IgnoreInner {
compiled: self.0.compiled.clone(),
dir: dir.to_path_buf(),
@@ -261,17 +310,44 @@ impl Ignore {
git_global_matcher: self.0.git_global_matcher.clone(),
git_ignore_matcher: gi_matcher,
git_exclude_matcher: gi_exclude_matcher,
has_git: dir.join(".git").is_dir(),
has_git: has_git,
opts: self.0.opts,
};
(ig, errs.into_error_option())
}
/// Returns true if at least one type of ignore rule should be matched.
fn has_any_ignore_rules(&self) -> bool {
let opts = self.0.opts;
let has_custom_ignore_files =
!self.0.custom_ignore_filenames.is_empty();
let has_explicit_ignores = !self.0.explicit_ignores.is_empty();
opts.ignore
|| opts.git_global
|| opts.git_ignore
|| opts.git_exclude
|| has_custom_ignore_files
|| has_explicit_ignores
}
/// Like `matched`, but works with a directory entry instead.
pub fn matched_dir_entry<'a>(
&'a self,
dent: &DirEntry,
) -> Match<IgnoreMatch<'a>> {
let m = self.matched(dent.path(), dent.is_dir());
if m.is_none() && self.0.opts.hidden && is_hidden(dent) {
return Match::Ignore(IgnoreMatch::hidden());
}
m
}
/// Returns a match indicating whether the given file path should be
/// ignored or not.
///
/// The match contains information about its origin.
pub fn matched<'a, P: AsRef<Path>>(
fn matched<'a, P: AsRef<Path>>(
&'a self,
path: P,
is_dir: bool,
@@ -287,15 +363,17 @@ impl Ignore {
// return that result immediately. Overrides have the highest
// precedence.
if !self.0.overrides.is_empty() {
let mat =
self.0.overrides.matched(path, is_dir)
.map(IgnoreMatch::overrides);
let mat = self
.0
.overrides
.matched(path, is_dir)
.map(IgnoreMatch::overrides);
if !mat.is_none() {
return mat;
}
}
let mut whitelisted = Match::None;
if self.0.opts.has_any_ignore_options() {
if self.has_any_ignore_rules() {
let mat = self.matched_ignore(path, is_dir);
if mat.is_ignore() {
return mat;
@@ -312,9 +390,6 @@ impl Ignore {
whitelisted = mat;
}
}
if whitelisted.is_none() && self.0.opts.hidden && is_hidden(path) {
return Match::Ignore(IgnoreMatch::hidden());
}
whitelisted
}
@@ -325,56 +400,75 @@ impl Ignore {
path: &Path,
is_dir: bool,
) -> Match<IgnoreMatch<'a>> {
let (mut m_custom_ignore, mut m_ignore, mut m_gi, mut m_gi_exclude, mut m_explicit) =
(Match::None, Match::None, Match::None, Match::None, Match::None);
let (
mut m_custom_ignore,
mut m_ignore,
mut m_gi,
mut m_gi_exclude,
mut m_explicit,
) = (Match::None, Match::None, Match::None, Match::None, Match::None);
let any_git =
!self.0.opts.require_git || self.parents().any(|ig| ig.0.has_git);
let mut saw_git = false;
for ig in self.parents().take_while(|ig| !ig.0.is_absolute_parent) {
if m_custom_ignore.is_none() {
m_custom_ignore =
ig.0.custom_ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.custom_ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi.is_none() {
if any_git && !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_ignore_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi_exclude.is_none() {
if any_git && !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
ig.0.git_exclude_matcher
.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
if let Some(abs_parent_path) = self.absolute_base() {
let path = abs_parent_path.join(path);
for ig in self.parents().skip_while(|ig|!ig.0.is_absolute_parent) {
if m_custom_ignore.is_none() {
m_custom_ignore =
ig.0.custom_ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
if self.0.opts.parents {
if let Some(abs_parent_path) = self.absolute_base() {
let path = abs_parent_path.join(path);
for ig in
self.parents().skip_while(|ig| !ig.0.is_absolute_parent)
{
if m_custom_ignore.is_none() {
m_custom_ignore =
ig.0.custom_ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if any_git && !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
}
for gi in self.0.explicit_ignores.iter().rev() {
@@ -383,10 +477,21 @@ impl Ignore {
}
m_explicit = gi.matched(&path, is_dir).map(IgnoreMatch::gitignore);
}
let m_global = self.0.git_global_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
let m_global = if any_git {
self.0
.git_global_matcher
.matched(&path, is_dir)
.map(IgnoreMatch::gitignore)
} else {
Match::None
};
m_custom_ignore.or(m_ignore).or(m_gi).or(m_gi_exclude).or(m_global).or(m_explicit)
m_custom_ignore
.or(m_ignore)
.or(m_gi)
.or(m_gi_exclude)
.or(m_global)
.or(m_explicit)
}
/// Returns an iterator over parent ignore matchers, including this one.
@@ -452,9 +557,12 @@ impl IgnoreBuilder {
opts: IgnoreOptions {
hidden: true,
ignore: true,
parents: true,
git_global: true,
git_ignore: true,
git_exclude: true,
ignore_case_insensitive: false,
require_git: true,
},
}
}
@@ -464,16 +572,19 @@ impl IgnoreBuilder {
/// The matcher returned won't match anything until ignore rules from
/// directories are added to it.
pub fn build(&self) -> Ignore {
let git_global_matcher =
if !self.opts.git_global {
Gitignore::empty()
} else {
let (gi, err) = Gitignore::global();
if let Some(err) = err {
debug!("{}", err);
}
gi
};
let git_global_matcher = if !self.opts.git_global {
Gitignore::empty()
} else {
let mut builder = GitignoreBuilder::new("");
builder
.case_insensitive(self.opts.ignore_case_insensitive)
.unwrap();
let (gi, err) = builder.build_global();
if let Some(err) = err {
debug!("{}", err);
}
gi
};
Ignore(Arc::new(IgnoreInner {
compiled: Arc::new(RwLock::new(HashMap::new())),
@@ -484,7 +595,9 @@ impl IgnoreBuilder {
is_absolute_parent: true,
absolute_base: None,
explicit_ignores: Arc::new(self.explicit_ignores.clone()),
custom_ignore_filenames: Arc::new(self.custom_ignore_filenames.clone()),
custom_ignore_filenames: Arc::new(
self.custom_ignore_filenames.clone(),
),
custom_ignore_matcher: Gitignore::empty(),
ignore_matcher: Gitignore::empty(),
git_global_matcher: Arc::new(git_global_matcher),
@@ -529,7 +642,7 @@ impl IgnoreBuilder {
/// later names.
pub fn add_custom_ignore_filename<S: AsRef<OsStr>>(
&mut self,
file_name: S
file_name: S,
) -> &mut IgnoreBuilder {
self.custom_ignore_filenames.push(file_name.as_ref().to_os_string());
self
@@ -554,6 +667,17 @@ impl IgnoreBuilder {
self
}
/// Enables reading ignore files from parent directories.
///
/// If this is enabled, then .gitignore files in parent directories of each
/// file path given are respected. Otherwise, they are ignored.
///
/// This is enabled by default.
pub fn parents(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.parents = yes;
self
}
/// Add a global gitignore matcher.
///
/// Its precedence is lower than both normal `.gitignore` files and
@@ -588,23 +712,63 @@ impl IgnoreBuilder {
self.opts.git_exclude = yes;
self
}
/// Whether a git repository is required to apply git-related ignore
/// rules (global rules, .gitignore and local exclude rules).
///
/// When disabled, git-related ignore rules are applied even when searching
/// outside a git repository.
pub fn require_git(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.require_git = yes;
self
}
/// Process ignore files case insensitively
///
/// This is disabled by default.
pub fn ignore_case_insensitive(
&mut self,
yes: bool,
) -> &mut IgnoreBuilder {
self.opts.ignore_case_insensitive = yes;
self
}
}
/// Creates a new gitignore matcher for the directory given.
///
/// Ignore globs are extracted from each of the file names in `dir` in the
/// order given (earlier names have lower precedence than later names).
/// The matcher is meant to match files below `dir`.
/// Ignore globs are extracted from each of the file names relative to
/// `dir_for_ignorefile` in the order given (earlier names have lower
/// precedence than later names).
///
/// I/O errors are ignored.
pub fn create_gitignore<T: AsRef<OsStr>>(
dir: &Path,
dir_for_ignorefile: &Path,
names: &[T],
case_insensitive: bool,
) -> (Gitignore, Option<Error>) {
let mut builder = GitignoreBuilder::new(dir);
let mut errs = PartialErrorBuilder::default();
builder.case_insensitive(case_insensitive).unwrap();
for name in names {
let gipath = dir.join(name.as_ref());
errs.maybe_push_ignore_io(builder.add(gipath));
let gipath = dir_for_ignorefile.join(name.as_ref());
// This check is not necessary, but is added for performance. Namely,
// a simple stat call checking for existence can often be just a bit
// quicker than actually trying to open a file. Since the number of
// directories without ignore files likely greatly exceeds the number
// with ignore files, this check generally makes sense.
//
// However, until demonstrated otherwise, we speculatively do not do
// this on Windows since Windows is notorious for having slow file
// system operations. Namely, it's not clear whether this analysis
// makes sense on Windows.
//
// For more details: https://github.com/BurntSushi/ripgrep/pull/1381
if cfg!(windows) || gipath.exists() {
errs.maybe_push_ignore_io(builder.add(gipath));
}
}
let gi = match builder.build() {
Ok(gi) => gi,
@@ -616,16 +780,71 @@ pub fn create_gitignore<T: AsRef<OsStr>>(
(gi, errs.into_error_option())
}
/// Find the GIT_COMMON_DIR for the given git worktree.
///
/// This is the directory that may contain a private ignore file
/// "info/exclude". Unlike git, this function does *not* read environment
/// variables GIT_DIR and GIT_COMMON_DIR, because it is not clear how to use
/// them when multiple repositories are searched.
///
/// Some I/O errors are ignored.
fn resolve_git_commondir(
dir: &Path,
git_type: Option<FileType>,
) -> Result<PathBuf, Option<Error>> {
let git_dir_path = || dir.join(".git");
let git_dir = git_dir_path();
if !git_type.map_or(false, |ft| ft.is_file()) {
return Ok(git_dir);
}
let file = match File::open(git_dir) {
Ok(file) => io::BufReader::new(file),
Err(err) => {
return Err(Some(Error::Io(err).with_path(git_dir_path())));
}
};
let dot_git_line = match file.lines().next() {
Some(Ok(line)) => line,
Some(Err(err)) => {
return Err(Some(Error::Io(err).with_path(git_dir_path())));
}
None => return Err(None),
};
if !dot_git_line.starts_with("gitdir: ") {
return Err(None);
}
let real_git_dir = PathBuf::from(&dot_git_line["gitdir: ".len()..]);
let git_commondir_file = || real_git_dir.join("commondir");
let file = match File::open(git_commondir_file()) {
Ok(file) => io::BufReader::new(file),
Err(err) => {
return Err(Some(Error::Io(err).with_path(git_commondir_file())));
}
};
let commondir_line = match file.lines().next() {
Some(Ok(line)) => line,
Some(Err(err)) => {
return Err(Some(Error::Io(err).with_path(git_commondir_file())));
}
None => return Err(None),
};
let commondir_abs = if commondir_line.starts_with(".") {
real_git_dir.join(commondir_line) // relative commondir
} else {
PathBuf::from(commondir_line)
};
Ok(commondir_abs)
}
#[cfg(test)]
mod tests {
use std::fs::{self, File};
use std::io::Write;
use std::io::{self, Write};
use std::path::Path;
use tempdir::TempDir;
use dir::IgnoreBuilder;
use gitignore::Gitignore;
use tests::TempDir;
use Error;
fn wfile<P: AsRef<Path>>(path: P, contents: &str) {
@@ -644,15 +863,19 @@ mod tests {
}
}
fn tmpdir() -> TempDir {
TempDir::new().unwrap()
}
#[test]
fn explicit_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
wfile(td.path().join("not-an-ignore"), "foo\n!bar");
let (gi, err) = Gitignore::new(td.path().join("not-an-ignore"));
assert!(err.is_none());
let (ig, err) = IgnoreBuilder::new()
.add_ignore(gi).build().add_child(td.path());
let (ig, err) =
IgnoreBuilder::new().add_ignore(gi).build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
@@ -661,7 +884,7 @@ mod tests {
#[test]
fn git_exclude() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
mkdirp(td.path().join(".git/info"));
wfile(td.path().join(".git/info/exclude"), "foo\n!bar");
@@ -674,7 +897,8 @@ mod tests {
#[test]
fn gitignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
mkdirp(td.path().join(".git"));
wfile(td.path().join(".gitignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -684,9 +908,36 @@ mod tests {
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn gitignore_no_git() {
let td = tmpdir();
wfile(td.path().join(".gitignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_none());
assert!(ig.matched("bar", false).is_none());
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn gitignore_allowed_no_git() {
let td = tmpdir();
wfile(td.path().join(".gitignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new()
.require_git(false)
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn ignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
wfile(td.path().join(".ignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -698,13 +949,14 @@ mod tests {
#[test]
fn custom_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
let custom_ignore = ".customignore";
wfile(td.path().join(custom_ignore), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
@@ -714,14 +966,15 @@ mod tests {
// Tests that a custom ignore file will override an .ignore.
#[test]
fn custom_ignore_over_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
let custom_ignore = ".customignore";
wfile(td.path().join(".ignore"), "foo");
wfile(td.path().join(custom_ignore), "!foo");
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_whitelist());
}
@@ -729,7 +982,7 @@ mod tests {
// Tests that earlier custom ignore files have lower precedence than later.
#[test]
fn custom_ignore_precedence() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
let custom_ignore1 = ".customignore1";
let custom_ignore2 = ".customignore2";
wfile(td.path().join(custom_ignore1), "foo");
@@ -738,7 +991,8 @@ mod tests {
let (ig, err) = IgnoreBuilder::new()
.add_custom_ignore_filename(custom_ignore1)
.add_custom_ignore_filename(custom_ignore2)
.build().add_child(td.path());
.build()
.add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_whitelist());
}
@@ -746,7 +1000,7 @@ mod tests {
// Tests that an .ignore will override a .gitignore.
#[test]
fn ignore_over_gitignore() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
wfile(td.path().join(".gitignore"), "foo");
wfile(td.path().join(".ignore"), "!foo");
@@ -758,7 +1012,7 @@ mod tests {
// Tests that exclude has lower precedent than both .ignore and .gitignore.
#[test]
fn exclude_lowest() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
wfile(td.path().join(".gitignore"), "!foo");
wfile(td.path().join(".ignore"), "!bar");
mkdirp(td.path().join(".git/info"));
@@ -773,8 +1027,8 @@ mod tests {
#[test]
fn errored() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo");
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_some());
@@ -782,9 +1036,9 @@ mod tests {
#[test]
fn errored_both() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo");
wfile(td.path().join(".ignore"), "fo**o");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo");
wfile(td.path().join(".ignore"), "{bar");
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert_eq!(2, partial(err.expect("an error")).len());
@@ -792,8 +1046,9 @@ mod tests {
#[test]
fn errored_partial() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo\nbar");
let td = tmpdir();
mkdirp(td.path().join(".git"));
wfile(td.path().join(".gitignore"), "{foo\nbar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_some());
@@ -802,8 +1057,8 @@ mod tests {
#[test]
fn errored_partial_and_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo\nbar");
let td = tmpdir();
wfile(td.path().join(".gitignore"), "{foo\nbar");
wfile(td.path().join(".ignore"), "!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -813,7 +1068,7 @@ mod tests {
#[test]
fn not_present_empty() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
@@ -823,7 +1078,7 @@ mod tests {
fn stops_at_git_dir() {
// This tests that .gitignore files beyond a .git barrier aren't
// matched, but .ignore files are.
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo/.git"));
wfile(td.path().join(".gitignore"), "foo");
@@ -844,7 +1099,7 @@ mod tests {
#[test]
fn absolute_parent() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo"));
wfile(td.path().join(".gitignore"), "bar");
@@ -867,7 +1122,7 @@ mod tests {
#[test]
fn absolute_parent_anchored() {
let td = TempDir::new("ignore-test-").unwrap();
let td = tmpdir();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("src/llvm"));
wfile(td.path().join(".gitignore"), "/llvm/\nfoo");
@@ -884,4 +1139,62 @@ mod tests {
assert!(ig2.matched("foo", false).is_ignore());
assert!(ig2.matched("src/foo", false).is_ignore());
}
#[test]
fn git_info_exclude_in_linked_worktree() {
let td = tmpdir();
let git_dir = td.path().join(".git");
mkdirp(git_dir.join("info"));
wfile(git_dir.join("info/exclude"), "ignore_me");
mkdirp(git_dir.join("worktrees/linked-worktree"));
let commondir_path =
|| git_dir.join("worktrees/linked-worktree/commondir");
mkdirp(td.path().join("linked-worktree"));
let worktree_git_dir_abs = format!(
"gitdir: {}",
git_dir.join("worktrees/linked-worktree").to_str().unwrap(),
);
wfile(td.path().join("linked-worktree/.git"), &worktree_git_dir_abs);
// relative commondir
wfile(commondir_path(), "../..");
let ib = IgnoreBuilder::new().build();
let (ignore, err) = ib.add_child(td.path().join("linked-worktree"));
assert!(err.is_none());
assert!(ignore.matched("ignore_me", false).is_ignore());
// absolute commondir
wfile(commondir_path(), git_dir.to_str().unwrap());
let (ignore, err) = ib.add_child(td.path().join("linked-worktree"));
assert!(err.is_none());
assert!(ignore.matched("ignore_me", false).is_ignore());
// missing commondir file
assert!(fs::remove_file(commondir_path()).is_ok());
let (_, err) = ib.add_child(td.path().join("linked-worktree"));
assert!(err.is_some());
assert!(match err {
Some(Error::WithPath { path, err }) => {
if path != commondir_path() {
false
} else {
match *err {
Error::Io(ioerr) => {
ioerr.kind() == io::ErrorKind::NotFound
}
_ => false,
}
}
}
_ => false,
});
wfile(td.path().join("linked-worktree/.git"), "garbage");
let (_, err) = ib.add_child(td.path().join("linked-worktree"));
assert!(err.is_none());
wfile(td.path().join("linked-worktree/.git"), "gitdir: garbage");
let (_, err) = ib.add_child(td.path().join("linked-worktree"));
assert!(err.is_some());
}
}

View File

@@ -66,6 +66,11 @@ impl Glob {
pub fn is_only_dir(&self) -> bool {
self.is_only_dir
}
/// Returns true if and only if this glob has a `**/` prefix.
fn has_doublestar_prefix(&self) -> bool {
self.actual.starts_with("**/") || self.actual == "**"
}
}
/// Gitignore is a matcher for the globs in one or more gitignore files
@@ -77,7 +82,7 @@ pub struct Gitignore {
globs: Vec<Glob>,
num_ignores: u64,
num_whitelists: u64,
matches: Arc<ThreadLocal<RefCell<Vec<usize>>>>,
matches: Option<Arc<ThreadLocal<RefCell<Vec<usize>>>>>,
}
impl Gitignore {
@@ -121,23 +126,21 @@ impl Gitignore {
/// `$XDG_CONFIG_HOME/git/ignore` is read. If `$XDG_CONFIG_HOME` is not
/// set or is empty, then `$HOME/.config/git/ignore` is used instead.
pub fn global() -> (Gitignore, Option<Error>) {
match gitconfig_excludes_path() {
None => (Gitignore::empty(), None),
Some(path) => {
if !path.is_file() {
(Gitignore::empty(), None)
} else {
Gitignore::new(path)
}
}
}
GitignoreBuilder::new("").build_global()
}
/// Creates a new empty gitignore matcher that never matches anything.
///
/// Its path is empty.
pub fn empty() -> Gitignore {
GitignoreBuilder::new("").build().unwrap()
Gitignore {
set: GlobSet::empty(),
root: PathBuf::from(""),
globs: vec![],
num_ignores: 0,
num_whitelists: 0,
matches: None,
}
}
/// Returns the directory containing this gitignore matcher.
@@ -207,6 +210,11 @@ impl Gitignore {
/// determined by a common suffix of the directory containing this
/// gitignore) is stripped. If there is no common suffix/prefix overlap,
/// then `path` is assumed to be relative to this matcher.
///
/// # Panics
///
/// This method panics if the given file path is not under the root path
/// of this matcher.
pub fn matched_path_or_any_parents<P: AsRef<Path>>(
&self,
path: P,
@@ -216,10 +224,8 @@ impl Gitignore {
return Match::None;
}
let mut path = self.strip(path.as_ref());
debug_assert!(
!path.has_root(),
"path is expect to be under the root"
);
assert!(!path.has_root(), "path is expected to be under the root");
match self.matched_stripped(path, is_dir) {
Match::None => (), // walk up
a_match => return a_match,
@@ -243,7 +249,7 @@ impl Gitignore {
return Match::None;
}
let path = path.as_ref();
let _matches = self.matches.get_default();
let _matches = self.matches.as_ref().unwrap().get_or_default();
let mut matches = _matches.borrow_mut();
let candidate = Candidate::new(path);
self.set.matches_candidate_into(&candidate, &mut *matches);
@@ -278,7 +284,10 @@ impl Gitignore {
// BUT, a file name might not have any directory components to it,
// in which case, we don't want to accidentally strip any part of the
// file name.
if !is_file_name(path) {
//
// As an additional special case, if the root is just `.`, then we
// shouldn't try to strip anything, e.g., when path begins with a `.`.
if self.root != Path::new(".") && !is_file_name(path) {
if let Some(p) = strip_prefix(&self.root, path) {
path = p;
// If we're left with a leading slash, get rid of it.
@@ -292,6 +301,7 @@ impl Gitignore {
}
/// Builds a matcher for a single set of globs from a .gitignore file.
#[derive(Clone, Debug)]
pub struct GitignoreBuilder {
builder: GlobSetBuilder,
root: PathBuf,
@@ -322,23 +332,50 @@ impl GitignoreBuilder {
pub fn build(&self) -> Result<Gitignore, Error> {
let nignore = self.globs.iter().filter(|g| !g.is_whitelist()).count();
let nwhite = self.globs.iter().filter(|g| g.is_whitelist()).count();
let set =
self.builder.build().map_err(|err| {
Error::Glob {
glob: None,
err: err.to_string(),
}
})?;
let set = self
.builder
.build()
.map_err(|err| Error::Glob { glob: None, err: err.to_string() })?;
Ok(Gitignore {
set: set,
root: self.root.clone(),
globs: self.globs.clone(),
num_ignores: nignore as u64,
num_whitelists: nwhite as u64,
matches: Arc::new(ThreadLocal::default()),
matches: Some(Arc::new(ThreadLocal::default())),
})
}
/// Build a global gitignore matcher using the configuration in this
/// builder.
///
/// This consumes ownership of the builder unlike `build` because it
/// must mutate the builder to add the global gitignore globs.
///
/// Note that this ignores the path given to this builder's constructor
/// and instead derives the path automatically from git's global
/// configuration.
pub fn build_global(mut self) -> (Gitignore, Option<Error>) {
match gitconfig_excludes_path() {
None => (Gitignore::empty(), None),
Some(path) => {
if !path.is_file() {
(Gitignore::empty(), None)
} else {
let mut errs = PartialErrorBuilder::default();
errs.maybe_push_ignore_io(self.add(path));
match self.build() {
Ok(gi) => (gi, errs.into_error_option()),
Err(err) => {
errs.push(err);
(Gitignore::empty(), errs.into_error_option())
}
}
}
}
}
}
/// Add each glob from the file path given.
///
/// The file given should be formatted as a `gitignore` file.
@@ -399,6 +436,8 @@ impl GitignoreBuilder {
from: Option<PathBuf>,
mut line: &str,
) -> Result<&mut GitignoreBuilder, Error> {
#![allow(deprecated)]
if line.starts_with("#") {
return Ok(self);
}
@@ -415,7 +454,6 @@ impl GitignoreBuilder {
is_whitelist: false,
is_only_dir: false,
};
let mut literal_separator = false;
let mut is_absolute = false;
if line.starts_with("\\!") || line.starts_with("\\#") {
line = &line[1..];
@@ -430,7 +468,6 @@ impl GitignoreBuilder {
// then the glob can only match the beginning of a path
// (relative to the location of gitignore). We achieve this by
// simply banning wildcards from matching /.
literal_separator = true;
line = &line[1..];
is_absolute = true;
}
@@ -443,18 +480,13 @@ impl GitignoreBuilder {
line = &line[..i];
}
}
// If there is a literal slash, then we note that so that globbing
// doesn't let wildcards match slashes.
glob.actual = line.to_string();
if is_absolute || line.chars().any(|c| c == '/') {
literal_separator = true;
}
// If there was a slash, then this is a glob that must match the entire
// path name. Otherwise, we should let it match anywhere, so use a **/
// prefix.
if !literal_separator {
// If there is a literal slash, then this is a glob that must match the
// entire path name. Otherwise, we should let it match anywhere, so use
// a **/ prefix.
if !is_absolute && !line.chars().any(|c| c == '/') {
// ... but only if we don't already have a **/ prefix.
if !(glob.actual.starts_with("**/") || (glob.actual == "**" && glob.is_only_dir)) {
if !glob.has_doublestar_prefix() {
glob.actual = format!("**/{}", glob.actual);
}
}
@@ -464,17 +496,15 @@ impl GitignoreBuilder {
if glob.actual.ends_with("/**") {
glob.actual = format!("{}/*", glob.actual);
}
let parsed =
GlobBuilder::new(&glob.actual)
.literal_separator(literal_separator)
.case_insensitive(self.case_insensitive)
.build()
.map_err(|err| {
Error::Glob {
glob: Some(glob.original.clone()),
err: err.kind().to_string(),
}
})?;
let parsed = GlobBuilder::new(&glob.actual)
.literal_separator(true)
.case_insensitive(self.case_insensitive)
.backslash_escape(true)
.build()
.map_err(|err| Error::Glob {
glob: Some(glob.original.clone()),
err: err.kind().to_string(),
})?;
self.builder.add(parsed);
self.globs.push(glob);
Ok(self)
@@ -482,10 +512,16 @@ impl GitignoreBuilder {
/// Toggle whether the globs should be matched case insensitively or not.
///
/// When this option is changed, only globs added after the change will be
/// affected.
///
/// This is disabled by default.
pub fn case_insensitive(
&mut self, yes: bool
&mut self,
yes: bool,
) -> Result<&mut GitignoreBuilder, Error> {
// TODO: This should not return a `Result`. Fix this in the next semver
// release.
self.case_insensitive = yes;
Ok(self)
}
@@ -495,16 +531,27 @@ impl GitignoreBuilder {
///
/// Note that the file path returned may not exist.
fn gitconfig_excludes_path() -> Option<PathBuf> {
gitconfig_contents()
.and_then(|data| parse_excludes_file(&data))
.or_else(excludes_file_default)
// git supports $HOME/.gitconfig and $XDG_CONFIG_HOME/git/config. Notably,
// both can be active at the same time, where $HOME/.gitconfig takes
// precedent. So if $HOME/.gitconfig defines a `core.excludesFile`, then
// we're done.
match gitconfig_home_contents().and_then(|x| parse_excludes_file(&x)) {
Some(path) => return Some(path),
None => {}
}
match gitconfig_xdg_contents().and_then(|x| parse_excludes_file(&x)) {
Some(path) => return Some(path),
None => {}
}
excludes_file_default()
}
/// Returns the file contents of git's global config file, if one exists.
fn gitconfig_contents() -> Option<Vec<u8>> {
let home = match env::var_os("HOME") {
/// Returns the file contents of git's global config file, if one exists, in
/// the user's home directory.
fn gitconfig_home_contents() -> Option<Vec<u8>> {
let home = match home_dir() {
None => return None,
Some(home) => PathBuf::from(home),
Some(home) => home,
};
let mut file = match File::open(home.join(".gitconfig")) {
Err(_) => return None,
@@ -514,13 +561,28 @@ fn gitconfig_contents() -> Option<Vec<u8>> {
file.read_to_end(&mut contents).ok().map(|_| contents)
}
/// Returns the file contents of git's global config file, if one exists, in
/// the user's XDG_CONFIG_HOME directory.
fn gitconfig_xdg_contents() -> Option<Vec<u8>> {
let path = env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })
.or_else(|| home_dir().map(|p| p.join(".config")))
.map(|x| x.join("git/config"));
let mut file = match path.and_then(|p| File::open(p).ok()) {
None => return None,
Some(file) => io::BufReader::new(file),
};
let mut contents = vec![];
file.read_to_end(&mut contents).ok().map(|_| contents)
}
/// Returns the default file path for a global .gitignore file.
///
/// Specifically, this respects XDG_CONFIG_HOME.
fn excludes_file_default() -> Option<PathBuf> {
env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })
.or_else(|| env::home_dir().map(|p| p.join(".config")))
.or_else(|| home_dir().map(|p| p.join(".config")))
.map(|x| x.join("git/ignore"))
}
@@ -531,8 +593,8 @@ fn parse_excludes_file(data: &[u8]) -> Option<PathBuf> {
// probably works in more circumstances. I guess we would ideally have
// a full INI parser. Yuck.
lazy_static! {
static ref RE: Regex = Regex::new(
r"(?ium)^\s*excludesfile\s*=\s*(.+)\s*$").unwrap();
static ref RE: Regex =
Regex::new(r"(?im)^\s*excludesfile\s*=\s*(.+)\s*$").unwrap();
};
let caps = match RE.captures(data) {
None => return None,
@@ -543,17 +605,26 @@ fn parse_excludes_file(data: &[u8]) -> Option<PathBuf> {
/// Expands ~ in file paths to the value of $HOME.
fn expand_tilde(path: &str) -> String {
let home = match env::var("HOME") {
Err(_) => return path.to_string(),
Ok(home) => home,
let home = match home_dir() {
None => return path.to_string(),
Some(home) => home.to_string_lossy().into_owned(),
};
path.replace("~", &home)
}
/// Returns the location of the user's home directory.
fn home_dir() -> Option<PathBuf> {
// We're fine with using env::home_dir for now. Its bugs are, IMO, pretty
// minor corner cases. We should still probably eventually migrate to
// the `dirs` crate to get a proper implementation.
#![allow(deprecated)]
env::home_dir()
}
#[cfg(test)]
mod tests {
use std::path::Path;
use super::{Gitignore, GitignoreBuilder};
use std::path::Path;
fn gi_from_str<P: AsRef<Path>>(root: P, s: &str) -> Gitignore {
let mut builder = GitignoreBuilder::new(root);
@@ -620,6 +691,19 @@ mod tests {
ignored!(ig29, ROOT, "node_modules/ ", "node_modules", true);
ignored!(ig30, ROOT, "**/", "foo/bar", true);
ignored!(ig31, ROOT, "path1/*", "path1/foo");
ignored!(ig32, ROOT, ".a/b", ".a/b");
ignored!(ig33, "./", ".a/b", ".a/b");
ignored!(ig34, ".", ".a/b", ".a/b");
ignored!(ig35, "./.", ".a/b", ".a/b");
ignored!(ig36, "././", ".a/b", ".a/b");
ignored!(ig37, "././.", ".a/b", ".a/b");
ignored!(ig38, ROOT, "\\[", "[");
ignored!(ig39, ROOT, "\\?", "?");
ignored!(ig40, ROOT, "\\*", "*");
ignored!(ig41, ROOT, "\\a", "a");
ignored!(ig42, ROOT, "s*.rs", "sfoo.rs");
ignored!(ig43, ROOT, "**", "foo.rs");
ignored!(ig44, ROOT, "**/**/*", "a/foo.rs");
not_ignored!(ignot1, ROOT, "amonths", "months");
not_ignored!(ignot2, ROOT, "monthsa", "months");
@@ -635,12 +719,16 @@ mod tests {
not_ignored!(ignot12, ROOT, "\n\n\n", "foo");
not_ignored!(ignot13, ROOT, "foo/**", "foo", true);
not_ignored!(
ignot14, "./third_party/protobuf", "m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config");
ignot14,
"./third_party/protobuf",
"m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config"
);
not_ignored!(ignot15, ROOT, "!/bar", "foo/bar");
not_ignored!(ignot16, ROOT, "*\n!**/", "foo", true);
not_ignored!(ignot17, ROOT, "src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot18, ROOT, "path1/*", "path2/path1/foo");
not_ignored!(ignot19, ROOT, "s*.rs", "src/foo.rs");
fn bytes(s: &str) -> Vec<u8> {
s.to_string().into_bytes()
@@ -679,9 +767,12 @@ mod tests {
#[test]
fn case_insensitive() {
let gi = GitignoreBuilder::new(ROOT)
.case_insensitive(true).unwrap()
.add_str(None, "*.html").unwrap()
.build().unwrap();
.case_insensitive(true)
.unwrap()
.add_str(None, "*.html")
.unwrap()
.build()
.unwrap();
assert!(gi.matched("foo.html", false).is_ignore());
assert!(gi.matched("foo.HTML", false).is_ignore());
assert!(!gi.matched("foo.htm", false).is_ignore());

View File

@@ -46,7 +46,7 @@ See the documentation for `WalkBuilder` for many other options.
#![deny(missing_docs)]
extern crate crossbeam;
extern crate crossbeam_channel as channel;
extern crate globset;
#[macro_use]
extern crate lazy_static;
@@ -55,24 +55,26 @@ extern crate log;
extern crate memchr;
extern crate regex;
extern crate same_file;
#[cfg(test)]
extern crate tempdir;
extern crate thread_local;
extern crate walkdir;
#[cfg(windows)]
extern crate winapi;
extern crate winapi_util;
use std::error;
use std::fmt;
use std::io;
use std::path::{Path, PathBuf};
pub use walk::{DirEntry, Walk, WalkBuilder, WalkParallel, WalkState};
pub use walk::{
DirEntry, ParallelVisitor, ParallelVisitorBuilder, Walk, WalkBuilder,
WalkParallel, WalkState,
};
mod default_types;
mod dir;
pub mod gitignore;
mod pathutil;
pub mod overrides;
mod pathutil;
pub mod types;
mod walk;
@@ -132,6 +134,38 @@ pub enum Error {
InvalidDefinition,
}
impl Clone for Error {
fn clone(&self) -> Error {
match *self {
Error::Partial(ref errs) => Error::Partial(errs.clone()),
Error::WithLineNumber { line, ref err } => {
Error::WithLineNumber { line: line, err: err.clone() }
}
Error::WithPath { ref path, ref err } => {
Error::WithPath { path: path.clone(), err: err.clone() }
}
Error::WithDepth { depth, ref err } => {
Error::WithDepth { depth: depth, err: err.clone() }
}
Error::Loop { ref ancestor, ref child } => Error::Loop {
ancestor: ancestor.clone(),
child: child.clone(),
},
Error::Io(ref err) => match err.raw_os_error() {
Some(e) => Error::Io(io::Error::from_raw_os_error(e)),
None => Error::Io(io::Error::new(err.kind(), err.to_string())),
},
Error::Glob { ref glob, ref err } => {
Error::Glob { glob: glob.clone(), err: err.clone() }
}
Error::UnrecognizedFileType(ref err) => {
Error::UnrecognizedFileType(err.clone())
}
Error::InvalidDefinition => Error::InvalidDefinition,
}
}
}
impl Error {
/// Returns true if this is a partial error.
///
@@ -183,19 +217,14 @@ impl Error {
/// Turn an error into a tagged error with the given depth.
fn with_depth(self, depth: usize) -> Error {
Error::WithDepth {
depth: depth,
err: Box::new(self),
}
Error::WithDepth { depth: depth, err: Box::new(self) }
}
/// Turn an error into a tagged error with the given file path and line
/// number. If path is empty, then it is omitted from the error.
fn tagged<P: AsRef<Path>>(self, path: P, lineno: u64) -> Error {
let errline = Error::WithLineNumber {
line: lineno,
err: Box::new(self),
};
let errline =
Error::WithLineNumber { line: lineno, err: Box::new(self) };
if path.as_ref().as_os_str().is_empty() {
return errline;
}
@@ -217,16 +246,14 @@ impl Error {
let path = err.path().map(|p| p.to_path_buf());
let mut ig_err = Error::Io(io::Error::from(err));
if let Some(path) = path {
ig_err = Error::WithPath {
path: path,
err: Box::new(ig_err),
};
ig_err = Error::WithPath { path: path, err: Box::new(ig_err) };
}
ig_err
}
}
impl error::Error for Error {
#[allow(deprecated)]
fn description(&self) -> &str {
match *self {
Error::Partial(_) => "partial error",
@@ -257,11 +284,13 @@ impl fmt::Display for Error {
write!(f, "{}: {}", path.display(), err)
}
Error::WithDepth { ref err, .. } => err.fmt(f),
Error::Loop { ref ancestor, ref child } => {
write!(f, "File system loop found: \
Error::Loop { ref ancestor, ref child } => write!(
f,
"File system loop found: \
{} points to an ancestor {}",
child.display(), ancestor.display())
}
child.display(),
ancestor.display()
),
Error::Io(ref err) => err.fmt(f),
Error::Glob { glob: None, ref err } => write!(f, "{}", err),
Error::Glob { glob: Some(ref glob), ref err } => {
@@ -270,10 +299,11 @@ impl fmt::Display for Error {
Error::UnrecognizedFileType(ref ty) => {
write!(f, "unrecognized file type: {}", ty)
}
Error::InvalidDefinition => {
write!(f, "invalid definition (format is type:glob, e.g., \
html:*.html)")
}
Error::InvalidDefinition => write!(
f,
"invalid definition (format is type:glob, e.g., \
html:*.html)"
),
}
}
}
@@ -404,3 +434,66 @@ impl<T> Match<T> {
}
}
}
#[cfg(test)]
mod tests {
use std::env;
use std::error;
use std::fs;
use std::path::{Path, PathBuf};
use std::result;
/// A convenient result type alias.
pub type Result<T> =
result::Result<T, Box<dyn error::Error + Send + Sync>>;
macro_rules! err {
($($tt:tt)*) => {
Box::<dyn error::Error + Send + Sync>::from(format!($($tt)*))
}
}
/// A simple wrapper for creating a temporary directory that is
/// automatically deleted when it's dropped.
///
/// We use this in lieu of tempfile because tempfile brings in too many
/// dependencies.
#[derive(Debug)]
pub struct TempDir(PathBuf);
impl Drop for TempDir {
fn drop(&mut self) {
fs::remove_dir_all(&self.0).unwrap();
}
}
impl TempDir {
/// Create a new empty temporary directory under the system's configured
/// temporary directory.
pub fn new() -> Result<TempDir> {
use std::sync::atomic::{AtomicUsize, Ordering};
static TRIES: usize = 100;
static COUNTER: AtomicUsize = AtomicUsize::new(0);
let tmpdir = env::temp_dir();
for _ in 0..TRIES {
let count = COUNTER.fetch_add(1, Ordering::SeqCst);
let path = tmpdir.join("rust-ignore").join(count.to_string());
if path.is_dir() {
continue;
}
fs::create_dir_all(&path).map_err(|e| {
err!("failed to create {}: {}", path.display(), e)
})?;
return Ok(TempDir(path));
}
Err(err!("failed to create temp dir after {} tries", TRIES))
}
/// Return the underlying path to this temporary directory.
pub fn path(&self) -> &Path {
&self.0
}
}
}

View File

@@ -115,9 +115,7 @@ impl OverrideBuilder {
///
/// Matching is done relative to the directory path provided.
pub fn new<P: AsRef<Path>>(path: P) -> OverrideBuilder {
OverrideBuilder {
builder: GitignoreBuilder::new(path),
}
OverrideBuilder { builder: GitignoreBuilder::new(path) }
}
/// Builds a new override matcher from the globs added so far.
@@ -139,11 +137,16 @@ impl OverrideBuilder {
}
/// Toggle whether the globs should be matched case insensitively or not.
///
///
/// When this option is changed, only globs added after the change will be affected.
///
/// This is disabled by default.
pub fn case_insensitive(
&mut self, yes: bool
&mut self,
yes: bool,
) -> Result<&mut OverrideBuilder, Error> {
// TODO: This should not return a `Result`. Fix this in the next semver
// release.
self.builder.case_insensitive(yes)?;
Ok(self)
}
@@ -235,9 +238,12 @@ mod tests {
#[test]
fn case_insensitive() {
let ov = OverrideBuilder::new(ROOT)
.case_insensitive(true).unwrap()
.add("*.html").unwrap()
.build().unwrap();
.case_insensitive(true)
.unwrap()
.add("*.html")
.unwrap()
.build()
.unwrap();
assert!(ov.matched("foo.html", false).is_whitelist());
assert!(ov.matched("foo.HTML", false).is_whitelist());
assert!(ov.matched("foo.htm", false).is_ignore());
@@ -246,9 +252,8 @@ mod tests {
#[test]
fn default_case_sensitive() {
let ov = OverrideBuilder::new(ROOT)
.add("*.html").unwrap()
.build().unwrap();
let ov =
OverrideBuilder::new(ROOT).add("*.html").unwrap().build().unwrap();
assert!(ov.matched("foo.html", false).is_whitelist());
assert!(ov.matched("foo.HTML", false).is_ignore());
assert!(ov.matched("foo.htm", false).is_ignore());

View File

@@ -1,22 +1,56 @@
use std::ffi::OsStr;
use std::path::Path;
/// Returns true if and only if this file path is considered to be hidden.
use walk::DirEntry;
/// Returns true if and only if this entry is considered to be hidden.
///
/// This only returns true if the base name of the path starts with a `.`.
///
/// On Unix, this implements a more optimized check.
#[cfg(unix)]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
pub fn is_hidden(dent: &DirEntry) -> bool {
use std::os::unix::ffi::OsStrExt;
if let Some(name) = file_name(path.as_ref()) {
if let Some(name) = file_name(dent.path()) {
name.as_bytes().get(0) == Some(&b'.')
} else {
false
}
}
/// Returns true if and only if this file path is considered to be hidden.
#[cfg(not(unix))]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
if let Some(name) = file_name(path.as_ref()) {
/// Returns true if and only if this entry is considered to be hidden.
///
/// On Windows, this returns true if one of the following is true:
///
/// * The base name of the path starts with a `.`.
/// * The file attributes have the `HIDDEN` property set.
#[cfg(windows)]
pub fn is_hidden(dent: &DirEntry) -> bool {
use std::os::windows::fs::MetadataExt;
use winapi_util::file;
// This looks like we're doing an extra stat call, but on Windows, the
// directory traverser reuses the metadata retrieved from each directory
// entry and stores it on the DirEntry itself. So this is "free."
if let Ok(md) = dent.metadata() {
if file::is_hidden(md.file_attributes() as u64) {
return true;
}
}
if let Some(name) = file_name(dent.path()) {
name.to_str().map(|s| s.starts_with(".")).unwrap_or(false)
} else {
false
}
}
/// Returns true if and only if this entry is considered to be hidden.
///
/// This only returns true if the base name of the path starts with a `.`.
#[cfg(not(any(unix, windows)))]
pub fn is_hidden(dent: &DirEntry) -> bool {
if let Some(name) = file_name(dent.path()) {
name.to_str().map(|s| s.starts_with(".")).unwrap_or(false)
} else {
false
@@ -57,8 +91,8 @@ pub fn strip_prefix<'a, P: AsRef<Path> + ?Sized>(
/// the empty string.
#[cfg(unix)]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
use memchr::memchr;
use std::os::unix::ffi::OsStrExt;
let path = path.as_ref().as_os_str().as_bytes();
memchr(b'/', path).is_none()
@@ -79,8 +113,8 @@ pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
use std::os::unix::ffi::OsStrExt;
let path = path.as_ref().as_os_str().as_bytes();
if path.is_empty() {

View File

@@ -93,205 +93,10 @@ use globset::{GlobBuilder, GlobSet, GlobSetBuilder};
use regex::Regex;
use thread_local::ThreadLocal;
use default_types::DEFAULT_TYPES;
use pathutil::file_name;
use {Error, Match};
const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("agda", &["*.agda", "*.lagda"]),
("asciidoc", &["*.adoc", "*.asc", "*.asciidoc"]),
("asm", &["*.asm", "*.s", "*.S"]),
("avro", &["*.avdl", "*.avpr", "*.avsc"]),
("awk", &["*.awk"]),
("bitbake", &["*.bb", "*.bbappend", "*.bbclass", "*.conf", "*.inc"]),
("bzip2", &["*.bz2"]),
("c", &["*.c", "*.h", "*.H"]),
("cabal", &["*.cabal"]),
("cbor", &["*.cbor"]),
("ceylon", &["*.ceylon"]),
("clojure", &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
("cmake", &["*.cmake", "CMakeLists.txt"]),
("coffeescript", &["*.coffee"]),
("creole", &["*.creole"]),
("config", &["*.cfg", "*.conf", "*.config", "*.ini"]),
("cpp", &[
"*.C", "*.cc", "*.cpp", "*.cxx",
"*.h", "*.H", "*.hh", "*.hpp", "*.hxx", "*.inl",
]),
("crystal", &["Projectfile", "*.cr"]),
("cs", &["*.cs"]),
("csharp", &["*.cs"]),
("cshtml", &["*.cshtml"]),
("css", &["*.css", "*.scss"]),
("cython", &["*.pyx"]),
("dart", &["*.dart"]),
("d", &["*.d"]),
("docker", &["*Dockerfile*"]),
("elisp", &["*.el"]),
("elixir", &["*.ex", "*.eex", "*.exs"]),
("elm", &["*.elm"]),
("erlang", &["*.erl", "*.hrl"]),
("fish", &["*.fish"]),
("fortran", &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("gn", &["*.gn", "*.gni"]),
("go", &["*.go"]),
("gzip", &["*.gz"]),
("groovy", &["*.groovy", "*.gradle"]),
("h", &["*.h", "*.hpp"]),
("hbs", &["*.hbs"]),
("haskell", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html", "*.ejs"]),
("java", &["*.java"]),
("jinja", &["*.j2", "*.jinja", "*.jinja2"]),
("js", &[
"*.js", "*.jsx", "*.vue",
]),
("json", &["*.json", "composer.lock"]),
("jsonl", &["*.jsonl"]),
("julia", &["*.jl"]),
("jupyter", &["*.ipynb", "*.jpynb"]),
("jl", &["*.jl"]),
("kotlin", &["*.kt", "*.kts"]),
("less", &["*.less"]),
("license", &[
// General
"COPYING", "COPYING[.-]*",
"COPYRIGHT", "COPYRIGHT[.-]*",
"EULA", "EULA[.-]*",
"licen[cs]e", "licen[cs]e.*",
"LICEN[CS]E", "LICEN[CS]E[.-]*", "*[.-]LICEN[CS]E*",
"NOTICE", "NOTICE[.-]*",
"PATENTS", "PATENTS[.-]*",
"UNLICEN[CS]E", "UNLICEN[CS]E[.-]*",
// GPL (gpl.txt, etc.)
"agpl[.-]*",
"gpl[.-]*",
"lgpl[.-]*",
// Other license-specific (APACHE-2.0.txt, etc.)
"AGPL-*[0-9]*",
"APACHE-*[0-9]*",
"BSD-*[0-9]*",
"CC-BY-*",
"GFDL-*[0-9]*",
"GNU-*[0-9]*",
"GPL-*[0-9]*",
"LGPL-*[0-9]*",
"MIT-*[0-9]*",
"MPL-*[0-9]*",
"OFL-*[0-9]*",
]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
("log", &["*.log"]),
("lua", &["*.lua"]),
("lzma", &["*.lzma"]),
("m4", &["*.ac", "*.m4"]),
("make", &[
"gnumakefile", "Gnumakefile", "GNUmakefile",
"makefile", "Makefile",
"*.mk", "*.mak"
]),
("markdown", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("md", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("man", &["*.[0-9lnpx]", "*.[0-9][cEFMmpSx]"]),
("matlab", &["*.m"]),
("mk", &["mkfile"]),
("ml", &["*.ml"]),
("msbuild", &[
"*.csproj", "*.fsproj", "*.vcxproj", "*.proj", "*.props", "*.targets"
]),
("nim", &["*.nim"]),
("nix", &["*.nix"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
("org", &["*.org"]),
("perl", &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm", "*.t"]),
("pdf", &["*.pdf"]),
("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
("pod", &["*.pod"]),
("protobuf", &["*.proto"]),
("ps", &["*.cdxml", "*.ps1", "*.ps1xml", "*.psd1", "*.psm1"]),
("purs", &["*.purs"]),
("py", &["*.py"]),
("qmake", &["*.pro", "*.pri", "*.prf"]),
("readme", &["README*", "*README"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rdoc", &["*.rdoc"]),
("rst", &["*.rst"]),
("ruby", &["Gemfile", "*.gemspec", ".irbrc", "Rakefile", "*.rb"]),
("rust", &["*.rs"]),
("sass", &["*.sass", "*.scss"]),
("scala", &["*.scala"]),
("sh", &[
// Portable/misc. init files
".login", ".logout", ".profile", "profile",
// bash-specific init files
".bash_login", "bash_login",
".bash_logout", "bash_logout",
".bash_profile", "bash_profile",
".bashrc", "bashrc", "*.bashrc",
// csh-specific init files
".cshrc", "*.cshrc",
// ksh-specific init files
".kshrc", "*.kshrc",
// tcsh-specific init files
".tcshrc",
// zsh-specific init files
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
// Extensions
"*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh", "*.zsh",
]),
("smarty", &["*.tpl"]),
("sml", &["*.sml", "*.sig"]),
("soy", &["*.soy"]),
("spark", &["*.spark"]),
("sql", &["*.sql", "*.psql"]),
("stylus", &["*.styl"]),
("sv", &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
("svg", &["*.svg"]),
("swift", &["*.swift"]),
("swig", &["*.def", "*.i"]),
("systemd", &[
"*.automount", "*.conf", "*.device", "*.link", "*.mount", "*.path",
"*.scope", "*.service", "*.slice", "*.socket", "*.swap", "*.target",
"*.timer",
]),
("taskpaper", &["*.taskpaper"]),
("tcl", &["*.tcl"]),
("tex", &["*.tex", "*.ltx", "*.cls", "*.sty", "*.bib"]),
("textile", &["*.textile"]),
("tf", &["*.tf"]),
("ts", &["*.ts", "*.tsx"]),
("txt", &["*.txt"]),
("toml", &["*.toml", "Cargo.lock"]),
("twig", &["*.twig"]),
("vala", &["*.vala"]),
("vb", &["*.vb"]),
("vim", &["*.vim"]),
("vimscript", &["*.vim"]),
("wiki", &["*.mediawiki", "*.wiki"]),
("webidl", &["*.idl", "*.webidl", "*.widl"]),
("xml", &["*.xml", "*.xml.dist"]),
("xz", &["*.xz"]),
("yacc", &["*.y"]),
("yaml", &["*.yaml", "*.yml"]),
("zsh", &[
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
"*.zsh",
]),
];
/// Glob represents a single glob in a set of file type definitions.
///
/// There may be more than one glob for a particular file type.
@@ -321,13 +126,23 @@ enum GlobInner<'a> {
which: usize,
/// Whether the selection was negated or not.
negated: bool,
}
},
}
impl<'a> Glob<'a> {
fn unmatched() -> Glob<'a> {
Glob(GlobInner::UnmatchedIgnore)
}
/// Return the file type defintion that matched, if one exists. A file type
/// definition always exists when a specific definition matches a file
/// path.
pub fn file_type_def(&self) -> Option<&FileTypeDef> {
match self {
Glob(GlobInner::UnmatchedIgnore) => None,
Glob(GlobInner::Matched { def, .. }) => Some(def),
}
}
}
/// A single file type definition.
@@ -472,7 +287,7 @@ impl Types {
return Match::None;
}
};
let mut matches = self.matches.get_default().borrow_mut();
let mut matches = self.matches.get_or_default().borrow_mut();
self.set.matches_into(name, &mut *matches);
// The highest precedent match is the last one.
if let Some(&i) = matches.last() {
@@ -511,10 +326,7 @@ impl TypesBuilder {
/// of default type definitions can be added with `add_defaults`, and
/// additional type definitions can be added with `select` and `negate`.
pub fn new() -> TypesBuilder {
TypesBuilder {
types: HashMap::new(),
selections: vec![],
}
TypesBuilder { types: HashMap::new(), selections: vec![] }
}
/// Build the current set of file type definitions *and* selections into
@@ -539,19 +351,18 @@ impl TypesBuilder {
GlobBuilder::new(glob)
.literal_separator(true)
.build()
.map_err(|err| {
Error::Glob {
glob: Some(glob.to_string()),
err: err.kind().to_string(),
}
})?);
.map_err(|err| Error::Glob {
glob: Some(glob.to_string()),
err: err.kind().to_string(),
})?,
);
glob_to_selection.push((isel, iglob));
}
selections.push(selection.clone().map(move |_| def));
}
let set = build_set.build().map_err(|err| {
Error::Glob { glob: None, err: err.to_string() }
})?;
let set = build_set
.build()
.map_err(|err| Error::Glob { glob: None, err: err.to_string() })?;
Ok(Types {
defs: defs,
selections: selections,
@@ -623,9 +434,14 @@ impl TypesBuilder {
return Err(Error::InvalidDefinition);
}
let (key, glob) = (name.to_string(), glob.to_string());
self.types.entry(key).or_insert_with(|| {
FileTypeDef { name: name.to_string(), globs: vec![] }
}).globs.push(glob);
self.types
.entry(key)
.or_insert_with(|| FileTypeDef {
name: name.to_string(),
globs: vec![],
})
.globs
.push(glob);
Ok(())
}
@@ -652,7 +468,10 @@ impl TypesBuilder {
3 => {
let name = parts[0];
let types_string = parts[2];
if name.is_empty() || parts[1] != "include" || types_string.is_empty() {
if name.is_empty()
|| parts[1] != "include"
|| types_string.is_empty()
{
return Err(Error::InvalidDefinition);
}
let types = types_string.split(',');
@@ -662,14 +481,15 @@ impl TypesBuilder {
return Err(Error::InvalidDefinition);
}
for type_name in types {
let globs = self.types.get(type_name).unwrap().globs.clone();
let globs =
self.types.get(type_name).unwrap().globs.clone();
for glob in globs {
self.add(name, &glob)?;
}
}
Ok(())
}
_ => Err(Error::InvalidDefinition)
_ => Err(Error::InvalidDefinition),
}
}
@@ -726,7 +546,7 @@ mod tests {
"rust:*.rs",
"js:*.js",
"foo:*.{rs,foo}",
"combo:include:html,rust"
"combo:include:html,rust",
]
}
@@ -760,7 +580,7 @@ mod tests {
"combo:include:html,python",
// Bad format
"combo:foobar:html,rust",
""
"",
];
for def in bad_defs {
assert!(btypes.add_def(def).is_err());

File diff suppressed because it is too large Load Diff

View File

@@ -1,13 +1,11 @@
extern crate ignore;
use std::path::Path;
use ignore::gitignore::{Gitignore, GitignoreBuilder};
const IGNORE_FILE: &'static str = "tests/gitignore_matched_path_or_any_parents_tests.gitignore";
const IGNORE_FILE: &'static str =
"tests/gitignore_matched_path_or_any_parents_tests.gitignore";
fn get_gitignore() -> Gitignore {
let mut builder = GitignoreBuilder::new("ROOT");
@@ -16,9 +14,8 @@ fn get_gitignore() -> Gitignore {
builder.build().unwrap()
}
#[test]
#[should_panic(expected = "path is expect to be under the root")]
#[should_panic(expected = "path is expected to be under the root")]
fn test_path_should_be_under_root() {
let gitignore = get_gitignore();
let path = "/tmp/some_file";
@@ -26,11 +23,12 @@ fn test_path_should_be_under_root() {
assert!(false);
}
#[test]
fn test_files_in_root() {
let gitignore = get_gitignore();
let m = |path: &str| gitignore.matched_path_or_any_parents(Path::new(path), false);
let m = |path: &str| {
gitignore.matched_path_or_any_parents(Path::new(path), false)
};
// 0x
assert!(m("ROOT/file_root_00").is_ignore());
@@ -57,11 +55,12 @@ fn test_files_in_root() {
assert!(m("ROOT/file_root_33").is_none());
}
#[test]
fn test_files_in_deep() {
let gitignore = get_gitignore();
let m = |path: &str| gitignore.matched_path_or_any_parents(Path::new(path), false);
let m = |path: &str| {
gitignore.matched_path_or_any_parents(Path::new(path), false)
};
// 0x
assert!(m("ROOT/parent_dir/file_deep_00").is_ignore());
@@ -88,12 +87,12 @@ fn test_files_in_deep() {
assert!(m("ROOT/parent_dir/file_deep_33").is_none());
}
#[test]
fn test_dirs_in_root() {
let gitignore = get_gitignore();
let m =
|path: &str, is_dir: bool| gitignore.matched_path_or_any_parents(Path::new(path), is_dir);
let m = |path: &str, is_dir: bool| {
gitignore.matched_path_or_any_parents(Path::new(path), is_dir)
};
// 00
assert!(m("ROOT/dir_root_00", true).is_ignore());
@@ -192,12 +191,12 @@ fn test_dirs_in_root() {
assert!(m("ROOT/dir_root_33/child_dir/file", false).is_ignore());
}
#[test]
fn test_dirs_in_deep() {
let gitignore = get_gitignore();
let m =
|path: &str, is_dir: bool| gitignore.matched_path_or_any_parents(Path::new(path), is_dir);
let m = |path: &str, is_dir: bool| {
gitignore.matched_path_or_any_parents(Path::new(path), is_dir)
};
// 00
assert!(m("ROOT/parent_dir/dir_deep_00", true).is_ignore());
@@ -260,13 +259,15 @@ fn test_dirs_in_deep() {
assert!(m("ROOT/parent_dir/dir_deep_21/child_dir/file", false).is_ignore());
// 22
assert!(m("ROOT/parent_dir/dir_deep_22", true).is_none()); // dir itself doesn't match
// dir itself doesn't match
assert!(m("ROOT/parent_dir/dir_deep_22", true).is_none());
assert!(m("ROOT/parent_dir/dir_deep_22/file", false).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_22/child_dir", true).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_22/child_dir/file", false).is_ignore());
// 23
assert!(m("ROOT/parent_dir/dir_deep_23", true).is_none()); // dir itself doesn't match
// dir itself doesn't match
assert!(m("ROOT/parent_dir/dir_deep_23", true).is_none());
assert!(m("ROOT/parent_dir/dir_deep_23/file", false).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_23/child_dir", true).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_23/child_dir/file", false).is_ignore());
@@ -284,13 +285,15 @@ fn test_dirs_in_deep() {
assert!(m("ROOT/parent_dir/dir_deep_31/child_dir/file", false).is_ignore());
// 32
assert!(m("ROOT/parent_dir/dir_deep_32", true).is_none()); // dir itself doesn't match
// dir itself doesn't match
assert!(m("ROOT/parent_dir/dir_deep_32", true).is_none());
assert!(m("ROOT/parent_dir/dir_deep_32/file", false).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_32/child_dir", true).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_32/child_dir/file", false).is_ignore());
// 33
assert!(m("ROOT/parent_dir/dir_deep_33", true).is_none()); // dir itself doesn't match
// dir itself doesn't match
assert!(m("ROOT/parent_dir/dir_deep_33", true).is_none());
assert!(m("ROOT/parent_dir/dir_deep_33/file", false).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_33/child_dir", true).is_ignore());
assert!(m("ROOT/parent_dir/dir_deep_33/child_dir/file", false).is_ignore());

24
crates/matcher/Cargo.toml Normal file
View File

@@ -0,0 +1,24 @@
[package]
name = "grep-matcher"
version = "0.1.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A trait for regular expressions, with a focus on line oriented search.
"""
documentation = "https://docs.rs/grep-matcher"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/matcher"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/matcher"
readme = "README.md"
keywords = ["regex", "pattern", "trait"]
license = "Unlicense/MIT"
autotests = false
[dependencies]
memchr = "2.1"
[dev-dependencies]
regex = "1.1"
[[test]]
name = "integration"
path = "tests/tests.rs"

36
crates/matcher/README.md Normal file
View File

@@ -0,0 +1,36 @@
grep-matcher
------------
This crate provides a low level interface for describing regular expression
matchers. The `grep` crate uses this interface in order to make the regex
engine it uses pluggable.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/grep-matcher.svg)](https://crates.io/crates/grep-matcher)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/grep-matcher](https://docs.rs/grep-matcher)
**NOTE:** You probably don't want to use this crate directly. Instead, you
should prefer the facade defined in the
[`grep`](https://docs.rs/grep)
crate.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep-matcher = "0.1"
```
and this to your crate root:
```rust
extern crate grep_matcher;
```

View File

@@ -0,0 +1,328 @@
use std::str;
use memchr::memchr;
/// Interpolate capture references in `replacement` and write the interpolation
/// result to `dst`. References in `replacement` take the form of $N or $name,
/// where `N` is a capture group index and `name` is a capture group name. The
/// function provided, `name_to_index`, maps capture group names to indices.
///
/// The `append` function given is responsible for writing the replacement
/// to the `dst` buffer. That is, it is called with the capture group index
/// of a capture group reference and is expected to resolve the index to its
/// corresponding matched text. If no such match exists, then `append` should
/// not write anything to its given buffer.
pub fn interpolate<A, N>(
mut replacement: &[u8],
mut append: A,
mut name_to_index: N,
dst: &mut Vec<u8>,
) where
A: FnMut(usize, &mut Vec<u8>),
N: FnMut(&str) -> Option<usize>,
{
while !replacement.is_empty() {
match memchr(b'$', replacement) {
None => break,
Some(i) => {
dst.extend(&replacement[..i]);
replacement = &replacement[i..];
}
}
if replacement.get(1).map_or(false, |&b| b == b'$') {
dst.push(b'$');
replacement = &replacement[2..];
continue;
}
debug_assert!(!replacement.is_empty());
let cap_ref = match find_cap_ref(replacement) {
Some(cap_ref) => cap_ref,
None => {
dst.push(b'$');
replacement = &replacement[1..];
continue;
}
};
replacement = &replacement[cap_ref.end..];
match cap_ref.cap {
Ref::Number(i) => append(i, dst),
Ref::Named(name) => {
if let Some(i) = name_to_index(name) {
append(i, dst);
}
}
}
}
dst.extend(replacement);
}
/// `CaptureRef` represents a reference to a capture group inside some text.
/// The reference is either a capture group name or a number.
///
/// It is also tagged with the position in the text immediately proceding the
/// capture reference.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
struct CaptureRef<'a> {
cap: Ref<'a>,
end: usize,
}
/// A reference to a capture group in some text.
///
/// e.g., `$2`, `$foo`, `${foo}`.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
enum Ref<'a> {
Named(&'a str),
Number(usize),
}
impl<'a> From<&'a str> for Ref<'a> {
fn from(x: &'a str) -> Ref<'a> {
Ref::Named(x)
}
}
impl From<usize> for Ref<'static> {
fn from(x: usize) -> Ref<'static> {
Ref::Number(x)
}
}
/// Parses a possible reference to a capture group name in the given text,
/// starting at the beginning of `replacement`.
///
/// If no such valid reference could be found, None is returned.
fn find_cap_ref(replacement: &[u8]) -> Option<CaptureRef> {
let mut i = 0;
if replacement.len() <= 1 || replacement[0] != b'$' {
return None;
}
let mut brace = false;
i += 1;
if replacement[i] == b'{' {
brace = true;
i += 1;
}
let mut cap_end = i;
while replacement.get(cap_end).map_or(false, is_valid_cap_letter) {
cap_end += 1;
}
if cap_end == i {
return None;
}
// We just verified that the range 0..cap_end is valid ASCII, so it must
// therefore be valid UTF-8. If we really cared, we could avoid this UTF-8
// check with an unchecked conversion or by parsing the number straight
// from &[u8].
let cap = str::from_utf8(&replacement[i..cap_end])
.expect("valid UTF-8 capture name");
if brace {
if !replacement.get(cap_end).map_or(false, |&b| b == b'}') {
return None;
}
cap_end += 1;
}
Some(CaptureRef {
cap: match cap.parse::<u32>() {
Ok(i) => Ref::Number(i as usize),
Err(_) => Ref::Named(cap),
},
end: cap_end,
})
}
/// Returns true if and only if the given byte is allowed in a capture name.
fn is_valid_cap_letter(b: &u8) -> bool {
match *b {
b'0'..=b'9' | b'a'..=b'z' | b'A'..=b'Z' | b'_' => true,
_ => false,
}
}
#[cfg(test)]
mod tests {
use super::{find_cap_ref, interpolate, CaptureRef};
macro_rules! find {
($name:ident, $text:expr) => {
#[test]
fn $name() {
assert_eq!(None, find_cap_ref($text.as_bytes()));
}
};
($name:ident, $text:expr, $capref:expr) => {
#[test]
fn $name() {
assert_eq!(Some($capref), find_cap_ref($text.as_bytes()));
}
};
}
macro_rules! c {
($name_or_number:expr, $pos:expr) => {
CaptureRef { cap: $name_or_number.into(), end: $pos }
};
}
find!(find_cap_ref1, "$foo", c!("foo", 4));
find!(find_cap_ref2, "${foo}", c!("foo", 6));
find!(find_cap_ref3, "$0", c!(0, 2));
find!(find_cap_ref4, "$5", c!(5, 2));
find!(find_cap_ref5, "$10", c!(10, 3));
find!(find_cap_ref6, "$42a", c!("42a", 4));
find!(find_cap_ref7, "${42}a", c!(42, 5));
find!(find_cap_ref8, "${42");
find!(find_cap_ref9, "${42 ");
find!(find_cap_ref10, " $0 ");
find!(find_cap_ref11, "$");
find!(find_cap_ref12, " ");
find!(find_cap_ref13, "");
// A convenience routine for using interpolate's unwieldy but flexible API.
fn interpolate_string(
mut name_to_index: Vec<(&'static str, usize)>,
caps: Vec<&'static str>,
replacement: &str,
) -> String {
name_to_index.sort_by_key(|x| x.0);
let mut dst = vec![];
interpolate(
replacement.as_bytes(),
|i, dst| {
if let Some(&s) = caps.get(i) {
dst.extend(s.as_bytes());
}
},
|name| -> Option<usize> {
name_to_index
.binary_search_by_key(&name, |x| x.0)
.ok()
.map(|i| name_to_index[i].1)
},
&mut dst,
);
String::from_utf8(dst).unwrap()
}
macro_rules! interp {
($name:ident, $map:expr, $caps:expr, $hay:expr, $expected:expr $(,)*) => {
#[test]
fn $name() {
assert_eq!($expected, interpolate_string($map, $caps, $hay));
}
};
}
interp!(
interp1,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test $foo test",
"test xxx test",
);
interp!(
interp2,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test$footest",
"test",
);
interp!(
interp3,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test${foo}test",
"testxxxtest",
);
interp!(
interp4,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test$2test",
"test",
);
interp!(
interp5,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test${2}test",
"testxxxtest",
);
interp!(
interp6,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test $$foo test",
"test $foo test",
);
interp!(
interp7,
vec![("foo", 2)],
vec!["", "", "xxx"],
"test $foo",
"test xxx",
);
interp!(
interp8,
vec![("foo", 2)],
vec!["", "", "xxx"],
"$foo test",
"xxx test",
);
interp!(
interp9,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test $bar$foo",
"test yyyxxx",
);
interp!(
interp10,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test $ test",
"test $ test",
);
interp!(
interp11,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test ${} test",
"test ${} test",
);
interp!(
interp12,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test ${ } test",
"test ${ } test",
);
interp!(
interp13,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test ${a b} test",
"test ${a b} test",
);
interp!(
interp14,
vec![("bar", 1), ("foo", 2)],
vec!["", "yyy", "xxx"],
"test ${a} test",
"test test",
);
}

1151
crates/matcher/src/lib.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,230 @@
use grep_matcher::{Captures, Match, Matcher};
use regex::bytes::Regex;
use util::{RegexMatcher, RegexMatcherNoCaps};
fn matcher(pattern: &str) -> RegexMatcher {
RegexMatcher::new(Regex::new(pattern).unwrap())
}
fn matcher_no_caps(pattern: &str) -> RegexMatcherNoCaps {
RegexMatcherNoCaps(Regex::new(pattern).unwrap())
}
fn m(start: usize, end: usize) -> Match {
Match::new(start, end)
}
#[test]
fn find() {
let matcher = matcher(r"(\w+)\s+(\w+)");
assert_eq!(matcher.find(b" homer simpson ").unwrap(), Some(m(1, 14)));
}
#[test]
fn find_iter() {
let matcher = matcher(r"(\w+)\s+(\w+)");
let mut matches = vec![];
matcher
.find_iter(b"aa bb cc dd", |m| {
matches.push(m);
true
})
.unwrap();
assert_eq!(matches, vec![m(0, 5), m(6, 11)]);
// Test that find_iter respects short circuiting.
matches.clear();
matcher
.find_iter(b"aa bb cc dd", |m| {
matches.push(m);
false
})
.unwrap();
assert_eq!(matches, vec![m(0, 5)]);
}
#[test]
fn try_find_iter() {
#[derive(Clone, Debug, Eq, PartialEq)]
struct MyError;
let matcher = matcher(r"(\w+)\s+(\w+)");
let mut matches = vec![];
let err = matcher
.try_find_iter(b"aa bb cc dd", |m| {
if matches.is_empty() {
matches.push(m);
Ok(true)
} else {
Err(MyError)
}
})
.unwrap()
.unwrap_err();
assert_eq!(matches, vec![m(0, 5)]);
assert_eq!(err, MyError);
}
#[test]
fn shortest_match() {
let matcher = matcher(r"a+");
// This tests that the default impl isn't doing anything smart, and simply
// defers to `find`.
assert_eq!(matcher.shortest_match(b"aaa").unwrap(), Some(3));
// The actual underlying regex is smarter.
assert_eq!(matcher.re.shortest_match(b"aaa"), Some(1));
}
#[test]
fn captures() {
let matcher = matcher(r"(?P<a>\w+)\s+(?P<b>\w+)");
assert_eq!(matcher.capture_count(), 3);
assert_eq!(matcher.capture_index("a"), Some(1));
assert_eq!(matcher.capture_index("b"), Some(2));
assert_eq!(matcher.capture_index("nada"), None);
let mut caps = matcher.new_captures().unwrap();
assert!(matcher.captures(b" homer simpson ", &mut caps).unwrap());
assert_eq!(caps.get(0), Some(m(1, 14)));
assert_eq!(caps.get(1), Some(m(1, 6)));
assert_eq!(caps.get(2), Some(m(7, 14)));
}
#[test]
fn captures_iter() {
let matcher = matcher(r"(?P<a>\w+)\s+(?P<b>\w+)");
let mut caps = matcher.new_captures().unwrap();
let mut matches = vec![];
matcher
.captures_iter(b"aa bb cc dd", &mut caps, |caps| {
matches.push(caps.get(0).unwrap());
matches.push(caps.get(1).unwrap());
matches.push(caps.get(2).unwrap());
true
})
.unwrap();
assert_eq!(
matches,
vec![m(0, 5), m(0, 2), m(3, 5), m(6, 11), m(6, 8), m(9, 11),]
);
// Test that captures_iter respects short circuiting.
matches.clear();
matcher
.captures_iter(b"aa bb cc dd", &mut caps, |caps| {
matches.push(caps.get(0).unwrap());
matches.push(caps.get(1).unwrap());
matches.push(caps.get(2).unwrap());
false
})
.unwrap();
assert_eq!(matches, vec![m(0, 5), m(0, 2), m(3, 5),]);
}
#[test]
fn try_captures_iter() {
#[derive(Clone, Debug, Eq, PartialEq)]
struct MyError;
let matcher = matcher(r"(?P<a>\w+)\s+(?P<b>\w+)");
let mut caps = matcher.new_captures().unwrap();
let mut matches = vec![];
let err = matcher
.try_captures_iter(b"aa bb cc dd", &mut caps, |caps| {
if matches.is_empty() {
matches.push(caps.get(0).unwrap());
matches.push(caps.get(1).unwrap());
matches.push(caps.get(2).unwrap());
Ok(true)
} else {
Err(MyError)
}
})
.unwrap()
.unwrap_err();
assert_eq!(matches, vec![m(0, 5), m(0, 2), m(3, 5)]);
assert_eq!(err, MyError);
}
// Test that our default impls for capturing are correct. Namely, when
// capturing isn't supported by the underlying matcher, then all of the
// various capturing related APIs fail fast.
#[test]
fn no_captures() {
let matcher = matcher_no_caps(r"(?P<a>\w+)\s+(?P<b>\w+)");
assert_eq!(matcher.capture_count(), 0);
assert_eq!(matcher.capture_index("a"), None);
assert_eq!(matcher.capture_index("b"), None);
assert_eq!(matcher.capture_index("nada"), None);
let mut caps = matcher.new_captures().unwrap();
assert!(!matcher.captures(b"homer simpson", &mut caps).unwrap());
let mut called = false;
matcher
.captures_iter(b"homer simpson", &mut caps, |_| {
called = true;
true
})
.unwrap();
assert!(!called);
}
#[test]
fn replace() {
let matcher = matcher(r"(\w+)\s+(\w+)");
let mut dst = vec![];
matcher
.replace(b"aa bb cc dd", &mut dst, |_, dst| {
dst.push(b'z');
true
})
.unwrap();
assert_eq!(dst, b"z z");
// Test that replacements respect short circuiting.
dst.clear();
matcher
.replace(b"aa bb cc dd", &mut dst, |_, dst| {
dst.push(b'z');
false
})
.unwrap();
assert_eq!(dst, b"z cc dd");
}
#[test]
fn replace_with_captures() {
let matcher = matcher(r"(\w+)\s+(\w+)");
let haystack = b"aa bb cc dd";
let mut caps = matcher.new_captures().unwrap();
let mut dst = vec![];
matcher
.replace_with_captures(haystack, &mut caps, &mut dst, |caps, dst| {
caps.interpolate(
|name| matcher.capture_index(name),
haystack,
b"$2 $1",
dst,
);
true
})
.unwrap();
assert_eq!(dst, b"bb aa dd cc");
// Test that replacements respect short circuiting.
dst.clear();
matcher
.replace_with_captures(haystack, &mut caps, &mut dst, |caps, dst| {
caps.interpolate(
|name| matcher.capture_index(name),
haystack,
b"$2 $1",
dst,
);
false
})
.unwrap();
assert_eq!(dst, b"bb aa cc dd");
}

View File

@@ -0,0 +1,6 @@
extern crate grep_matcher;
extern crate regex;
mod util;
mod test_matcher;

View File

@@ -0,0 +1,95 @@
use std::collections::HashMap;
use std::result;
use grep_matcher::{Captures, Match, Matcher, NoCaptures, NoError};
use regex::bytes::{CaptureLocations, Regex};
#[derive(Debug)]
pub struct RegexMatcher {
pub re: Regex,
pub names: HashMap<String, usize>,
}
impl RegexMatcher {
pub fn new(re: Regex) -> RegexMatcher {
let mut names = HashMap::new();
for (i, optional_name) in re.capture_names().enumerate() {
if let Some(name) = optional_name {
names.insert(name.to_string(), i);
}
}
RegexMatcher { re: re, names: names }
}
}
type Result<T> = result::Result<T, NoError>;
impl Matcher for RegexMatcher {
type Captures = RegexCaptures;
type Error = NoError;
fn find_at(&self, haystack: &[u8], at: usize) -> Result<Option<Match>> {
Ok(self
.re
.find_at(haystack, at)
.map(|m| Match::new(m.start(), m.end())))
}
fn new_captures(&self) -> Result<RegexCaptures> {
Ok(RegexCaptures(self.re.capture_locations()))
}
fn captures_at(
&self,
haystack: &[u8],
at: usize,
caps: &mut RegexCaptures,
) -> Result<bool> {
Ok(self.re.captures_read_at(&mut caps.0, haystack, at).is_some())
}
fn capture_count(&self) -> usize {
self.re.captures_len()
}
fn capture_index(&self, name: &str) -> Option<usize> {
self.names.get(name).map(|i| *i)
}
// We purposely don't implement any other methods, so that we test the
// default impls. The "real" Regex impl for Matcher provides a few more
// impls. e.g., Its `find_iter` impl is faster than what we can do here,
// since the regex crate avoids synchronization overhead.
}
#[derive(Debug)]
pub struct RegexMatcherNoCaps(pub Regex);
impl Matcher for RegexMatcherNoCaps {
type Captures = NoCaptures;
type Error = NoError;
fn find_at(&self, haystack: &[u8], at: usize) -> Result<Option<Match>> {
Ok(self
.0
.find_at(haystack, at)
.map(|m| Match::new(m.start(), m.end())))
}
fn new_captures(&self) -> Result<NoCaptures> {
Ok(NoCaptures::new())
}
}
#[derive(Clone, Debug)]
pub struct RegexCaptures(CaptureLocations);
impl Captures for RegexCaptures {
fn len(&self) -> usize {
self.0.len()
}
fn get(&self, i: usize) -> Option<Match> {
self.0.pos(i).map(|(s, e)| Match::new(s, e))
}
}

17
crates/pcre2/Cargo.toml Normal file
View File

@@ -0,0 +1,17 @@
[package]
name = "grep-pcre2"
version = "0.1.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Use PCRE2 with the 'grep' crate.
"""
documentation = "https://docs.rs/grep-pcre2"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/pcre2"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/pcre2"
readme = "README.md"
keywords = ["regex", "grep", "pcre", "backreference", "look"]
license = "Unlicense/MIT"
[dependencies]
grep-matcher = { version = "0.1.2", path = "../matcher" }
pcre2 = "0.2.0"

21
crates/pcre2/LICENSE-MIT Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2015 Andrew Gallant
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

39
crates/pcre2/README.md Normal file
View File

@@ -0,0 +1,39 @@
grep-pcre2
----------
The `grep-pcre2` crate provides an implementation of the `Matcher` trait from
the `grep-matcher` crate. This implementation permits PCRE2 to be used in the
`grep` crate for fast line oriented searching.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/grep-pcre2.svg)](https://crates.io/crates/grep-pcre2)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/grep-pcre2](https://docs.rs/grep-pcre2)
**NOTE:** You probably don't want to use this crate directly. Instead, you
should prefer the facade defined in the
[`grep`](https://docs.rs/grep)
crate.
If you're looking to just use PCRE2 from Rust, then you probably want the
[`pcre2`](https://docs.rs/pcre2)
crate, which provide high level safe bindings to PCRE2.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep-pcre2 = "0.1"
```
and this to your crate root:
```rust
extern crate grep_pcre2;
```

24
crates/pcre2/UNLICENSE Normal file
View File

@@ -0,0 +1,24 @@
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org/>

Some files were not shown because too many files have changed in this diff Show More