Compare commits

...

203 Commits
0.2.1 ... 0.3.0

Author SHA1 Message Date
Andrew Gallant
7f3e7d2faa bsd doesn't have --recursive 2016-11-20 17:57:10 -05:00
Andrew Gallant
8d5906d7fc another attempt to fix deploy 2016-11-20 16:57:20 -05:00
Andrew Gallant
feda38852e fix deploy 2016-11-20 16:40:42 -05:00
Andrew Gallant
59187902d0 Fix completion script deployment bundle. 2016-11-20 16:26:23 -05:00
Andrew Gallant
aef46beaf2 0.3.0 2016-11-20 16:07:25 -05:00
Andrew Gallant
f0e192943f changelog 0.3.0 2016-11-20 16:06:49 -05:00
Andrew Gallant
df72d8d1e0 Make wincolor crate compilable on non-Windows platforms. 2016-11-20 15:44:19 -05:00
Andrew Gallant
d06f84ced3 Get rid of special mmap decision on Windows.
I spent some quality time on my Windows 10 laptop and it appears to
suffer from a similar trade-off as on Linux: mmaps are bad for large
directory traversals but good for single large files.

Darwin continues to reject memory maps in all cases (unless explicitly
requested), but more testing should be done there.
2016-11-20 15:32:50 -05:00
Andrew Gallant
9598331fa8 Propagate no_messages option to worker.
Fixes #241
2016-11-20 15:01:37 -05:00
Andrew Gallant
883d8fc72f Merge pull request #240 from BurntSushi/color
Completely re-work colored output and tty handling.
2016-11-20 15:01:11 -05:00
Andrew Gallant
e8a30cb893 Completely re-work colored output and tty handling.
This commit completely guts all of the color handling code and replaces
most of it with two new crates: wincolor and termcolor. wincolor
provides a simple API to coloring using the Windows console and
termcolor provides a platform independent coloring API tuned for
multithreaded command line programs. This required a lot more
flexibility than what the `term` crate provided, so it was dropped.
We instead switch to writing ANSI escape sequences directly and ignore
the TERMINFO database.

In addition to fixing several bugs, this commit also permits end users
to customize colors to a certain extent. For example, this command will
set the match color to magenta and the line number background to yellow:

    rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo

For tty handling, we've adopted a hack from `git` to do tty detection in
MSYS/mintty terminals. As a result, ripgrep should get both color
detection and piping correct on Windows regardless of which terminal you
use.

Finally, switch to line buffering. Performance doesn't seem to be
impacted and it's an otherwise more user friendly option.

Fixes #37, Fixes #51, Fixes #94, Fixes #117, Fixes #182, Fixes #231
2016-11-20 11:14:52 -05:00
Andrew Gallant
03f7605322 Rename --files-without-matches to --files-without-match.
This is to be consistent with grep.
2016-11-19 20:15:41 -05:00
Andrew Gallant
61663e2307 Merge pull request #239 from mernen/files-without-matches
Add --files-without-matches flag.
2016-11-19 20:05:00 -05:00
Daniel Luz
bd3e7eedb1 Add --files-without-matches flag.
Performs the opposite of --files-with-matches: only shows paths of
files that contain zero matches.

Closes #138
2016-11-19 21:48:59 -02:00
Andrew Gallant
1e6c2ac8e3 Merge pull request #238 from jacwah/refactor
`Ignore` refactorings
2016-11-18 23:11:17 -05:00
Andrew Gallant
0302d58eb8 Fix stdin bug with --file.
When `rg -f-` is used, the default search path should be `./` and not
`-`.
2016-11-17 20:48:11 -05:00
Andrew Gallant
e37f783fc0 Fix issue number mixup.
Thanks @bluss!
2016-11-17 20:30:18 -05:00
Andrew Gallant
495e13cc61 Merge pull request #233 from BurntSushi/clap
Switch from Docopt to Clap.
2016-11-17 20:22:16 -05:00
Andrew Gallant
92dc402f7f Switch from Docopt to Clap.
There were two important reasons for the switch:

1. Performance. Docopt does poorly when the argv becomes large, which is
   a reasonable common use case for search tools. (e.g., use with xargs)
2. Better failure modes. Clap knows a lot more about how a particular
   argv might be invalid, and can therefore provide much clearer error
   messages.

While both were important, (1) made it urgent.

Note that since Clap requires at least Rust 1.11, this will in turn
increase the minimum Rust version supported by ripgrep from Rust 1.9 to
Rust 1.11. It is therefore a breaking change, so the soonest release of
ripgrep with Clap will have to be 0.3.

There is also at least one subtle breaking change in real usage.
Previous to this commit, this used to work:

    rg -e -foo

Where this would cause ripgrep to search for the string `-foo`. Clap
currently has problems supporting this use case
(see: https://github.com/kbknapp/clap-rs/issues/742),
but it can be worked around by using this instead:

    rg -e [-]foo

or even

    rg [-]foo

and this still works:

    rg -- -foo

This commit also adds Bash, Fish and PowerShell completion files to the
release, fixes a bug that prevented ripgrep from working on file
paths containing invalid UTF-8 and shows short descriptions in the
output of `-h` but longer descriptions in the output of `--help`.

Fixes #136, Fixes #189, Fixes #210, Fixes #230
2016-11-17 19:53:41 -05:00
Andrew Gallant
a3f5e0c3d5 Use env::home_dir() instead of env::var_os(HOME).
Thanks @steveklabnik!
2016-11-17 16:54:39 -05:00
Andrew Gallant
39e1a0d694 Merge pull request #227 from emk/issue-7
Allow specifying patterns with `-f FILE` and `-f-`
2016-11-15 18:15:05 -05:00
Eric Kidd
e9cd0a1cc3 Allow specifying patterns with -f FILE and -f-
This is a somewhat basic implementation of `-f-` (#7), with unit tests.
Changes include:

1. The internals of the `pattern` function have been refactored to avoid
   code duplication, but there's a lot more we could do.  Right now we
   read the entire pattern list into a `Vec`.
2. There's now a `WorkDir::pipe` command that allows sending standard
   input to `rg` when testing.

Not implemented: aho-corasick.
2016-11-15 13:00:16 -05:00
Andrew Gallant
cc35ae0748 Merge pull request #235 from jinyeow/master
Added elixir to types
2016-11-15 06:36:19 -05:00
Justin Puah
5ee175beaf Added elixir to types 2016-11-15 20:57:38 +11:00
Andrew Gallant
4b18f82899 Disable symlink tests on Windows.
For some reason, these work on AppVeyor but not in other build systems.
Let's just disable them.

See: https://github.com/rust-lang/rust/pull/37149
2016-11-11 06:44:23 -05:00
Andrew Gallant
5462af4434 Pin rustc-serialize to 0.3.19.
See: https://github.com/rust-lang-nursery/rustc-serialize/pull/159
2016-11-09 20:28:58 -05:00
Andrew Gallant
d2e70da040 0.2.9 2016-11-09 19:07:25 -05:00
Andrew Gallant
64dc9b6709 update deps 2016-11-09 18:54:22 -05:00
Andrew Gallant
9ffd4c421f ignore-0.1.5 2016-11-09 18:52:52 -05:00
Andrew Gallant
d862b80afb changelog 0.2.9 2016-11-09 18:52:08 -05:00
Andrew Gallant
5b73dcc8ab Rework parallelism in directory iterator.
Previously, ignore::WalkParallel would invoke the callback for all
*explicitly* given file paths in a single thread, which effectively
meant that `rg pattern foo bar baz ...` didn't actually search foo, bar
and baz in parallel.

The code was structured that way to avoid spinning up workers if no
directory paths were given. The original intention was probably to have
a separate pool of threads responsible for searching, but ripgrep ended
up just reusing the ignore::WalkParallel workers themselves for searching,
and thereby subjected to its sub-par performance in this case.

The code has been restructured so that file paths are sent to the workers,
which brings back parallelism.

Fixes #226
2016-11-09 17:19:40 -05:00
Andrew Gallant
2dce0dc0df Fix a bug with handling --ignore-file.
Namely, passing a directory to --ignore-file caused ripgrep to allocate
memory without bound.

The issue was that I got a bit overzealous with partial error
reporting. Namely, when processing a gitignore file, we should try
to use every pattern even if some patterns are invalid globs (e.g.,
a**b). In the process, I applied the same logic to I/O errors. In this
case, it manifest by attempting to read lines from a directory, which
appears to yield Results forever, where each Result is an error of the
form "you can't read from a directory silly." Since I treated it as a
partial error, ripgrep was just spinning and accruing each error in
memory, which caused the OOM killer to kick in.

Fixes #228
2016-11-09 16:45:23 -05:00
Andrew Gallant
2e5c3c05e8 reword 2016-11-06 19:48:49 -05:00
Andrew Gallant
6884eea2f5 reword 2016-11-06 19:48:17 -05:00
Andrew Gallant
a3a2f0be6a ucg author says it's not a bug per se 2016-11-06 19:45:18 -05:00
Andrew Gallant
f24873c70b Don't ever search directories. 2016-11-06 19:02:14 -05:00
Andrew Gallant
58126ffe15 touchups 2016-11-06 18:51:00 -05:00
Andrew Gallant
17644a76c0 typo 2016-11-06 18:49:07 -05:00
Andrew Gallant
9fc9f368f5 Always search paths given by user.
This permits doing `rg -a test /dev/sda1` for example, where as before
/dev/sda1 was skipped because it wasn't a regular file.
2016-11-06 18:23:50 -05:00
Andrew Gallant
9cab076a72 touchups 2016-11-06 18:04:55 -05:00
Andrew Gallant
7aa9652f3c touchups 2016-11-06 18:02:45 -05:00
Andrew Gallant
7187f61ca8 touchups 2016-11-06 18:01:55 -05:00
Andrew Gallant
f869c58a5a touchups 2016-11-06 17:59:57 -05:00
Andrew Gallant
3538ba3577 Update README with more/updated benchmarks 2016-11-06 17:55:38 -05:00
Andrew Gallant
a454fa75b9 Update brew tap. 2016-11-06 16:51:14 -05:00
Andrew Gallant
18943b9317 0.2.8 2016-11-06 16:16:48 -05:00
Andrew Gallant
68427b5b79 changelog 0.2.8 2016-11-06 16:16:19 -05:00
Andrew Gallant
4ca15a8a51 simd-accel should not invoke avx-accel.
This was a silly transcription error.
2016-11-06 16:15:23 -05:00
Andrew Gallant
2daef51fe5 0.2.7 2016-11-06 15:49:25 -05:00
Andrew Gallant
43ed91dc5c changelog 0.2.7 2016-11-06 15:48:52 -05:00
Andrew Gallant
dada75d2a7 Update sub-crate dependency versions. 2016-11-06 15:48:40 -05:00
Andrew Gallant
76b9f01ad2 ignore-0.1.4 2016-11-06 15:35:21 -05:00
Andrew Gallant
8baa0e56b7 grep-0.1.4 2016-11-06 15:35:17 -05:00
Andrew Gallant
301ee6d3f5 globset-0.1.2 2016-11-06 15:35:05 -05:00
Andrew Gallant
77ad7588ae Add --no-messages flag.
This flag is similar to what's found in grep: it will suppress all error
messages, such as those shown when a particular file couldn't be read.

Closes #149
2016-11-06 14:36:08 -05:00
Andrew Gallant
58aca2efb2 Add -m/--max-count flag.
This flag limits the number of matches printed *per file*.

Closes #159
2016-11-06 13:09:53 -05:00
Andrew Gallant
351eddc17e Add new 'h' file type.
This is intended to correspond to C/C++ header files.

Fixes #186
2016-11-06 12:23:42 -05:00
Andrew Gallant
277dda544c Include the name "ripgrep" in more places.
Fixes #203
2016-11-06 12:21:36 -05:00
Andrew Gallant
8c869cbd87 update man page 2016-11-06 12:10:55 -05:00
Andrew Gallant
598b162fea Note -e/--regexp's additional usefulness.
Specifically, it can be used when searching for patterns that start
with a dash.

Fixes #215
2016-11-06 12:10:27 -05:00
Andrew Gallant
0222e024fe Fixes a bug with --smart-case.
This was a subtle bug, but the big picture was that the smart case
information wasn't being carried through to the literal extraction in
some cases. When this happened, it was possible to get back an incomplete
set of literals, which would therefore miss some valid matches.

The fix to this is to actually parse the regex and determine whether
smart case applies before doing anything else. It's a little extra work,
but parsing is pretty fast.

Fixes #199
2016-11-06 12:07:47 -05:00
Andrew Gallant
5bd0edbbe1 Actually use simd/avx optimizations in bytecount crate.
Also update compile script.
2016-11-05 22:44:33 -04:00
Andrew Gallant
4368913d8f Merge branch 'fast_linecount' 2016-11-05 22:29:42 -04:00
Andre Bogus
02de97b8ce Use the bytecount crate for fast line counting.
Fixes #128
2016-11-05 22:29:26 -04:00
Andrew Gallant
32db773d51 Merge pull request #223 from BurntSushi/ignore-parallel
Add parallel recursive directory iterator.
2016-11-05 22:19:47 -04:00
Andrew Gallant
b272be25fa Add parallel recursive directory iterator.
This adds a new walk type in the `ignore` crate, `WalkParallel`, which
provides a way for recursively iterating over a set of paths in parallel
while respecting various ignore rules.

The API is a bit strange, as a closure producing a closure isn't
something one often sees, but it does seem to work well.

This also allowed us to simplify much of the worker logic in ripgrep
proper, where MultiWorker is now gone.
2016-11-05 21:45:55 -04:00
Jacob Wahlgren
f63c168563 Rename IgnoreOptions::has_ignores
The name has_ignores is not descriptive in my opinion. I think
has_any_ignore_options more clearly states this method's purpose. I also
considered simply IgnoreOptions::any though I went with the more verbose
option.
2016-11-06 01:17:41 +01:00
Jacob Wahlgren
a05671c8d7 Use new Match::or to simplify return 2016-11-06 01:07:11 +01:00
Andrew Gallant
1aeae3e22d update ripgrep 2016-11-04 21:12:08 -04:00
Andrew Gallant
60d537c43d Merge pull request #220 from theamazingfedex/adding-mak-type
adding .mak extension for makefile filetype.
2016-11-03 20:46:29 -04:00
Andrew Gallant
ef5c07476b Merge pull request #219 from theamazingfedex/adding-pdf-filetype
adding .pdf filetype to available types.
2016-11-03 20:46:07 -04:00
Andrew Gallant
4f6f34307c Merge pull request #218 from theamazingfedex/adding-cs-filetype
adding cs filetype for ease of use.
2016-11-03 20:46:02 -04:00
Daniel Wood
7cf560d27c adding .mak extension for makefile filetype.
Fixes #217
2016-11-03 15:52:10 -06:00
Daniel Wood
15b263ff55 adding cs filetype for ease of use. 2016-11-03 15:33:00 -06:00
Daniel Wood
53121e0733 adding .pdf filetype to available types. 2016-11-03 15:30:58 -06:00
Andrew Gallant
404785f950 Merge pull request #214 from lyuha/master
Add textile, org, creole, rdoc, wiki filetype
2016-11-02 11:53:22 -04:00
Lyuha
103c4c953c Add pod filetype 2016-11-03 00:06:14 +09:00
Lyuha
82abf883c5 Add wiki filetype 2016-11-03 00:06:14 +09:00
Lyuha
a2315d5ee5 Add creole filetype 2016-11-03 00:06:14 +09:00
Lyuha
201d0cb8c1 Add org filetype 2016-11-03 00:06:14 +09:00
Lyuha
6f45478a7d Add rdoc filetype 2016-11-03 00:06:14 +09:00
Lyuha
9c2c569624 Add textile filetype 2016-11-03 00:06:14 +09:00
Andrew Gallant
a1e4e0f85c Merge pull request #213 from tjdgus3537/master
add asciidoc filetype and update markdown filetype
2016-11-02 10:44:31 -04:00
tjdgus3537
caf31a769b Add .markdown, .mdown, .mkdn extension to md and markdown 2016-11-02 22:50:59 +09:00
tjdgus3537
920112e640 Add .adoc, .asc, asciidoc extension to asciidoc 2016-11-02 22:49:31 +09:00
Andrew Gallant
a84ffe603b Merge pull request #212 from radhermit/gentoo
Add Gentoo info to the README
2016-11-02 07:05:47 -04:00
Tim Harder
e4f83f3161 Add Gentoo info to the README 2016-11-01 22:03:00 -04:00
Andrew Gallant
fbca4a0332 Merge pull request #209 from dueyfinster/patch-1
Added taskpaper as a file type
2016-11-01 10:29:15 -04:00
Neil Grogan
65c7df1c25 Added taskpaper as a file type 2016-11-01 13:59:59 +00:00
Alexander Altman
18237da9b2 Add Agda and improve TeX ignore support (#207)
Add Agda and improve TeX ignore support
2016-11-01 06:56:53 -04:00
Andrew Gallant
f147f3aa39 0.2.6 2016-10-31 20:01:37 -04:00
Andrew Gallant
599c4fc3f3 changelog 0.2.6 2016-10-31 20:01:31 -04:00
Andrew Gallant
d85a6dd5c8 update ignore dependency 2016-10-31 20:01:31 -04:00
Andrew Gallant
40abade8ee ignore-0.1.3 2016-10-31 19:54:47 -04:00
Andrew Gallant
fca4fdf6ea ignore-0.1.2 2016-10-31 19:54:38 -04:00
Andrew Gallant
16975797fe Fixes a matching bug in the glob override matcher.
This was probably a transcription error when moving the ignore matcher
code out of ripgrep core. Specifically, the override glob matcher should
not ignore directories if they don't match.

Fixes #206
2016-10-31 19:54:38 -04:00
Andrew Gallant
6507a48f97 Ignore ignore/Cargo.lock 2016-10-31 19:54:38 -04:00
Andrew Gallant
c8e2fa1869 update Cargo.lock 2016-10-31 19:54:38 -04:00
Andrew Gallant
f728708ce9 Merge pull request #204 from lyuha/master
Add .fish extension to fish shell
2016-10-30 17:31:51 -04:00
Lyuha
c302995d05 Add .fish extension to fish shell 2016-10-31 00:52:45 +09:00
Andrew Gallant
4a77cc8100 update brew tap to 0.2.5 2016-10-29 23:23:45 -04:00
Andrew Gallant
dc86666044 changelog 0.2.5 2016-10-29 22:44:07 -04:00
Andrew Gallant
6b038511c7 0.2.5 2016-10-29 22:42:28 -04:00
Andrew Gallant
c767bccade fix appveyor, sigh 2016-10-29 22:42:22 -04:00
Andrew Gallant
a075a462fa 0.2.4 2016-10-29 22:40:02 -04:00
Andrew Gallant
24f753c306 changelog 0.2.4 2016-10-29 22:27:39 -04:00
Andrew Gallant
1aae2759ad update deps 2016-10-29 22:27:29 -04:00
Andrew Gallant
91646f6cca bump ignore to 0.1.1 2016-10-29 22:19:00 -04:00
Andrew Gallant
031ace209d ignore-0.1.1 2016-10-29 22:17:39 -04:00
Andrew Gallant
942e9c4743 don't commit ignore/Cargo.lock 2016-10-29 22:17:27 -04:00
Andrew Gallant
12c9656b18 bump thread_local 2016-10-29 22:15:24 -04:00
Andrew Gallant
8bf3760cdb ignore-0.1.0 2016-10-29 22:13:41 -04:00
Andrew Gallant
c96623e66a bump globset dep in ignore 2016-10-29 22:13:32 -04:00
Andrew Gallant
36f949633b globset-0.1.1 2016-10-29 22:12:44 -04:00
Andrew Gallant
811fcc1fe8 Merge branch 'ctrlc-reset-terminal' 2016-10-29 21:54:18 -04:00
Brian Campbell
79a8d0ab3f Reset the terminal when Ctrl-C is pressed
If a user hits Ctrl-C to exit out of a search in the middle of printing
a line, we don't want to leave the terminal colors screwed up for them.
Catch Ctrl-C using the ctrlc crate, obtain a stdout lock to ensure that
other threads don't continue writing after we do so, reset the terminal,
and exit the program.

Closes #119
2016-10-29 21:23:05 -04:00
Andrew Gallant
fbf8265cde Merge pull request #202 from BurntSushi/ignore
Move all gitignore matching to separate crate.
2016-10-29 21:19:43 -04:00
Andrew Gallant
d79add341b Move all gitignore matching to separate crate.
This PR introduces a new sub-crate, `ignore`, which primarily provides a
fast recursive directory iterator that respects ignore files like
gitignore and other configurable filtering rules based on globs or even
file types.

This results in a substantial source of complexity moved out of ripgrep's
core and into a reusable component that others can now (hopefully)
benefit from.

While much of the ignore code carried over from ripgrep's core, a
substantial portion of it was rewritten with the following goals in
mind:

1. Reuse matchers built from gitignore files across directory iteration.
2. Design the matcher data structure to be amenable for parallelizing
   directory iteration. (Indeed, writing the parallel iterator is the
   next step.)

Fixes #9, #44, #45
2016-10-29 20:48:59 -04:00
Andrew Gallant
12b2b1f624 Merge pull request #198 from sanga/patch-1
update python types to include Cython files
2016-10-28 10:22:22 -04:00
Tim Sampson
3aaf550ca5 update python types to include Cython files 2016-10-28 09:24:21 +03:00
Andrew Gallant
d4876cd064 Merge pull request #197 from SimenB/zsh-type
Add zsh type
2016-10-27 11:56:32 -04:00
simbekkh
867a57e176 Add zsh type 2016-10-27 17:36:26 +02:00
Andrew Gallant
ec4904df33 Merge pull request #195 from 8573/8573/readme/note-Nix-pkgs/1
Mention Nix package in README
2016-10-26 18:52:32 -04:00
c74d
c4ea157cb7 Mention Nix package in README
In the `README.md` document, where said document documents the
availability of pre-built packages of ripgrep, document the
availability of such a package from the package management system Nix.
2016-10-26 03:01:18 +00:00
Andrew Gallant
0156967f4c Merge pull request #192 from SimenB/patch-1
Use svg for travis badge
2016-10-22 18:00:23 -04:00
Simen Bekkhus
3238707b0b Use svg for travis badge 2016-10-22 23:44:38 +02:00
Andrew Gallant
31fbae597f Merge pull request #183 from carlwgeorge/copr-instructions
Add instructions for installation on Fedora 24+ and RHEL/CentOS 7.
2016-10-16 10:17:17 -04:00
Andrew Gallant
f2e1711781 Fix bug when processing parent gitignore files.
This particular bug was triggered whenever a search was run in a directory
with a parent directory that contains a relevant .gitignore file. In
particular, before matching against a parent directory's gitignore rules,
a path's leading `./` was not stripped, which results in errant matching.

We now make sure `./` is stripped.

Fixes #184.
2016-10-16 10:15:11 -04:00
Andrew Gallant
94d600e6e1 Update deps. 2016-10-16 10:12:49 -04:00
Carl George
b904c5d9dc Add instructions for installation on Fedora 24+ and RHEL/CentOS 7.
Please note that the referenced copr repository should just be a temporary home while the Fedora/EPEL package review is pending.

https://bugzilla.redhat.com/show_bug.cgi?id=1380442
2016-10-16 04:13:54 -05:00
Andrew Gallant
5a29417796 update archlinux PKGBUILD 2016-10-13 21:01:49 -04:00
Andrew Gallant
f694800768 Update brew tap. 2016-10-13 21:00:18 -04:00
Andrew Gallant
bd1c9e9499 Merge pull request #175 from little-dude/master
add .tcl file extension
2016-10-13 16:00:40 -04:00
Corentin Henry
f04b0dd95c add .tcl file extension 2016-10-13 11:28:34 -07:00
Andrew Gallant
5487dffefa Merge pull request #172 from alexlafroscia/patch-1
Add `handlebars` type
2016-10-13 06:38:27 -04:00
Alex LaFroscia
1fc6787648 Shorten name of Handlebars type 2016-10-12 17:08:24 -07:00
Alex LaFroscia
11e164aec9 Add handlebars type 2016-10-12 17:07:04 -07:00
Andrew Gallant
1c1331d926 Merge pull request #170 from sfackler/patch-1
Add .gradle as a groovy extension pattern
2016-10-12 15:59:57 -04:00
Steven Fackler
cbe94823d2 Add .gradle as a groovy extension pattern
Gradle is a build system for JVM-based languages that is configured via a Groovy DSL.
2016-10-12 11:07:27 -07:00
Andrew Gallant
d8712daf27 0.2.3 2016-10-11 19:44:54 -04:00
Andrew Gallant
7cbbef019f changelog 0.2.3 2016-10-11 19:44:50 -04:00
Andrew Gallant
4d29d886e5 Merge pull request #168 from durka/doc-threads
clarify docs for --threads
2016-10-11 19:27:11 -04:00
Andrew Gallant
247a9398f4 Switch to thread_local crate in lieu of thread_local!.
This is to work around a bug where using a thread_local! was causing
a segfault on macos.

Fixes #164.
2016-10-11 18:23:49 -04:00
Alex Burka
4c3025ab1c clarify docs for --threads 2016-10-11 17:32:51 -04:00
Andrew Gallant
4981991a6e 0.2.2 2016-10-10 22:24:36 -04:00
Andrew Gallant
51440f59cd Don't include HomebrewFormula in crate. 2016-10-10 22:24:28 -04:00
Andrew Gallant
7b8a8d77d0 changelog 0.2.2 2016-10-10 22:18:21 -04:00
Andrew Gallant
4737326ed3 Update regex-syntax for bug fix.
The bug fix was in expression pretty printing. ripgrep parses the regex
into an AST and may do some modifications to it, which requires the
ability to go from string -> AST -> string' -> AST' where string == string'
implies AST == AST'.

Also, add a regression test for the specific regex that tripped the bug.

Fixes #156.
2016-10-10 22:04:29 -04:00
Andrew Gallant
a3537aa32a Update darwin cfg attributes. 2016-10-10 21:48:47 -04:00
Andrew Gallant
d3e118a786 Fix debug expression statement. 2016-10-10 21:48:34 -04:00
Andrew Gallant
4e52059ad6 Disable regression_131 test on darwin.
It's not clear why it's failing. Maybe it doesn't permit certain
characters in file paths?
2016-10-10 21:03:11 -04:00
Andrew Gallant
60c016c243 Fix docopt usage string. Gah. 2016-10-10 20:49:39 -04:00
Andrew Gallant
4665128f25 Clarify documentation for --replace.
Also add a minor clarification for --type-add.

Fixes #147
2016-10-10 20:19:45 -04:00
Andrew Gallant
dde5bd5a80 globset-0.1.0 2016-10-10 20:07:13 -04:00
Andrew Gallant
762ad44f71 add version marker 2016-10-10 20:06:35 -04:00
Andrew Gallant
705386934d Fill in globset/Cargo.toml with more details. 2016-10-10 19:50:21 -04:00
Andrew Gallant
97bbc6ef11 Update appveyor to test subcrates. 2016-10-10 19:35:47 -04:00
Andrew Gallant
27a980c1bc Fix symlink test.
We attempt to run it on Windows, but I'm getting "access denied" errors
when trying to create a file symlink. So we disable the test on Windows.
2016-10-10 19:34:57 -04:00
Andrew Gallant
e8645dc8ae style nits 2016-10-10 19:27:12 -04:00
Andrew Gallant
e96d93034a Finish overhaul of glob matching.
This commit completes the initial move of glob matching to an external
crate, including fixing up cross platform support, polishing the
external crate for others to use and fixing a number of bugs in the
process.

Fixes #87, #127, #131
2016-10-10 19:24:18 -04:00
Andrew Gallant
bc5accc035 Merge pull request #161 from moshen/update-homebrew-readme
Update Homebrew instructions in the README
2016-10-10 19:11:50 -04:00
Andrew Gallant
c9d0ca8257 Merge pull request #157 from CannedYerins/follow-explicit-args
Always follow symlinks on explicit file arguments
2016-10-10 19:11:10 -04:00
Andrew Gallant
45fe4aab96 Merge pull request #155 from theamazingfedex/adding-extra-md-filetype
Adding extra .md filetype for ease of access to Markdown filetypes
2016-10-10 19:09:15 -04:00
Andrew Gallant
97f981fbcb Merge pull request #154 from theamazingfedex/adding-spark-filetype
Adding .spark filetype
2016-10-10 19:08:20 -04:00
Andrew Gallant
59329dcc61 Merge pull request #153 from theamazingfedex/master
Adding .config filetype
2016-10-10 19:08:06 -04:00
Colin Kennedy
604da8eb86 Update Homebrew instructions in the README 2016-10-09 23:45:02 -05:00
Ian Kerins
1c964372ad Always follow symlinks on explicit file arguments. 2016-10-08 22:40:03 -04:00
Daniel Wood
50a961960e added extra .md filetype for ease of access 2016-10-07 14:37:29 -06:00
Daniel Wood
7481c5fe29 added .spark filetype 2016-10-07 14:35:20 -06:00
Daniel Wood
3ae37b0937 added .config filetype 2016-10-07 13:50:26 -06:00
Andrew Gallant
4ee6dbe422 Merge pull request #148 from munyari/patch-1
Change Arch Linux instructions
2016-10-05 09:53:17 -04:00
Panashe Fundira
cd4bdcf810 Change Arch Linux instructions
The `-Syu` flag will do a full system upgrade and then install the package, which is not necessarily the desired behavior. Only the `-S` flag is necessary to install a single package.
See https://wiki.archlinux.org/index.php/Pacman#Installing_specific_packages
https://wiki.archlinux.org/index.php/Pacman#Upgrading_packages
2016-10-05 08:26:19 -04:00
Andrew Gallant
175406df01 Refactor and test glob sets.
This commit goes a long way toward refactoring glob sets so that the
code is easier to maintain going forward. In particular, it makes the
literal optimizations that glob sets used a lot more structured and much
easier to extend. Tests have also been modified to include glob sets.

There's still a bit of polish work left to do before a release.

This also fixes the immediate issue where large gitignore files were
causing ripgrep to slow way down. While we don't technically fix it for
good, we're a lot better about reducing the number of regexes we
compile. In particular, if a gitignore file contains thousands of
patterns that can't be matched more simply using literals, then ripgrep
will slow down again. We could fix this for good by avoiding RegexSet if
the number of regexes grows too large.

Fixes #134.
2016-10-04 20:28:56 -04:00
Andrew Gallant
89811d43d4 Merge pull request #146 from samuelcolvin/add-jinja-type
add jinja type for *.jinja and *.jinja2
2016-10-04 08:25:22 -04:00
Samuel Colvin
f0053682c0 add jinja type for *.jinja and *.jinja2 2016-10-04 13:15:31 +01:00
Andrew Gallant
35045d6105 Merge pull request #143 from moshen/change-brew-formula-name
Fix brew formula name to not conflict with core
2016-10-04 06:49:16 -04:00
Colin Kennedy
95f552fc06 Fix brew formula name to not conflict with core
Since the homebrew-core formula was accepted, we should differentiate
the prebuilt formula available in this tap
2016-10-03 22:30:26 -05:00
Andrew Gallant
48353bea17 Merge pull request #144 from bitshifter/dotcmake
Added *.cmake extension to cmake file type.
2016-10-03 20:51:05 -04:00
Cameron Hart
703d5b558e Added *.cmake extension to cmake file type. 2016-10-04 11:28:49 +11:00
Andrew Gallant
47efea234f Remove i686-darwin.
Apparently 32 bit Mac CPUs are really old at this point. Also, it has
been causing CI to fail lately. It's not worth it.
2016-10-03 17:16:28 -04:00
Andrew Gallant
ca0d8998a2 Merge pull request #139 from moshen/make-a-tap
Make the repo a Homebrew Tap
2016-10-03 17:14:09 -04:00
Andrew Gallant
fdf24317ac Move glob implementation to new crate.
It is isolated and complex enough that it deserves attention all on its
own. It's also eminently reusable.
2016-09-30 19:42:41 -04:00
Andrew Gallant
b9d5f22a4d Stopgap measure for projects with huge gitignore files.
This helps #134 by avoiding a slow regex execution path, but doesn't
actually fix the problem. Namely, we've gone from "so slow I'm not going
to keep waiting for rg to finish" to "wow that was slow but at least it
finished before I lost my patience."
2016-09-30 19:29:52 -04:00
Colin Kennedy
67bb4f040f Make the repo a Homebrew Tap 2016-09-30 12:51:37 -05:00
Andrew Gallant
cee2f09a6d Merge pull request #121 from lilydjwg/master
if --color always, always print with color, even when --vimgrep is given
2016-09-29 16:49:44 -04:00
Andrew Gallant
ced777e91f Merge pull request #133 from akien-mga/pr-appveyor
AppVeyor: Change release description to fit Travis binaries
2016-09-29 09:35:44 -04:00
Rémi Verschelde
e9d9083898 AppVeyor: Change release description to fit Travis binaries 2016-09-29 15:29:59 +02:00
Andrew Gallant
46dff8f4be Be better with short circuiting with --quiet.
It didn't make sense for --quiet to be part of the printer, because --quiet
doesn't just mean "don't print," it also means, "stop after the first
match is found." This needs to be wired all the way up through directory
traversal, and it also needs to cause all of the search workers to quit
as well. We do it with an atomic that is only checked with --quiet is
given.

Fixes #116.
2016-09-28 20:50:50 -04:00
Andrew Gallant
7aa6e87952 clarify 2016-09-28 16:47:10 -04:00
Andrew Gallant
925d0db9f0 Add -s/--case-sensitive flag.
This flag overrides both --smart-case and --ignore-case.

Closes #124.
2016-09-28 16:32:29 -04:00
Andrew Gallant
316ffd87b3 bump docopt to 0.6.86 2016-09-28 15:56:59 -04:00
依云
5943b1effe if --color always, always print with color, even when --vimgrep is given 2016-09-28 20:10:07 +08:00
Andrew Gallant
c42f97b4da Merge pull request #122 from lilydjwg/color-filename
colorize filepath at the beginning of line too
2016-09-28 07:06:34 -04:00
依云
0d9bba7816 colorize filepath at the beginning of line too 2016-09-28 11:54:43 +08:00
Andrew Gallant
3550f2e29a Merge pull request #111 from gsquire/max-depth
Max depth option
2016-09-27 19:46:29 -04:00
Garrett Squire
babe80d498 add a max-depth option for directory traversal
CR and add integration test
2016-09-27 16:14:53 -07:00
Andrew Gallant
3e892a7a80 Correct example with --type-add.
Fixes #118.
2016-09-27 18:35:06 -04:00
Andrew Gallant
1df3f0b793 Merge pull request #114 from cetra3/colorChoice
Create Colour Choice struct to adjust colours depending on platform
2016-09-27 09:43:47 -04:00
cetra3
b3935935cb Add colour choice 2016-09-27 22:51:07 +09:30
Andrew Gallant
67abbf6f22 Merge pull request #115 from nickstenning/update-brew-hashes
Update brew 0.2.1 package hashes
2016-09-27 07:01:27 -04:00
Nick Stenning
7b9f7d7dc6 Update brew 0.2.1 package hashes 2016-09-27 12:01:22 +02:00
Andrew Gallant
7ab29a91d0 fix use of --type-add 2016-09-26 20:58:28 -04:00
Andrew Gallant
9fa38c6232 brew 0.2.1 2016-09-26 20:42:12 -04:00
64 changed files with 11766 additions and 4665 deletions

4
.gitignore vendored
View File

@@ -2,3 +2,7 @@
tags
target
/grep/Cargo.lock
/globset/Cargo.lock
/ignore/Cargo.lock
/termcolor/Cargo.lock
/wincolor/Cargo.lock

View File

@@ -15,9 +15,6 @@ matrix:
- os: linux
rust: nightly
env: TARGET=x86_64-unknown-linux-musl
- os: osx
rust: nightly
env: TARGET=i686-apple-darwin
- os: osx
rust: nightly
env: TARGET=x86_64-apple-darwin
@@ -33,13 +30,10 @@ matrix:
env: TARGET=x86_64-apple-darwin
# Minimum Rust supported channel.
- os: linux
rust: 1.9.0
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: 1.9.0
rust: 1.11.0
env: TARGET=x86_64-unknown-linux-gnu
- os: osx
rust: 1.9.0
rust: 1.11.0
env: TARGET=x86_64-apple-darwin
before_install:

View File

@@ -1,3 +1,223 @@
0.3.0
=====
This is a new minor version release of ripgrep that includes two breaking
changes with lots of bug fixes and some new features and performance
improvements. Notably, if you had a problem with colors or piping on Windows
before, then that should now be fixed in this release.
**BREAKING CHANGES**:
* ripgrep now requires Rust 1.11 to compile. Previously, it could build on
Rust 1.9. The cause of this was the move from
[Docopt to Clap](https://github.com/BurntSushi/ripgrep/pull/233)
for argument parsing.
* The `-e/--regexp` flag can no longer accept a pattern starting with a `-`.
There are two work-arounds: `rg -- -foo` and `rg [-]foo` or `rg -e [-]foo`
will all search for the same `-foo` pattern. The cause of this was the move
from [Docopt to Clap](https://github.com/BurntSushi/ripgrep/pull/233)
for argument parsing.
[This may get fixed in the
future.](https://github.com/kbknapp/clap-rs/issues/742).
Performance improvements:
* [PERF #33](https://github.com/BurntSushi/ripgrep/issues/33):
ripgrep now performs similar to GNU grep on small corpora.
* [PERF #136](https://github.com/BurntSushi/ripgrep/issues/136):
ripgrep no longer slows down because of argument parsing when given a large
argument list.
Feature enhancements:
* Added or improved file type filtering for Elixir.
* [FEATURE #7](https://github.com/BurntSushi/ripgrep/issues/7):
Add a `-f/--file` flag that causes ripgrep to read patterns from a file.
* [FEATURE #51](https://github.com/BurntSushi/ripgrep/issues/51):
Add a `--colors` flag that enables one to customize the colors used in
ripgrep's output.
* [FEATURE #138](https://github.com/BurntSushi/ripgrep/issues/138):
Add a `--files-without-match` flag that shows only file paths that contain
zero matches.
* [FEATURE #230](https://github.com/BurntSushi/ripgrep/issues/230):
Add completion files to the release (Bash, Fish and PowerShell).
Bug fixes:
* [BUG #37](https://github.com/BurntSushi/ripgrep/issues/37):
Use correct ANSI escape sequences when `TERM=screen.linux`.
* [BUG #94](https://github.com/BurntSushi/ripgrep/issues/94):
ripgrep now detects stdin on Windows automatically.
* [BUG #117](https://github.com/BurntSushi/ripgrep/issues/117):
Colors should now work correctly and automatically inside mintty.
* [BUG #182](https://github.com/BurntSushi/ripgrep/issues/182):
Colors should now work within Emacs. In particular, `--color=always` will
emit colors regardless of the current environment.
* [BUG #189](https://github.com/BurntSushi/ripgrep/issues/189):
Show less content when running `rg -h`. The full help content can be
accessed with `rg --help`.
* [BUG #210](https://github.com/BurntSushi/ripgrep/issues/210):
Support non-UTF-8 file names on Unix platforms.
* [BUG #231](https://github.com/BurntSushi/ripgrep/issues/231):
Switch from block buffering to line buffering.
* [BUG #241](https://github.com/BurntSushi/ripgrep/issues/241):
Some error messages weren't suppressed when `--no-messages` was used.
0.2.9
=====
Bug fixes:
* [BUG #226](https://github.com/BurntSushi/ripgrep/issues/226):
File paths explicitly given on the command line weren't searched in parallel.
(This was a regression in `0.2.7`.)
* [BUG #228](https://github.com/BurntSushi/ripgrep/issues/228):
If a directory was given to `--ignore-file`, ripgrep's memory usage would
grow without bound.
0.2.8
=====
Bug fixes:
* Fixed a bug with the SIMD/AVX features for using bytecount in commit
`4ca15a`.
0.2.7
=====
Performance improvements:
* [PERF #223](https://github.com/BurntSushi/ripgrep/pull/223):
Added a parallel recursive directory iterator. This results in major
performance improvements on large repositories.
* [PERF #11](https://github.com/BurntSushi/ripgrep/pull/11):
ripgrep now uses the `bytecount` library for counting new lines. In some
cases, ripgrep runs twice as fast. Use
`RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'`
to get the fastest possible binary.
Feature enhancements:
* Added or improved file type filtering for Agda, Tex, Taskpaper, Markdown,
asciidoc, textile, rdoc, org, creole, wiki, pod, C#, PDF, C, C++.
* [FEATURE #149](https://github.com/BurntSushi/ripgrep/issues/149):
Add a new `--no-messages` flag that suppresses error messages.
Note that `rg foo 2> /dev/null` also works.
* [FEATURE #159](https://github.com/BurntSushi/ripgrep/issues/159):
Add a new `-m/--max-count` flag that limits the total number of matches
printed for each file searched.
Bug fixes:
* [BUG #199](https://github.com/BurntSushi/ripgrep/issues/199):
Fixed a bug where `-S/--smart-case` wasn't being applied correctly to
literal optimizations.
* [BUG #203](https://github.com/BurntSushi/ripgrep/issues/203):
Mention the full name, ripgrep, in more places. It now appears in
the output of `--help` and `--version`. The repository URL is now also
in the output of `--help` and the man page.
* [BUG #215](https://github.com/BurntSushi/ripgrep/issues/215):
Include small note about how to search for a pattern that starts with a `-`.
0.2.6
=====
Feature enhancements:
* Added or improved file type filtering for Fish.
Bug fixes:
* [BUG #206](https://github.com/BurntSushi/ripgrep/issues/206):
Fixed a regression with `-g/--glob` flag in `0.2.5`.
0.2.5
=====
Feature enhancements:
* Added or improved file type filtering for Groovy, Handlebars, Tcl, zsh and
Python.
* [FEATURE #9](https://github.com/BurntSushi/ripgrep/issues/9):
Support global gitignore config and `.git/info/exclude` files.
* [FEATURE #45](https://github.com/BurntSushi/ripgrep/issues/45):
Add --ignore-file flag for specifying additional ignore files.
* [FEATURE #202](https://github.com/BurntSushi/ripgrep/pull/202):
Introduce a new
[`ignore`](https://github.com/BurntSushi/ripgrep/tree/master/ignore)
crate that encapsulates all of ripgrep's gitignore matching logic.
Bug fixes:
* [BUG #44](https://github.com/BurntSushi/ripgrep/issues/44):
ripgrep runs slowly when given lots of positional arguments that are
directories.
* [BUG #119](https://github.com/BurntSushi/ripgrep/issues/119):
ripgrep didn't reset terminal colors if it was interrupted by `^C`.
Fixed in [PR #187](https://github.com/BurntSushi/ripgrep/pull/187).
* [BUG #184](https://github.com/BurntSushi/ripgrep/issues/184):
Fixed a bug related to interpreting gitignore files in parent directories.
0.2.4
=====
SKIPPED.
0.2.3
=====
Bug fixes:
* [BUG #164](https://github.com/BurntSushi/ripgrep/issues/164):
Fixes a segfault on macos builds.
* [BUG #167](https://github.com/BurntSushi/ripgrep/issues/167):
Clarify documentation for --threads.
0.2.2
=====
Packaging updates:
* `ripgrep` is now in homebrew-core. `brew install ripgrep` will do the trick
on a Mac.
* `ripgrep` is now in the Archlinux community repository.
`pacman -S ripgrep` will do the trick on Archlinux.
* Support has been discontinued for i686-darwin.
* Glob matching has been moved out into its own crate:
[`globset`](https://crates.io/crates/globset).
Feature enhancements:
* Added or improved file type filtering for CMake, config, Jinja, Markdown,
Spark.
* [FEATURE #109](https://github.com/BurntSushi/ripgrep/issues/109):
Add a --max-depth flag for directory traversal.
* [FEATURE #124](https://github.com/BurntSushi/ripgrep/issues/124):
Add -s/--case-sensitive flag. Overrides --smart-case.
* [FEATURE #139](https://github.com/BurntSushi/ripgrep/pull/139):
The `ripgrep` repo is now a Homebrew tap. This is useful for installing
SIMD accelerated binaries, which aren't available in homebrew-core.
Bug fixes:
* [BUG #87](https://github.com/BurntSushi/ripgrep/issues/87),
[BUG #127](https://github.com/BurntSushi/ripgrep/issues/127),
[BUG #131](https://github.com/BurntSushi/ripgrep/issues/131):
Various issues related to glob matching.
* [BUG #116](https://github.com/BurntSushi/ripgrep/issues/116):
--quiet should stop search after first match.
* [BUG #121](https://github.com/BurntSushi/ripgrep/pull/121):
--color always should show colors, even when --vimgrep is used.
* [BUG #122](https://github.com/BurntSushi/ripgrep/pull/122):
Colorize file path at beginning of line.
* [BUG #134](https://github.com/BurntSushi/ripgrep/issues/134):
Processing a large ignore file (thousands of globs) was very slow.
* [BUG #137](https://github.com/BurntSushi/ripgrep/issues/137):
Always follow symlinks when given as an explicit argument.
* [BUG #147](https://github.com/BurntSushi/ripgrep/issues/147):
Clarify documentation for --replace.
0.2.1
=====
Feature enhancements:

247
Cargo.lock generated
View File

@@ -1,24 +1,22 @@
[root]
name = "ripgrep"
version = "0.2.1"
version = "0.3.0"
dependencies = [
"deque 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"docopt 0.6.85 (registry+https://github.com/rust-lang/crates.io-index)",
"bytecount 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.18.0 (registry+https://github.com/rust-lang/crates.io-index)",
"ctrlc 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"env_logger 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"grep 0.1.3",
"grep 0.1.4",
"ignore 0.1.5",
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.5.0 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
"rustc-serialize 0.3.19 (registry+https://github.com/rust-lang/crates.io-index)",
"term 0.4.4 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 0.1.0",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
@@ -31,22 +29,51 @@ dependencies = [
]
[[package]]
name = "deque"
version = "0.3.1"
name = "ansi_term"
version = "0.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bitflags"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bytecount"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"rand 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "docopt"
version = "0.6.85"
name = "clap"
version = "2.18.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
"rustc-serialize 0.3.19 (registry+https://github.com/rust-lang/crates.io-index)",
"ansi_term 0.9.0 (registry+https://github.com/rust-lang/crates.io-index)",
"bitflags 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"strsim 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"term_size 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-segmentation 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
"vec_map 0.6.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "crossbeam"
version = "0.2.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "ctrlc"
version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -55,7 +82,7 @@ version = "0.3.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -65,28 +92,49 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "fs2"
version = "0.2.5"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "glob"
version = "0.2.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
name = "globset"
version = "0.1.2"
dependencies = [
"aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep"
version = "0.1.3"
version = "0.1.4"
dependencies = [
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.5.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.9 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ignore"
version = "0.1.5"
dependencies = [
"crossbeam 0.2.10 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.1.2",
"lazy_static 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -100,12 +148,12 @@ dependencies = [
[[package]]
name = "lazy_static"
version = "0.2.1"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "libc"
version = "0.2.16"
version = "0.2.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
@@ -118,17 +166,17 @@ name = "memchr"
version = "0.1.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "memmap"
version = "0.2.3"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"fs2 0.2.5 (registry+https://github.com/rust-lang/crates.io-index)",
"fs2 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)",
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
@@ -137,25 +185,17 @@ name = "num_cpus"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "rand"
version = "0.3.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "regex"
version = "0.1.77"
version = "0.1.80"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.9 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -163,12 +203,7 @@ dependencies = [
[[package]]
name = "regex-syntax"
version = "0.3.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "rustc-serialize"
version = "0.3.19"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
@@ -182,21 +217,38 @@ version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "term"
version = "0.4.4"
name = "term_size"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "termcolor"
version = "0.1.0"
dependencies = [
"wincolor 0.1.0",
]
[[package]]
name = "thread-id"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "thread-id"
version = "3.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -207,14 +259,51 @@ dependencies = [
"thread-id 2.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "thread_local"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"thread-id 3.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unreachable 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "unicode-segmentation"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unicode-width"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unreachable"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "utf8-ranges"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "vec_map"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "void"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "walkdir"
version = "0.1.8"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -231,31 +320,47 @@ name = "winapi-build"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "wincolor"
version = "0.1.0"
dependencies = [
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
]
[metadata]
"checksum aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)" = "ca972c2ea5f742bfce5687b9aef75506a764f61d37f8f649047846a9686ddb66"
"checksum deque 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "1614659040e711785ed8ea24219140654da1729f3ec8a47a9719d041112fe7bf"
"checksum docopt 0.6.85 (registry+https://github.com/rust-lang/crates.io-index)" = "1b88d783674021c5570e7238e17985b9b8c7141d90f33de49031b8d56e7f0bf9"
"checksum ansi_term 0.9.0 (registry+https://github.com/rust-lang/crates.io-index)" = "23ac7c30002a5accbf7e8987d0632fa6de155b7c3d39d0067317a391e00a2ef6"
"checksum bitflags 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "aad18937a628ec6abcd26d1489012cc0e18c21798210f491af69ded9b881106d"
"checksum bytecount 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)" = "49e3c21915578e2300b08d3c174a8ac887e0c6421dff86fdc4d741dc29e5d413"
"checksum clap 2.18.0 (registry+https://github.com/rust-lang/crates.io-index)" = "40046b8a004bf3ba43b9078bf4b9b6d1708406a234848f925dbd7160a374c8a8"
"checksum crossbeam 0.2.10 (registry+https://github.com/rust-lang/crates.io-index)" = "0c5ea215664ca264da8a9d9c3be80d2eaf30923c259d03e870388eb927508f97"
"checksum ctrlc 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "77f98bb69e3fefadcc5ca80a1368a55251f70295168203e01165bcaecb270891"
"checksum env_logger 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "15abd780e45b3ea4f76b4e9a26ff4843258dd8a3eed2775a0e7368c2e7936c2f"
"checksum fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)" = "6cc484842f1e2884faf56f529f960cc12ad8c71ce96cc7abba0a067c98fee344"
"checksum fs2 0.2.5 (registry+https://github.com/rust-lang/crates.io-index)" = "bcd414e5a1a979b931bb92f41b7a54106d3f6d2e6c253e9ce943b7cd468251ef"
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
"checksum fs2 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "640001e1bd865c7c32806292822445af576a6866175b5225aa2087ca5e3de551"
"checksum kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "7507624b29483431c0ba2d82aece8ca6cdba9382bff4ddd0f7490560c056098d"
"checksum lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "49247ec2a285bb3dcb23cbd9c35193c025e7251bfce77c1d5da97e6362dffe7f"
"checksum libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)" = "408014cace30ee0f767b1c4517980646a573ec61a57957aeeabcac8ac0a02e8d"
"checksum lazy_static 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6abe0ee2e758cd6bc8a2cd56726359007748fbf4128da998b65d0b70f881e19b"
"checksum libc 0.2.17 (registry+https://github.com/rust-lang/crates.io-index)" = "044d1360593a78f5c8e5e710beccdc24ab71d1f01bc19a29bcacdba22e8475d8"
"checksum log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)" = "ab83497bf8bf4ed2a74259c1c802351fcd67a65baa86394b6ba73c36f4838054"
"checksum memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)" = "d8b629fb514376c675b98c1421e80b151d3817ac42d7c667717d282761418d20"
"checksum memmap 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)" = "f20f72ed93291a72e22e8b16bb18762183bb4943f0f483da5b8be1a9e8192752"
"checksum memmap 0.5.0 (registry+https://github.com/rust-lang/crates.io-index)" = "065ce59af31c18ea2c419100bda6247dd4ec3099423202b12f0bd32e529fabd2"
"checksum num_cpus 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "8890e6084723d57d0df8d2720b0d60c6ee67d6c93e7169630e4371e88765dcad"
"checksum rand 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)" = "2791d88c6defac799c3f20d74f094ca33b9332612d9aef9078519c82e4fe04a5"
"checksum regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)" = "64b03446c466d35b42f2a8b203c8e03ed8b91c0f17b56e1f84f7210a257aa665"
"checksum regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "279401017ae31cf4e15344aa3f085d0e2e5c1e70067289ef906906fdbe92c8fd"
"checksum rustc-serialize 0.3.19 (registry+https://github.com/rust-lang/crates.io-index)" = "6159e4e6e559c81bd706afe9c8fd68f547d3e851ce12e76b1de7914bab61691b"
"checksum regex 0.1.80 (registry+https://github.com/rust-lang/crates.io-index)" = "4fd4ace6a8cf7860714a2c2280d6c1f7e6a413486c13298bbc86fd3da019402f"
"checksum regex-syntax 0.3.9 (registry+https://github.com/rust-lang/crates.io-index)" = "f9ec002c35e86791825ed294b50008eea9ddfc8def4420124fbc6b08db834957"
"checksum simd 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "63b5847c2d766ca7ce7227672850955802fabd779ba616aeabead4c2c3877023"
"checksum strsim 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "50c069df92e4b01425a8bf3576d5d417943a6a7272fbabaf5bd80b1aaa76442e"
"checksum term 0.4.4 (registry+https://github.com/rust-lang/crates.io-index)" = "3deff8a2b3b6607d6d7cc32ac25c0b33709453ca9cceac006caac51e963cf94a"
"checksum term_size 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3f7f5f3f71b0040cecc71af239414c23fd3c73570f5ff54cf50e03cef637f2a0"
"checksum thread-id 2.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "a9539db560102d1cef46b8b78ce737ff0bb64e7e18d35b2a5688f7d097d0ff03"
"checksum thread-id 3.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "4437c97558c70d129e40629a5b385b3fb1ffac301e63941335e4d354081ec14a"
"checksum thread_local 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)" = "8576dbbfcaef9641452d5cf0df9b0e7eeab7694956dd33bb61515fb8f18cfdd5"
"checksum thread_local 0.3.0 (registry+https://github.com/rust-lang/crates.io-index)" = "50057ca52c629a39aed52d8eb253800cb727875fa6fc7c4b1445f0ac3b50c27c"
"checksum unicode-segmentation 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)" = "b905d0fc2a1f0befd86b0e72e31d1787944efef9d38b9358a9e92a69757f7e3b"
"checksum unicode-width 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "2d6722facc10989f63ee0e20a83cd4e1714a9ae11529403ac7e0afd069abc39e"
"checksum unreachable 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "1f2ae5ddb18e1c92664717616dd9549dde73f539f01bd7b77c2edb2446bdff91"
"checksum utf8-ranges 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "a1ca13c08c41c9c3e04224ed9ff80461d97e121589ff27c753a16cb10830ae0f"
"checksum walkdir 0.1.8 (registry+https://github.com/rust-lang/crates.io-index)" = "c66c0b9792f0a765345452775f3adbd28dde9d33f30d13e5dcc5ae17cf6f3780"
"checksum vec_map 0.6.0 (registry+https://github.com/rust-lang/crates.io-index)" = "cac5efe5cb0fa14ec2f84f83c701c562ee63f6dcc680861b21d65c682adfb05f"
"checksum void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6a02e4885ed3bc0f2de90ea6dd45ebcbb66dacffe03547fadbb0eeae2770887d"
"checksum walkdir 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "98da26f00240118fbb7a06fa29579d1b39d34cd6e0505ea5c125b26d5260a967"
"checksum winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)" = "167dc9d6949a9b857f3451275e911c3f44255842c1f7a76f33c55103a909087a"
"checksum winapi-build 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "2d315eee3b34aca4797b2da6b13ed88266e6d612562a0c46390af8299fc699bc"

View File

@@ -1,6 +1,6 @@
[package]
name = "ripgrep"
version = "0.2.1" #:version
version = "0.3.0" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Line oriented search tool using Rust's regex library. Combines the raw
@@ -12,6 +12,8 @@ repository = "https://github.com/BurntSushi/ripgrep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense/MIT"
exclude = ["HomebrewFormula"]
build = "build.rs"
[[bin]]
bench = false
@@ -23,31 +25,32 @@ name = "integration"
path = "tests/tests.rs"
[dependencies]
deque = "0.3"
docopt = "0.6"
bytecount = "0.1.4"
clap = "2.18"
ctrlc = "2.0"
env_logger = "0.3"
fnv = "1.0"
grep = { version = "0.1.3", path = "grep" }
grep = { version = "0.1.4", path = "grep" }
ignore = { version = "0.1.5", path = "ignore" }
lazy_static = "0.2"
libc = "0.2"
log = "0.3"
memchr = "0.1"
memmap = "0.2"
memmap = "0.5"
num_cpus = "1"
regex = "0.1.77"
rustc-serialize = "0.3"
term = "0.4"
walkdir = "0.1"
termcolor = { version = "0.1.0", path = "termcolor" }
[target.'cfg(windows)'.dependencies]
kernel32-sys = "0.2"
winapi = "0.2"
kernel32-sys = "0.2.2"
winapi = "0.2.8"
[build-dependencies]
clap = "2.18"
lazy_static = "0.2"
[features]
simd-accel = ["regex/simd-accel"]
[dev-dependencies]
glob = "0.2"
avx-accel = ["bytecount/avx-accel"]
simd-accel = ["bytecount/simd-accel", "regex/simd-accel"]
[profile.release]
debug = true

1
HomebrewFormula Symbolic link
View File

@@ -0,0 +1 @@
pkg/brew

View File

@@ -5,7 +5,7 @@ Silver Searcher (an `ack` clone) with the raw speed of GNU grep. `ripgrep` has
first class support on Windows, Mac and Linux, with binary downloads available
for [every release](https://github.com/BurntSushi/ripgrep/releases).
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Linux build status](https://travis-ci.org/BurntSushi/ripgrep.svg?branch=master)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
@@ -15,11 +15,12 @@ Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
[![A screenshot of a sample search with ripgrep](http://burntsushi.net/stuff/ripgrep1.png)](http://burntsushi.net/stuff/ripgrep1.png)
### Quick example comparing tools
### Quick examples comparing tools
This example searches the entire Linux kernel source tree (after running
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz.
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and
ripgrep was compiled using the `compile` script in this repo.
Please remember that a single benchmark is never enough! See my
[blog post on `ripgrep`](http://blog.burntsushi.net/ripgrep/)
@@ -27,17 +28,41 @@ for a very detailed comparison with more benchmarks and analysis.
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.245s** |
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.134s** |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.753s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.823s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.880s |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.880s |
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.656s |
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 12.369s |
| [ack](http://beyondgrep.com/) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 16.952s |
| [ack](https://github.com/petdance/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 16.952s |
(Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
[bug](https://github.com/petdance/ack2/issues/14).)
Here's another benchmark that disregards gitignore files and searches with a
whitelist instead. The corpus is the same as in the previous benchmark, and the
flags passed to each command ensures that they are doing equivalent work:
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.108s** |
| [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 392 | 0.219s |
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.733s |
(`ucg` [has slightly different behavior in the presence of symbolic links](https://github.com/gvansickle/ucg/issues/106).)
And finally, a straight up comparison between ripgrep and GNU grep on a single
large file (~9.3GB,
[`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz)):
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.520s** |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.143s |
In the above benchmark, passing the `-n` flag (for showing line numbers)
increases the times to `3.081s` for ripgrep and `11.403s` for GNU grep.
### Why should I use `ripgrep`?
* It can replace both The Silver Searcher and GNU grep because it is faster
@@ -82,8 +107,9 @@ Summarizing, `ripgrep` is fast because:
[`RegexSet`](https://doc.rust-lang.org/regex/regex/struct.RegexSet.html).
That means a single file path can be matched against multiple glob patterns
simultaneously.
* Uses a Chase-Lev work-stealing queue for quickly distributing work to
multiple threads.
* It uses a lock-free parallel recursive directory iterator, courtesy of
[`crossbeam`](https://docs.rs/crossbeam) and
[`ignore`](https://docs.rs/ignore).
### Installation
@@ -97,18 +123,53 @@ but you'll need to have the
[Microsoft VC++ 2015 redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)
installed.
If you're a **Homebrew** user, then you can install it with a custom formula
(N.B. `ripgrep` isn't actually in Homebrew yet. This just installs the binary
directly):
If you're a **Mac OS X Homebrew** user, then you can install ripgrep either
from homebrew-core, (compiled with rust stable, no SIMD):
```
$ brew install https://raw.githubusercontent.com/BurntSushi/ripgrep/master/pkg/brew/ripgrep.rb
$ brew install ripgrep
```
or you can install a binary compiled with rust nightly (including SIMD and all
optimizations) by utilizing a custom tap:
```
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install burntsushi/ripgrep/ripgrep-bin
```
If you're an **Arch Linux** user, then you can install `ripgrep` from the official repos:
```
$ pacman -Syu ripgrep
$ pacman -S ripgrep
```
If you're a **Gentoo** user, you can install `ripgrep` from the [official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
```
$ emerge ripgrep
```
If you're a **Fedora 24+** user, you can install `ripgrep` from [copr](https://copr.fedorainfracloud.org/coprs/carlgeorge/ripgrep/):
```
$ dnf copr enable carlgeorge/ripgrep
$ dnf install ripgrep
```
If you're a **RHEL/CentOS 7** user, you can install `ripgrep` from [copr](https://copr.fedorainfracloud.org/coprs/carlgeorge/ripgrep/):
```
$ yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlgeorge/ripgrep/repo/epel-7/carlgeorge-ripgrep-epel-7.repo
$ yum install ripgrep
```
If you're a **Nix** user, you can install `ripgrep` from
[nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/text/ripgrep/default.nix):
```
$ nix-env --install ripgrep
$ # (Or using the attribute name, which is also `ripgrep`.)
```
If you're a **Rust programmer**, `ripgrep` can be installed with `cargo`:
@@ -214,10 +275,11 @@ $ rg -Tjs foobar
```
To see a list of types supported, run `rg --type-list`. To add a new type, use
`--type-add`:
`--type-add`, which must be accompanied by a pattern for searching (`rg` won't
persist your type settings):
```
$ rg --type-add 'foo:*.foo,*.foobar'
$ rg --type-add 'foo:*.{foo,foobar}' -tfoo bar
```
The type `foo` will now match any file ending with the `.foo` or `.foobar`
@@ -246,9 +308,12 @@ If you have a Rust nightly compiler, then you can enable optional SIMD
acceleration like so:
```
RUSTFLAGS="-C target-cpu=native" cargo build --release --features simd-accel
RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'
```
If your machine doesn't support AVX instructions, then simply remove
`avx-accel` from the features list. Similarly for SIMD.
### Running tests
`ripgrep` is relatively well tested, including both unit tests and integration

View File

@@ -28,6 +28,11 @@ build: false
# TODO modify this phase as you see fit
test_script:
- cargo test --verbose
- cargo test --verbose --manifest-path grep/Cargo.toml
- cargo test --verbose --manifest-path globset/Cargo.toml
- cargo test --verbose --manifest-path ignore/Cargo.toml
- cargo test --verbose --manifest-path wincolor/Cargo.toml
- cargo test --verbose --manifest-path termcolor/Cargo.toml
before_deploy:
# Generate artifacts for release
@@ -41,7 +46,7 @@ before_deploy:
- appveyor PushArtifact ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip
deploy:
description: 'Windows release'
description: 'Automatically deployed release'
# All the zipped artifacts will be deployed
artifact: /.*\.zip/
auth_token:
@@ -57,7 +62,9 @@ deploy:
branches:
only:
- appveyor
- /\d+\.\d+\.\d+/
except:
- master
# - appveyor
# - /\d+\.\d+\.\d+/
# except:
# - master

View File

@@ -1,5 +0,0 @@
These are internal microbenchmarks for tracking the peformance of individual
components inside of ripgrep. At the moment, they aren't heavily used.
For performance benchmarks of ripgrep proper, see the sibling `benchsuite`
directory.

23
build.rs Normal file
View File

@@ -0,0 +1,23 @@
#[macro_use]
extern crate clap;
#[macro_use]
extern crate lazy_static;
use std::fs;
use clap::Shell;
#[allow(dead_code)]
#[path = "src/app.rs"]
mod app;
fn main() {
fs::create_dir_all(env!("OUT_DIR")).unwrap();
let mut app = app::app_short();
app.gen_completions("rg", Shell::Bash, env!("OUT_DIR"));
app.gen_completions("rg", Shell::Fish, env!("OUT_DIR"));
// Zsh seems to fail with a panic.
// app.gen_completions("rg", Shell::Zsh, env!("OUT_DIR"));
app.gen_completions("rg", Shell::PowerShell, env!("OUT_DIR"));
}

View File

@@ -19,6 +19,8 @@ mk_tarball() {
cp target/$TARGET/release/rg "$td/$name/"
cp {doc/rg.1,README.md,UNLICENSE,COPYING,LICENSE-MIT} "$td/$name/"
cp target/release/build/ripgrep-*/out/rg.* "$td/$name/"
cp target/release/build/ripgrep-*/out/_rg.* "$td/$name/"
pushd $td
tar czf "$out_dir/$name.tar.gz" *

View File

@@ -19,6 +19,14 @@ run_test_suite() {
cargo clean --target $TARGET --verbose
cargo build --target $TARGET --verbose
cargo test --target $TARGET --verbose
cargo build --target $TARGET --verbose --manifest-path grep/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path grep/Cargo.toml
cargo build --target $TARGET --verbose --manifest-path globset/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path globset/Cargo.toml
cargo build --target $TARGET --verbose --manifest-path ignore/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path ignore/Cargo.toml
cargo build --target $TARGET --verbose --manifest-path termcolor/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path termcolor/Cargo.toml
# sanity check the file type
file target/$TARGET/debug/rg

View File

@@ -1,5 +1,5 @@
#!/bin/sh
export RUSTFLAGS="-C target-feature=+ssse3"
# export RUSTFLAGS="-C target-cpu=native"
cargo build --release --features simd-accel
# export RUSTFLAGS="-C target-feature=+ssse3"
export RUSTFLAGS="-C target-cpu=native"
cargo build --release --features 'simd-accel avx-accel'

106
doc/rg.1
View File

@@ -1,4 +1,4 @@
.\" Automatically generated by Pandoc 1.17.2
.\" Automatically generated by Pandoc 1.18
.\"
.TH "rg" "1"
.hy
@@ -7,11 +7,11 @@
rg \- recursively search current directory for lines matching a pattern
.SH SYNOPSIS
.PP
rg [\f[I]options\f[]] \-e PATTERN ...
[\f[I]<\f[]path\f[I]> ...\f[]]
.PP
rg [\f[I]options\f[]] <\f[I]pattern\f[]> [\f[I]<\f[]path\f[I]> ...\f[]]
.PP
rg [\f[I]options\f[]] (\-e PATTERN | \-f FILE) ...
[\f[I]<\f[]path\f[I]> ...\f[]]
.PP
rg [\f[I]options\f[]] \-\-files [\f[I]<\f[]path\f[I]> ...\f[]]
.PP
rg [\f[I]options\f[]] \-\-type\-list
@@ -21,8 +21,10 @@ rg [\f[I]options\f[]] \-\-help
rg [\f[I]options\f[]] \-\-version
.SH DESCRIPTION
.PP
rg (ripgrep) combines the usability of The Silver Searcher (an ack
ripgrep (rg) combines the usability of The Silver Searcher (an ack
clone) with the raw speed of grep.
.PP
Project home page: https://github.com/BurntSushi/ripgrep
.SH COMMON OPTIONS
.TP
.B \-a, \-\-text
@@ -46,6 +48,7 @@ Valid values are never, always or auto.
Use PATTERN to search.
This option can be provided multiple times, where all patterns given are
searched.
This is also useful when searching for patterns that start with a dash.
.RS
.RE
.TP
@@ -70,6 +73,7 @@ Show this usage message.
.TP
.B \-i, \-\-ignore\-case
Case insensitive search.
Overridden by \-\-case\-sensitive.
.RS
.RE
.TP
@@ -90,12 +94,6 @@ If a match is found in a file, stop searching that file.
.RS
.RE
.TP
.B \-r, \-\-replace \f[I]ARG\f[]
Replace every match with the string given.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported.
.RS
.RE
.TP
.B \-t, \-\-type \f[I]TYPE\f[] ...
Only search files matching TYPE.
Multiple type flags may be provided.
@@ -145,6 +143,28 @@ Show NUM lines before and after each match.
.RS
.RE
.TP
.B \-\-colors \f[I]SPEC\f[] ...
This flag specifies color settings for use in the output.
This flag may be provided multiple times.
Settings are applied iteratively.
Colors are limited to one of eight choices: red, blue, green, cyan,
magenta, yellow, white and black.
Styles are limited to either nobold or bold.
.RS
.PP
The format of the flag is {type}:{attribute}:{value}.
{type} should be one of path, line or match.
{attribute} can be fg, bg or style.
Value is either a color (for fg and bg) or a text style.
A special format, {type}:none, will clear all color settings for {type}.
.PP
For example, the following command will change the match color to
magenta and the background color for line numbers to yellow:
.PP
rg \-\-colors \[aq]match:fg:magenta\[aq] \-\-colors
\[aq]line:bg:yellow\[aq] foo.
.RE
.TP
.B \-\-column
Show column numbers (1 based) in output.
This only shows the column numbers for the first match on each line.
@@ -165,6 +185,15 @@ Show debug messages.
.RS
.RE
.TP
.B \-f, \-\-file FILE ...
Search for patterns from the given file, with one pattern per line.
When this flag is used or multiple times or in combination with the
\-e/\-\-regexp flag, then all patterns provided are searched.
Empty pattern lines will match all input lines, and the newline is not
counted as part of the pattern.
.RS
.RE
.TP
.B \-\-files
Print each file that would be searched (but don\[aq]t search).
.RS
@@ -175,6 +204,11 @@ Only show path of each file with matches.
.RS
.RE
.TP
.B \-\-files\-without\-match
Only show path of each file with no matches.
.RS
.RE
.TP
.B \-H, \-\-with\-filename
Prefix each match with the file name that contains it.
This is the default when more than one file is searched.
@@ -204,11 +238,32 @@ Search hidden directories and files.
.RS
.RE
.TP
.B \-\-ignore\-file FILE ...
Specify additional ignore files for filtering file paths.
Ignore files should be in the gitignore format and are matched relative
to the current working directory.
These ignore files have lower precedence than all other ignore files.
When specifying multiple ignore files, earlier files have lower
precedence than later files.
.RS
.RE
.TP
.B \-L, \-\-follow
Follow symlinks.
.RS
.RE
.TP
.B \-m, \-\-max\-count NUM
Limit the number of matching lines per file searched to NUM.
.RS
.RE
.TP
.B \-\-maxdepth \f[I]NUM\f[]
Descend at most NUM directories below the command line arguments.
A value of zero searches only the starting\-points themselves.
.RS
.RE
.TP
.B \-\-mmap
Search using memory maps when possible.
This is enabled by default when ripgrep thinks it will be faster.
@@ -217,6 +272,11 @@ context related options.)
.RS
.RE
.TP
.B \-\-no\-messages
Suppress all error messages.
.RS
.RE
.TP
.B \-\-no\-mmap
Never use memory maps, even when they might be faster.
.RS
@@ -252,15 +312,31 @@ Alias for \-\-color=always \-\-heading \-n.
.RS
.RE
.TP
.B \-r, \-\-replace \f[I]ARG\f[]
Replace every match with the string given when printing search results.
Neither this flag nor any other flag will modify your files.
.RS
.PP
Capture group indices (e.g., $5) and names (e.g., $foo) are supported in
the replacement string.
.RE
.TP
.B \-s, \-\-case\-sensitive
Search case sensitively.
This overrides \-\-ignore\-case and \-\-smart\-case.
.RS
.RE
.TP
.B \-S, \-\-smart\-case
Search case insensitively if the pattern is all lowercase.
Search case sensitively otherwise.
This is overridden by either \-\-case\-sensitive or \-\-ignore\-case.
.RS
.RE
.TP
.B \-j, \-\-threads \f[I]ARG\f[]
The number of threads to use.
Defaults to the number of logical CPUs (capped at 6).
0 means use the number of logical CPUs (capped at 6).
[default: 0]
.RS
.RE
@@ -291,10 +367,12 @@ Multiple \-\-type\-add flags can be provided.
Unless \-\-type\-clear is used, globs are added to any existing globs
inside of ripgrep.
Note that this must be passed to every invocation of rg.
Type settings are NOT persisted.
.RS
.RE
.PP
Example: \f[C]\-\-type\-add\ html:*.html\f[]
Example:
\f[C]rg\ \-\-type\-add\ \[aq]foo:*.foo\[aq]\ \-tfoo\ PATTERN\f[]
.RE
.TP
.B \-\-type\-clear \f[I]TYPE\f[] ...
Clear the file type globs previously defined for TYPE.

View File

@@ -4,10 +4,10 @@ rg - recursively search current directory for lines matching a pattern
# SYNOPSIS
rg [*options*] -e PATTERN ... [*<*path*> ...*]
rg [*options*] <*pattern*> [*<*path*> ...*]
rg [*options*] (-e PATTERN | -f FILE) ... [*<*path*> ...*]
rg [*options*] --files [*<*path*> ...*]
rg [*options*] --type-list
@@ -18,9 +18,11 @@ rg [*options*] --version
# DESCRIPTION
rg (ripgrep) combines the usability of The Silver Searcher (an ack clone) with
ripgrep (rg) combines the usability of The Silver Searcher (an ack clone) with
the raw speed of grep.
Project home page: https://github.com/BurntSushi/ripgrep
# COMMON OPTIONS
-a, --text
@@ -35,7 +37,8 @@ the raw speed of grep.
-e, --regexp *PATTERN* ...
: Use PATTERN to search. This option can be provided multiple times, where all
patterns given are searched.
patterns given are searched. This is also useful when searching for patterns
that start with a dash.
-F, --fixed-strings
: Treat the pattern as a literal string instead of a regular expression.
@@ -49,7 +52,7 @@ the raw speed of grep.
: Show this usage message.
-i, --ignore-case
: Case insensitive search.
: Case insensitive search. Overridden by --case-sensitive.
-n, --line-number
: Show line numbers (1-based). This is enabled by default at a tty.
@@ -61,10 +64,6 @@ the raw speed of grep.
: Do not print anything to stdout. If a match is found in a file, stop
searching that file.
-r, --replace *ARG*
: Replace every match with the string given. Capture group indices (e.g., $5)
and names (e.g., $foo) are supported.
-t, --type *TYPE* ...
: Only search files matching TYPE. Multiple type flags may be provided. Use the
--type-list flag to list all available types.
@@ -96,6 +95,22 @@ the raw speed of grep.
-C, --context *NUM*
: Show NUM lines before and after each match.
--colors *SPEC* ...
: This flag specifies color settings for use in the output. This flag may be
provided multiple times. Settings are applied iteratively. Colors are limited
to one of eight choices: red, blue, green, cyan, magenta, yellow, white and
black. Styles are limited to either nobold or bold.
The format of the flag is {type}:{attribute}:{value}. {type} should be one
of path, line or match. {attribute} can be fg, bg or style. Value is either
a color (for fg and bg) or a text style. A special format, {type}:none,
will clear all color settings for {type}.
For example, the following command will change the match color to magenta
and the background color for line numbers to yellow:
rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo.
--column
: Show column numbers (1 based) in output. This only shows the column
numbers for the first match on each line. Note that this doesn't try
@@ -108,12 +123,21 @@ the raw speed of grep.
--debug
: Show debug messages.
-f, --file FILE ...
: Search for patterns from the given file, with one pattern per line. When this
flag is used or multiple times or in combination with the -e/--regexp flag,
then all patterns provided are searched. Empty pattern lines will match all
input lines, and the newline is not counted as part of the pattern.
--files
: Print each file that would be searched (but don't search).
-l, --files-with-matches
: Only show path of each file with matches.
--files-without-match
: Only show path of each file with no matches.
-H, --with-filename
: Prefix each match with the file name that contains it. This is the
default when more than one file is searched.
@@ -133,14 +157,32 @@ the raw speed of grep.
: Search hidden directories and files. (Hidden directories and files are
skipped by default.)
--ignore-file FILE ...
: Specify additional ignore files for filtering file paths.
Ignore files should be in the gitignore format and are matched
relative to the current working directory. These ignore files
have lower precedence than all other ignore files. When
specifying multiple ignore files, earlier files have lower
precedence than later files.
-L, --follow
: Follow symlinks.
-m, --max-count NUM
: Limit the number of matching lines per file searched to NUM.
--maxdepth *NUM*
: Descend at most NUM directories below the command line arguments.
A value of zero searches only the starting-points themselves.
--mmap
: Search using memory maps when possible. This is enabled by default
when ripgrep thinks it will be faster. (Note that mmap searching
doesn't currently support the various context related options.)
--no-messages
: Suppress all error messages.
--no-mmap
: Never use memory maps, even when they might be faster.
@@ -164,12 +206,23 @@ the raw speed of grep.
-p, --pretty
: Alias for --color=always --heading -n.
-r, --replace *ARG*
: Replace every match with the string given when printing search results.
Neither this flag nor any other flag will modify your files.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported
in the replacement string.
-s, --case-sensitive
: Search case sensitively. This overrides --ignore-case and --smart-case.
-S, --smart-case
: Search case insensitively if the pattern is all lowercase.
Search case sensitively otherwise.
Search case sensitively otherwise. This is overridden by either
--case-sensitive or --ignore-case.
-j, --threads *ARG*
: The number of threads to use. Defaults to the number of logical CPUs
: The number of threads to use. 0 means use the number of logical CPUs
(capped at 6). [default: 0]
--version
@@ -189,9 +242,10 @@ the raw speed of grep.
: Add a new glob for a particular file type. Only one glob can be added
at a time. Multiple --type-add flags can be provided. Unless --type-clear
is used, globs are added to any existing globs inside of ripgrep. Note that
this must be passed to every invocation of rg.
this must be passed to every invocation of rg. Type settings are NOT
persisted.
Example: `--type-add html:*.html`
Example: `rg --type-add 'foo:*.foo' -tfoo PATTERN`
--type-clear *TYPE* ...
: Clear the file type globs previously defined for TYPE. This only clears

33
globset/Cargo.toml Normal file
View File

@@ -0,0 +1,33 @@
[package]
name = "globset"
version = "0.1.2" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
"""
documentation = "https://docs.rs/globset"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
readme = "README.md"
keywords = ["regex", "glob", "multiple", "set", "pattern"]
license = "Unlicense/MIT"
[lib]
name = "globset"
bench = false
[dependencies]
aho-corasick = "0.5.3"
fnv = "1.0"
lazy_static = "0.2"
log = "0.3"
memchr = "0.1"
regex = "0.1.77"
[dev-dependencies]
glob = "0.2"
[features]
simd-accel = ["regex/simd-accel"]

122
globset/README.md Normal file
View File

@@ -0,0 +1,122 @@
globset
=======
Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/globset.svg)](https://crates.io/crates/globset)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/globset](https://docs.rs/globset)
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
globset = "0.1"
```
and this to your crate root:
```rust
extern crate globset;
```
### Example: one glob
This example shows how to match a single glob against a single file path.
```rust
use globset::Glob;
let glob = try!(Glob::new("*.rs")).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
```
### Example: configuring a glob matcher
This example shows how to use a `GlobBuilder` to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
```rust
use globset::GlobBuilder;
let glob = try!(GlobBuilder::new("*.rs")
.literal_separator(true).build()).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
```
### Example: match multiple globs at once
This example shows how to match multiple glob patterns at once.
```rust
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(try!(Glob::new("*.rs")));
builder.add(try!(Glob::new("src/lib.rs")));
builder.add(try!(Glob::new("src/**/foo.rs")));
let set = try!(builder.build());
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
```
### Performance
This crate implements globs by converting them to regular expressions, and
executing them with the
[`regex`](https://github.com/rust-lang-nursery/regex)
crate.
For single glob matching, performance of this crate should be roughly on par
with the performance of the
[`glob`](https://github.com/rust-lang-nursery/glob)
crate. (`*_regex` correspond to benchmarks for this library while `*_glob`
correspond to benchmarks for the `glob` library.)
Optimizations in the `regex` crate may propel this library past `glob`,
particularly when matching longer paths.
```
test ext_glob ... bench: 425 ns/iter (+/- 21)
test ext_regex ... bench: 175 ns/iter (+/- 10)
test long_glob ... bench: 182 ns/iter (+/- 11)
test long_regex ... bench: 173 ns/iter (+/- 10)
test short_glob ... bench: 69 ns/iter (+/- 4)
test short_regex ... bench: 83 ns/iter (+/- 2)
```
The primary performance advantage of this crate is when matching multiple
globs against a single path. With the `glob` crate, one must match each glob
synchronously, one after the other. In this crate, many can be matched
simultaneously. For example:
```
test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
```
### Comparison with the [`glob`](https://github.com/rust-lang-nursery/glob) crate
* Supports alternate "or" globs, e.g., `*.{foo,bar}`.
* Can match non-UTF-8 file paths correctly.
* Supports matching multiple globs at once.
* Doesn't provide a recursive directory iterator of matching file paths,
although I believe this crate should grow one eventually.
* Supports case insensitive and require-literal-separator match options, but
**doesn't** support the require-literal-leading-dot option.

View File

@@ -5,37 +5,53 @@ tool itself, see the benchsuite directory.
#![feature(test)]
extern crate glob;
extern crate globset;
#[macro_use]
extern crate lazy_static;
extern crate regex;
extern crate test;
use std::ffi::OsStr;
use std::path::Path;
use globset::{Candidate, Glob, GlobMatcher, GlobSet, GlobSetBuilder};
const EXT: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";
const EXT_PAT: &'static str = "*.txt";
const SHORT: &'static str = "some/needle.txt";
const SHORT_PAT: &'static str = "some/**/needle.txt";
const LONG: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";
const LONG_PAT: &'static str = "some/**/needle.txt";
#[allow(dead_code, unused_variables)]
#[path = "../src/glob.rs"]
mod reglob;
fn new_glob(pat: &str) -> glob::Pattern {
glob::Pattern::new(pat).unwrap()
}
fn new_reglob(pat: &str) -> reglob::Set {
let mut builder = reglob::SetBuilder::new();
builder.add(pat).unwrap();
fn new_reglob(pat: &str) -> GlobMatcher {
Glob::new(pat).unwrap().compile_matcher()
}
fn new_reglob_many(pats: &[&str]) -> GlobSet {
let mut builder = GlobSetBuilder::new();
for pat in pats {
builder.add(Glob::new(pat).unwrap());
}
builder.build().unwrap()
}
fn new_reglob_many(pats: &[&str]) -> reglob::Set {
let mut builder = reglob::SetBuilder::new();
for pat in pats {
builder.add(pat).unwrap();
}
builder.build().unwrap()
#[bench]
fn ext_glob(b: &mut test::Bencher) {
let pat = new_glob(EXT_PAT);
b.iter(|| assert!(pat.matches(EXT)));
}
#[bench]
fn ext_regex(b: &mut test::Bencher) {
let set = new_reglob(EXT_PAT);
let cand = Candidate::new(EXT);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
#[bench]
@@ -47,7 +63,8 @@ fn short_glob(b: &mut test::Bencher) {
#[bench]
fn short_regex(b: &mut test::Bencher) {
let set = new_reglob(SHORT_PAT);
b.iter(|| assert!(set.is_match(SHORT)));
let cand = Candidate::new(SHORT);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
#[bench]
@@ -59,7 +76,8 @@ fn long_glob(b: &mut test::Bencher) {
#[bench]
fn long_regex(b: &mut test::Bencher) {
let set = new_reglob(LONG_PAT);
b.iter(|| assert!(set.is_match(LONG)));
let cand = Candidate::new(LONG);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
const MANY_SHORT_GLOBS: &'static [&'static str] = &[
@@ -101,26 +119,3 @@ fn many_short_regex_set(b: &mut test::Bencher) {
let set = new_reglob_many(MANY_SHORT_GLOBS);
b.iter(|| assert_eq!(2, set.matches(MANY_SHORT_SEARCH).iter().count()));
}
// This is the fastest on my system (beating many_glob by about 2x). This
// suggests that a RegexSet needs quite a few regexes (or a larger haystack)
// in order for it to scale.
//
// TODO(burntsushi): come up with a benchmark that uses more complex patterns
// or a longer haystack.
#[bench]
fn many_short_regex_pattern(b: &mut test::Bencher) {
let pats: Vec<_> = MANY_SHORT_GLOBS.iter().map(|&s| {
let pat = reglob::Pattern::new(s).unwrap();
regex::Regex::new(&pat.to_regex()).unwrap()
}).collect();
b.iter(|| {
let mut count = 0;
for pat in &pats {
if pat.is_match(MANY_SHORT_SEARCH) {
count += 1;
}
}
assert_eq!(2, count);
})
}

1300
globset/src/glob.rs Normal file

File diff suppressed because it is too large Load Diff

791
globset/src/lib.rs Normal file
View File

@@ -0,0 +1,791 @@
/*!
The globset crate provides cross platform single glob and glob set matching.
Glob set matching is the process of matching one or more glob patterns against
a single candidate path simultaneously, and returning all of the globs that
matched. For example, given this set of globs:
```ignore
*.rs
src/lib.rs
src/**/foo.rs
```
and a path `src/bar/baz/foo.rs`, then the set would report the first and third
globs as matching.
Single glob matching is also provided and is done by converting globs to
# Example: one glob
This example shows how to match a single glob against a single file path.
```
# fn example() -> Result<(), globset::Error> {
use globset::Glob;
let glob = try!(Glob::new("*.rs")).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
```
# Example: configuring a glob matcher
This example shows how to use a `GlobBuilder` to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
```
# fn example() -> Result<(), globset::Error> {
use globset::GlobBuilder;
let glob = try!(GlobBuilder::new("*.rs")
.literal_separator(true).build()).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
```
# Example: match multiple globs at once
This example shows how to match multiple glob patterns at once.
```
# fn example() -> Result<(), globset::Error> {
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(try!(Glob::new("*.rs")));
builder.add(try!(Glob::new("src/lib.rs")));
builder.add(try!(Glob::new("src/**/foo.rs")));
let set = try!(builder.build());
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
# Ok(()) } example().unwrap();
```
# Syntax
Standard Unix-style glob syntax is supported:
* `?` matches any single character. (If the `literal_separator` option is
enabled, then `?` can never match a path separator.)
* `*` matches zero or more characters. (If the `literal_separator` option is
enabled, then `*` can never match a path separator.)
* `**` recursively matches directories but are only legal in three situations.
First, if the glob starts with <code>\*\*&#x2F;</code>, then it matches
all directories. For example, <code>\*\*&#x2F;foo</code> matches `foo`
and `bar/foo` but not `foo/bar`. Secondly, if the glob ends with
<code>&#x2F;\*\*</code>, then it matches all sub-entries. For example,
<code>foo&#x2F;\*\*</code> matches `foo/a` and `foo/a/b`, but not `foo`.
Thirdly, if the glob contains <code>&#x2F;\*\*&#x2F;</code> anywhere within
the pattern, then it matches zero or more directories. Using `**` anywhere
else is illegal (N.B. the glob `**` is allowed and means "match everything").
* `{a,b}` matches `a` or `b` where `a` and `b` are arbitrary glob patterns.
(N.B. Nesting `{...}` is not currently allowed.)
* `[ab]` matches `a` or `b` where `a` and `b` are characters. Use
`[!ab]` to match any character except for `a` and `b`.
* Metacharacters such as `*` and `?` can be escaped with character class
notation. e.g., `[*]` matches `*`.
A `GlobBuilder` can be used to prevent wildcards from matching path separators,
or to enable case insensitive matching.
*/
#![deny(missing_docs)]
extern crate aho_corasick;
extern crate fnv;
#[macro_use]
extern crate lazy_static;
#[macro_use]
extern crate log;
extern crate memchr;
extern crate regex;
use std::borrow::Cow;
use std::collections::{BTreeMap, HashMap};
use std::error::Error as StdError;
use std::ffi::{OsStr, OsString};
use std::fmt;
use std::hash;
use std::path::Path;
use std::str;
use aho_corasick::{Automaton, AcAutomaton, FullAcAutomaton};
use regex::bytes::{Regex, RegexBuilder, RegexSet};
use pathutil::{
file_name, file_name_ext, normalize_path, os_str_bytes, path_bytes,
};
use glob::MatchStrategy;
pub use glob::{Glob, GlobBuilder, GlobMatcher};
mod glob;
mod pathutil;
/// Represents an error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Error {
/// Occurs when a use of `**` is invalid. Namely, `**` can only appear
/// adjacent to a path separator, or the beginning/end of a glob.
InvalidRecursive,
/// Occurs when a character class (e.g., `[abc]`) is not closed.
UnclosedClass,
/// Occurs when a range in a character (e.g., `[a-z]`) is invalid. For
/// example, if the range starts with a lexicographically larger character
/// than it ends with.
InvalidRange(char, char),
/// Occurs when a `}` is found without a matching `{`.
UnopenedAlternates,
/// Occurs when a `{` is found without a matching `}`.
UnclosedAlternates,
/// Occurs when an alternating group is nested inside another alternating
/// group, e.g., `{{a,b},{c,d}}`.
NestedAlternates,
/// An error associated with parsing or compiling a regex.
Regex(String),
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Error::InvalidRecursive => {
"invalid use of **; must be one path component"
}
Error::UnclosedClass => {
"unclosed character class; missing ']'"
}
Error::InvalidRange(_, _) => {
"invalid character range"
}
Error::UnopenedAlternates => {
"unopened alternate group; missing '{' \
(maybe escape '}' with '[}]'?)"
}
Error::UnclosedAlternates => {
"unclosed alternate group; missing '}' \
(maybe escape '{' with '[{]'?)"
}
Error::NestedAlternates => {
"nested alternate groups are not allowed"
}
Error::Regex(ref err) => err,
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::InvalidRecursive
| Error::UnclosedClass
| Error::UnopenedAlternates
| Error::UnclosedAlternates
| Error::NestedAlternates
| Error::Regex(_) => {
write!(f, "{}", self.description())
}
Error::InvalidRange(s, e) => {
write!(f, "invalid range; '{}' > '{}'", s, e)
}
}
}
}
fn new_regex(pat: &str) -> Result<Regex, Error> {
RegexBuilder::new(pat)
.dot_matches_new_line(true)
.size_limit(10 * (1 << 20))
.dfa_size_limit(10 * (1 << 20))
.compile()
.map_err(|err| Error::Regex(err.to_string()))
}
fn new_regex_set<I, S>(pats: I) -> Result<RegexSet, Error>
where S: AsRef<str>, I: IntoIterator<Item=S> {
RegexSet::new(pats).map_err(|err| Error::Regex(err.to_string()))
}
type Fnv = hash::BuildHasherDefault<fnv::FnvHasher>;
/// GlobSet represents a group of globs that can be matched together in a
/// single pass.
#[derive(Clone, Debug)]
pub struct GlobSet {
len: usize,
strats: Vec<GlobSetMatchStrategy>,
}
impl GlobSet {
/// Returns true if this set is empty, and therefore matches nothing.
pub fn is_empty(&self) -> bool {
self.len == 0
}
/// Returns the number of globs in this set.
pub fn len(&self) -> usize {
self.len
}
/// Returns true if any glob in this set matches the path given.
pub fn is_match<P: AsRef<Path>>(&self, path: P) -> bool {
self.is_match_candidate(&Candidate::new(path.as_ref()))
}
/// Returns true if any glob in this set matches the path given.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
if self.is_empty() {
return false;
}
for strat in &self.strats {
if strat.is_match(path) {
return true;
}
}
false
}
/// Returns the sequence number of every glob pattern that matches the
/// given path.
pub fn matches<P: AsRef<Path>>(&self, path: P) -> Vec<usize> {
self.matches_candidate(&Candidate::new(path.as_ref()))
}
/// Returns the sequence number of every glob pattern that matches the
/// given path.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn matches_candidate(&self, path: &Candidate) -> Vec<usize> {
let mut into = vec![];
if self.is_empty() {
return into;
}
self.matches_candidate_into(path, &mut into);
into
}
/// Adds the sequence number of every glob pattern that matches the given
/// path to the vec given.
///
/// `into` is is cleared before matching begins, and contains the set of
/// sequence numbers (in ascending order) after matching ends. If no globs
/// were matched, then `into` will be empty.
pub fn matches_into<P: AsRef<Path>>(
&self,
path: P,
into: &mut Vec<usize>,
) {
self.matches_candidate_into(&Candidate::new(path.as_ref()), into);
}
/// Adds the sequence number of every glob pattern that matches the given
/// path to the vec given.
///
/// `into` is is cleared before matching begins, and contains the set of
/// sequence numbers (in ascending order) after matching ends. If no globs
/// were matched, then `into` will be empty.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn matches_candidate_into(
&self,
path: &Candidate,
into: &mut Vec<usize>,
) {
into.clear();
if self.is_empty() {
return;
}
for strat in &self.strats {
strat.matches_into(path, into);
}
into.sort();
into.dedup();
}
fn new(pats: &[Glob]) -> Result<GlobSet, Error> {
if pats.is_empty() {
return Ok(GlobSet { len: 0, strats: vec![] });
}
let mut lits = LiteralStrategy::new();
let mut base_lits = BasenameLiteralStrategy::new();
let mut exts = ExtensionStrategy::new();
let mut prefixes = MultiStrategyBuilder::new();
let mut suffixes = MultiStrategyBuilder::new();
let mut required_exts = RequiredExtensionStrategyBuilder::new();
let mut regexes = MultiStrategyBuilder::new();
for (i, p) in pats.iter().enumerate() {
match MatchStrategy::new(p) {
MatchStrategy::Literal(lit) => {
lits.add(i, lit);
}
MatchStrategy::BasenameLiteral(lit) => {
base_lits.add(i, lit);
}
MatchStrategy::Extension(ext) => {
exts.add(i, ext);
}
MatchStrategy::Prefix(prefix) => {
prefixes.add(i, prefix);
}
MatchStrategy::Suffix { suffix, component } => {
if component {
lits.add(i, suffix[1..].to_string());
}
suffixes.add(i, suffix);
}
MatchStrategy::RequiredExtension(ext) => {
required_exts.add(i, ext, p.regex().to_owned());
}
MatchStrategy::Regex => {
debug!("glob converted to regex: {:?}", p);
regexes.add(i, p.regex().to_owned());
}
}
}
debug!("built glob set; {} literals, {} basenames, {} extensions, \
{} prefixes, {} suffixes, {} required extensions, {} regexes",
lits.0.len(), base_lits.0.len(), exts.0.len(),
prefixes.literals.len(), suffixes.literals.len(),
required_exts.0.len(), regexes.literals.len());
Ok(GlobSet {
len: pats.len(),
strats: vec![
GlobSetMatchStrategy::Extension(exts),
GlobSetMatchStrategy::BasenameLiteral(base_lits),
GlobSetMatchStrategy::Literal(lits),
GlobSetMatchStrategy::Suffix(suffixes.suffix()),
GlobSetMatchStrategy::Prefix(prefixes.prefix()),
GlobSetMatchStrategy::RequiredExtension(
try!(required_exts.build())),
GlobSetMatchStrategy::Regex(try!(regexes.regex_set())),
],
})
}
}
/// GlobSetBuilder builds a group of patterns that can be used to
/// simultaneously match a file path.
pub struct GlobSetBuilder {
pats: Vec<Glob>,
}
impl GlobSetBuilder {
/// Create a new GlobSetBuilder. A GlobSetBuilder can be used to add new
/// patterns. Once all patterns have been added, `build` should be called
/// to produce a `GlobSet`, which can then be used for matching.
pub fn new() -> GlobSetBuilder {
GlobSetBuilder { pats: vec![] }
}
/// Builds a new matcher from all of the glob patterns added so far.
///
/// Once a matcher is built, no new patterns can be added to it.
pub fn build(&self) -> Result<GlobSet, Error> {
GlobSet::new(&self.pats)
}
/// Add a new pattern to this set.
#[allow(dead_code)]
pub fn add(&mut self, pat: Glob) -> &mut GlobSetBuilder {
self.pats.push(pat);
self
}
}
/// A candidate path for matching.
///
/// All glob matching in this crate operates on `Candidate` values.
/// Constructing candidates has a very small cost associated with it, so
/// callers may find it beneficial to amortize that cost when matching a single
/// path against multiple globs or sets of globs.
#[derive(Clone, Debug)]
pub struct Candidate<'a> {
path: Cow<'a, [u8]>,
basename: Cow<'a, [u8]>,
ext: &'a OsStr,
}
impl<'a> Candidate<'a> {
/// Create a new candidate for matching from the given path.
pub fn new<P: AsRef<Path> + ?Sized>(path: &'a P) -> Candidate<'a> {
let path = path.as_ref();
let basename = file_name(path).unwrap_or(OsStr::new(""));
Candidate {
path: normalize_path(path_bytes(path)),
basename: os_str_bytes(basename),
ext: file_name_ext(basename).unwrap_or(OsStr::new("")),
}
}
fn path_prefix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {
&self.path[..max]
}
}
fn path_suffix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {
&self.path[self.path.len() - max..]
}
}
}
#[derive(Clone, Debug)]
enum GlobSetMatchStrategy {
Literal(LiteralStrategy),
BasenameLiteral(BasenameLiteralStrategy),
Extension(ExtensionStrategy),
Prefix(PrefixStrategy),
Suffix(SuffixStrategy),
RequiredExtension(RequiredExtensionStrategy),
Regex(RegexSetStrategy),
}
impl GlobSetMatchStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.is_match(candidate),
BasenameLiteral(ref s) => s.is_match(candidate),
Extension(ref s) => s.is_match(candidate),
Prefix(ref s) => s.is_match(candidate),
Suffix(ref s) => s.is_match(candidate),
RequiredExtension(ref s) => s.is_match(candidate),
Regex(ref s) => s.is_match(candidate),
}
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.matches_into(candidate, matches),
BasenameLiteral(ref s) => s.matches_into(candidate, matches),
Extension(ref s) => s.matches_into(candidate, matches),
Prefix(ref s) => s.matches_into(candidate, matches),
Suffix(ref s) => s.matches_into(candidate, matches),
RequiredExtension(ref s) => s.matches_into(candidate, matches),
Regex(ref s) => s.matches_into(candidate, matches),
}
}
}
#[derive(Clone, Debug)]
struct LiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
impl LiteralStrategy {
fn new() -> LiteralStrategy {
LiteralStrategy(BTreeMap::new())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
self.0.contains_key(&*candidate.path)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if let Some(hits) = self.0.get(&*candidate.path) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct BasenameLiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
impl BasenameLiteralStrategy {
fn new() -> BasenameLiteralStrategy {
BasenameLiteralStrategy(BTreeMap::new())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.basename.is_empty() {
return false;
}
self.0.contains_key(&*candidate.basename)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.basename.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.basename) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct ExtensionStrategy(HashMap<OsString, Vec<usize>, Fnv>);
impl ExtensionStrategy {
fn new() -> ExtensionStrategy {
ExtensionStrategy(HashMap::with_hasher(Fnv::default()))
}
fn add(&mut self, global_index: usize, ext: OsString) {
self.0.entry(ext).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.ext.is_empty() {
return false;
}
self.0.contains_key(candidate.ext)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.ext.is_empty() {
return;
}
if let Some(hits) = self.0.get(candidate.ext) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct PrefixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
map: Vec<usize>,
longest: usize,
}
impl PrefixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
matches.push(self.map[m.pati]);
}
}
}
}
#[derive(Clone, Debug)]
struct SuffixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
map: Vec<usize>,
longest: usize,
}
impl SuffixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
matches.push(self.map[m.pati]);
}
}
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategy(HashMap<OsString, Vec<(usize, Regex)>, Fnv>);
impl RequiredExtensionStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.ext.is_empty() {
return false;
}
match self.0.get(candidate.ext) {
None => false,
Some(regexes) => {
for &(_, ref re) in regexes {
if re.is_match(&*candidate.path) {
return true;
}
}
false
}
}
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.ext.is_empty() {
return;
}
if let Some(regexes) = self.0.get(candidate.ext) {
for &(global_index, ref re) in regexes {
if re.is_match(&*candidate.path) {
matches.push(global_index);
}
}
}
}
}
#[derive(Clone, Debug)]
struct RegexSetStrategy {
matcher: RegexSet,
map: Vec<usize>,
}
impl RegexSetStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
self.matcher.is_match(&*candidate.path)
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
for i in self.matcher.matches(&*candidate.path) {
matches.push(self.map[i]);
}
}
}
#[derive(Clone, Debug)]
struct MultiStrategyBuilder {
literals: Vec<String>,
map: Vec<usize>,
longest: usize,
}
impl MultiStrategyBuilder {
fn new() -> MultiStrategyBuilder {
MultiStrategyBuilder {
literals: vec![],
map: vec![],
longest: 0,
}
}
fn add(&mut self, global_index: usize, literal: String) {
if literal.len() > self.longest {
self.longest = literal.len();
}
self.map.push(global_index);
self.literals.push(literal);
}
fn prefix(self) -> PrefixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
PrefixStrategy {
matcher: AcAutomaton::new(it).into_full(),
map: self.map,
longest: self.longest,
}
}
fn suffix(self) -> SuffixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
SuffixStrategy {
matcher: AcAutomaton::new(it).into_full(),
map: self.map,
longest: self.longest,
}
}
fn regex_set(self) -> Result<RegexSetStrategy, Error> {
Ok(RegexSetStrategy {
matcher: try!(new_regex_set(self.literals)),
map: self.map,
})
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategyBuilder(
HashMap<OsString, Vec<(usize, String)>>,
);
impl RequiredExtensionStrategyBuilder {
fn new() -> RequiredExtensionStrategyBuilder {
RequiredExtensionStrategyBuilder(HashMap::new())
}
fn add(&mut self, global_index: usize, ext: OsString, regex: String) {
self.0.entry(ext).or_insert(vec![]).push((global_index, regex));
}
fn build(self) -> Result<RequiredExtensionStrategy, Error> {
let mut exts = HashMap::with_hasher(Fnv::default());
for (ext, regexes) in self.0.into_iter() {
exts.insert(ext.clone(), vec![]);
for (global_index, regex) in regexes {
let compiled = try!(new_regex(&regex));
exts.get_mut(&ext).unwrap().push((global_index, compiled));
}
}
Ok(RequiredExtensionStrategy(exts))
}
}
#[cfg(test)]
mod tests {
use super::GlobSetBuilder;
use glob::Glob;
#[test]
fn set_works() {
let mut builder = GlobSetBuilder::new();
builder.add(Glob::new("src/**/*.rs").unwrap());
builder.add(Glob::new("*.c").unwrap());
builder.add(Glob::new("src/lib.rs").unwrap());
let set = builder.build().unwrap();
assert!(set.is_match("foo.c"));
assert!(set.is_match("src/foo.c"));
assert!(!set.is_match("foo.rs"));
assert!(!set.is_match("tests/foo.rs"));
assert!(set.is_match("src/foo.rs"));
assert!(set.is_match("src/grep/src/main.rs"));
let matches = set.matches("src/lib.rs");
assert_eq!(2, matches.len());
assert_eq!(0, matches[0]);
assert_eq!(2, matches[1]);
}
#[test]
fn empty_set_works() {
let set = GlobSetBuilder::new().build().unwrap();
assert!(!set.is_match(""));
assert!(!set.is_match("a"));
}
}

178
globset/src/pathutil.rs Normal file
View File

@@ -0,0 +1,178 @@
use std::borrow::Cow;
use std::ffi::OsStr;
use std::path::Path;
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(unix)]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
let path = path.as_ref().as_os_str().as_bytes();
if path.is_empty() {
return None;
} else if path.len() == 1 && path[0] == b'.' {
return None;
} else if path.last() == Some(&b'.') {
return None;
} else if path.len() >= 2 && &path[path.len() - 2..] == &b".."[..] {
return None;
}
let last_slash = memrchr(b'/', path).map(|i| i + 1).unwrap_or(0);
Some(OsStr::from_bytes(&path[last_slash..]))
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(not(unix))]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
path.as_ref().file_name()
}
/// Return a file extension given a path's file name.
///
/// Note that this does NOT match the semantics of std::path::Path::extension.
/// Namely, the extension includes the `.` and matching is otherwise more
/// liberal. Specifically, the extenion is:
///
/// * None, if the file name given is empty;
/// * None, if there is no embedded `.`;
/// * Otherwise, the portion of the file name starting with the final `.`.
///
/// e.g., A file name of `.rs` has an extension `.rs`.
///
/// N.B. This is done to make certain glob match optimizations easier. Namely,
/// a pattern like `*.rs` is obviously trying to match files with a `rs`
/// extension, but it also matches files like `.rs`, which doesn't have an
/// extension according to std::path::Path::extension.
pub fn file_name_ext(name: &OsStr) -> Option<&OsStr> {
// Yes, these functions are awful, and yes, we are completely violating
// the abstraction barrier of std::ffi. The barrier we're violating is
// that an OsStr's encoding is *ASCII compatible*. While this is obviously
// true on Unix systems, it's also true on Windows because an OsStr uses
// WTF-8 internally: https://simonsapin.github.io/wtf-8/
//
// We should consider doing the same for the other path utility functions.
// Right now, we don't break any barriers, but Windows users are paying
// for it.
//
// Got any better ideas that don't cost anything? Hit me up. ---AG
unsafe fn os_str_as_u8_slice(s: &OsStr) -> &[u8] {
::std::mem::transmute(s)
}
unsafe fn u8_slice_as_os_str(s: &[u8]) -> &OsStr {
::std::mem::transmute(s)
}
if name.is_empty() {
return None;
}
let name = unsafe { os_str_as_u8_slice(name) };
for (i, &b) in name.iter().enumerate().rev() {
if b == b'.' {
return Some(unsafe { u8_slice_as_os_str(&name[i..]) });
}
}
None
}
/// Return raw bytes of a path, transcoded to UTF-8 if necessary.
pub fn path_bytes(path: &Path) -> Cow<[u8]> {
os_str_bytes(path.as_os_str())
}
/// Return the raw bytes of the given OS string, possibly transcoded to UTF-8.
#[cfg(unix)]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
use std::os::unix::ffi::OsStrExt;
Cow::Borrowed(s.as_bytes())
}
/// Return the raw bytes of the given OS string, possibly transcoded to UTF-8.
#[cfg(not(unix))]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
// TODO(burntsushi): On Windows, OS strings are WTF-8, which is a superset
// of UTF-8, so even if we could get at the raw bytes, they wouldn't
// be useful. We *must* convert to UTF-8 before doing path matching.
// Unfortunate, but necessary.
match s.to_string_lossy() {
Cow::Owned(s) => Cow::Owned(s.into_bytes()),
Cow::Borrowed(s) => Cow::Borrowed(s.as_bytes()),
}
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(unix)]
pub fn normalize_path(path: Cow<[u8]>) -> Cow<[u8]> {
// UNIX only uses /, so we're good.
path
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(not(unix))]
pub fn normalize_path(mut path: Cow<[u8]>) -> Cow<[u8]> {
use std::path::is_separator;
for i in 0..path.len() {
if path[i] == b'/' || !is_separator(path[i] as char) {
continue;
}
path.to_mut()[i] = b'/';
}
path
}
#[cfg(test)]
mod tests {
use std::borrow::Cow;
use std::ffi::OsStr;
use super::{file_name_ext, normalize_path};
macro_rules! ext {
($name:ident, $file_name:expr, $ext:expr) => {
#[test]
fn $name() {
let got = file_name_ext(OsStr::new($file_name));
assert_eq!($ext.map(OsStr::new), got);
}
};
}
ext!(ext1, "foo.rs", Some(".rs"));
ext!(ext2, ".rs", Some(".rs"));
ext!(ext3, "..rs", Some(".rs"));
ext!(ext4, "", None::<&str>);
ext!(ext5, "foo", None::<&str>);
macro_rules! normalize {
($name:ident, $path:expr, $expected:expr) => {
#[test]
fn $name() {
let got = normalize_path(Cow::Owned($path.to_vec()));
assert_eq!($expected.to_vec(), got.into_owned());
}
};
}
normalize!(normal1, b"foo", b"foo");
normalize!(normal2, b"foo/bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal3, b"foo\\bar", b"foo\\bar");
#[cfg(not(unix))]
normalize!(normal3, b"foo\\bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal4, b"foo\\bar/baz", b"foo\\bar/baz");
#[cfg(not(unix))]
normalize!(normal4, b"foo\\bar/baz", b"foo/bar/baz");
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep"
version = "0.1.3" #:version
version = "0.1.4" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.
@@ -15,6 +15,6 @@ license = "Unlicense/MIT"
[dependencies]
log = "0.3"
memchr = "0.1"
memmap = "0.2"
memmap = "0.5"
regex = "0.1.77"
regex-syntax = "0.3.5"

View File

@@ -9,7 +9,7 @@ principled.
*/
use std::cmp;
use regex::bytes::Regex;
use regex::bytes::RegexBuilder;
use syntax::{
Expr, Literals, Lit,
ByteClass, ByteRange, CharClass, ClassRange, Repeater,
@@ -33,7 +33,7 @@ impl LiteralSets {
}
}
pub fn to_regex(&self) -> Option<Regex> {
pub fn to_regex_builder(&self) -> Option<RegexBuilder> {
if self.prefixes.all_complete() && !self.prefixes.is_empty() {
debug!("literal prefixes detected: {:?}", self.prefixes);
// When this is true, the regex engine will do a literal scan.
@@ -79,14 +79,12 @@ impl LiteralSets {
debug!("required literals found: {:?}", req_lits);
let alts: Vec<String> =
req_lits.into_iter().map(|x| bytes_to_regex(x)).collect();
// Literals always compile.
Some(Regex::new(&alts.join("|")).unwrap())
Some(RegexBuilder::new(&alts.join("|")))
} else if lit.is_empty() {
None
} else {
// Literals always compile.
debug!("required literal found: {:?}", show(lit));
Some(Regex::new(&bytes_to_regex(lit)).unwrap())
Some(RegexBuilder::new(&bytes_to_regex(lit)))
}
}
}

View File

@@ -144,14 +144,19 @@ impl GrepBuilder {
let expr = try!(self.parse());
let literals = LiteralSets::create(&expr);
let re = try!(self.regex(&expr));
let required = literals.to_regex().or_else(|| {
let expr = match strip_unicode_word_boundaries(&expr) {
None => return None,
Some(expr) => expr,
};
debug!("Stripped Unicode word boundaries. New AST:\n{:?}", expr);
self.regex(&expr).ok()
});
let required = match literals.to_regex_builder() {
Some(builder) => Some(try!(self.regex_build(builder))),
None => {
match strip_unicode_word_boundaries(&expr) {
None => None,
Some(expr) => {
debug!("Stripped Unicode word boundaries. \
New AST:\n{:?}", expr);
self.regex(&expr).ok()
}
}
}
};
Ok(Grep {
re: re,
required: required,
@@ -162,11 +167,12 @@ impl GrepBuilder {
/// Creates a new regex from the given expression with the current
/// configuration.
fn regex(&self, expr: &Expr) -> Result<Regex> {
let casei =
self.opts.case_insensitive
|| (self.opts.case_smart && !has_uppercase_literal(expr));
RegexBuilder::new(&expr.to_string())
.case_insensitive(casei)
self.regex_build(RegexBuilder::new(&expr.to_string()))
}
/// Builds a new regex from the given builder using the caller's settings.
fn regex_build(&self, builder: RegexBuilder) -> Result<Regex> {
builder
.multi_line(true)
.unicode(true)
.size_limit(self.opts.size_limit)
@@ -182,10 +188,29 @@ impl GrepBuilder {
try!(syntax::ExprBuilder::new()
.allow_bytes(true)
.unicode(true)
.case_insensitive(self.opts.case_insensitive)
.case_insensitive(try!(self.is_case_insensitive()))
.parse(&self.pattern));
let expr = try!(nonl::remove(expr, self.opts.line_terminator));
debug!("regex ast:\n{:#?}", expr);
Ok(try!(nonl::remove(expr, self.opts.line_terminator)))
Ok(expr)
}
/// Determines whether the case insensitive flag should be enabled or not.
///
/// An error is returned if the regex could not be parsed.
fn is_case_insensitive(&self) -> Result<bool> {
if self.opts.case_insensitive {
return Ok(true);
}
if !self.opts.case_smart {
return Ok(false);
}
let expr =
try!(syntax::ExprBuilder::new()
.allow_bytes(true)
.unicode(true)
.parse(&self.pattern));
Ok(!has_uppercase_literal(&expr))
}
}

37
ignore/Cargo.toml Normal file
View File

@@ -0,0 +1,37 @@
[package]
name = "ignore"
version = "0.1.5" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A fast library for efficiently matching ignore files such as `.gitignore`
against file paths.
"""
documentation = "https://docs.rs/ignore"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/ignore"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/ignore"
readme = "README.md"
keywords = ["glob", "ignore", "gitignore", "pattern", "file"]
license = "Unlicense/MIT"
[lib]
name = "ignore"
bench = false
[dependencies]
crossbeam = "0.2"
globset = { version = "0.1.2", path = "../globset" }
lazy_static = "0.2"
log = "0.3"
memchr = "0.1"
regex = "0.1.77"
thread_local = "0.3.0"
walkdir = "1"
[dev-dependencies]
tempdir = "0.3.5"
[features]
simd-accel = ["globset/simd-accel"]
[profile.release]
debug = true

66
ignore/README.md Normal file
View File

@@ -0,0 +1,66 @@
ignore
======
The ignore crate provides a fast recursive directory iterator that respects
various filters such as globs, file types and `.gitignore` files. This crate
also provides lower level direct access to gitignore and file type matchers.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ignore.svg)](https://crates.io/crates/ignore)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/ignore](https://docs.rs/ignore)
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
ignore = "0.1"
```
and this to your crate root:
```rust
extern crate ignore;
```
### Example
This example shows the most basic usage of this crate. This code will
recursively traverse the current directory while automatically filtering out
files and directories according to ignore globs found in files like
`.ignore` and `.gitignore`:
```rust,no_run
use ignore::Walk;
for result in Walk::new("./") {
// Each item yielded by the iterator is either a directory entry or an
// error, so either print the path or the error.
match result {
Ok(entry) => println!("{}", entry.path().display()),
Err(err) => println!("ERROR: {}", err),
}
}
```
### Example: advanced
By default, the recursive directory iterator will ignore hidden files and
directories. This can be disabled by building the iterator with `WalkBuilder`:
```rust,no_run
use ignore::WalkBuilder;
for result in WalkBuilder::new("./").hidden(false).build() {
println!("{:?}", result);
}
```
See the documentation for `WalkBuilder` for many other options.

92
ignore/examples/walk.rs Normal file
View File

@@ -0,0 +1,92 @@
#![allow(dead_code, unused_imports, unused_mut, unused_variables)]
extern crate crossbeam;
extern crate ignore;
extern crate walkdir;
use std::env;
use std::io::{self, Write};
use std::path::Path;
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::thread;
use crossbeam::sync::MsQueue;
use ignore::WalkBuilder;
use walkdir::WalkDir;
fn main() {
let mut path = env::args().nth(1).unwrap();
let mut parallel = false;
let mut simple = false;
let queue: Arc<MsQueue<Option<DirEntry>>> = Arc::new(MsQueue::new());
if path == "parallel" {
path = env::args().nth(2).unwrap();
parallel = true;
} else if path == "walkdir" {
path = env::args().nth(2).unwrap();
simple = true;
}
let stdout_queue = queue.clone();
let stdout_thread = thread::spawn(move || {
let mut stdout = io::BufWriter::new(io::stdout());
while let Some(dent) = stdout_queue.pop() {
write_path(&mut stdout, dent.path());
}
});
if parallel {
let walker = WalkBuilder::new(path).threads(6).build_parallel();
walker.run(|| {
let queue = queue.clone();
Box::new(move |result| {
use ignore::WalkState::*;
queue.push(Some(DirEntry::Y(result.unwrap())));
Continue
})
});
} else if simple {
let mut stdout = io::BufWriter::new(io::stdout());
let walker = WalkDir::new(path);
for result in walker {
queue.push(Some(DirEntry::X(result.unwrap())));
}
} else {
let mut stdout = io::BufWriter::new(io::stdout());
let walker = WalkBuilder::new(path).build();
for result in walker {
queue.push(Some(DirEntry::Y(result.unwrap())));
}
}
queue.push(None);
stdout_thread.join().unwrap();
}
enum DirEntry {
X(walkdir::DirEntry),
Y(ignore::DirEntry),
}
impl DirEntry {
fn path(&self) -> &Path {
match *self {
DirEntry::X(ref x) => x.path(),
DirEntry::Y(ref y) => y.path(),
}
}
}
#[cfg(unix)]
fn write_path<W: Write>(mut wtr: W, path: &Path) {
use std::os::unix::ffi::OsStrExt;
wtr.write(path.as_os_str().as_bytes()).unwrap();
wtr.write(b"\n").unwrap();
}
#[cfg(not(unix))]
fn write_path<W: Write>(mut wtr: W, path: &Path) {
wtr.write(path.to_string_lossy().as_bytes()).unwrap();
wtr.write(b"\n").unwrap();
}

800
ignore/src/dir.rs Normal file
View File

@@ -0,0 +1,800 @@
// This module provides a data structure, `Ignore`, that connects "directory
// traversal" with "ignore matchers." Specifically, it knows about gitignore
// semantics and precedence, and is organized based on directory hierarchy.
// Namely, every matcher logically corresponds to ignore rules from a single
// directory, and points to the matcher for its corresponding parent directory.
// In this sense, `Ignore` is a *persistent* data structure.
//
// This design was specifically chosen to make it possible to use this data
// structure in a parallel directory iterator.
//
// My initial intention was to expose this module as part of this crate's
// public API, but I think the data structure's public API is too complicated
// with non-obvious failure modes. Alas, such things haven't been documented
// well.
use std::collections::HashMap;
use std::ffi::OsString;
use std::path::{Path, PathBuf};
use std::sync::{Arc, RwLock};
use gitignore::{self, Gitignore, GitignoreBuilder};
use pathutil::{is_hidden, strip_prefix};
use overrides::{self, Override};
use types::{self, Types};
use {Error, Match, PartialErrorBuilder};
/// IgnoreMatch represents information about where a match came from when using
/// the `Ignore` matcher.
#[derive(Clone, Debug)]
pub struct IgnoreMatch<'a>(IgnoreMatchInner<'a>);
/// IgnoreMatchInner describes precisely where the match information came from.
/// This is private to allow expansion to more matchers in the future.
#[derive(Clone, Debug)]
enum IgnoreMatchInner<'a> {
Override(overrides::Glob<'a>),
Gitignore(&'a gitignore::Glob),
Types(types::Glob<'a>),
Hidden,
}
impl<'a> IgnoreMatch<'a> {
fn overrides(x: overrides::Glob<'a>) -> IgnoreMatch<'a> {
IgnoreMatch(IgnoreMatchInner::Override(x))
}
fn gitignore(x: &'a gitignore::Glob) -> IgnoreMatch<'a> {
IgnoreMatch(IgnoreMatchInner::Gitignore(x))
}
fn types(x: types::Glob<'a>) -> IgnoreMatch<'a> {
IgnoreMatch(IgnoreMatchInner::Types(x))
}
fn hidden() -> IgnoreMatch<'static> {
IgnoreMatch(IgnoreMatchInner::Hidden)
}
}
/// Options for the ignore matcher, shared between the matcher itself and the
/// builder.
#[derive(Clone, Copy, Debug)]
struct IgnoreOptions {
/// Whether to ignore hidden file paths or not.
hidden: bool,
/// Whether to read .ignore files.
ignore: bool,
/// Whether to read git's global gitignore file.
git_global: bool,
/// Whether to read .gitignore files.
git_ignore: bool,
/// Whether to read .git/info/exclude files.
git_exclude: bool,
}
impl IgnoreOptions {
/// Returns true if at least one type of ignore rules should be matched.
fn has_any_ignore_options(&self) -> bool {
self.ignore || self.git_global || self.git_ignore || self.git_exclude
}
}
/// Ignore is a matcher useful for recursively walking one or more directories.
#[derive(Clone, Debug)]
pub struct Ignore(Arc<IgnoreInner>);
#[derive(Clone, Debug)]
struct IgnoreInner {
/// A map of all existing directories that have already been
/// compiled into matchers.
///
/// Note that this is never used during matching, only when adding new
/// parent directory matchers. This avoids needing to rebuild glob sets for
/// parent directories if many paths are being searched.
compiled: Arc<RwLock<HashMap<OsString, Ignore>>>,
/// The path to the directory that this matcher was built from.
dir: PathBuf,
/// An override matcher (default is empty).
overrides: Arc<Override>,
/// A file type matcher.
types: Arc<Types>,
/// The parent directory to match next.
///
/// If this is the root directory or there are otherwise no more
/// directories to match, then `parent` is `None`.
parent: Option<Ignore>,
/// Whether this is an absolute parent matcher, as added by add_parent.
is_absolute_parent: bool,
/// The absolute base path of this matcher. Populated only if parent
/// directories are added.
absolute_base: Option<Arc<PathBuf>>,
/// Explicit ignore matchers specified by the caller.
explicit_ignores: Arc<Vec<Gitignore>>,
/// The matcher for .ignore files.
ignore_matcher: Gitignore,
/// A global gitignore matcher, usually from $XDG_CONFIG_HOME/git/ignore.
git_global_matcher: Arc<Gitignore>,
/// The matcher for .gitignore files.
git_ignore_matcher: Gitignore,
/// Special matcher for `.git/info/exclude` files.
git_exclude_matcher: Gitignore,
/// Whether this directory contains a .git sub-directory.
has_git: bool,
/// Ignore config.
opts: IgnoreOptions,
}
impl Ignore {
/// Return the directory path of this matcher.
#[allow(dead_code)]
pub fn path(&self) -> &Path {
&self.0.dir
}
/// Return true if this matcher has no parent.
pub fn is_root(&self) -> bool {
self.0.parent.is_none()
}
/// Returns true if this matcher was added via the `add_parents` method.
pub fn is_absolute_parent(&self) -> bool {
self.0.is_absolute_parent
}
/// Return this matcher's parent, if one exists.
pub fn parent(&self) -> Option<Ignore> {
self.0.parent.clone()
}
/// Create a new `Ignore` matcher with the parent directories of `dir`.
///
/// Note that this can only be called on an `Ignore` matcher with no
/// parents (i.e., `is_root` returns `true`). This will panic otherwise.
pub fn add_parents<P: AsRef<Path>>(
&self,
path: P,
) -> (Ignore, Option<Error>) {
if !self.is_root() {
panic!("Ignore::add_parents called on non-root matcher");
}
let absolute_base = match path.as_ref().canonicalize() {
Ok(path) => Arc::new(path),
Err(_) => {
// There's not much we can do here, so just return our
// existing matcher. We drop the error to be consistent
// with our general pattern of ignoring I/O errors when
// processing ignore files.
return (self.clone(), None);
}
};
// List of parents, from child to root.
let mut parents = vec![];
let mut path = &**absolute_base;
while let Some(parent) = path.parent() {
parents.push(parent);
path = parent;
}
let mut errs = PartialErrorBuilder::default();
let mut ig = self.clone();
for parent in parents.into_iter().rev() {
let mut compiled = self.0.compiled.write().unwrap();
if let Some(prebuilt) = compiled.get(parent.as_os_str()) {
ig = prebuilt.clone();
continue;
}
let (mut igtmp, err) = ig.add_child_path(parent);
errs.maybe_push(err);
igtmp.is_absolute_parent = true;
igtmp.absolute_base = Some(absolute_base.clone());
ig = Ignore(Arc::new(igtmp));
compiled.insert(parent.as_os_str().to_os_string(), ig.clone());
}
(ig, errs.into_error_option())
}
/// Create a new `Ignore` matcher for the given child directory.
///
/// Since building the matcher may require reading from multiple
/// files, it's possible that this method partially succeeds. Therefore,
/// a matcher is always returned (which may match nothing) and an error is
/// returned if it exists.
///
/// Note that all I/O errors are completely ignored.
pub fn add_child<P: AsRef<Path>>(
&self,
dir: P,
) -> (Ignore, Option<Error>) {
let (ig, err) = self.add_child_path(dir.as_ref());
(Ignore(Arc::new(ig)), err)
}
/// Like add_child, but takes a full path and returns an IgnoreInner.
fn add_child_path(&self, dir: &Path) -> (IgnoreInner, Option<Error>) {
static IG_NAMES: &'static [&'static str] = &[".rgignore", ".ignore"];
let mut errs = PartialErrorBuilder::default();
let ig_matcher =
if !self.0.opts.ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, IG_NAMES);
errs.maybe_push(err);
m
};
let gi_matcher =
if !self.0.opts.git_ignore {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, &[".gitignore"]);
errs.maybe_push(err);
m
};
let gi_exclude_matcher =
if !self.0.opts.git_exclude {
Gitignore::empty()
} else {
let (m, err) = create_gitignore(&dir, &[".git/info/exclude"]);
errs.maybe_push(err);
m
};
let ig = IgnoreInner {
compiled: self.0.compiled.clone(),
dir: dir.to_path_buf(),
overrides: self.0.overrides.clone(),
types: self.0.types.clone(),
parent: Some(self.clone()),
is_absolute_parent: false,
absolute_base: self.0.absolute_base.clone(),
explicit_ignores: self.0.explicit_ignores.clone(),
ignore_matcher: ig_matcher,
git_global_matcher: self.0.git_global_matcher.clone(),
git_ignore_matcher: gi_matcher,
git_exclude_matcher: gi_exclude_matcher,
has_git: dir.join(".git").is_dir(),
opts: self.0.opts,
};
(ig, errs.into_error_option())
}
/// Returns a match indicating whether the given file path should be
/// ignored or not.
///
/// The match contains information about its origin.
pub fn matched<'a, P: AsRef<Path>>(
&'a self,
path: P,
is_dir: bool,
) -> Match<IgnoreMatch<'a>> {
// We need to be careful with our path. If it has a leading ./, then
// strip it because it causes nothing but trouble.
let mut path = path.as_ref();
if let Some(p) = strip_prefix("./", path) {
path = p;
}
// Match against the override patterns. If an override matches
// regardless of whether it's whitelist/ignore, then we quit and
// return that result immediately. Overrides have the highest
// precedence.
if !self.0.overrides.is_empty() {
let mat =
self.0.overrides.matched(path, is_dir)
.map(IgnoreMatch::overrides);
if !mat.is_none() {
return mat;
}
}
let mut whitelisted = Match::None;
if self.0.opts.has_any_ignore_options() {
let mat = self.matched_ignore(path, is_dir);
if mat.is_ignore() {
return mat;
} else if mat.is_whitelist() {
whitelisted = mat;
}
}
if !self.0.types.is_empty() {
let mat =
self.0.types.matched(path, is_dir).map(IgnoreMatch::types);
if mat.is_ignore() {
return mat;
} else if mat.is_whitelist() {
whitelisted = mat;
}
}
if whitelisted.is_none() && self.0.opts.hidden && is_hidden(path) {
return Match::Ignore(IgnoreMatch::hidden());
}
whitelisted
}
/// Performs matching only on the ignore files for this directory and
/// all parent directories.
fn matched_ignore<'a>(
&'a self,
path: &Path,
is_dir: bool,
) -> Match<IgnoreMatch<'a>> {
let (mut m_ignore, mut m_gi, mut m_gi_exclude, mut m_explicit) =
(Match::None, Match::None, Match::None, Match::None);
let mut saw_git = false;
for ig in self.parents().take_while(|ig| !ig.0.is_absolute_parent) {
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
if let Some(abs_parent_path) = self.absolute_base() {
let path = abs_parent_path.join(path);
for ig in self.parents().skip_while(|ig|!ig.0.is_absolute_parent) {
if m_ignore.is_none() {
m_ignore =
ig.0.ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi.is_none() {
m_gi =
ig.0.git_ignore_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
if !saw_git && m_gi_exclude.is_none() {
m_gi_exclude =
ig.0.git_exclude_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
}
saw_git = saw_git || ig.0.has_git;
}
}
for gi in self.0.explicit_ignores.iter().rev() {
if !m_explicit.is_none() {
break;
}
m_explicit = gi.matched(&path, is_dir).map(IgnoreMatch::gitignore);
}
let m_global = self.0.git_global_matcher.matched(&path, is_dir)
.map(IgnoreMatch::gitignore);
m_ignore.or(m_gi).or(m_gi_exclude).or(m_global).or(m_explicit)
}
/// Returns an iterator over parent ignore matchers, including this one.
pub fn parents(&self) -> Parents {
Parents(Some(self))
}
/// Returns the first absolute path of the first absolute parent, if
/// one exists.
fn absolute_base(&self) -> Option<&Path> {
self.0.absolute_base.as_ref().map(|p| &***p)
}
}
/// An iterator over all parents of an ignore matcher, including itself.
///
/// The lifetime `'a` refers to the lifetime of the initial `Ignore` matcher.
pub struct Parents<'a>(Option<&'a Ignore>);
impl<'a> Iterator for Parents<'a> {
type Item = &'a Ignore;
fn next(&mut self) -> Option<&'a Ignore> {
match self.0.take() {
None => None,
Some(ig) => {
self.0 = ig.0.parent.as_ref();
Some(ig)
}
}
}
}
/// A builder for creating an Ignore matcher.
#[derive(Clone, Debug)]
pub struct IgnoreBuilder {
/// The root directory path for this ignore matcher.
dir: PathBuf,
/// An override matcher (default is empty).
overrides: Arc<Override>,
/// A type matcher (default is empty).
types: Arc<Types>,
/// Explicit ignore matchers.
explicit_ignores: Vec<Gitignore>,
/// Ignore config.
opts: IgnoreOptions,
}
impl IgnoreBuilder {
/// Create a new builder for an `Ignore` matcher.
///
/// All relative file paths are resolved with respect to the current
/// working directory.
pub fn new() -> IgnoreBuilder {
IgnoreBuilder {
dir: Path::new("").to_path_buf(),
overrides: Arc::new(Override::empty()),
types: Arc::new(Types::empty()),
explicit_ignores: vec![],
opts: IgnoreOptions {
hidden: true,
ignore: true,
git_global: true,
git_ignore: true,
git_exclude: true,
},
}
}
/// Builds a new `Ignore` matcher.
///
/// The matcher returned won't match anything until ignore rules from
/// directories are added to it.
pub fn build(&self) -> Ignore {
let git_global_matcher =
if !self.opts.git_global {
Gitignore::empty()
} else {
let (gi, err) = Gitignore::global();
if let Some(err) = err {
debug!("{}", err);
}
gi
};
Ignore(Arc::new(IgnoreInner {
compiled: Arc::new(RwLock::new(HashMap::new())),
dir: self.dir.clone(),
overrides: self.overrides.clone(),
types: self.types.clone(),
parent: None,
is_absolute_parent: true,
absolute_base: None,
explicit_ignores: Arc::new(self.explicit_ignores.clone()),
ignore_matcher: Gitignore::empty(),
git_global_matcher: Arc::new(git_global_matcher),
git_ignore_matcher: Gitignore::empty(),
git_exclude_matcher: Gitignore::empty(),
has_git: false,
opts: self.opts,
}))
}
/// Add an override matcher.
///
/// By default, no override matcher is used.
///
/// This overrides any previous setting.
pub fn overrides(&mut self, overrides: Override) -> &mut IgnoreBuilder {
self.overrides = Arc::new(overrides);
self
}
/// Add a file type matcher.
///
/// By default, no file type matcher is used.
///
/// This overrides any previous setting.
pub fn types(&mut self, types: Types) -> &mut IgnoreBuilder {
self.types = Arc::new(types);
self
}
/// Adds a new global ignore matcher from the ignore file path given.
pub fn add_ignore(&mut self, ig: Gitignore) -> &mut IgnoreBuilder {
self.explicit_ignores.push(ig);
self
}
/// Enables ignoring hidden files.
///
/// This is enabled by default.
pub fn hidden(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.hidden = yes;
self
}
/// Enables reading `.ignore` files.
///
/// `.ignore` files have the same semantics as `gitignore` files and are
/// supported by search tools such as ripgrep and The Silver Searcher.
///
/// This is enabled by default.
pub fn ignore(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.ignore = yes;
self
}
/// Add a global gitignore matcher.
///
/// Its precedence is lower than both normal `.gitignore` files and
/// `.git/info/exclude` files.
///
/// This overwrites any previous global gitignore setting.
///
/// This is enabled by default.
pub fn git_global(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.git_global = yes;
self
}
/// Enables reading `.gitignore` files.
///
/// `.gitignore` files have match semantics as described in the `gitignore`
/// man page.
///
/// This is enabled by default.
pub fn git_ignore(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.git_ignore = yes;
self
}
/// Enables reading `.git/info/exclude` files.
///
/// `.git/info/exclude` files have match semantics as described in the
/// `gitignore` man page.
///
/// This is enabled by default.
pub fn git_exclude(&mut self, yes: bool) -> &mut IgnoreBuilder {
self.opts.git_exclude = yes;
self
}
}
/// Creates a new gitignore matcher for the directory given.
///
/// Ignore globs are extracted from each of the file names in `dir` in the
/// order given (earlier names have lower precedence than later names).
///
/// I/O errors are ignored.
pub fn create_gitignore(
dir: &Path,
names: &[&str],
) -> (Gitignore, Option<Error>) {
let mut builder = GitignoreBuilder::new(dir);
let mut errs = PartialErrorBuilder::default();
for name in names {
let gipath = dir.join(name);
errs.maybe_push_ignore_io(builder.add(gipath));
}
let gi = match builder.build() {
Ok(gi) => gi,
Err(err) => {
errs.push(err);
GitignoreBuilder::new(dir).build().unwrap()
}
};
(gi, errs.into_error_option())
}
#[cfg(test)]
mod tests {
use std::fs::{self, File};
use std::io::Write;
use std::path::Path;
use tempdir::TempDir;
use dir::IgnoreBuilder;
use gitignore::Gitignore;
use Error;
fn wfile<P: AsRef<Path>>(path: P, contents: &str) {
let mut file = File::create(path).unwrap();
file.write_all(contents.as_bytes()).unwrap();
}
fn mkdirp<P: AsRef<Path>>(path: P) {
fs::create_dir_all(path).unwrap();
}
fn partial(err: Error) -> Vec<Error> {
match err {
Error::Partial(errs) => errs,
_ => panic!("expected partial error but got {:?}", err),
}
}
#[test]
fn explicit_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join("not-an-ignore"), "foo\n!bar");
let (gi, err) = Gitignore::new(td.path().join("not-an-ignore"));
assert!(err.is_none());
let (ig, err) = IgnoreBuilder::new()
.add_ignore(gi).build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn git_exclude() {
let td = TempDir::new("ignore-test-").unwrap();
mkdirp(td.path().join(".git/info"));
wfile(td.path().join(".git/info/exclude"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn gitignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
assert!(ig.matched("baz", false).is_none());
}
#[test]
fn ignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".ignore"), "foo\n!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_ignore());
assert!(ig.matched("bar", false).is_whitelist());
assert!(ig.matched("baz", false).is_none());
}
// Tests that an .ignore will override a .gitignore.
#[test]
fn ignore_over_gitignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "foo");
wfile(td.path().join(".ignore"), "!foo");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("foo", false).is_whitelist());
}
// Tests that exclude has lower precedent than both .ignore and .gitignore.
#[test]
fn exclude_lowest() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "!foo");
wfile(td.path().join(".ignore"), "!bar");
mkdirp(td.path().join(".git/info"));
wfile(td.path().join(".git/info/exclude"), "foo\nbar\nbaz");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
assert!(ig.matched("baz", false).is_ignore());
assert!(ig.matched("foo", false).is_whitelist());
assert!(ig.matched("bar", false).is_whitelist());
}
#[test]
fn errored() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo");
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_some());
}
#[test]
fn errored_both() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo");
wfile(td.path().join(".ignore"), "fo**o");
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert_eq!(2, partial(err.expect("an error")).len());
}
#[test]
fn errored_partial() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo\nbar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_some());
assert!(ig.matched("bar", false).is_ignore());
}
#[test]
fn errored_partial_and_ignore() {
let td = TempDir::new("ignore-test-").unwrap();
wfile(td.path().join(".gitignore"), "f**oo\nbar");
wfile(td.path().join(".ignore"), "!bar");
let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_some());
assert!(ig.matched("bar", false).is_whitelist());
}
#[test]
fn not_present_empty() {
let td = TempDir::new("ignore-test-").unwrap();
let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
assert!(err.is_none());
}
#[test]
fn stops_at_git_dir() {
// This tests that .gitignore files beyond a .git barrier aren't
// matched, but .ignore files are.
let td = TempDir::new("ignore-test-").unwrap();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo/.git"));
wfile(td.path().join(".gitignore"), "foo");
wfile(td.path().join(".ignore"), "bar");
let ig0 = IgnoreBuilder::new().build();
let (ig1, err) = ig0.add_child(td.path());
assert!(err.is_none());
let (ig2, err) = ig1.add_child(ig1.path().join("foo"));
assert!(err.is_none());
assert!(ig1.matched("foo", false).is_ignore());
assert!(ig2.matched("foo", false).is_none());
assert!(ig1.matched("bar", false).is_ignore());
assert!(ig2.matched("bar", false).is_ignore());
}
#[test]
fn absolute_parent() {
let td = TempDir::new("ignore-test-").unwrap();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("foo"));
wfile(td.path().join(".gitignore"), "bar");
// First, check that the parent gitignore file isn't detected if the
// parent isn't added. This establishes a baseline.
let ig0 = IgnoreBuilder::new().build();
let (ig1, err) = ig0.add_child(td.path().join("foo"));
assert!(err.is_none());
assert!(ig1.matched("bar", false).is_none());
// Second, check that adding a parent directory actually works.
let ig0 = IgnoreBuilder::new().build();
let (ig1, err) = ig0.add_parents(td.path().join("foo"));
assert!(err.is_none());
let (ig2, err) = ig1.add_child(td.path().join("foo"));
assert!(err.is_none());
assert!(ig2.matched("bar", false).is_ignore());
}
#[test]
fn absolute_parent_anchored() {
let td = TempDir::new("ignore-test-").unwrap();
mkdirp(td.path().join(".git"));
mkdirp(td.path().join("src/llvm"));
wfile(td.path().join(".gitignore"), "/llvm/\nfoo");
let ig0 = IgnoreBuilder::new().build();
let (ig1, err) = ig0.add_parents(td.path().join("src"));
assert!(err.is_none());
let (ig2, err) = ig1.add_child("src");
assert!(err.is_none());
assert!(ig1.matched("llvm", true).is_none());
assert!(ig2.matched("llvm", true).is_none());
assert!(ig2.matched("src/llvm", true).is_none());
assert!(ig2.matched("foo", false).is_ignore());
assert!(ig2.matched("src/foo", false).is_ignore());
}
}

607
ignore/src/gitignore.rs Normal file
View File

@@ -0,0 +1,607 @@
/*!
The gitignore module provides a way to match globs from a gitignore file
against file paths.
Note that this module implements the specification as described in the
`gitignore` man page from scratch. That is, this module does *not* shell out to
the `git` command line tool.
*/
use std::cell::RefCell;
use std::env;
use std::fs::File;
use std::io::{self, BufRead, Read};
use std::path::{Path, PathBuf};
use std::str;
use std::sync::Arc;
use globset::{Candidate, GlobBuilder, GlobSet, GlobSetBuilder};
use regex::bytes::Regex;
use thread_local::ThreadLocal;
use pathutil::{is_file_name, strip_prefix};
use {Error, Match, PartialErrorBuilder};
/// Glob represents a single glob in a gitignore file.
///
/// This is used to report information about the highest precedent glob that
/// matched in one or more gitignore files.
#[derive(Clone, Debug)]
pub struct Glob {
/// The file path that this glob was extracted from.
from: Option<PathBuf>,
/// The original glob string.
original: String,
/// The actual glob string used to convert to a regex.
actual: String,
/// Whether this is a whitelisted glob or not.
is_whitelist: bool,
/// Whether this glob should only match directories or not.
is_only_dir: bool,
}
impl Glob {
/// Returns the file path that defined this glob.
pub fn from(&self) -> Option<&Path> {
self.from.as_ref().map(|p| &**p)
}
/// The original glob as it was defined in a gitignore file.
pub fn original(&self) -> &str {
&self.original
}
/// The actual glob that was compiled to respect gitignore
/// semantics.
pub fn actual(&self) -> &str {
&self.actual
}
/// Whether this was a whitelisted glob or not.
pub fn is_whitelist(&self) -> bool {
self.is_whitelist
}
/// Whether this glob must match a directory or not.
pub fn is_only_dir(&self) -> bool {
self.is_only_dir
}
}
/// Gitignore is a matcher for the globs in one or more gitignore files
/// in the same directory.
#[derive(Clone, Debug)]
pub struct Gitignore {
set: GlobSet,
root: PathBuf,
globs: Vec<Glob>,
num_ignores: u64,
num_whitelists: u64,
matches: Arc<ThreadLocal<RefCell<Vec<usize>>>>,
}
impl Gitignore {
/// Creates a new gitignore matcher from the gitignore file path given.
///
/// If it's desirable to include multiple gitignore files in a single
/// matcher, or read gitignore globs from a different source, then
/// use `GitignoreBuilder`.
///
/// This always returns a valid matcher, even if it's empty. In particular,
/// a Gitignore file can be partially valid, e.g., when one glob is invalid
/// but the rest aren't.
///
/// Note that I/O errors are ignored. For more granular control over
/// errors, use `GitignoreBuilder`.
pub fn new<P: AsRef<Path>>(
gitignore_path: P,
) -> (Gitignore, Option<Error>) {
let path = gitignore_path.as_ref();
let parent = path.parent().unwrap_or(Path::new("/"));
let mut builder = GitignoreBuilder::new(parent);
let mut errs = PartialErrorBuilder::default();
errs.maybe_push_ignore_io(builder.add(path));
match builder.build() {
Ok(gi) => (gi, errs.into_error_option()),
Err(err) => {
errs.push(err);
(Gitignore::empty(), errs.into_error_option())
}
}
}
/// Creates a new gitignore matcher from the global ignore file, if one
/// exists.
///
/// The global config file path is specified by git's `core.excludesFile`
/// config option.
///
/// Git's config file location is `$HOME/.gitconfig`. If `$HOME/.gitconfig`
/// does not exist or does not specify `core.excludesFile`, then
/// `$XDG_CONFIG_HOME/git/ignore` is read. If `$XDG_CONFIG_HOME` is not
/// set or is empty, then `$HOME/.config/git/ignore` is used instead.
pub fn global() -> (Gitignore, Option<Error>) {
match gitconfig_excludes_path() {
None => (Gitignore::empty(), None),
Some(path) => {
if !path.is_file() {
(Gitignore::empty(), None)
} else {
Gitignore::new(path)
}
}
}
}
/// Creates a new empty gitignore matcher that never matches anything.
///
/// Its path is empty.
pub fn empty() -> Gitignore {
GitignoreBuilder::new("").build().unwrap()
}
/// Returns the directory containing this gitignore matcher.
///
/// All matches are done relative to this path.
pub fn path(&self) -> &Path {
&*self.root
}
/// Returns true if and only if this gitignore has zero globs, and
/// therefore never matches any file path.
pub fn is_empty(&self) -> bool {
self.set.is_empty()
}
/// Returns the total number of globs, which should be equivalent to
/// `num_ignores + num_whitelists`.
pub fn len(&self) -> usize {
self.set.len()
}
/// Returns the total number of ignore globs.
pub fn num_ignores(&self) -> u64 {
self.num_ignores
}
/// Returns the total number of whitelisted globs.
pub fn num_whitelists(&self) -> u64 {
self.num_whitelists
}
/// Returns whether the given file path matched a pattern in this gitignore
/// matcher.
///
/// `is_dir` should be true if the path refers to a directory and false
/// otherwise.
///
/// The given path is matched relative to the path given when building
/// the matcher. Specifically, before matching `path`, its prefix (as
/// determined by a common suffix of the directory containing this
/// gitignore) is stripped. If there is no common suffix/prefix overlap,
/// then `path` is assumed to be relative to this matcher.
pub fn matched<P: AsRef<Path>>(
&self,
path: P,
is_dir: bool,
) -> Match<&Glob> {
if self.is_empty() {
return Match::None;
}
self.matched_stripped(self.strip(path.as_ref()), is_dir)
}
/// Like matched, but takes a path that has already been stripped.
fn matched_stripped<P: AsRef<Path>>(
&self,
path: P,
is_dir: bool,
) -> Match<&Glob> {
if self.is_empty() {
return Match::None;
}
let path = path.as_ref();
let _matches = self.matches.get_default();
let mut matches = _matches.borrow_mut();
let candidate = Candidate::new(path);
self.set.matches_candidate_into(&candidate, &mut *matches);
for &i in matches.iter().rev() {
let glob = &self.globs[i];
if !glob.is_only_dir() || is_dir {
return if glob.is_whitelist() {
Match::Whitelist(glob)
} else {
Match::Ignore(glob)
};
}
}
Match::None
}
/// Strips the given path such that it's suitable for matching with this
/// gitignore matcher.
fn strip<'a, P: 'a + AsRef<Path> + ?Sized>(
&'a self,
path: &'a P,
) -> &'a Path {
let mut path = path.as_ref();
// A leading ./ is completely superfluous. We also strip it from
// our gitignore root path, so we need to strip it from our candidate
// path too.
if let Some(p) = strip_prefix("./", path) {
path = p;
}
// Strip any common prefix between the candidate path and the root
// of the gitignore, to make sure we get relative matching right.
// BUT, a file name might not have any directory components to it,
// in which case, we don't want to accidentally strip any part of the
// file name.
if !is_file_name(path) {
if let Some(p) = strip_prefix(&self.root, path) {
path = p;
// If we're left with a leading slash, get rid of it.
if let Some(p) = strip_prefix("/", path) {
path = p;
}
}
}
path
}
}
/// Builds a matcher for a single set of globs from a .gitignore file.
pub struct GitignoreBuilder {
builder: GlobSetBuilder,
root: PathBuf,
globs: Vec<Glob>,
}
impl GitignoreBuilder {
/// Create a new builder for a gitignore file.
///
/// The path given should be the path at which the globs for this gitignore
/// file should be matched. Note that paths are always matched relative
/// to the root path given here. Generally, the root path should correspond
/// to the *directory* containing a `.gitignore` file.
pub fn new<P: AsRef<Path>>(root: P) -> GitignoreBuilder {
let root = root.as_ref();
GitignoreBuilder {
builder: GlobSetBuilder::new(),
root: strip_prefix("./", root).unwrap_or(root).to_path_buf(),
globs: vec![],
}
}
/// Builds a new matcher from the globs added so far.
///
/// Once a matcher is built, no new globs can be added to it.
pub fn build(&self) -> Result<Gitignore, Error> {
let nignore = self.globs.iter().filter(|g| !g.is_whitelist()).count();
let nwhite = self.globs.iter().filter(|g| g.is_whitelist()).count();
let set = try!(
self.builder.build().map_err(|err| Error::Glob(err.to_string())));
Ok(Gitignore {
set: set,
root: self.root.clone(),
globs: self.globs.clone(),
num_ignores: nignore as u64,
num_whitelists: nwhite as u64,
matches: Arc::new(ThreadLocal::default()),
})
}
/// Add each glob from the file path given.
///
/// The file given should be formatted as a `gitignore` file.
///
/// Note that partial errors can be returned. For example, if there was
/// a problem adding one glob, an error for that will be returned, but
/// all other valid globs will still be added.
pub fn add<P: AsRef<Path>>(&mut self, path: P) -> Option<Error> {
let path = path.as_ref();
let file = match File::open(path) {
Err(err) => return Some(Error::Io(err).with_path(path)),
Ok(file) => file,
};
let rdr = io::BufReader::new(file);
let mut errs = PartialErrorBuilder::default();
for (i, line) in rdr.lines().enumerate() {
let lineno = (i + 1) as u64;
let line = match line {
Ok(line) => line,
Err(err) => {
errs.push(Error::Io(err).tagged(path, lineno));
break;
}
};
if let Err(err) = self.add_line(Some(path.to_path_buf()), &line) {
errs.push(err.tagged(path, lineno));
}
}
errs.into_error_option()
}
/// Add each glob line from the string given.
///
/// If this string came from a particular `gitignore` file, then its path
/// should be provided here.
///
/// The string given should be formatted as a `gitignore` file.
#[cfg(test)]
fn add_str(
&mut self,
from: Option<PathBuf>,
gitignore: &str,
) -> Result<&mut GitignoreBuilder, Error> {
for line in gitignore.lines() {
try!(self.add_line(from.clone(), line));
}
Ok(self)
}
/// Add a line from a gitignore file to this builder.
///
/// If this line came from a particular `gitignore` file, then its path
/// should be provided here.
///
/// If the line could not be parsed as a glob, then an error is returned.
pub fn add_line(
&mut self,
from: Option<PathBuf>,
mut line: &str,
) -> Result<&mut GitignoreBuilder, Error> {
if line.starts_with("#") {
return Ok(self);
}
if !line.ends_with("\\ ") {
line = line.trim_right();
}
if line.is_empty() {
return Ok(self);
}
let mut glob = Glob {
from: from,
original: line.to_string(),
actual: String::new(),
is_whitelist: false,
is_only_dir: false,
};
let mut literal_separator = false;
let has_slash = line.chars().any(|c| c == '/');
let is_absolute = line.chars().nth(0).unwrap() == '/';
if line.starts_with("\\!") || line.starts_with("\\#") {
line = &line[1..];
} else {
if line.starts_with("!") {
glob.is_whitelist = true;
line = &line[1..];
}
if line.starts_with("/") {
// `man gitignore` says that if a glob starts with a slash,
// then the glob can only match the beginning of a path
// (relative to the location of gitignore). We achieve this by
// simply banning wildcards from matching /.
literal_separator = true;
line = &line[1..];
}
}
// If it ends with a slash, then this should only match directories,
// but the slash should otherwise not be used while globbing.
if let Some((i, c)) = line.char_indices().rev().nth(0) {
if c == '/' {
glob.is_only_dir = true;
line = &line[..i];
}
}
// If there is a literal slash, then we note that so that globbing
// doesn't let wildcards match slashes.
glob.actual = line.to_string();
if has_slash {
literal_separator = true;
}
// If there was a leading slash, then this is a glob that must
// match the entire path name. Otherwise, we should let it match
// anywhere, so use a **/ prefix.
if !is_absolute {
// ... but only if we don't already have a **/ prefix.
if !glob.actual.starts_with("**/") {
glob.actual = format!("**/{}", glob.actual);
}
}
// If the glob ends with `/**`, then we should only match everything
// inside a directory, but not the directory itself. Standard globs
// will match the directory. So we add `/*` to force the issue.
if glob.actual.ends_with("/**") {
glob.actual = format!("{}/*", glob.actual);
}
let parsed = try!(
GlobBuilder::new(&glob.actual)
.literal_separator(literal_separator)
.build()
.map_err(|err| Error::Glob(err.to_string())));
self.builder.add(parsed);
self.globs.push(glob);
Ok(self)
}
}
/// Return the file path of the current environment's global gitignore file.
///
/// Note that the file path returned may not exist.
fn gitconfig_excludes_path() -> Option<PathBuf> {
gitconfig_contents()
.and_then(|data| parse_excludes_file(&data))
.or_else(excludes_file_default)
}
/// Returns the file contents of git's global config file, if one exists.
fn gitconfig_contents() -> Option<Vec<u8>> {
let home = match env::var_os("HOME") {
None => return None,
Some(home) => PathBuf::from(home),
};
let mut file = match File::open(home.join(".gitconfig")) {
Err(_) => return None,
Ok(file) => io::BufReader::new(file),
};
let mut contents = vec![];
file.read_to_end(&mut contents).ok().map(|_| contents)
}
/// Returns the default file path for a global .gitignore file.
///
/// Specifically, this respects XDG_CONFIG_HOME.
fn excludes_file_default() -> Option<PathBuf> {
env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })
.or_else(|| env::home_dir())
.map(|x| x.join("git/ignore"))
}
/// Extract git's `core.excludesfile` config setting from the raw file contents
/// given.
fn parse_excludes_file(data: &[u8]) -> Option<PathBuf> {
// N.B. This is the lazy approach, and isn't technically correct, but
// probably works in more circumstances. I guess we would ideally have
// a full INI parser. Yuck.
lazy_static! {
static ref RE: Regex = Regex::new(
r"(?ium)^\s*excludesfile\s*=\s*(.+)\s*$").unwrap();
};
let caps = match RE.captures(data) {
None => return None,
Some(caps) => caps,
};
str::from_utf8(&caps[1]).ok().map(|s| PathBuf::from(expand_tilde(s)))
}
/// Expands ~ in file paths to the value of $HOME.
fn expand_tilde(path: &str) -> String {
let home = match env::var("HOME") {
Err(_) => return path.to_string(),
Ok(home) => home,
};
path.replace("~", &home)
}
#[cfg(test)]
mod tests {
use std::path::Path;
use super::{Gitignore, GitignoreBuilder};
fn gi_from_str<P: AsRef<Path>>(root: P, s: &str) -> Gitignore {
let mut builder = GitignoreBuilder::new(root);
builder.add_str(None, s).unwrap();
builder.build().unwrap()
}
macro_rules! ignored {
($name:ident, $root:expr, $gi:expr, $path:expr) => {
ignored!($name, $root, $gi, $path, false);
};
($name:ident, $root:expr, $gi:expr, $path:expr, $is_dir:expr) => {
#[test]
fn $name() {
let gi = gi_from_str($root, $gi);
assert!(gi.matched($path, $is_dir).is_ignore());
}
};
}
macro_rules! not_ignored {
($name:ident, $root:expr, $gi:expr, $path:expr) => {
not_ignored!($name, $root, $gi, $path, false);
};
($name:ident, $root:expr, $gi:expr, $path:expr, $is_dir:expr) => {
#[test]
fn $name() {
let gi = gi_from_str($root, $gi);
assert!(!gi.matched($path, $is_dir).is_ignore());
}
};
}
const ROOT: &'static str = "/home/foobar/rust/rg";
ignored!(ig1, ROOT, "months", "months");
ignored!(ig2, ROOT, "*.lock", "Cargo.lock");
ignored!(ig3, ROOT, "*.rs", "src/main.rs");
ignored!(ig4, ROOT, "src/*.rs", "src/main.rs");
ignored!(ig5, ROOT, "/*.c", "cat-file.c");
ignored!(ig6, ROOT, "/src/*.rs", "src/main.rs");
ignored!(ig7, ROOT, "!src/main.rs\n*.rs", "src/main.rs");
ignored!(ig8, ROOT, "foo/", "foo", true);
ignored!(ig9, ROOT, "**/foo", "foo");
ignored!(ig10, ROOT, "**/foo", "src/foo");
ignored!(ig11, ROOT, "**/foo/**", "src/foo/bar");
ignored!(ig12, ROOT, "**/foo/**", "wat/src/foo/bar/baz");
ignored!(ig13, ROOT, "**/foo/bar", "foo/bar");
ignored!(ig14, ROOT, "**/foo/bar", "src/foo/bar");
ignored!(ig15, ROOT, "abc/**", "abc/x");
ignored!(ig16, ROOT, "abc/**", "abc/x/y");
ignored!(ig17, ROOT, "abc/**", "abc/x/y/z");
ignored!(ig18, ROOT, "a/**/b", "a/b");
ignored!(ig19, ROOT, "a/**/b", "a/x/b");
ignored!(ig20, ROOT, "a/**/b", "a/x/y/b");
ignored!(ig21, ROOT, r"\!xy", "!xy");
ignored!(ig22, ROOT, r"\#foo", "#foo");
ignored!(ig23, ROOT, "foo", "./foo");
ignored!(ig24, ROOT, "target", "grep/target");
ignored!(ig25, ROOT, "Cargo.lock", "./tabwriter-bin/Cargo.lock");
ignored!(ig26, ROOT, "/foo/bar/baz", "./foo/bar/baz");
ignored!(ig27, ROOT, "foo/", "xyz/foo", true);
ignored!(ig28, ROOT, "src/*.rs", "src/grep/src/main.rs");
ignored!(ig29, "./src", "/llvm/", "./src/llvm", true);
ignored!(ig30, ROOT, "node_modules/ ", "node_modules", true);
not_ignored!(ignot1, ROOT, "amonths", "months");
not_ignored!(ignot2, ROOT, "monthsa", "months");
not_ignored!(ignot3, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot4, ROOT, "/*.c", "mozilla-sha1/sha1.c");
not_ignored!(ignot5, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot6, ROOT, "*.rs\n!src/main.rs", "src/main.rs");
not_ignored!(ignot7, ROOT, "foo/", "foo", false);
not_ignored!(ignot8, ROOT, "**/foo/**", "wat/src/afoo/bar/baz");
not_ignored!(ignot9, ROOT, "**/foo/**", "wat/src/fooa/bar/baz");
not_ignored!(ignot10, ROOT, "**/foo/bar", "foo/src/bar");
not_ignored!(ignot11, ROOT, "#foo", "#foo");
not_ignored!(ignot12, ROOT, "\n\n\n", "foo");
not_ignored!(ignot13, ROOT, "foo/**", "foo", true);
not_ignored!(
ignot14, "./third_party/protobuf", "m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config");
fn bytes(s: &str) -> Vec<u8> {
s.to_string().into_bytes()
}
fn path_string<P: AsRef<Path>>(path: P) -> String {
path.as_ref().to_str().unwrap().to_string()
}
#[test]
fn parse_excludes_file1() {
let data = bytes("[core]\nexcludesFile = /foo/bar");
let got = super::parse_excludes_file(&data).unwrap();
assert_eq!(path_string(got), "/foo/bar");
}
#[test]
fn parse_excludes_file2() {
let data = bytes("[core]\nexcludesFile = ~/foo/bar");
let got = super::parse_excludes_file(&data).unwrap();
assert_eq!(path_string(got), super::expand_tilde("~/foo/bar"));
}
#[test]
fn parse_excludes_file3() {
let data = bytes("[core]\nexcludeFile = /foo/bar");
assert!(super::parse_excludes_file(&data).is_none());
}
// See: https://github.com/BurntSushi/ripgrep/issues/106
#[test]
fn regression_106() {
gi_from_str("/", " ");
}
}

369
ignore/src/lib.rs Normal file
View File

@@ -0,0 +1,369 @@
/*!
The ignore crate provides a fast recursive directory iterator that respects
various filters such as globs, file types and `.gitignore` files. The precise
matching rules and precedence is explained in the documentation for
`WalkBuilder`.
Secondarily, this crate exposes gitignore and file type matchers for use cases
that demand more fine-grained control.
# Example
This example shows the most basic usage of this crate. This code will
recursively traverse the current directory while automatically filtering out
files and directories according to ignore globs found in files like
`.ignore` and `.gitignore`:
```rust,no_run
use ignore::Walk;
for result in Walk::new("./") {
// Each item yielded by the iterator is either a directory entry or an
// error, so either print the path or the error.
match result {
Ok(entry) => println!("{}", entry.path().display()),
Err(err) => println!("ERROR: {}", err),
}
}
```
# Example: advanced
By default, the recursive directory iterator will ignore hidden files and
directories. This can be disabled by building the iterator with `WalkBuilder`:
```rust,no_run
use ignore::WalkBuilder;
for result in WalkBuilder::new("./").hidden(false).build() {
println!("{:?}", result);
}
```
See the documentation for `WalkBuilder` for many other options.
*/
extern crate crossbeam;
extern crate globset;
#[macro_use]
extern crate lazy_static;
#[macro_use]
extern crate log;
extern crate memchr;
extern crate regex;
#[cfg(test)]
extern crate tempdir;
extern crate thread_local;
extern crate walkdir;
use std::error;
use std::fmt;
use std::io;
use std::path::{Path, PathBuf};
pub use walk::{DirEntry, Walk, WalkBuilder, WalkParallel, WalkState};
mod dir;
pub mod gitignore;
mod pathutil;
pub mod overrides;
pub mod types;
mod walk;
/// Represents an error that can occur when parsing a gitignore file.
#[derive(Debug)]
pub enum Error {
/// A collection of "soft" errors. These occur when adding an ignore
/// file partially succeeded.
Partial(Vec<Error>),
/// An error associated with a specific line number.
WithLineNumber { line: u64, err: Box<Error> },
/// An error associated with a particular file path.
WithPath { path: PathBuf, err: Box<Error> },
/// An error associated with a particular directory depth when recursively
/// walking a directory.
WithDepth { depth: usize, err: Box<Error> },
/// An error that occurs when a file loop is detected when traversing
/// symbolic links.
Loop { ancestor: PathBuf, child: PathBuf },
/// An error that occurs when doing I/O, such as reading an ignore file.
Io(io::Error),
/// An error that occurs when trying to parse a glob.
Glob(String),
/// A type selection for a file type that is not defined.
UnrecognizedFileType(String),
/// A user specified file type definition could not be parsed.
InvalidDefinition,
}
impl Error {
/// Returns true if this is a partial error.
///
/// A partial error occurs when only some operations failed while others
/// may have succeeded. For example, an ignore file may contain an invalid
/// glob among otherwise valid globs.
pub fn is_partial(&self) -> bool {
match *self {
Error::Partial(_) => true,
Error::WithLineNumber { ref err, .. } => err.is_partial(),
Error::WithPath { ref err, .. } => err.is_partial(),
Error::WithDepth { ref err, .. } => err.is_partial(),
_ => false,
}
}
/// Returns true if this error is exclusively an I/O error.
pub fn is_io(&self) -> bool {
match *self {
Error::Partial(ref errs) => errs.len() == 1 && errs[0].is_io(),
Error::WithLineNumber { ref err, .. } => err.is_io(),
Error::WithPath { ref err, .. } => err.is_io(),
Error::WithDepth { ref err, .. } => err.is_io(),
Error::Loop { .. } => false,
Error::Io(_) => true,
Error::Glob(_) => false,
Error::UnrecognizedFileType(_) => false,
Error::InvalidDefinition => false,
}
}
/// Returns a depth associated with recursively walking a directory (if
/// this error was generated from a recursive directory iterator).
pub fn depth(&self) -> Option<usize> {
match *self {
Error::WithPath { ref err, .. } => err.depth(),
Error::WithDepth { depth, .. } => Some(depth),
_ => None,
}
}
/// Turn an error into a tagged error with the given file path.
fn with_path<P: AsRef<Path>>(self, path: P) -> Error {
Error::WithPath {
path: path.as_ref().to_path_buf(),
err: Box::new(self),
}
}
/// Turn an error into a tagged error with the given depth.
fn with_depth(self, depth: usize) -> Error {
Error::WithDepth {
depth: depth,
err: Box::new(self),
}
}
/// Turn an error into a tagged error with the given file path and line
/// number. If path is empty, then it is omitted from the error.
fn tagged<P: AsRef<Path>>(self, path: P, lineno: u64) -> Error {
let errline = Error::WithLineNumber {
line: lineno,
err: Box::new(self),
};
if path.as_ref().as_os_str().is_empty() {
return errline;
}
errline.with_path(path)
}
}
impl error::Error for Error {
fn description(&self) -> &str {
match *self {
Error::Partial(_) => "partial error",
Error::WithLineNumber { ref err, .. } => err.description(),
Error::WithPath { ref err, .. } => err.description(),
Error::WithDepth { ref err, .. } => err.description(),
Error::Loop { .. } => "file system loop found",
Error::Io(ref err) => err.description(),
Error::Glob(ref msg) => msg,
Error::UnrecognizedFileType(_) => "unrecognized file type",
Error::InvalidDefinition => "invalid definition",
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::Partial(ref errs) => {
let msgs: Vec<String> =
errs.iter().map(|err| err.to_string()).collect();
write!(f, "{}", msgs.join("\n"))
}
Error::WithLineNumber { line, ref err } => {
write!(f, "line {}: {}", line, err)
}
Error::WithPath { ref path, ref err } => {
write!(f, "{}: {}", path.display(), err)
}
Error::WithDepth { ref err, .. } => err.fmt(f),
Error::Loop { ref ancestor, ref child } => {
write!(f, "File system loop found: \
{} points to an ancestor {}",
child.display(), ancestor.display())
}
Error::Io(ref err) => err.fmt(f),
Error::Glob(ref msg) => write!(f, "{}", msg),
Error::UnrecognizedFileType(ref ty) => {
write!(f, "unrecognized file type: {}", ty)
}
Error::InvalidDefinition => {
write!(f, "invalid definition (format is type:glob, e.g., \
html:*.html)")
}
}
}
}
impl From<io::Error> for Error {
fn from(err: io::Error) -> Error {
Error::Io(err)
}
}
impl From<walkdir::Error> for Error {
fn from(err: walkdir::Error) -> Error {
let depth = err.depth();
if let (Some(anc), Some(child)) = (err.loop_ancestor(), err.path()) {
return Error::WithDepth {
depth: depth,
err: Box::new(Error::Loop {
ancestor: anc.to_path_buf(),
child: child.to_path_buf(),
}),
};
}
let path = err.path().map(|p| p.to_path_buf());
let mut ig_err = Error::Io(io::Error::from(err));
if let Some(path) = path {
ig_err = Error::WithPath {
path: path,
err: Box::new(ig_err),
};
}
ig_err
}
}
#[derive(Debug, Default)]
struct PartialErrorBuilder(Vec<Error>);
impl PartialErrorBuilder {
fn push(&mut self, err: Error) {
self.0.push(err);
}
fn push_ignore_io(&mut self, err: Error) {
if !err.is_io() {
self.push(err);
}
}
fn maybe_push(&mut self, err: Option<Error>) {
if let Some(err) = err {
self.push(err);
}
}
fn maybe_push_ignore_io(&mut self, err: Option<Error>) {
if let Some(err) = err {
self.push_ignore_io(err);
}
}
fn into_error_option(mut self) -> Option<Error> {
if self.0.is_empty() {
None
} else if self.0.len() == 1 {
Some(self.0.pop().unwrap())
} else {
Some(Error::Partial(self.0))
}
}
}
/// The result of a glob match.
///
/// The type parameter `T` typically refers to a type that provides more
/// information about a particular match. For example, it might identify
/// the specific gitignore file and the specific glob pattern that caused
/// the match.
#[derive(Clone, Debug)]
pub enum Match<T> {
/// The path didn't match any glob.
None,
/// The highest precedent glob matched indicates the path should be
/// ignored.
Ignore(T),
/// The highest precedent glob matched indicates the path should be
/// whitelisted.
Whitelist(T),
}
impl<T> Match<T> {
/// Returns true if the match result didn't match any globs.
pub fn is_none(&self) -> bool {
match *self {
Match::None => true,
Match::Ignore(_) | Match::Whitelist(_) => false,
}
}
/// Returns true if the match result implies the path should be ignored.
pub fn is_ignore(&self) -> bool {
match *self {
Match::Ignore(_) => true,
Match::None | Match::Whitelist(_) => false,
}
}
/// Returns true if the match result implies the path should be
/// whitelisted.
pub fn is_whitelist(&self) -> bool {
match *self {
Match::Whitelist(_) => true,
Match::None | Match::Ignore(_) => false,
}
}
/// Inverts the match so that `Ignore` becomes `Whitelist` and
/// `Whitelist` becomes `Ignore`. A non-match remains the same.
pub fn invert(self) -> Match<T> {
match self {
Match::None => Match::None,
Match::Ignore(t) => Match::Whitelist(t),
Match::Whitelist(t) => Match::Ignore(t),
}
}
/// Return the value inside this match if it exists.
pub fn inner(&self) -> Option<&T> {
match *self {
Match::None => None,
Match::Ignore(ref t) => Some(t),
Match::Whitelist(ref t) => Some(t),
}
}
/// Apply the given function to the value inside this match.
///
/// If the match has no value, then return the match unchanged.
pub fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Match<U> {
match self {
Match::None => Match::None,
Match::Ignore(t) => Match::Ignore(f(t)),
Match::Whitelist(t) => Match::Whitelist(f(t)),
}
}
/// Return the match if it is not none. Otherwise, return other.
pub fn or(self, other: Self) -> Self {
if self.is_none() {
other
} else {
self
}
}
}

217
ignore/src/overrides.rs Normal file
View File

@@ -0,0 +1,217 @@
/*!
The overrides module provides a way to specify a set of override globs.
This provides functionality similar to `--include` or `--exclude` in command
line tools.
*/
use std::path::Path;
use gitignore::{self, Gitignore, GitignoreBuilder};
use {Error, Match};
/// Glob represents a single glob in an override matcher.
///
/// This is used to report information about the highest precedent glob
/// that matched.
///
/// Note that not all matches necessarily correspond to a specific glob. For
/// example, if there are one or more whitelist globs and a file path doesn't
/// match any glob in the set, then the file path is considered to be ignored.
///
/// The lifetime `'a` refers to the lifetime of the matcher that produced
/// this glob.
#[derive(Clone, Debug)]
pub struct Glob<'a>(GlobInner<'a>);
#[derive(Clone, Debug)]
enum GlobInner<'a> {
/// No glob matched, but the file path should still be ignored.
UnmatchedIgnore,
/// A glob matched.
Matched(&'a gitignore::Glob),
}
impl<'a> Glob<'a> {
fn unmatched() -> Glob<'a> {
Glob(GlobInner::UnmatchedIgnore)
}
}
/// Manages a set of overrides provided explicitly by the end user.
#[derive(Clone, Debug)]
pub struct Override(Gitignore);
impl Override {
/// Returns an empty matcher that never matches any file path.
pub fn empty() -> Override {
Override(Gitignore::empty())
}
/// Returns the directory of this override set.
///
/// All matches are done relative to this path.
pub fn path(&self) -> &Path {
self.0.path()
}
/// Returns true if and only if this matcher is empty.
///
/// When a matcher is empty, it will never match any file path.
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
/// Returns the total number of ignore globs.
pub fn num_ignores(&self) -> u64 {
self.0.num_whitelists()
}
/// Returns the total number of whitelisted globs.
pub fn num_whitelists(&self) -> u64 {
self.0.num_ignores()
}
/// Returns whether the given file path matched a pattern in this override
/// matcher.
///
/// `is_dir` should be true if the path refers to a directory and false
/// otherwise.
///
/// If there are no overrides, then this always returns `Match::None`.
///
/// If there is at least one whitelist override and `is_dir` is false, then
/// this never returns `Match::None`, since non-matches are interpreted as
/// ignored.
///
/// The given path is matched to the globs relative to the path given
/// when building the override matcher. Specifically, before matching
/// `path`, its prefix (as determined by a common suffix of the directory
/// given) is stripped. If there is no common suffix/prefix overlap, then
/// `path` is assumed to reside in the same directory as the root path for
/// this set of overrides.
pub fn matched<'a, P: AsRef<Path>>(
&'a self,
path: P,
is_dir: bool,
) -> Match<Glob<'a>> {
if self.is_empty() {
return Match::None;
}
let mat = self.0.matched(path, is_dir).invert();
if mat.is_none() && self.num_whitelists() > 0 && !is_dir {
return Match::Ignore(Glob::unmatched());
}
mat.map(move |giglob| Glob(GlobInner::Matched(giglob)))
}
}
/// Builds a matcher for a set of glob overrides.
pub struct OverrideBuilder {
builder: GitignoreBuilder,
}
impl OverrideBuilder {
/// Create a new override builder.
///
/// Matching is done relative to the directory path provided.
pub fn new<P: AsRef<Path>>(path: P) -> OverrideBuilder {
OverrideBuilder {
builder: GitignoreBuilder::new(path),
}
}
/// Builds a new override matcher from the globs added so far.
///
/// Once a matcher is built, no new globs can be added to it.
pub fn build(&self) -> Result<Override, Error> {
Ok(Override(try!(self.builder.build())))
}
/// Add a glob to the set of overrides.
///
/// Globs provided here have precisely the same semantics as a single
/// line in a `gitignore` file, where the meaning of `!` is inverted:
/// namely, `!` at the beginning of a glob will ignore a file. Without `!`,
/// all matches of the glob provided are treated as whitelist matches.
pub fn add(&mut self, glob: &str) -> Result<&mut OverrideBuilder, Error> {
try!(self.builder.add_line(None, glob));
Ok(self)
}
}
#[cfg(test)]
mod tests {
use super::{Override, OverrideBuilder};
const ROOT: &'static str = "/home/andrew/foo";
fn ov(globs: &[&str]) -> Override {
let mut builder = OverrideBuilder::new(ROOT);
for glob in globs {
builder.add(glob).unwrap();
}
builder.build().unwrap()
}
#[test]
fn empty() {
let ov = ov(&[]);
assert!(ov.matched("a.foo", false).is_none());
assert!(ov.matched("a", false).is_none());
assert!(ov.matched("", false).is_none());
}
#[test]
fn simple() {
let ov = ov(&["*.foo", "!*.bar"]);
assert!(ov.matched("a.foo", false).is_whitelist());
assert!(ov.matched("a.foo", true).is_whitelist());
assert!(ov.matched("a.rs", false).is_ignore());
assert!(ov.matched("a.rs", true).is_none());
assert!(ov.matched("a.bar", false).is_ignore());
assert!(ov.matched("a.bar", true).is_ignore());
}
#[test]
fn only_ignores() {
let ov = ov(&["!*.bar"]);
assert!(ov.matched("a.rs", false).is_none());
assert!(ov.matched("a.rs", true).is_none());
assert!(ov.matched("a.bar", false).is_ignore());
assert!(ov.matched("a.bar", true).is_ignore());
}
#[test]
fn precedence() {
let ov = ov(&["*.foo", "!*.bar.foo"]);
assert!(ov.matched("a.foo", false).is_whitelist());
assert!(ov.matched("a.baz", false).is_ignore());
assert!(ov.matched("a.bar.foo", false).is_ignore());
}
#[test]
fn gitignore() {
let ov = ov(&["/foo", "bar/*.rs", "baz/**"]);
assert!(ov.matched("bar/wat/lib.rs", false).is_ignore());
assert!(ov.matched("wat/bar/lib.rs", false).is_whitelist());
assert!(ov.matched("foo", false).is_whitelist());
assert!(ov.matched("wat/foo", false).is_ignore());
assert!(ov.matched("baz", false).is_ignore());
assert!(ov.matched("baz/a", false).is_whitelist());
assert!(ov.matched("baz/a/b", false).is_whitelist());
}
#[test]
fn allow_directories() {
// This tests that directories are NOT ignored when they are unmatched.
let ov = ov(&["*.rs"]);
assert!(ov.matched("foo.rs", false).is_whitelist());
assert!(ov.matched("foo.c", false).is_ignore());
assert!(ov.matched("foo", false).is_ignore());
assert!(ov.matched("foo", true).is_none());
assert!(ov.matched("src/foo.rs", false).is_whitelist());
assert!(ov.matched("src/foo.c", false).is_ignore());
assert!(ov.matched("src/foo", false).is_ignore());
assert!(ov.matched("src/foo", true).is_none());
}
}

108
ignore/src/pathutil.rs Normal file
View File

@@ -0,0 +1,108 @@
use std::ffi::OsStr;
use std::path::Path;
/// Returns true if and only if this file path is considered to be hidden.
#[cfg(unix)]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
if let Some(name) = file_name(path.as_ref()) {
name.as_bytes().get(0) == Some(&b'.')
} else {
false
}
}
/// Returns true if and only if this file path is considered to be hidden.
#[cfg(not(unix))]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
if let Some(name) = file_name(path.as_ref()) {
name.to_str().map(|s| s.starts_with(".")).unwrap_or(false)
} else {
false
}
}
/// Strip `prefix` from the `path` and return the remainder.
///
/// If `path` doesn't have a prefix `prefix`, then return `None`.
#[cfg(unix)]
pub fn strip_prefix<'a, P: AsRef<Path> + ?Sized>(
prefix: &'a P,
path: &'a Path,
) -> Option<&'a Path> {
use std::os::unix::ffi::OsStrExt;
let prefix = prefix.as_ref().as_os_str().as_bytes();
let path = path.as_os_str().as_bytes();
if prefix.len() > path.len() || prefix != &path[0..prefix.len()] {
None
} else {
Some(&Path::new(OsStr::from_bytes(&path[prefix.len()..])))
}
}
/// Strip `prefix` from the `path` and return the remainder.
///
/// If `path` doesn't have a prefix `prefix`, then return `None`.
#[cfg(not(unix))]
pub fn strip_prefix<'a, P: AsRef<Path> + ?Sized>(
prefix: &'a P,
path: &'a Path,
) -> Option<&'a Path> {
path.strip_prefix(prefix).ok()
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(unix)]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
use memchr::memchr;
let path = path.as_ref().as_os_str().as_bytes();
memchr(b'/', path).is_none()
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(not(unix))]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
path.as_ref().parent().map(|p| p.as_os_str().is_empty()).unwrap_or(false)
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(unix)]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
let path = path.as_ref().as_os_str().as_bytes();
if path.is_empty() {
return None;
} else if path.len() == 1 && path[0] == b'.' {
return None;
} else if path.last() == Some(&b'.') {
return None;
} else if path.len() >= 2 && &path[path.len() - 2..] == &b".."[..] {
return None;
}
let last_slash = memrchr(b'/', path).map(|i| i + 1).unwrap_or(0);
Some(OsStr::from_bytes(&path[last_slash..]))
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(not(unix))]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
path.as_ref().file_name()
}

582
ignore/src/types.rs Normal file
View File

@@ -0,0 +1,582 @@
/*!
The types module provides a way of associating globs on file names to file
types.
This can be used to match specific types of files. For example, among
the default file types provided, the Rust file type is defined to be `*.rs`
with name `rust`. Similarly, the C file type is defined to be `*.{c,h}` with
name `c`.
Note that the set of default types may change over time.
# Example
This shows how to create and use a simple file type matcher using the default
file types defined in this crate.
```
use ignore::types::TypesBuilder;
let mut builder = TypesBuilder::new();
builder.add_defaults();
builder.select("rust");
let matcher = builder.build().unwrap();
assert!(matcher.matched("foo.rs", false).is_whitelist());
assert!(matcher.matched("foo.c", false).is_ignore());
```
# Example: negation
This is like the previous example, but shows how negating a file type works.
That is, this will let us match file paths that *don't* correspond to a
particular file type.
```
use ignore::types::TypesBuilder;
let mut builder = TypesBuilder::new();
builder.add_defaults();
builder.negate("c");
let matcher = builder.build().unwrap();
assert!(matcher.matched("foo.rs", false).is_none());
assert!(matcher.matched("foo.c", false).is_ignore());
```
# Example: custom file type definitions
This shows how to extend this library default file type definitions with
your own.
```
use ignore::types::TypesBuilder;
let mut builder = TypesBuilder::new();
builder.add_defaults();
builder.add("foo", "*.foo");
// Another way of adding a file type definition.
// This is useful when accepting input from an end user.
builder.add_def("bar:*.bar");
// Note: we only select `foo`, not `bar`.
builder.select("foo");
let matcher = builder.build().unwrap();
assert!(matcher.matched("x.foo", false).is_whitelist());
// This is ignored because we only selected the `foo` file type.
assert!(matcher.matched("x.bar", false).is_ignore());
```
*/
use std::cell::RefCell;
use std::collections::HashMap;
use std::path::Path;
use std::sync::Arc;
use globset::{GlobBuilder, GlobSet, GlobSetBuilder};
use thread_local::ThreadLocal;
use pathutil::file_name;
use {Error, Match};
const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
("agda", &["*.agda", "*.lagda"]),
("asciidoc", &["*.adoc", "*.asc", "*.asciidoc"]),
("asm", &["*.asm", "*.s", "*.S"]),
("awk", &["*.awk"]),
("c", &["*.c", "*.h", "*.H"]),
("cbor", &["*.cbor"]),
("clojure", &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
("cmake", &["*.cmake", "CMakeLists.txt"]),
("coffeescript", &["*.coffee"]),
("creole", &["*.creole"]),
("config", &["*.config"]),
("cpp", &[
"*.C", "*.cc", "*.cpp", "*.cxx",
"*.h", "*.H", "*.hh", "*.hpp",
]),
("cs", &["*.cs"]),
("csharp", &["*.cs"]),
("css", &["*.css"]),
("cython", &["*.pyx"]),
("dart", &["*.dart"]),
("d", &["*.d"]),
("elisp", &["*.el"]),
("elixir", &["*.ex", "*.exs"]),
("erlang", &["*.erl", "*.hrl"]),
("fish", &["*.fish"]),
("fortran", &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("go", &["*.go"]),
("groovy", &["*.groovy", "*.gradle"]),
("h", &["*.h", "*.hpp"]),
("hbs", &["*.hbs"]),
("haskell", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html"]),
("java", &["*.java"]),
("jinja", &["*.jinja", "*.jinja2"]),
("js", &[
"*.js", "*.jsx", "*.vue",
]),
("json", &["*.json"]),
("jsonl", &["*.jsonl"]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
("lua", &["*.lua"]),
("m4", &["*.ac", "*.m4"]),
("make", &["gnumakefile", "Gnumakefile", "makefile", "Makefile", "*.mk", "*.mak"]),
("markdown", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("md", &["*.markdown", "*.md", "*.mdown", "*.mkdn"]),
("matlab", &["*.m"]),
("mk", &["mkfile"]),
("ml", &["*.ml"]),
("nim", &["*.nim"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
("org", &["*.org"]),
("perl", &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm"]),
("pdf", &["*.pdf"]),
("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
("pod", &["*.pod"]),
("py", &["*.py"]),
("readme", &["README*", "*README"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rdoc", &["*.rdoc"]),
("rst", &["*.rst"]),
("ruby", &["*.rb"]),
("rust", &["*.rs"]),
("scala", &["*.scala"]),
("sh", &["*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh"]),
("spark", &["*.spark"]),
("sql", &["*.sql"]),
("sv", &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
("swift", &["*.swift"]),
("taskpaper", &["*.taskpaper"]),
("tcl", &["*.tcl"]),
("tex", &["*.tex", "*.ltx", "*.cls", "*.sty", "*.bib"]),
("textile", &["*.textile"]),
("ts", &["*.ts", "*.tsx"]),
("txt", &["*.txt"]),
("toml", &["*.toml", "Cargo.lock"]),
("vala", &["*.vala"]),
("vb", &["*.vb"]),
("vimscript", &["*.vim"]),
("wiki", &["*.mediawiki", "*.wiki"]),
("xml", &["*.xml"]),
("yacc", &["*.y"]),
("yaml", &["*.yaml", "*.yml"]),
];
/// Glob represents a single glob in a set of file type definitions.
///
/// There may be more than one glob for a particular file type.
///
/// This is used to report information about the highest precedent glob
/// that matched.
///
/// Note that not all matches necessarily correspond to a specific glob.
/// For example, if there are one or more selections and a file path doesn't
/// match any of those selections, then the file path is considered to be
/// ignored.
///
/// The lifetime `'a` refers to the lifetime of the underlying file type
/// definition, which corresponds to the lifetime of the file type matcher.
#[derive(Clone, Debug)]
pub struct Glob<'a>(GlobInner<'a>);
#[derive(Clone, Debug)]
enum GlobInner<'a> {
/// No glob matched, but the file path should still be ignored.
UnmatchedIgnore,
/// A glob matched.
Matched {
/// The file type definition which provided the glob.
def: &'a FileTypeDef,
/// The index of the glob that matched inside the file type definition.
which: usize,
/// Whether the selection was negated or not.
negated: bool,
}
}
impl<'a> Glob<'a> {
fn unmatched() -> Glob<'a> {
Glob(GlobInner::UnmatchedIgnore)
}
}
/// A single file type definition.
///
/// File type definitions can be retrieved in aggregate from a file type
/// matcher. File type definitions are also reported when its responsible
/// for a match.
#[derive(Clone, Debug)]
pub struct FileTypeDef {
name: String,
globs: Vec<String>,
}
impl FileTypeDef {
/// Return the name of this file type.
pub fn name(&self) -> &str {
&self.name
}
/// Return the globs used to recognize this file type.
pub fn globs(&self) -> &[String] {
&self.globs
}
}
/// Types is a file type matcher.
#[derive(Clone, Debug)]
pub struct Types {
/// All of the file type definitions, sorted lexicographically by name.
defs: Vec<FileTypeDef>,
/// All of the selections made by the user.
selections: Vec<Selection<FileTypeDef>>,
/// Whether there is at least one Selection::Select in our selections.
/// When this is true, a Match::None is converted to Match::Ignore.
has_selected: bool,
/// A mapping from glob index in the set to two indices. The first is an
/// index into `selections` and the second is an index into the
/// corresponding file type definition's list of globs.
glob_to_selection: Vec<(usize, usize)>,
/// The set of all glob selections, used for actual matching.
set: GlobSet,
/// Temporary storage for globs that match.
matches: Arc<ThreadLocal<RefCell<Vec<usize>>>>,
}
/// Indicates the type of a selection for a particular file type.
#[derive(Clone, Debug)]
enum Selection<T> {
Select(String, T),
Negate(String, T),
}
impl<T> Selection<T> {
fn is_negated(&self) -> bool {
match *self {
Selection::Select(..) => false,
Selection::Negate(..) => true,
}
}
fn name(&self) -> &str {
match *self {
Selection::Select(ref name, _) => name,
Selection::Negate(ref name, _) => name,
}
}
fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Selection<U> {
match self {
Selection::Select(name, inner) => {
Selection::Select(name, f(inner))
}
Selection::Negate(name, inner) => {
Selection::Negate(name, f(inner))
}
}
}
fn inner(&self) -> &T {
match *self {
Selection::Select(_, ref inner) => inner,
Selection::Negate(_, ref inner) => inner,
}
}
}
impl Types {
/// Creates a new file type matcher that never matches any path and
/// contains no file type definitions.
pub fn empty() -> Types {
Types {
defs: vec![],
selections: vec![],
has_selected: false,
glob_to_selection: vec![],
set: GlobSetBuilder::new().build().unwrap(),
matches: Arc::new(ThreadLocal::default()),
}
}
/// Returns true if and only if this matcher has zero selections.
pub fn is_empty(&self) -> bool {
self.selections.is_empty()
}
/// Returns the number of selections used in this matcher.
pub fn len(&self) -> usize {
self.selections.len()
}
/// Return the set of current file type definitions.
///
/// Definitions and globs are sorted.
pub fn definitions(&self) -> &[FileTypeDef] {
&self.defs
}
/// Returns a match for the given path against this file type matcher.
///
/// The path is considered whitelisted if it matches a selected file type.
/// The path is considered ignored if it matches a negated file type.
/// If at least one file type is selected and `path` doesn't match, then
/// the path is also considered ignored.
pub fn matched<'a, P: AsRef<Path>>(
&'a self,
path: P,
is_dir: bool,
) -> Match<Glob<'a>> {
// File types don't apply to directories, and we can't do anything
// if our glob set is empty.
if is_dir || self.set.is_empty() {
return Match::None;
}
// We only want to match against the file name, so extract it.
// If one doesn't exist, then we can't match it.
let name = match file_name(path.as_ref()) {
Some(name) => name,
None if self.has_selected => {
return Match::Ignore(Glob::unmatched());
}
None => {
return Match::None;
}
};
let mut matches = self.matches.get_default().borrow_mut();
self.set.matches_into(name, &mut *matches);
// The highest precedent match is the last one.
if let Some(&i) = matches.last() {
let (isel, iglob) = self.glob_to_selection[i];
let sel = &self.selections[isel];
let glob = Glob(GlobInner::Matched {
def: sel.inner(),
which: iglob,
negated: sel.is_negated(),
});
return if sel.is_negated() {
Match::Ignore(glob)
} else {
Match::Whitelist(glob)
};
}
if self.has_selected {
Match::Ignore(Glob::unmatched())
} else {
Match::None
}
}
}
/// TypesBuilder builds a type matcher from a set of file type definitions and
/// a set of file type selections.
pub struct TypesBuilder {
types: HashMap<String, FileTypeDef>,
selections: Vec<Selection<()>>,
}
impl TypesBuilder {
/// Create a new builder for a file type matcher.
///
/// The builder contains *no* type definitions to start with. A set
/// of default type definitions can be added with `add_defaults`, and
/// additional type definitions can be added with `select` and `negate`.
pub fn new() -> TypesBuilder {
TypesBuilder {
types: HashMap::new(),
selections: vec![],
}
}
/// Build the current set of file type definitions *and* selections into
/// a file type matcher.
pub fn build(&self) -> Result<Types, Error> {
let defs = self.definitions();
let has_selected = self.selections.iter().any(|s| !s.is_negated());
let mut selections = vec![];
let mut glob_to_selection = vec![];
let mut build_set = GlobSetBuilder::new();
for (isel, selection) in self.selections.iter().enumerate() {
let def = match self.types.get(selection.name()) {
Some(def) => def.clone(),
None => {
let name = selection.name().to_string();
return Err(Error::UnrecognizedFileType(name));
}
};
for (iglob, glob) in def.globs.iter().enumerate() {
build_set.add(try!(
GlobBuilder::new(glob)
.literal_separator(true)
.build()
.map_err(|err| Error::Glob(err.to_string()))));
glob_to_selection.push((isel, iglob));
}
selections.push(selection.clone().map(move |_| def));
}
let set = try!(build_set.build().map_err(|err| {
Error::Glob(err.to_string())
}));
Ok(Types {
defs: defs,
selections: selections,
has_selected: has_selected,
glob_to_selection: glob_to_selection,
set: set,
matches: Arc::new(ThreadLocal::default()),
})
}
/// Return the set of current file type definitions.
///
/// Definitions and globs are sorted.
pub fn definitions(&self) -> Vec<FileTypeDef> {
let mut defs = vec![];
for def in self.types.values() {
let mut def = def.clone();
def.globs.sort();
defs.push(def);
}
defs.sort_by(|def1, def2| def1.name().cmp(def2.name()));
defs
}
/// Select the file type given by `name`.
///
/// If `name` is `all`, then all file types currently defined are selected.
pub fn select(&mut self, name: &str) -> &mut TypesBuilder {
if name == "all" {
for name in self.types.keys() {
self.selections.push(Selection::Select(name.to_string(), ()));
}
} else {
self.selections.push(Selection::Select(name.to_string(), ()));
}
self
}
/// Ignore the file type given by `name`.
///
/// If `name` is `all`, then all file types currently defined are negated.
pub fn negate(&mut self, name: &str) -> &mut TypesBuilder {
if name == "all" {
for name in self.types.keys() {
self.selections.push(Selection::Negate(name.to_string(), ()));
}
} else {
self.selections.push(Selection::Negate(name.to_string(), ()));
}
self
}
/// Clear any file type definitions for the type name given.
pub fn clear(&mut self, name: &str) -> &mut TypesBuilder {
self.types.remove(name);
self
}
/// Add a new file type definition. `name` can be arbitrary and `pat`
/// should be a glob recognizing file paths belonging to the `name` type.
///
/// If `name` is `all` or otherwise contains a `:`, then an error is
/// returned.
pub fn add(&mut self, name: &str, glob: &str) -> Result<(), Error> {
if name == "all" || name.contains(':') {
return Err(Error::InvalidDefinition);
}
let (key, glob) = (name.to_string(), glob.to_string());
self.types.entry(key).or_insert_with(|| {
FileTypeDef { name: name.to_string(), globs: vec![] }
}).globs.push(glob);
Ok(())
}
/// Add a new file type definition specified in string form. The format
/// is `name:glob`. Names may not include a colon.
pub fn add_def(&mut self, def: &str) -> Result<(), Error> {
let name: String = def.chars().take_while(|&c| c != ':').collect();
let pat: String = def.chars().skip(name.chars().count() + 1).collect();
if name.is_empty() || pat.is_empty() {
return Err(Error::InvalidDefinition);
}
self.add(&name, &pat)
}
/// Add a set of default file type definitions.
pub fn add_defaults(&mut self) -> &mut TypesBuilder {
static MSG: &'static str = "adding a default type should never fail";
for &(name, exts) in DEFAULT_TYPES {
for ext in exts {
self.add(name, ext).expect(MSG);
}
}
self
}
}
#[cfg(test)]
mod tests {
use super::TypesBuilder;
macro_rules! matched {
($name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr) => {
matched!($name, $types, $sel, $selnot, $path, true);
};
(not, $name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr) => {
matched!($name, $types, $sel, $selnot, $path, false);
};
($name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr, $matched:expr) => {
#[test]
fn $name() {
let mut btypes = TypesBuilder::new();
for tydef in $types {
btypes.add_def(tydef).unwrap();
}
for sel in $sel {
btypes.select(sel);
}
for selnot in $selnot {
btypes.negate(selnot);
}
let types = btypes.build().unwrap();
let mat = types.matched($path, false);
assert_eq!($matched, !mat.is_ignore());
}
};
}
fn types() -> Vec<&'static str> {
vec![
"html:*.html",
"html:*.htm",
"rust:*.rs",
"js:*.js",
"foo:*.{rs,foo}",
]
}
matched!(match1, types(), vec!["rust"], vec![], "lib.rs");
matched!(match2, types(), vec!["html"], vec![], "index.html");
matched!(match3, types(), vec!["html"], vec![], "index.htm");
matched!(match4, types(), vec!["html", "rust"], vec![], "main.rs");
matched!(match5, types(), vec![], vec![], "index.html");
matched!(match6, types(), vec![], vec!["rust"], "index.html");
matched!(match7, types(), vec!["foo"], vec!["rust"], "main.foo");
matched!(not, matchnot1, types(), vec!["rust"], vec![], "index.html");
matched!(not, matchnot2, types(), vec![], vec!["rust"], "main.rs");
matched!(not, matchnot3, types(), vec!["foo"], vec!["rust"], "main.rs");
matched!(not, matchnot4, types(), vec!["rust"], vec!["foo"], "main.rs");
matched!(not, matchnot5, types(), vec!["rust"], vec!["foo"], "main.foo");
}

1378
ignore/src/walk.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
# Contributor: Andrew Gallant <jamslam@gmail.com>
# Maintainer: Andrew Gallant
pkgname=ripgrep
pkgver=0.1.16
pkgver=0.2.3
pkgrel=1
pkgdesc="A search tool that combines the usability of The Silver Searcher with the raw speed of grep."
arch=('i686' 'x86_64')
@@ -9,7 +9,7 @@ url="https://github.com/BurntSushi/ripgrep"
license=('UNLICENSE')
makedepends=('cargo')
source=("https://github.com/BurntSushi/$pkgname/archive/$pkgver.tar.gz")
sha256sums=('6f877018742c9a7557102ccebeedb40d7c779b470a5910a7bdab50ca2ce21532')
sha256sums=('a88531558d2023df76190ea2e52bee50d739eabece8a57df29abbad0c6bdb917')
build() {
cd "$pkgname-$pkgver"
@@ -29,8 +29,9 @@ package() {
install -Dm755 "target/release/rg" "$pkgdir/usr/bin/rg"
install -Dm644 "doc/rg.1" "$pkgdir/usr/share/man/man1/rg.1"
install -Dm644 "README-NEW.md" "$pkgdir/usr/share/doc/ripgrep/README.md"
install -Dm644 "README.md" "$pkgdir/usr/share/doc/ripgrep/README.md"
install -Dm644 "COPYING" "$pkgdir/usr/share/doc/ripgrep/COPYING"
install -Dm644 "LICENSE-MIT" "$pkgdir/usr/share/doc/ripgrep/LICENSE-MIT"
install -Dm644 "UNLICENSE" "$pkgdir/usr/share/doc/ripgrep/UNLICENSE"
install -Dm644 "CHANGELOG.md" "$pkgdir/usr/share/doc/ripgrep/CHANGELOG.md"
}

14
pkg/brew/ripgrep-bin.rb Normal file
View File

@@ -0,0 +1,14 @@
class RipgrepBin < Formula
version '0.2.8'
desc "Search tool like grep and The Silver Searcher."
homepage "https://github.com/BurntSushi/ripgrep"
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
sha256 "349aba7561028e869932bae8fd27cd5ce45a68f47f05d426d6701a50a8474aa0"
conflicts_with "ripgrep"
def install
bin.install "rg"
man1.install "rg.1"
end
end

View File

@@ -1,19 +0,0 @@
require 'formula'
class Ripgrep < Formula
version '0.2.0'
desc "Search tool like grep and The Silver Searcher."
homepage "https://github.com/BurntSushi/ripgrep"
if Hardware::CPU.is_64_bit?
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
sha256 "f55ef5dac04178bcae0d6c5ba2d09690d326e8c7c3f28e561025b04e1ab81d80"
else
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-i686-apple-darwin.tar.gz"
sha256 "d901d55ccb48c19067f563d42652dfd8642bf50d28a40c0e2a4d3e866857a93b"
end
def install
bin.install "rg"
man1.install "rg.1"
end
end

455
src/app.rs Normal file
View File

@@ -0,0 +1,455 @@
use std::collections::HashMap;
use clap::{App, AppSettings, Arg};
const ABOUT: &'static str = "
ripgrep (rg) recursively searches your current directory for a regex pattern.
Project home page: https://github.com/BurntSushi/ripgrep
Use -h for short descriptions and --help for more details.";
const USAGE: &'static str = "
rg [OPTIONS] <pattern> [<path> ...]
rg [OPTIONS] [-e PATTERN | -f FILE ]... [<path> ...]
rg [OPTIONS] --files [<path> ...]
rg [OPTIONS] --type-list";
const TEMPLATE: &'static str = "\
{bin} {version}
{author}
{about}
USAGE:{usage}
ARGS:
{positionals}
OPTIONS:
{unified}";
/// Build a clap application with short help strings.
pub fn app_short() -> App<'static, 'static> {
app(false, |k| USAGES[k].short)
}
/// Build a clap application with long help strings.
pub fn app_long() -> App<'static, 'static> {
app(true, |k| USAGES[k].long)
}
/// Build a clap application parameterized by usage strings.
///
/// The function given should take a clap argument name and return a help
/// string. `app` will panic if a usage string is not defined.
///
/// This is an intentionally stand-alone module so that it can be used easily
/// in a `build.rs` script to build shell completion files.
fn app<F>(next_line_help: bool, doc: F) -> App<'static, 'static>
where F: Fn(&'static str) -> &'static str {
let arg = |name| {
Arg::with_name(name).help(doc(name)).next_line_help(next_line_help)
};
let flag = |name| arg(name).long(name);
App::new("ripgrep")
.author(crate_authors!())
.version(crate_version!())
.about(ABOUT)
.max_term_width(100)
.setting(AppSettings::UnifiedHelpMessage)
.usage(USAGE)
.template(TEMPLATE)
// Handle help/version manually to make their output formatting
// consistent with short/long views.
.arg(arg("help-short").short("h"))
.arg(flag("help"))
.arg(flag("version").short("V"))
// First, set up primary positional/flag arguments.
.arg(arg("pattern")
.required_unless_one(&[
"file", "files", "help-short", "help", "regexp", "type-list",
"version",
]))
.arg(arg("path").multiple(true))
.arg(flag("regexp").short("e")
.takes_value(true).multiple(true).number_of_values(1)
.value_name("pattern"))
.arg(flag("files")
// This should also conflict with `pattern`, but the first file
// path will actually be in `pattern`.
.conflicts_with_all(&["file", "regexp", "type-list"]))
.arg(flag("type-list")
.conflicts_with_all(&["file", "files", "pattern", "regexp"]))
// Second, set up common flags.
.arg(flag("text").short("a"))
.arg(flag("count").short("c"))
.arg(flag("color")
.value_name("WHEN")
.takes_value(true)
.hide_possible_values(true)
.possible_values(&["never", "auto", "always", "ansi"]))
.arg(flag("colors").value_name("SPEC")
.takes_value(true).multiple(true).number_of_values(1))
.arg(flag("fixed-strings").short("F"))
.arg(flag("glob").short("g")
.takes_value(true).multiple(true).number_of_values(1)
.value_name("GLOB"))
.arg(flag("ignore-case").short("i"))
.arg(flag("line-number").short("n"))
.arg(flag("no-line-number").short("N"))
.arg(flag("quiet").short("q"))
.arg(flag("type").short("t")
.takes_value(true).multiple(true).number_of_values(1)
.value_name("TYPE"))
.arg(flag("type-not").short("T")
.takes_value(true).multiple(true).number_of_values(1)
.value_name("TYPE"))
.arg(flag("unrestricted").short("u")
.multiple(true))
.arg(flag("invert-match").short("v"))
.arg(flag("word-regexp").short("w"))
// Third, set up less common flags.
.arg(flag("after-context").short("A")
.value_name("NUM").takes_value(true)
.validator(validate_number))
.arg(flag("before-context").short("B")
.value_name("NUM").takes_value(true)
.validator(validate_number))
.arg(flag("context").short("C")
.value_name("NUM").takes_value(true)
.validator(validate_number))
.arg(flag("column"))
.arg(flag("context-separator").value_name("ARG").takes_value(true))
.arg(flag("debug"))
.arg(flag("file").short("f")
.value_name("FILE").takes_value(true)
.multiple(true).number_of_values(1))
.arg(flag("files-with-matches").short("l"))
.arg(flag("files-without-match"))
.arg(flag("with-filename").short("H"))
.arg(flag("no-filename"))
.arg(flag("heading"))
.arg(flag("no-heading"))
.arg(flag("hidden"))
.arg(flag("ignore-file")
.value_name("FILE").takes_value(true)
.multiple(true).number_of_values(1))
.arg(flag("follow").short("L"))
.arg(flag("max-count")
.short("m").value_name("NUM").takes_value(true)
.validator(validate_number))
.arg(flag("maxdepth")
.value_name("NUM").takes_value(true)
.validator(validate_number))
.arg(flag("mmap"))
.arg(flag("no-messages"))
.arg(flag("no-mmap"))
.arg(flag("no-ignore"))
.arg(flag("no-ignore-parent"))
.arg(flag("no-ignore-vcs"))
.arg(flag("null"))
.arg(flag("pretty").short("p"))
.arg(flag("replace").short("r").value_name("ARG").takes_value(true))
.arg(flag("case-sensitive").short("s"))
.arg(flag("smart-case").short("S"))
.arg(flag("threads")
.short("j").value_name("ARG").takes_value(true)
.validator(validate_number))
.arg(flag("vimgrep"))
.arg(flag("type-add")
.value_name("TYPE").takes_value(true)
.multiple(true).number_of_values(1))
.arg(flag("type-clear")
.value_name("TYPE").takes_value(true)
.multiple(true).number_of_values(1))
}
struct Usage {
short: &'static str,
long: &'static str,
}
macro_rules! doc {
($map:expr, $name:expr, $short:expr) => {
doc!($map, $name, $short, $short)
};
($map:expr, $name:expr, $short:expr, $long:expr) => {
$map.insert($name, Usage {
short: $short,
long: concat!($long, "\n "),
});
};
}
lazy_static! {
static ref USAGES: HashMap<&'static str, Usage> = {
let mut h = HashMap::new();
doc!(h, "help-short",
"Show short help output.",
"Show short help output. Use --help to show more details.");
doc!(h, "help",
"Show verbose help output.",
"When given, more details about flags are provided.");
doc!(h, "version",
"Prints version information.");
doc!(h, "pattern",
"A regular expression used for searching.",
"A regular expression used for searching. Multiple patterns \
may be given. To match a pattern beginning with a -, use [-].");
doc!(h, "regexp",
"A regular expression used for searching.",
"A regular expression used for searching. Multiple patterns \
may be given. To match a pattern beginning with a -, use [-].");
doc!(h, "path",
"A file or directory to search.",
"A file or directory to search. Directories are searched \
recursively.");
doc!(h, "files",
"Print each file that would be searched.",
"Print each file that would be searched without actually \
performing the search. This is useful to determine whether a \
particular file is being searched or not.");
doc!(h, "type-list",
"Show all supported file types.",
"Show all supported file types and their corresponding globs.");
doc!(h, "text",
"Search binary files as if they were text.");
doc!(h, "count",
"Only show count of matches for each file.");
doc!(h, "color",
"When to use color. [default: auto]",
"When to use color in the output. The possible values are \
never, auto, always or ansi. The default is auto. When always \
is used, coloring is attempted based on your environment. When \
ansi used, coloring is forcefully done using ANSI escape color \
codes.");
doc!(h, "colors",
"Configure color settings and styles.",
"This flag specifies color settings for use in the output. \
This flag may be provided multiple times. Settings are applied \
iteratively. Colors are limited to one of eight choices: \
red, blue, green, cyan, magenta, yellow, white and black. \
Styles are limited to either nobold or bold.\n\nThe format \
of the flag is {type}:{attribute}:{value}. {type} should be \
one of path, line or match. {attribute} can be fg, bg or style. \
{value} is either a color (for fg and bg) or a text style. \
A special format, {type}:none, will clear all color settings \
for {type}.\n\nFor example, the following command will change \
the match color to magenta and the background color for line \
numbers to yellow:\n\n\
rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo.");
doc!(h, "fixed-strings",
"Treat the pattern as a literal string.",
"Treat the pattern as a literal string instead of a regular \
expression. When this flag is used, special regular expression \
meta characters such as (){}*+. do not need to be escaped.");
doc!(h, "glob",
"Include or exclude files/directories.",
"Include or exclude files/directories for searching that \
match the given glob. This always overrides any other \
ignore logic. Multiple glob flags may be used. Globbing \
rules match .gitignore globs. Precede a glob with a ! \
to exclude it.");
doc!(h, "ignore-case",
"Case insensitive search.",
"Case insensitive search. This is overridden by \
--case-sensitive.");
doc!(h, "line-number",
"Show line numbers.",
"Show line numbers (1-based). This is enabled by default when \
searching in a tty.");
doc!(h, "no-line-number",
"Suppress line numbers.",
"Suppress line numbers. This is enabled by default when NOT \
searching in a tty.");
doc!(h, "quiet",
"Do not print anything to stdout.",
"Do not print anything to stdout. If a match is found in a file, \
stop searching. This is useful when ripgrep is used only for \
its exit code.");
doc!(h, "type",
"Only search files matching TYPE.",
"Only search files matching TYPE. Multiple type flags may be \
provided. Use the --type-list flag to list all available \
types.");
doc!(h, "type-not",
"Do not search files matching TYPE.",
"Do not search files matching TYPE. Multiple type-not flags may \
be provided. Use the --type-list flag to list all available \
types.");
doc!(h, "unrestricted",
"Reduce the level of \"smart\" searching.",
"Reduce the level of \"smart\" searching. A single -u \
won't respect .gitignore (etc.) files. Two -u flags will \
additionally search hidden files and directories. Three \
-u flags will additionally search binary files. -uu is \
roughly equivalent to grep -r and -uuu is roughly \
equivalent to grep -a -r.");
doc!(h, "invert-match",
"Invert matching.",
"Invert matching. Show lines that don't match given patterns.");
doc!(h, "word-regexp",
"Only show matches surrounded by word boundaries.",
"Only show matches surrounded by word boundaries. This is \
equivalent to putting \\b before and after all of the search \
patterns.");
doc!(h, "after-context",
"Show NUM lines after each match.");
doc!(h, "before-context",
"Show NUM lines before each match.");
doc!(h, "context",
"Show NUM lines before and after each match.");
doc!(h, "column",
"Show column numbers",
"Show column numbers (1-based). This only shows the column \
numbers for the first match on each line. This does not try \
to account for Unicode. One byte is equal to one column.");
doc!(h, "context-separator",
"Set the context separator string. [default: --]",
"The string used to separate non-contiguous context lines in the \
output. Escape sequences like \\x7F or \\t may be used. The \
default value is --.");
doc!(h, "debug",
"Show debug messages.",
"Show debug messages. Please use this when filing a bug report.");
doc!(h, "file",
"Search for patterns from the given file.",
"Search for patterns from the given file, with one pattern per \
line. When this flag is used or multiple times or in \
combination with the -e/--regexp flag, then all patterns \
provided are searched. Empty pattern lines will match all input \
lines, and the newline is not counted as part of the pattern.");
doc!(h, "files-with-matches",
"Only show the path of each file with at least one match.");
doc!(h, "files-without-match",
"Only show the path of each file that contains zero matches.");
doc!(h, "with-filename",
"Show file name for each match.",
"Prefix each match with the file name that contains it. This is \
the default when more than one file is searched.");
doc!(h, "no-filename",
"Never show the file name for a match.",
"Never show the file name for a match. This is the default when \
one file is searched.");
doc!(h, "heading",
"Show matches grouped by each file.",
"This shows the file name above clusters of matches from each \
file. This is the default mode at a tty.");
doc!(h, "no-heading",
"Don't group matches by each file.",
"Don't group matches by each file. This is the default mode \
when not at a tty.");
doc!(h, "hidden",
"Search hidden files and directories.",
"Search hidden files and directories. By default, hidden files \
and directories are skipped.");
doc!(h, "ignore-file",
"Specify additional ignore files.",
"Specify additional ignore files for filtering file paths. \
Ignore files should be in the gitignore format and are matched \
relative to the current working directory. These ignore files \
have lower precedence than all other ignore files. When \
specifying multiple ignore files, earlier files have lower \
precedence than later files.");
doc!(h, "follow",
"Follow symbolic links.");
doc!(h, "max-count",
"Limit the number of matches.",
"Limit the number of matching lines per file searched to NUM.");
doc!(h, "maxdepth",
"Descend at most NUM directories.",
"Limit the depth of directory traversal to NUM levels beyond \
the paths given. A value of zero only searches the \
starting-points themselves.\n\nFor example, \
'rg --maxdepth 0 dir/' is a no-op because dir/ will not be \
descended into. 'rg --maxdepth 1 dir/' will search only the \
direct children of dir/.");
doc!(h, "mmap",
"Searching using memory maps when possible.",
"Search using memory maps when possible. This is enabled by \
default when ripgrep thinks it will be faster. Note that memory \
map searching doesn't currently support all options, so if an \
incompatible option (e.g., --context) is given with --mmap, \
then memory maps will not be used.");
doc!(h, "no-messages",
"Suppress all error messages.",
"Suppress all error messages. This is equivalent to redirecting \
stderr to /dev/null.");
doc!(h, "no-mmap",
"Never use memory maps.",
"Never use memory maps, even when they might be faster.");
doc!(h, "no-ignore",
"Don't respect ignore files.",
"Don't respect ignore files (.gitignore, .ignore, etc.). This \
implies --no-ignore-parent and --no-ignore-vcs.");
doc!(h, "no-ignore-parent",
"Don't respect ignore files in parent directories.",
"Don't respect ignore files (.gitignore, .ignore, etc.) in \
parent directories.");
doc!(h, "no-ignore-vcs",
"Don't respect VCS ignore files",
"Don't respect version control ignore files (.gitignore, etc.). \
This implies --no-ignore-parent. Note that .ignore files will \
continue to be respected.");
doc!(h, "null",
"Print NUL byte after file names",
"Whenever a file name is printed, follow it with a NUL byte. \
This includes printing file names before matches, and when \
printing a list of matching files such as with --count, \
--files-with-matches and --files. This option is useful for use \
with xargs.");
doc!(h, "pretty",
"Alias for --color always --heading -n.");
doc!(h, "replace",
"Replace matches with string given.",
"Replace every match with the string given when printing \
results. Neither this flag nor any other flag will modify your \
files.\n\nCapture group indices (e.g., $5) and names \
(e.g., $foo) are supported in the replacement string.");
doc!(h, "case-sensitive",
"Search case sensitively.",
"Search case sensitively. This overrides -i/--ignore-case and \
-S/--smart-case.");
doc!(h, "smart-case",
"Smart case search.",
"Searches case insensitively if the pattern is all lowercase. \
Search case sensitively otherwise. This is overridden by \
either -s/--case-sensitive or -i/--ignore-case.");
doc!(h, "threads",
"The approximate number of threads to use.",
"The approximate number of threads to use. A value of 0 (which \
is the default) causes ripgrep to choose the thread count \
using heuristics.");
doc!(h, "vimgrep",
"Show results in vim compatible format.",
"Show results with every match on its own line, including \
line numbers and column numbers. With this option, a line with \
more than one match will be printed more than once.");
doc!(h, "type-add",
"Add a new glob for a file type.",
"Add a new glob for a particular file type. Only one glob can be \
added at a time. Multiple --type-add flags can be provided. \
Unless --type-clear is used, globs are added to any existing \
globs defined inside of ripgrep.\n\nNote that this MUST be \
passed to every invocation of ripgrep. Type settings are NOT \
persisted.\n\nExample: \
rg --type-add 'foo:*.foo' -tfoo PATTERN.");
doc!(h, "type-clear",
"Clear globs for given file type.",
"Clear the file type globs previously defined for TYPE. This \
only clears the default tpye definitions that are found inside \
of ripgrep.\n\nNote that this MUST be passed to every \
invocation of ripgrep. Type settings are NOT persisted.");
h
};
}
fn validate_number(s: String) -> Result<(), String> {
s.parse::<usize>().map(|_|()).map_err(|err| err.to_string())
}

File diff suppressed because it is too large Load Diff

View File

@@ -4,6 +4,11 @@ from (or to) a terminal. Windows and Unix do this differently, so implement
both here.
*/
#[cfg(windows)]
use winapi::minwindef::DWORD;
#[cfg(windows)]
use winapi::winnt::HANDLE;
#[cfg(unix)]
pub fn stdin_is_readable() -> bool {
use std::fs::File;
@@ -44,26 +49,104 @@ pub fn on_stdout() -> bool {
/// Returns true if there is a tty on stdin.
#[cfg(windows)]
pub fn on_stdin() -> bool {
// BUG: https://github.com/BurntSushi/ripgrep/issues/19
// It's not clear to me how to determine whether there is a tty on stdin.
// Checking GetConsoleMode(GetStdHandle(stdin)) != 0 appears to report
// that stdin is a pipe, even if it's not in a cygwin terminal, for
// example.
//
// To fix this, we just assume there is always a tty on stdin. If Windows
// users need to search stdin, they'll have to pass -. Ug.
true
use kernel32::GetStdHandle;
use winapi::winbase::{
STD_INPUT_HANDLE, STD_OUTPUT_HANDLE, STD_ERROR_HANDLE,
};
unsafe {
let stdin = GetStdHandle(STD_INPUT_HANDLE);
if console_on_handle(stdin) {
// False positives aren't possible. If we got a console then
// we definitely have a tty on stdin.
return true;
}
// Otherwise, it's possible to get a false negative. If we know that
// there's a console on stdout or stderr however, then this is a true
// negative.
if console_on_fd(STD_OUTPUT_HANDLE)
|| console_on_fd(STD_ERROR_HANDLE) {
return false;
}
// Otherwise, we can't really tell, so we do a weird hack.
msys_tty_on_handle(stdin)
}
}
/// Returns true if there is a tty on stdout.
#[cfg(windows)]
pub fn on_stdout() -> bool {
use kernel32;
use winapi;
use kernel32::GetStdHandle;
use winapi::winbase::{
STD_INPUT_HANDLE, STD_OUTPUT_HANDLE, STD_ERROR_HANDLE,
};
unsafe {
let fd = winapi::winbase::STD_OUTPUT_HANDLE;
let mut out = 0;
kernel32::GetConsoleMode(kernel32::GetStdHandle(fd), &mut out) != 0
let stdout = GetStdHandle(STD_OUTPUT_HANDLE);
if console_on_handle(stdout) {
// False positives aren't possible. If we got a console then
// we definitely have a tty on stdout.
return true;
}
// Otherwise, it's possible to get a false negative. If we know that
// there's a console on stdin or stderr however, then this is a true
// negative.
if console_on_fd(STD_INPUT_HANDLE) || console_on_fd(STD_ERROR_HANDLE) {
return false;
}
// Otherwise, we can't really tell, so we do a weird hack.
msys_tty_on_handle(stdout)
}
}
/// Returns true if there is an MSYS tty on the given handle.
#[cfg(windows)]
fn msys_tty_on_handle(handle: HANDLE) -> bool {
use std::ffi::OsString;
use std::mem;
use std::os::raw::c_void;
use std::os::windows::ffi::OsStringExt;
use std::slice;
use kernel32::{GetFileInformationByHandleEx};
use winapi::fileapi::FILE_NAME_INFO;
use winapi::minwinbase::FileNameInfo;
use winapi::minwindef::MAX_PATH;
unsafe {
let size = mem::size_of::<FILE_NAME_INFO>();
let mut name_info_bytes = vec![0u8; size + MAX_PATH];
let res = GetFileInformationByHandleEx(
handle,
FileNameInfo,
&mut *name_info_bytes as *mut _ as *mut c_void,
name_info_bytes.len() as u32);
if res == 0 {
return true;
}
let name_info: FILE_NAME_INFO =
*(name_info_bytes[0..size].as_ptr() as *const FILE_NAME_INFO);
let name_bytes =
&name_info_bytes[size..size + name_info.FileNameLength as usize];
let name_u16 = slice::from_raw_parts(
name_bytes.as_ptr() as *const u16, name_bytes.len() / 2);
let name = OsString::from_wide(name_u16)
.as_os_str().to_string_lossy().into_owned();
name.contains("msys-") || name.contains("-pty")
}
}
/// Returns true if there is a console on the given file descriptor.
#[cfg(windows)]
unsafe fn console_on_fd(fd: DWORD) -> bool {
use kernel32::GetStdHandle;
console_on_handle(GetStdHandle(fd))
}
/// Returns true if there is a console on the given handle.
#[cfg(windows)]
fn console_on_handle(handle: HANDLE) -> bool {
use kernel32::GetConsoleMode;
let mut out = 0;
unsafe { GetConsoleMode(handle, &mut out) != 0 }
}

View File

@@ -1,438 +0,0 @@
/*!
The gitignore module provides a way of reading a gitignore file and applying
it to a particular file name to determine whether it should be ignore or not.
The motivation for this submodule is performance and portability:
1. There is a gitignore crate on crates.io, but it uses the standard `glob`
crate and checks patterns one-by-one. This is a reasonable implementation,
but not suitable for the performance we need here.
2. We could shell out to a `git` sub-command like ls-files or status, but it
seems better to not rely on the existence of external programs for a search
tool. Besides, we need to implement this logic anyway to support things like
an .ignore file.
The key implementation detail here is that a single gitignore file is compiled
into a single RegexSet, which can be used to report which globs match a
particular file name. We can then do a quick post-processing step to implement
additional rules such as whitelists (prefix of `!`) or directory-only globs
(suffix of `/`).
*/
// TODO(burntsushi): Implement something similar, but for Mercurial. We can't
// use this exact implementation because hgignore files are different.
use std::cell::RefCell;
use std::error::Error as StdError;
use std::fmt;
use std::fs::File;
use std::io::{self, BufRead};
use std::path::{Path, PathBuf};
use regex;
use glob;
use pathutil::{is_file_name, strip_prefix};
/// Represents an error that can occur when parsing a gitignore file.
#[derive(Debug)]
pub enum Error {
Glob(glob::Error),
Regex(regex::Error),
Io(io::Error),
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Error::Glob(ref err) => err.description(),
Error::Regex(ref err) => err.description(),
Error::Io(ref err) => err.description(),
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::Glob(ref err) => err.fmt(f),
Error::Regex(ref err) => err.fmt(f),
Error::Io(ref err) => err.fmt(f),
}
}
}
impl From<glob::Error> for Error {
fn from(err: glob::Error) -> Error {
Error::Glob(err)
}
}
impl From<regex::Error> for Error {
fn from(err: regex::Error) -> Error {
Error::Regex(err)
}
}
impl From<io::Error> for Error {
fn from(err: io::Error) -> Error {
Error::Io(err)
}
}
/// Gitignore is a matcher for the glob patterns in a single gitignore file.
#[derive(Clone, Debug)]
pub struct Gitignore {
set: glob::Set,
root: PathBuf,
patterns: Vec<Pattern>,
num_ignores: u64,
num_whitelist: u64,
}
impl Gitignore {
/// Create a new gitignore glob matcher from the given root directory and
/// string containing the contents of a gitignore file.
#[allow(dead_code)]
fn from_str<P: AsRef<Path>>(
root: P,
gitignore: &str,
) -> Result<Gitignore, Error> {
let mut builder = GitignoreBuilder::new(root);
try!(builder.add_str(gitignore));
builder.build()
}
/// Returns true if and only if the given file path should be ignored
/// according to the globs in this gitignore. `is_dir` should be true if
/// the path refers to a directory and false otherwise.
///
/// Before matching path, its prefix (as determined by a common suffix
/// of the directory containing this gitignore) is stripped. If there is
/// no common suffix/prefix overlap, then path is assumed to reside in the
/// same directory as this gitignore file.
pub fn matched<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> Match {
let mut path = path.as_ref();
if let Some(p) = strip_prefix("./", path) {
path = p;
}
// Strip any common prefix between the candidate path and the root
// of the gitignore, to make sure we get relative matching right.
// BUT, a file name might not have any directory components to it,
// in which case, we don't want to accidentally strip any part of the
// file name.
if !is_file_name(path) {
if let Some(p) = strip_prefix(&self.root, path) {
path = p;
}
}
if let Some(p) = strip_prefix("/", path) {
path = p;
}
self.matched_stripped(path, is_dir)
}
/// Like matched, but takes a path that has already been stripped.
pub fn matched_stripped(&self, path: &Path, is_dir: bool) -> Match {
thread_local! {
static MATCHES: RefCell<Vec<usize>> = {
RefCell::new(vec![])
}
};
MATCHES.with(|matches| {
let mut matches = matches.borrow_mut();
self.set.matches_into(path, &mut *matches);
for &i in matches.iter().rev() {
let pat = &self.patterns[i];
if !pat.only_dir || is_dir {
return if pat.whitelist {
Match::Whitelist(pat)
} else {
Match::Ignored(pat)
};
}
}
Match::None
})
}
/// Returns the total number of ignore patterns.
pub fn num_ignores(&self) -> u64 {
self.num_ignores
}
}
/// The result of a glob match.
///
/// The lifetime `'a` refers to the lifetime of the pattern that resulted in
/// a match (whether ignored or whitelisted).
#[derive(Clone, Debug)]
pub enum Match<'a> {
/// The path didn't match any glob in the gitignore file.
None,
/// The last glob matched indicates the path should be ignored.
Ignored(&'a Pattern),
/// The last glob matched indicates the path should be whitelisted.
Whitelist(&'a Pattern),
}
impl<'a> Match<'a> {
/// Returns true if the match result implies the path should be ignored.
#[allow(dead_code)]
pub fn is_ignored(&self) -> bool {
match *self {
Match::Ignored(_) => true,
Match::None | Match::Whitelist(_) => false,
}
}
/// Returns true if the match result didn't match any globs.
pub fn is_none(&self) -> bool {
match *self {
Match::None => true,
Match::Ignored(_) | Match::Whitelist(_) => false,
}
}
/// Inverts the match so that Ignored becomes Whitelisted and Whitelisted
/// becomes Ignored. A non-match remains the same.
pub fn invert(self) -> Match<'a> {
match self {
Match::None => Match::None,
Match::Ignored(pat) => Match::Whitelist(pat),
Match::Whitelist(pat) => Match::Ignored(pat),
}
}
}
/// GitignoreBuilder constructs a matcher for a single set of globs from a
/// .gitignore file.
pub struct GitignoreBuilder {
builder: glob::SetBuilder,
root: PathBuf,
patterns: Vec<Pattern>,
}
/// Pattern represents a single pattern in a gitignore file. It doesn't
/// know how to do glob matching directly, but it does store additional
/// options on a pattern, such as whether it's whitelisted.
#[derive(Clone, Debug)]
pub struct Pattern {
/// The file path that this pattern was extracted from (may be empty).
pub from: PathBuf,
/// The original glob pattern string.
pub original: String,
/// The actual glob pattern string used to convert to a regex.
pub pat: String,
/// Whether this is a whitelisted pattern or not.
pub whitelist: bool,
/// Whether this pattern should only match directories or not.
pub only_dir: bool,
}
impl GitignoreBuilder {
/// Create a new builder for a gitignore file.
///
/// The path given should be the path at which the globs for this gitignore
/// file should be matched.
pub fn new<P: AsRef<Path>>(root: P) -> GitignoreBuilder {
let root = strip_prefix("./", root.as_ref()).unwrap_or(root.as_ref());
GitignoreBuilder {
builder: glob::SetBuilder::new(),
root: root.to_path_buf(),
patterns: vec![],
}
}
/// Builds a new matcher from the glob patterns added so far.
///
/// Once a matcher is built, no new glob patterns can be added to it.
pub fn build(self) -> Result<Gitignore, Error> {
let nignores = self.patterns.iter().filter(|p| !p.whitelist).count();
let nwhitelist = self.patterns.iter().filter(|p| p.whitelist).count();
Ok(Gitignore {
set: try!(self.builder.build()),
root: self.root,
patterns: self.patterns,
num_ignores: nignores as u64,
num_whitelist: nwhitelist as u64,
})
}
/// Add each pattern line from the file path given.
pub fn add_path<P: AsRef<Path>>(&mut self, path: P) -> Result<(), Error> {
let rdr = io::BufReader::new(try!(File::open(&path)));
for line in rdr.lines() {
try!(self.add(&path, &try!(line)));
}
Ok(())
}
/// Add each pattern line from the string given.
pub fn add_str(&mut self, gitignore: &str) -> Result<(), Error> {
for line in gitignore.lines() {
try!(self.add("", line));
}
Ok(())
}
/// Add a line from a gitignore file to this builder.
///
/// If the line could not be parsed as a glob, then an error is returned.
pub fn add<P: AsRef<Path>>(
&mut self,
from: P,
mut line: &str,
) -> Result<(), Error> {
if line.starts_with("#") {
return Ok(());
}
if !line.ends_with("\\ ") {
line = line.trim_right();
}
if line.is_empty() {
return Ok(());
}
let mut pat = Pattern {
from: from.as_ref().to_path_buf(),
original: line.to_string(),
pat: String::new(),
whitelist: false,
only_dir: false,
};
let mut opts = glob::MatchOptions::default();
let has_slash = line.chars().any(|c| c == '/');
let is_absolute = line.chars().nth(0).unwrap() == '/';
if line.starts_with("\\!") || line.starts_with("\\#") {
line = &line[1..];
} else {
if line.starts_with("!") {
pat.whitelist = true;
line = &line[1..];
}
if line.starts_with("/") {
// `man gitignore` says that if a glob starts with a slash,
// then the glob can only match the beginning of a path
// (relative to the location of gitignore). We achieve this by
// simply banning wildcards from matching /.
opts.require_literal_separator = true;
line = &line[1..];
}
}
// If it ends with a slash, then this should only match directories,
// but the slash should otherwise not be used while globbing.
if let Some((i, c)) = line.char_indices().rev().nth(0) {
if c == '/' {
pat.only_dir = true;
line = &line[..i];
}
}
// If there is a literal slash, then we note that so that globbing
// doesn't let wildcards match slashes.
pat.pat = line.to_string();
if has_slash {
opts.require_literal_separator = true;
}
// If there was a leading slash, then this is a pattern that must
// match the entire path name. Otherwise, we should let it match
// anywhere, so use a **/ prefix.
if !is_absolute {
// ... but only if we don't already have a **/ prefix.
if !pat.pat.starts_with("**/") {
pat.pat = format!("**/{}", pat.pat);
}
}
// If the pattern ends with `/**`, then we should only match everything
// inside a directory, but not the directory itself. Standard globs
// will match the directory. So we add `/*` to force the issue.
if pat.pat.ends_with("/**") {
pat.pat = format!("{}/*", pat.pat);
}
try!(self.builder.add_with(&pat.pat, &opts));
self.patterns.push(pat);
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::Gitignore;
macro_rules! ignored {
($name:ident, $root:expr, $gi:expr, $path:expr) => {
ignored!($name, $root, $gi, $path, false);
};
($name:ident, $root:expr, $gi:expr, $path:expr, $is_dir:expr) => {
#[test]
fn $name() {
let gi = Gitignore::from_str($root, $gi).unwrap();
assert!(gi.matched($path, $is_dir).is_ignored());
}
};
}
macro_rules! not_ignored {
($name:ident, $root:expr, $gi:expr, $path:expr) => {
not_ignored!($name, $root, $gi, $path, false);
};
($name:ident, $root:expr, $gi:expr, $path:expr, $is_dir:expr) => {
#[test]
fn $name() {
let gi = Gitignore::from_str($root, $gi).unwrap();
assert!(!gi.matched($path, $is_dir).is_ignored());
}
};
}
const ROOT: &'static str = "/home/foobar/rust/rg";
ignored!(ig1, ROOT, "months", "months");
ignored!(ig2, ROOT, "*.lock", "Cargo.lock");
ignored!(ig3, ROOT, "*.rs", "src/main.rs");
ignored!(ig4, ROOT, "src/*.rs", "src/main.rs");
ignored!(ig5, ROOT, "/*.c", "cat-file.c");
ignored!(ig6, ROOT, "/src/*.rs", "src/main.rs");
ignored!(ig7, ROOT, "!src/main.rs\n*.rs", "src/main.rs");
ignored!(ig8, ROOT, "foo/", "foo", true);
ignored!(ig9, ROOT, "**/foo", "foo");
ignored!(ig10, ROOT, "**/foo", "src/foo");
ignored!(ig11, ROOT, "**/foo/**", "src/foo/bar");
ignored!(ig12, ROOT, "**/foo/**", "wat/src/foo/bar/baz");
ignored!(ig13, ROOT, "**/foo/bar", "foo/bar");
ignored!(ig14, ROOT, "**/foo/bar", "src/foo/bar");
ignored!(ig15, ROOT, "abc/**", "abc/x");
ignored!(ig16, ROOT, "abc/**", "abc/x/y");
ignored!(ig17, ROOT, "abc/**", "abc/x/y/z");
ignored!(ig18, ROOT, "a/**/b", "a/b");
ignored!(ig19, ROOT, "a/**/b", "a/x/b");
ignored!(ig20, ROOT, "a/**/b", "a/x/y/b");
ignored!(ig21, ROOT, r"\!xy", "!xy");
ignored!(ig22, ROOT, r"\#foo", "#foo");
ignored!(ig23, ROOT, "foo", "./foo");
ignored!(ig24, ROOT, "target", "grep/target");
ignored!(ig25, ROOT, "Cargo.lock", "./tabwriter-bin/Cargo.lock");
ignored!(ig26, ROOT, "/foo/bar/baz", "./foo/bar/baz");
ignored!(ig27, ROOT, "foo/", "xyz/foo", true);
ignored!(ig28, ROOT, "src/*.rs", "src/grep/src/main.rs");
ignored!(ig29, "./src", "/llvm/", "./src/llvm", true);
ignored!(ig30, ROOT, "node_modules/ ", "node_modules", true);
not_ignored!(ignot1, ROOT, "amonths", "months");
not_ignored!(ignot2, ROOT, "monthsa", "months");
not_ignored!(ignot3, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot4, ROOT, "/*.c", "mozilla-sha1/sha1.c");
not_ignored!(ignot5, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot6, ROOT, "*.rs\n!src/main.rs", "src/main.rs");
not_ignored!(ignot7, ROOT, "foo/", "foo", false);
not_ignored!(ignot8, ROOT, "**/foo/**", "wat/src/afoo/bar/baz");
not_ignored!(ignot9, ROOT, "**/foo/**", "wat/src/fooa/bar/baz");
not_ignored!(ignot10, ROOT, "**/foo/bar", "foo/src/bar");
not_ignored!(ignot11, ROOT, "#foo", "#foo");
not_ignored!(ignot12, ROOT, "\n\n\n", "foo");
not_ignored!(ignot13, ROOT, "foo/**", "foo", true);
// See: https://github.com/BurntSushi/ripgrep/issues/106
#[test]
fn regression_106() {
Gitignore::from_str("/", " ").unwrap();
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,490 +0,0 @@
/*!
The ignore module is responsible for managing the state required to determine
whether a *single* file path should be searched or not.
In general, there are two ways to ignore a particular file:
1. Specify an ignore rule in some "global" configuration, such as a
$HOME/.ignore or on the command line.
2. A specific ignore file (like .gitignore) found during directory traversal.
The `IgnoreDir` type handles ignore patterns for any one particular directory
(including "global" ignore patterns), while the `Ignore` type handles a stack
of `IgnoreDir`s for use during directory traversal.
*/
use std::error::Error as StdError;
use std::ffi::OsString;
use std::fmt;
use std::io;
use std::path::{Path, PathBuf};
use gitignore::{self, Gitignore, GitignoreBuilder, Match, Pattern};
use pathutil::{file_name, is_hidden};
use types::Types;
const IGNORE_NAMES: &'static [&'static str] = &[
".gitignore",
".ignore",
".rgignore",
];
/// Represents an error that can occur when parsing a gitignore file.
#[derive(Debug)]
pub enum Error {
Gitignore(gitignore::Error),
Io {
path: PathBuf,
err: io::Error,
},
}
impl Error {
fn from_io<P: AsRef<Path>>(path: P, err: io::Error) -> Error {
Error::Io { path: path.as_ref().to_path_buf(), err: err }
}
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Error::Gitignore(ref err) => err.description(),
Error::Io { ref err, .. } => err.description(),
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::Gitignore(ref err) => err.fmt(f),
Error::Io { ref path, ref err } => {
write!(f, "{}: {}", path.display(), err)
}
}
}
}
impl From<gitignore::Error> for Error {
fn from(err: gitignore::Error) -> Error {
Error::Gitignore(err)
}
}
/// Ignore represents a collection of ignore patterns organized by directory.
/// In particular, a stack is maintained, where the top of the stack
/// corresponds to the current directory being searched and the bottom of the
/// stack represents the root of a search. Ignore patterns at the top of the
/// stack take precedence over ignore patterns at the bottom of the stack.
pub struct Ignore {
/// A stack of ignore patterns at each directory level of traversal.
/// A directory that contributes no ignore patterns is `None`.
stack: Vec<IgnoreDir>,
/// A stack of parent directories above the root of the current search.
parent_stack: Vec<IgnoreDir>,
/// A set of override globs that are always checked first. A match (whether
/// it's whitelist or blacklist) trumps anything in stack.
overrides: Overrides,
/// A file type matcher.
types: Types,
/// Whether to ignore hidden files or not.
ignore_hidden: bool,
/// When true, don't look at .gitignore or .ignore files for ignore
/// rules.
no_ignore: bool,
/// When true, don't look at .gitignore files for ignore rules.
no_ignore_vcs: bool,
}
impl Ignore {
/// Create an empty set of ignore patterns.
pub fn new() -> Ignore {
Ignore {
stack: vec![],
parent_stack: vec![],
overrides: Overrides::new(None),
types: Types::empty(),
ignore_hidden: true,
no_ignore: false,
no_ignore_vcs: true,
}
}
/// Set whether hidden files/folders should be ignored (defaults to true).
pub fn ignore_hidden(&mut self, yes: bool) -> &mut Ignore {
self.ignore_hidden = yes;
self
}
/// When set, ignore files are ignored.
pub fn no_ignore(&mut self, yes: bool) -> &mut Ignore {
self.no_ignore = yes;
self
}
/// When set, VCS ignore files are ignored.
pub fn no_ignore_vcs(&mut self, yes: bool) -> &mut Ignore {
self.no_ignore_vcs = yes;
self
}
/// Add a set of globs that overrides all other match logic.
pub fn add_override(&mut self, gi: Gitignore) -> &mut Ignore {
self.overrides = Overrides::new(Some(gi));
self
}
/// Add a file type matcher. The file type matcher has the lowest
/// precedence.
pub fn add_types(&mut self, types: Types) -> &mut Ignore {
self.types = types;
self
}
/// Push parent directories of `path` on to the stack.
pub fn push_parents<P: AsRef<Path>>(
&mut self,
path: P,
) -> Result<(), Error> {
let path = try!(path.as_ref().canonicalize().map_err(|err| {
Error::from_io(path.as_ref(), err)
}));
let mut path = &*path;
let mut saw_git = path.join(".git").is_dir();
let mut ignore_names = IGNORE_NAMES.to_vec();
if self.no_ignore_vcs {
ignore_names.retain(|&name| name != ".gitignore");
}
let mut ignore_dir_results = vec![];
while let Some(parent) = path.parent() {
if self.no_ignore {
ignore_dir_results.push(Ok(IgnoreDir::empty(parent)));
} else {
if saw_git {
ignore_names.retain(|&name| name != ".gitignore");
} else {
saw_git = parent.join(".git").is_dir();
}
let ignore_dir_result =
IgnoreDir::with_ignore_names(parent, ignore_names.iter());
ignore_dir_results.push(ignore_dir_result);
}
path = parent;
}
for ignore_dir_result in ignore_dir_results.into_iter().rev() {
self.parent_stack.push(try!(ignore_dir_result));
}
Ok(())
}
/// Add a directory to the stack.
///
/// Note that even if this returns an error, the directory is added to the
/// stack (and therefore should be popped).
pub fn push<P: AsRef<Path>>(&mut self, path: P) -> Result<(), Error> {
if self.no_ignore {
self.stack.push(IgnoreDir::empty(path));
Ok(())
} else if self.no_ignore_vcs {
self.push_ignore_dir(IgnoreDir::without_vcs(path))
} else {
self.push_ignore_dir(IgnoreDir::new(path))
}
}
/// Pushes the result of building a directory matcher on to the stack.
///
/// If the result given contains an error, then it is returned.
pub fn push_ignore_dir(
&mut self,
result: Result<IgnoreDir, Error>,
) -> Result<(), Error> {
match result {
Ok(id) => {
self.stack.push(id);
Ok(())
}
Err(err) => {
// Don't leave the stack in an inconsistent state.
self.stack.push(IgnoreDir::empty("error"));
Err(err)
}
}
}
/// Pop a directory from the stack.
///
/// This panics if the stack is empty.
pub fn pop(&mut self) {
self.stack.pop().expect("non-empty stack");
}
/// Returns true if and only if the given file path should be ignored.
pub fn ignored<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> bool {
let path = path.as_ref();
let mat = self.overrides.matched(path, is_dir);
if let Some(is_ignored) = self.ignore_match(path, mat) {
return is_ignored;
}
let mut whitelisted = false;
if !self.no_ignore {
for id in self.stack.iter().rev() {
let mat = id.matched(path, is_dir);
if let Some(is_ignored) = self.ignore_match(path, mat) {
if is_ignored {
return true;
}
// If this path is whitelisted by an ignore, then
// fallthrough and let the file type matcher have a say.
whitelisted = true;
break;
}
}
// If the file has been whitelisted, then we have to stop checking
// parent directories. The only thing that can override a whitelist
// at this point is a type filter.
if !whitelisted {
let mut path = path.to_path_buf();
for id in self.parent_stack.iter().rev() {
if let Some(ref dirname) = id.name {
path = Path::new(dirname).join(path);
}
let mat = id.matched(&*path, is_dir);
if let Some(is_ignored) = self.ignore_match(&*path, mat) {
if is_ignored {
return true;
}
// If this path is whitelisted by an ignore, then
// fallthrough and let the file type matcher have a
// say.
whitelisted = true;
break;
}
}
}
}
let mat = self.types.matched(path, is_dir);
if let Some(is_ignored) = self.ignore_match(path, mat) {
if is_ignored {
return true;
}
whitelisted = true;
}
if !whitelisted && self.ignore_hidden && is_hidden(&path) {
debug!("{} ignored because it is hidden", path.display());
return true;
}
false
}
/// Returns true if the given match says the given pattern should be
/// ignored or false if the given pattern should be explicitly whitelisted.
/// Returns None otherwise.
pub fn ignore_match<P: AsRef<Path>>(
&self,
path: P,
mat: Match,
) -> Option<bool> {
let path = path.as_ref();
match mat {
Match::Whitelist(ref pat) => {
debug!("{} whitelisted by {:?}", path.display(), pat);
Some(false)
}
Match::Ignored(ref pat) => {
debug!("{} ignored by {:?}", path.display(), pat);
Some(true)
}
Match::None => None,
}
}
}
/// IgnoreDir represents a set of ignore patterns retrieved from a single
/// directory.
#[derive(Debug)]
pub struct IgnoreDir {
/// The path to this directory as given.
path: PathBuf,
/// The directory name, if one exists.
name: Option<OsString>,
/// A single accumulation of glob patterns for this directory, matched
/// using gitignore semantics.
///
/// This will include patterns from rgignore as well. The patterns are
/// ordered so that precedence applies automatically (e.g., rgignore
/// patterns procede gitignore patterns).
gi: Option<Gitignore>,
// TODO(burntsushi): Matching other types of glob patterns that don't
// conform to gitignore will probably require refactoring this approach.
}
impl IgnoreDir {
/// Create a new matcher for the given directory.
pub fn new<P: AsRef<Path>>(path: P) -> Result<IgnoreDir, Error> {
IgnoreDir::with_ignore_names(path, IGNORE_NAMES.iter())
}
/// Create a new matcher for the given directory.
///
/// Don't respect VCS ignore files.
pub fn without_vcs<P: AsRef<Path>>(path: P) -> Result<IgnoreDir, Error> {
let names = IGNORE_NAMES.iter().filter(|name| **name != ".gitignore");
IgnoreDir::with_ignore_names(path, names)
}
/// Create a new IgnoreDir that never matches anything with the given path.
pub fn empty<P: AsRef<Path>>(path: P) -> IgnoreDir {
IgnoreDir {
path: path.as_ref().to_path_buf(),
name: file_name(path.as_ref()).map(|s| s.to_os_string()),
gi: None,
}
}
/// Create a new matcher for the given directory using only the ignore
/// patterns found in the file names given.
///
/// If no ignore glob patterns could be found in the directory then `None`
/// is returned.
///
/// Note that the order of the names given is meaningful. Names appearing
/// later in the list have precedence over names appearing earlier in the
/// list.
pub fn with_ignore_names<P: AsRef<Path>, S, I>(
path: P,
names: I,
) -> Result<IgnoreDir, Error>
where P: AsRef<Path>, S: AsRef<str>, I: Iterator<Item=S> {
let mut id = IgnoreDir::empty(path);
let mut ok = false;
let mut builder = GitignoreBuilder::new(&id.path);
// The ordering here is important. Later globs have higher precedence.
for name in names {
ok = builder.add_path(id.path.join(name.as_ref())).is_ok() || ok;
}
if !ok {
return Ok(id);
}
id.gi = Some(try!(builder.build()));
Ok(id)
}
/// Returns true if and only if the given file path should be ignored
/// according to the globs in this directory. `is_dir` should be true if
/// the path refers to a directory and false otherwise.
///
/// Before matching path, its prefix (as determined by a common suffix
/// of this directory) is stripped. If there is
/// no common suffix/prefix overlap, then path is assumed to reside
/// directly in this directory.
///
/// If the given path has a `./` prefix then it is stripped before
/// matching.
pub fn matched<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> Match {
self.gi.as_ref()
.map(|gi| gi.matched(path, is_dir))
.unwrap_or(Match::None)
}
}
/// Manages a set of overrides provided explicitly by the end user.
struct Overrides {
gi: Option<Gitignore>,
unmatched_pat: Pattern,
}
impl Overrides {
/// Creates a new set of overrides from the gitignore matcher provided.
/// If no matcher is provided, then the resulting overrides have no effect.
fn new(gi: Option<Gitignore>) -> Overrides {
Overrides {
gi: gi,
unmatched_pat: Pattern {
from: Path::new("<argv>").to_path_buf(),
original: "<none>".to_string(),
pat: "<none>".to_string(),
whitelist: false,
only_dir: false,
},
}
}
/// Returns a match for the given path against this set of overrides.
///
/// If there are no overrides, then this always returns Match::None.
///
/// If there is at least one positive override, then this never returns
/// Match::None (and interpreting non-matches as ignored) unless is_dir
/// is true.
pub fn matched<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> Match {
let path = path.as_ref();
self.gi.as_ref()
.map(|gi| {
let mat = gi.matched_stripped(path, is_dir).invert();
if mat.is_none() && !is_dir {
if gi.num_ignores() > 0 {
return Match::Ignored(&self.unmatched_pat);
}
}
mat
})
.unwrap_or(Match::None)
}
}
#[cfg(test)]
mod tests {
use std::path::Path;
use gitignore::GitignoreBuilder;
use super::IgnoreDir;
macro_rules! ignored_dir {
($name:ident, $root:expr, $gi:expr, $xi:expr, $path:expr) => {
#[test]
fn $name() {
let mut builder = GitignoreBuilder::new(&$root);
builder.add_str($gi).unwrap();
builder.add_str($xi).unwrap();
let gi = builder.build().unwrap();
let id = IgnoreDir {
path: Path::new($root).to_path_buf(),
name: Path::new($root).file_name().map(|s| {
s.to_os_string()
}),
gi: Some(gi),
};
assert!(id.matched($path, false).is_ignored());
}
};
}
macro_rules! not_ignored_dir {
($name:ident, $root:expr, $gi:expr, $xi:expr, $path:expr) => {
#[test]
fn $name() {
let mut builder = GitignoreBuilder::new(&$root);
builder.add_str($gi).unwrap();
builder.add_str($xi).unwrap();
let gi = builder.build().unwrap();
let id = IgnoreDir {
path: Path::new($root).to_path_buf(),
name: Path::new($root).file_name().map(|s| {
s.to_os_string()
}),
gi: Some(gi),
};
assert!(!id.matched($path, false).is_ignored());
}
};
}
const ROOT: &'static str = "/home/foobar/rust/rg";
ignored_dir!(id1, ROOT, "src/main.rs", "", "src/main.rs");
ignored_dir!(id2, ROOT, "", "src/main.rs", "src/main.rs");
ignored_dir!(id3, ROOT, "!src/main.rs", "*.rs", "src/main.rs");
not_ignored_dir!(idnot1, ROOT, "*.rs", "!src/main.rs", "src/main.rs");
}

View File

@@ -1,8 +1,12 @@
extern crate deque;
extern crate docopt;
#![allow(dead_code, unused_imports)]
extern crate bytecount;
#[macro_use]
extern crate clap;
extern crate ctrlc;
extern crate env_logger;
extern crate fnv;
extern crate grep;
extern crate ignore;
#[cfg(windows)]
extern crate kernel32;
#[macro_use]
@@ -14,35 +18,24 @@ extern crate memchr;
extern crate memmap;
extern crate num_cpus;
extern crate regex;
extern crate rustc_serialize;
extern crate term;
extern crate walkdir;
extern crate termcolor;
#[cfg(windows)]
extern crate winapi;
use std::error::Error;
use std::fs::File;
use std::io;
use std::path::Path;
use std::io::Write;
use std::process;
use std::result;
use std::sync::{Arc, Mutex};
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::mpsc;
use std::thread;
use std::cmp;
use deque::{Stealer, Stolen};
use grep::Grep;
use memmap::{Mmap, Protection};
use term::Terminal;
use walkdir::DirEntry;
use termcolor::WriteColor;
use args::Args;
use out::{ColoredTerminal, Out};
use pathutil::strip_prefix;
use printer::Printer;
use search_stream::InputBuffer;
#[cfg(windows)]
use terminal_win::WindowsBuffer;
use worker::Work;
macro_rules! errored {
($($tt:tt)*) => {
@@ -57,25 +50,20 @@ macro_rules! eprintln {
}}
}
mod app;
mod args;
mod atty;
mod gitignore;
mod glob;
mod ignore;
mod out;
mod pathutil;
mod printer;
mod search_buffer;
mod search_stream;
#[cfg(windows)]
mod terminal_win;
mod types;
mod walk;
mod unescape;
mod worker;
pub type Result<T> = result::Result<T, Box<Error + Send + Sync>>;
fn main() {
match Args::parse().and_then(run) {
match Args::parse().map(Arc::new).and_then(run) {
Ok(count) if count == 0 => process::exit(1),
Ok(_) => process::exit(0),
Err(err) => {
@@ -85,140 +73,170 @@ fn main() {
}
}
fn run(args: Args) -> Result<u64> {
let args = Arc::new(args);
let paths = args.paths();
let threads = cmp::max(1, args.threads() - 1);
let isone =
paths.len() == 1 && (paths[0] == Path::new("-") || paths[0].is_file());
fn run(args: Arc<Args>) -> Result<u64> {
if args.never_match() {
return Ok(0);
}
{
let args = args.clone();
ctrlc::set_handler(move || {
let mut writer = args.stdout();
let _ = writer.reset();
let _ = writer.flush();
process::exit(1);
});
}
let threads = args.threads();
if args.files() {
return run_files(args.clone());
}
if args.type_list() {
return run_types(args.clone());
}
if threads == 1 || isone {
return run_one_thread(args.clone());
}
let out = Arc::new(Mutex::new(args.out()));
let mut workers = vec![];
let workq = {
let (workq, stealer) = deque::new();
for _ in 0..threads {
let worker = MultiWorker {
chan_work: stealer.clone(),
out: out.clone(),
outbuf: Some(args.outbuf()),
worker: Worker {
args: args.clone(),
inpbuf: args.input_buffer(),
grep: args.grep(),
match_count: 0,
},
};
workers.push(thread::spawn(move || worker.run()));
}
workq
};
let mut paths_searched: u64 = 0;
for p in paths {
if p == Path::new("-") {
paths_searched += 1;
workq.push(Work::Stdin);
if threads == 1 || args.is_one_path() {
run_files_one_thread(args)
} else {
for ent in try!(args.walker(p)) {
paths_searched += 1;
workq.push(Work::File(ent));
run_files_parallel(args)
}
} else if args.type_list() {
run_types(args)
} else if threads == 1 || args.is_one_path() {
run_one_thread(args)
} else {
run_parallel(args)
}
}
fn run_parallel(args: Arc<Args>) -> Result<u64> {
let bufwtr = Arc::new(args.buffer_writer());
let quiet_matched = QuietMatched::new(args.quiet());
let paths_searched = Arc::new(AtomicUsize::new(0));
let match_count = Arc::new(AtomicUsize::new(0));
args.walker_parallel().run(|| {
let args = args.clone();
let quiet_matched = quiet_matched.clone();
let paths_searched = paths_searched.clone();
let match_count = match_count.clone();
let bufwtr = bufwtr.clone();
let mut buf = bufwtr.buffer();
let mut worker = args.worker();
Box::new(move |result| {
use ignore::WalkState::*;
if quiet_matched.has_match() {
return Quit;
}
let dent = match get_or_log_dir_entry(result, args.no_messages()) {
None => return Continue,
Some(dent) => dent,
};
paths_searched.fetch_add(1, Ordering::SeqCst);
buf.clear();
{
// This block actually executes the search and prints the
// results into outbuf.
let mut printer = args.printer(&mut buf);
let count =
if dent.is_stdin() {
worker.run(&mut printer, Work::Stdin)
} else {
worker.run(&mut printer, Work::DirEntry(dent))
};
match_count.fetch_add(count as usize, Ordering::SeqCst);
if quiet_matched.set_match(count > 0) {
return Quit;
}
}
// BUG(burntsushi): We should handle this error instead of ignoring
// it. See: https://github.com/BurntSushi/ripgrep/issues/200
let _ = bufwtr.print(&buf);
Continue
})
});
if !args.paths().is_empty() && paths_searched.load(Ordering::SeqCst) == 0 {
if !args.no_messages() {
eprint_nothing_searched();
}
}
Ok(match_count.load(Ordering::SeqCst) as u64)
}
fn run_one_thread(args: Arc<Args>) -> Result<u64> {
let stdout = args.stdout();
let mut stdout = stdout.lock();
let mut worker = args.worker();
let mut paths_searched: u64 = 0;
let mut match_count = 0;
for result in args.walker() {
let dent = match get_or_log_dir_entry(result, args.no_messages()) {
None => continue,
Some(dent) => dent,
};
let mut printer = args.printer(&mut stdout);
if match_count > 0 {
if args.quiet() {
break;
}
if let Some(sep) = args.file_separator() {
printer = printer.file_separator(sep);
}
}
paths_searched += 1;
match_count +=
if dent.is_stdin() {
worker.run(&mut printer, Work::Stdin)
} else {
worker.run(&mut printer, Work::DirEntry(dent))
};
}
if !paths.is_empty() && paths_searched == 0 {
eprintln!("No files were searched, which means ripgrep probably \
applied a filter you didn't expect. \
Try running again with --debug.");
}
for _ in 0..workers.len() {
workq.push(Work::Quit);
}
let mut match_count = 0;
for worker in workers {
match_count += worker.join().unwrap();
if !args.paths().is_empty() && paths_searched == 0 {
if !args.no_messages() {
eprint_nothing_searched();
}
}
Ok(match_count)
}
fn run_one_thread(args: Arc<Args>) -> Result<u64> {
let mut worker = Worker {
args: args.clone(),
inpbuf: args.input_buffer(),
grep: args.grep(),
match_count: 0,
};
let paths = args.paths();
let mut term = args.stdout();
let mut paths_searched: u64 = 0;
for p in paths {
if p == Path::new("-") {
paths_searched += 1;
let mut printer = args.printer(&mut term);
if worker.match_count > 0 {
if let Some(sep) = args.file_separator() {
printer = printer.file_separator(sep);
}
}
worker.do_work(&mut printer, WorkReady::Stdin);
} else {
for ent in try!(args.walker(p)) {
paths_searched += 1;
let mut printer = args.printer(&mut term);
if worker.match_count > 0 {
if let Some(sep) = args.file_separator() {
printer = printer.file_separator(sep);
}
}
let file = match File::open(ent.path()) {
Ok(file) => file,
Err(err) => {
eprintln!("{}: {}", ent.path().display(), err);
continue;
}
};
worker.do_work(&mut printer, WorkReady::DirFile(ent, file));
}
fn run_files_parallel(args: Arc<Args>) -> Result<u64> {
let print_args = args.clone();
let (tx, rx) = mpsc::channel::<ignore::DirEntry>();
let print_thread = thread::spawn(move || {
let stdout = print_args.stdout();
let mut printer = print_args.printer(stdout.lock());
let mut file_count = 0;
for dent in rx.iter() {
printer.path(dent.path());
file_count += 1;
}
}
if !paths.is_empty() && paths_searched == 0 {
eprintln!("No files were searched, which means ripgrep probably \
applied a filter you didn't expect. \
Try running again with --debug.");
}
Ok(worker.match_count)
file_count
});
let no_messages = args.no_messages();
args.walker_parallel().run(move || {
let tx = tx.clone();
Box::new(move |result| {
if let Some(dent) = get_or_log_dir_entry(result, no_messages) {
tx.send(dent).unwrap();
}
ignore::WalkState::Continue
})
});
Ok(print_thread.join().unwrap())
}
fn run_files(args: Arc<Args>) -> Result<u64> {
let term = args.stdout();
let mut printer = args.printer(term);
fn run_files_one_thread(args: Arc<Args>) -> Result<u64> {
let stdout = args.stdout();
let mut printer = args.printer(stdout.lock());
let mut file_count = 0;
for p in args.paths() {
if p == Path::new("-") {
printer.path(&Path::new("<stdin>"));
file_count += 1;
} else {
for ent in try!(args.walker(p)) {
printer.path(ent.path());
file_count += 1;
}
}
for result in args.walker() {
let dent = match get_or_log_dir_entry(result, args.no_messages()) {
None => continue,
Some(dent) => dent,
};
printer.path(dent.path());
file_count += 1;
}
Ok(file_count)
}
fn run_types(args: Arc<Args>) -> Result<u64> {
let term = args.stdout();
let mut printer = args.printer(term);
let stdout = args.stdout();
let mut printer = args.printer(stdout.lock());
let mut ty_count = 0;
for def in args.type_defs() {
printer.type_def(def);
@@ -227,135 +245,95 @@ fn run_types(args: Arc<Args>) -> Result<u64> {
Ok(ty_count)
}
enum Work {
Stdin,
File(DirEntry),
Quit,
}
enum WorkReady {
Stdin,
DirFile(DirEntry, File),
}
struct MultiWorker {
chan_work: Stealer<Work>,
out: Arc<Mutex<Out>>,
#[cfg(not(windows))]
outbuf: Option<ColoredTerminal<term::TerminfoTerminal<Vec<u8>>>>,
#[cfg(windows)]
outbuf: Option<ColoredTerminal<WindowsBuffer>>,
worker: Worker,
}
struct Worker {
args: Arc<Args>,
inpbuf: InputBuffer,
grep: Grep,
match_count: u64,
}
impl MultiWorker {
fn run(mut self) -> u64 {
loop {
let work = match self.chan_work.steal() {
Stolen::Empty | Stolen::Abort => continue,
Stolen::Data(Work::Quit) => break,
Stolen::Data(Work::Stdin) => WorkReady::Stdin,
Stolen::Data(Work::File(ent)) => {
match File::open(ent.path()) {
Ok(file) => WorkReady::DirFile(ent, file),
Err(err) => {
eprintln!("{}: {}", ent.path().display(), err);
continue;
}
}
}
};
let mut outbuf = self.outbuf.take().unwrap();
outbuf.clear();
let mut printer = self.worker.args.printer(outbuf);
self.worker.do_work(&mut printer, work);
let outbuf = printer.into_inner();
if !outbuf.get_ref().is_empty() {
let mut out = self.out.lock().unwrap();
out.write(&outbuf);
}
self.outbuf = Some(outbuf);
}
self.worker.match_count
}
}
impl Worker {
fn do_work<W: Terminal + Send>(
&mut self,
printer: &mut Printer<W>,
work: WorkReady,
) {
let result = match work {
WorkReady::Stdin => {
let stdin = io::stdin();
let stdin = stdin.lock();
self.search(printer, &Path::new("<stdin>"), stdin)
}
WorkReady::DirFile(ent, file) => {
let mut path = ent.path();
if let Some(p) = strip_prefix("./", path) {
path = p;
}
if self.args.mmap() {
self.search_mmap(printer, path, &file)
} else {
self.search(printer, path, file)
}
}
};
match result {
Ok(count) => {
self.match_count += count;
}
Err(err) => {
fn get_or_log_dir_entry(
result: result::Result<ignore::DirEntry, ignore::Error>,
no_messages: bool,
) -> Option<ignore::DirEntry> {
match result {
Err(err) => {
if !no_messages {
eprintln!("{}", err);
}
None
}
}
fn search<R: io::Read, W: Terminal + Send>(
&mut self,
printer: &mut Printer<W>,
path: &Path,
rdr: R,
) -> Result<u64> {
self.args.searcher(
&mut self.inpbuf,
printer,
&self.grep,
path,
rdr,
).run().map_err(From::from)
}
fn search_mmap<W: Terminal + Send>(
&mut self,
printer: &mut Printer<W>,
path: &Path,
file: &File,
) -> Result<u64> {
if try!(file.metadata()).len() == 0 {
// Opening a memory map with an empty file results in an error.
// However, this may not actually be an empty file! For example,
// /proc/cpuinfo reports itself as an empty file, but it can
// produce data when it's read from. Therefore, we fall back to
// regular read calls.
return self.search(printer, path, file);
Ok(dent) => {
if let Some(err) = dent.error() {
if !no_messages {
eprintln!("{}", err);
}
}
let ft = match dent.file_type() {
None => return Some(dent), // entry is stdin
Some(ft) => ft,
};
// A depth of 0 means the user gave the path explicitly, so we
// should always try to search it.
if dent.depth() == 0 && !ft.is_dir() {
Some(dent)
} else if ft.is_file() {
Some(dent)
} else {
None
}
}
}
}
fn version() -> String {
let (maj, min, pat) = (
option_env!("CARGO_PKG_VERSION_MAJOR"),
option_env!("CARGO_PKG_VERSION_MINOR"),
option_env!("CARGO_PKG_VERSION_PATCH"),
);
match (maj, min, pat) {
(Some(maj), Some(min), Some(pat)) => {
format!("ripgrep {}.{}.{}", maj, min, pat)
}
_ => "".to_owned(),
}
}
fn eprint_nothing_searched() {
eprintln!("No files were searched, which means ripgrep probably \
applied a filter you didn't expect. \
Try running again with --debug.");
}
/// A simple thread safe abstraction for determining whether a search should
/// stop if the user has requested quiet mode.
#[derive(Clone, Debug)]
pub struct QuietMatched(Arc<Option<AtomicBool>>);
impl QuietMatched {
/// Create a new QuietMatched value.
///
/// If quiet is true, then set_match and has_match will reflect whether
/// a search should quit or not because it found a match.
///
/// If quiet is false, then set_match is always a no-op and has_match
/// always returns false.
pub fn new(quiet: bool) -> QuietMatched {
let atomic = if quiet { Some(AtomicBool::new(false)) } else { None };
QuietMatched(Arc::new(atomic))
}
/// Returns true if and only if quiet mode is enabled and a match has
/// occurred.
pub fn has_match(&self) -> bool {
match *self.0 {
None => false,
Some(ref matched) => matched.load(Ordering::SeqCst),
}
}
/// Sets whether a match has occurred or not.
///
/// If quiet mode is disabled, then this is a no-op.
pub fn set_match(&self, yes: bool) -> bool {
match *self.0 {
None => false,
Some(_) if !yes => false,
Some(ref m) => { m.store(true, Ordering::SeqCst); true }
}
let mmap = try!(Mmap::open(file, Protection::Read));
Ok(self.args.searcher_buffer(
printer,
&self.grep,
path,
unsafe { mmap.as_slice() },
).run())
}
}

View File

@@ -1,374 +0,0 @@
use std::io::{self, Write};
use term::{self, Terminal};
#[cfg(not(windows))]
use term::terminfo::TermInfo;
#[cfg(windows)]
use term::WinConsole;
#[cfg(windows)]
use terminal_win::WindowsBuffer;
/// Out controls the actual output of all search results for a particular file
/// to the end user.
///
/// (The difference between Out and Printer is that a Printer works with
/// individual search results where as Out works with search results for each
/// file as a whole. For example, it knows when to print a file separator.)
pub struct Out {
#[cfg(not(windows))]
term: ColoredTerminal<term::TerminfoTerminal<io::BufWriter<io::Stdout>>>,
#[cfg(windows)]
term: ColoredTerminal<WinConsole<io::Stdout>>,
printed: bool,
file_separator: Option<Vec<u8>>,
}
impl Out {
/// Create a new Out that writes to the wtr given.
#[cfg(not(windows))]
pub fn new(color: bool) -> Out {
let wtr = io::BufWriter::new(io::stdout());
Out {
term: ColoredTerminal::new(wtr, color),
printed: false,
file_separator: None,
}
}
/// Create a new Out that writes to the wtr given.
#[cfg(windows)]
pub fn new(color: bool) -> Out {
Out {
term: ColoredTerminal::new_stdout(color),
printed: false,
file_separator: None,
}
}
/// If set, the separator is printed between matches from different files.
/// By default, no separator is printed.
pub fn file_separator(mut self, sep: Vec<u8>) -> Out {
self.file_separator = Some(sep);
self
}
/// Write the search results of a single file to the underlying wtr and
/// flush wtr.
#[cfg(not(windows))]
pub fn write(
&mut self,
buf: &ColoredTerminal<term::TerminfoTerminal<Vec<u8>>>,
) {
self.write_sep();
match *buf {
ColoredTerminal::Colored(ref tt) => {
let _ = self.term.write_all(tt.get_ref());
}
ColoredTerminal::NoColor(ref buf) => {
let _ = self.term.write_all(buf);
}
}
self.write_done();
}
/// Write the search results of a single file to the underlying wtr and
/// flush wtr.
#[cfg(windows)]
pub fn write(
&mut self,
buf: &ColoredTerminal<WindowsBuffer>,
) {
self.write_sep();
match *buf {
ColoredTerminal::Colored(ref tt) => {
tt.print_stdout(&mut self.term);
}
ColoredTerminal::NoColor(ref buf) => {
let _ = self.term.write_all(buf);
}
}
self.write_done();
}
fn write_sep(&mut self) {
if let Some(ref sep) = self.file_separator {
if self.printed {
let _ = self.term.write_all(sep);
let _ = self.term.write_all(b"\n");
}
}
}
fn write_done(&mut self) {
let _ = self.term.flush();
self.printed = true;
}
}
/// ColoredTerminal provides optional colored output through the term::Terminal
/// trait. In particular, it will dynamically configure itself to use coloring
/// if it's available in the environment.
#[derive(Clone, Debug)]
pub enum ColoredTerminal<T: Terminal + Send> {
Colored(T),
NoColor(T::Output),
}
#[cfg(not(windows))]
impl<W: io::Write + Send> ColoredTerminal<term::TerminfoTerminal<W>> {
/// Create a new output buffer.
///
/// When color is true, the buffer will attempt to support coloring.
pub fn new(wtr: W, color: bool) -> Self {
lazy_static! {
// Only pay for parsing the terminfo once.
static ref TERMINFO: Option<TermInfo> = {
match TermInfo::from_env() {
Ok(info) => Some(info),
Err(err) => {
debug!("error loading terminfo for coloring: {}", err);
None
}
}
};
}
// If we want color, build a term::TerminfoTerminal and see if the
// current environment supports coloring. If not, bail with NoColor. To
// avoid losing our writer (ownership), do this the long way.
if !color {
return ColoredTerminal::NoColor(wtr);
}
let terminfo = match *TERMINFO {
None => return ColoredTerminal::NoColor(wtr),
Some(ref ti) => {
// Ug, this should go away with the next release of `term`.
TermInfo {
names: ti.names.clone(),
bools: ti.bools.clone(),
numbers: ti.numbers.clone(),
strings: ti.strings.clone(),
}
}
};
let tt = term::TerminfoTerminal::new_with_terminfo(wtr, terminfo);
if !tt.supports_color() {
debug!("environment doesn't support coloring");
return ColoredTerminal::NoColor(tt.into_inner());
}
ColoredTerminal::Colored(tt)
}
}
#[cfg(not(windows))]
impl ColoredTerminal<term::TerminfoTerminal<Vec<u8>>> {
/// Clear the give buffer of all search results such that it is reusable
/// in another search.
pub fn clear(&mut self) {
match *self {
ColoredTerminal::Colored(ref mut tt) => {
tt.get_mut().clear();
}
ColoredTerminal::NoColor(ref mut buf) => {
buf.clear();
}
}
}
}
#[cfg(windows)]
impl ColoredTerminal<WindowsBuffer> {
/// Create a new output buffer.
///
/// When color is true, the buffer will attempt to support coloring.
pub fn new_buffer(color: bool) -> Self {
if !color {
ColoredTerminal::NoColor(vec![])
} else {
ColoredTerminal::Colored(WindowsBuffer::new())
}
}
/// Clear the give buffer of all search results such that it is reusable
/// in another search.
pub fn clear(&mut self) {
match *self {
ColoredTerminal::Colored(ref mut win) => win.clear(),
ColoredTerminal::NoColor(ref mut buf) => buf.clear(),
}
}
}
#[cfg(windows)]
impl ColoredTerminal<WinConsole<io::Stdout>> {
/// Create a new output buffer.
///
/// When color is true, the buffer will attempt to support coloring.
pub fn new_stdout(color: bool) -> Self {
if !color {
return ColoredTerminal::NoColor(io::stdout());
}
match WinConsole::new(io::stdout()) {
Ok(win) => ColoredTerminal::Colored(win),
Err(_) => ColoredTerminal::NoColor(io::stdout()),
}
}
}
impl<T: Terminal + Send> ColoredTerminal<T> {
fn map_result<F>(
&mut self,
mut f: F,
) -> term::Result<()>
where F: FnMut(&mut T) -> term::Result<()> {
match *self {
ColoredTerminal::Colored(ref mut w) => f(w),
ColoredTerminal::NoColor(_) => Err(term::Error::NotSupported),
}
}
fn map_bool<F>(
&self,
mut f: F,
) -> bool
where F: FnMut(&T) -> bool {
match *self {
ColoredTerminal::Colored(ref w) => f(w),
ColoredTerminal::NoColor(_) => false,
}
}
}
impl<T: Terminal + Send> io::Write for ColoredTerminal<T> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
match *self {
ColoredTerminal::Colored(ref mut w) => w.write(buf),
ColoredTerminal::NoColor(ref mut w) => w.write(buf),
}
}
fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}
impl<T: Terminal + Send> term::Terminal for ColoredTerminal<T> {
type Output = T::Output;
fn fg(&mut self, fg: term::color::Color) -> term::Result<()> {
self.map_result(|w| w.fg(fg))
}
fn bg(&mut self, bg: term::color::Color) -> term::Result<()> {
self.map_result(|w| w.bg(bg))
}
fn attr(&mut self, attr: term::Attr) -> term::Result<()> {
self.map_result(|w| w.attr(attr))
}
fn supports_attr(&self, attr: term::Attr) -> bool {
self.map_bool(|w| w.supports_attr(attr))
}
fn reset(&mut self) -> term::Result<()> {
self.map_result(|w| w.reset())
}
fn supports_reset(&self) -> bool {
self.map_bool(|w| w.supports_reset())
}
fn supports_color(&self) -> bool {
self.map_bool(|w| w.supports_color())
}
fn cursor_up(&mut self) -> term::Result<()> {
self.map_result(|w| w.cursor_up())
}
fn delete_line(&mut self) -> term::Result<()> {
self.map_result(|w| w.delete_line())
}
fn carriage_return(&mut self) -> term::Result<()> {
self.map_result(|w| w.carriage_return())
}
fn get_ref(&self) -> &Self::Output {
match *self {
ColoredTerminal::Colored(ref w) => w.get_ref(),
ColoredTerminal::NoColor(ref w) => w,
}
}
fn get_mut(&mut self) -> &mut Self::Output {
match *self {
ColoredTerminal::Colored(ref mut w) => w.get_mut(),
ColoredTerminal::NoColor(ref mut w) => w,
}
}
fn into_inner(self) -> Self::Output {
match self {
ColoredTerminal::Colored(w) => w.into_inner(),
ColoredTerminal::NoColor(w) => w,
}
}
}
impl<'a, T: Terminal + Send> term::Terminal for &'a mut ColoredTerminal<T> {
type Output = T::Output;
fn fg(&mut self, fg: term::color::Color) -> term::Result<()> {
(**self).fg(fg)
}
fn bg(&mut self, bg: term::color::Color) -> term::Result<()> {
(**self).bg(bg)
}
fn attr(&mut self, attr: term::Attr) -> term::Result<()> {
(**self).attr(attr)
}
fn supports_attr(&self, attr: term::Attr) -> bool {
(**self).supports_attr(attr)
}
fn reset(&mut self) -> term::Result<()> {
(**self).reset()
}
fn supports_reset(&self) -> bool {
(**self).supports_reset()
}
fn supports_color(&self) -> bool {
(**self).supports_color()
}
fn cursor_up(&mut self) -> term::Result<()> {
(**self).cursor_up()
}
fn delete_line(&mut self) -> term::Result<()> {
(**self).delete_line()
}
fn carriage_return(&mut self) -> term::Result<()> {
(**self).carriage_return()
}
fn get_ref(&self) -> &Self::Output {
(**self).get_ref()
}
fn get_mut(&mut self) -> &mut Self::Output {
(**self).get_mut()
}
fn into_inner(self) -> Self::Output {
// Good golly miss molly...
unimplemented!()
}
}

View File

@@ -8,7 +8,6 @@ with the raw bytes directly.
On large repositories (like chromium), this can have a ~25% performance
improvement on just listing the files to search (!).
*/
use std::ffi::OsStr;
use std::path::Path;
/// Strip `prefix` from the `path` and return the remainder.
@@ -19,6 +18,7 @@ pub fn strip_prefix<'a, P: AsRef<Path> + ?Sized>(
prefix: &'a P,
path: &'a Path,
) -> Option<&'a Path> {
use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;
let prefix = prefix.as_ref().as_os_str().as_bytes();
@@ -40,79 +40,3 @@ pub fn strip_prefix<'a, P: AsRef<Path> + ?Sized>(
) -> Option<&'a Path> {
path.strip_prefix(prefix).ok()
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(unix)]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
let path = path.as_ref().as_os_str().as_bytes();
if path.is_empty() {
return None;
} else if path.len() == 1 && path[0] == b'.' {
return None;
} else if path.last() == Some(&b'.') {
return None;
} else if path.len() >= 2 && &path[path.len() - 2..] == &b".."[..] {
return None;
}
let last_slash = memrchr(b'/', path).map(|i| i + 1).unwrap_or(0);
Some(OsStr::from_bytes(&path[last_slash..]))
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(not(unix))]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
path.as_ref().file_name()
}
/// Returns true if and only if this file path is considered to be hidden.
#[cfg(unix)]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
if let Some(name) = file_name(path.as_ref()) {
name.as_bytes().get(0) == Some(&b'.')
} else {
false
}
}
/// Returns true if and only if this file path is considered to be hidden.
#[cfg(not(unix))]
pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
if let Some(name) = file_name(path.as_ref()) {
name.to_str().map(|s| s.starts_with(".")).unwrap_or(false)
} else {
false
}
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(unix)]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
use memchr::memchr;
let path = path.as_ref().as_os_str().as_bytes();
memchr(b'/', path).is_none()
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(not(unix))]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
path.as_ref().parent().map(|p| p.as_os_str().is_empty()).unwrap_or(false)
}

View File

@@ -1,11 +1,13 @@
use std::error;
use std::fmt;
use std::path::Path;
use std::str::FromStr;
use regex::bytes::Regex;
use term::{Attr, Terminal};
use term::color;
use termcolor::{Color, ColorSpec, ParseColorError, WriteColor};
use pathutil::strip_prefix;
use types::FileTypeDef;
use ignore::types::FileTypeDef;
/// Printer encapsulates all output logic for searching.
///
@@ -33,8 +35,6 @@ pub struct Printer<W> {
heading: bool,
/// Whether to show every match on its own line.
line_per_match: bool,
/// Whether to suppress all output.
quiet: bool,
/// Whether to print NUL bytes after a file path instead of new lines
/// or `:`.
null: bool,
@@ -42,10 +42,12 @@ pub struct Printer<W> {
replace: Option<Vec<u8>>,
/// Whether to prefix each match with the corresponding file name.
with_filename: bool,
/// The color specifications.
colors: ColorSpecs,
}
impl<W: Terminal + Send> Printer<W> {
/// Create a new printer that writes to wtr.
impl<W: WriteColor> Printer<W> {
/// Create a new printer that writes to wtr with the given color settings.
pub fn new(wtr: W) -> Printer<W> {
Printer {
wtr: wtr,
@@ -56,13 +58,19 @@ impl<W: Terminal + Send> Printer<W> {
file_separator: None,
heading: false,
line_per_match: false,
quiet: false,
null: false,
replace: None,
with_filename: false,
colors: ColorSpecs::default(),
}
}
/// Set the color specifications.
pub fn colors(mut self, colors: ColorSpecs) -> Printer<W> {
self.colors = colors;
self
}
/// When set, column numbers will be printed for the first match on each
/// line.
pub fn column(mut self, yes: bool) -> Printer<W> {
@@ -110,12 +118,6 @@ impl<W: Terminal + Send> Printer<W> {
self
}
/// When set, all output is suppressed.
pub fn quiet(mut self, yes: bool) -> Printer<W> {
self.quiet = yes;
self
}
/// Replace every match in each matching line with the replacement string
/// given.
///
@@ -137,12 +139,8 @@ impl<W: Terminal + Send> Printer<W> {
self.has_printed
}
/// Returns true if the printer has been configured to be quiet.
pub fn is_quiet(&self) -> bool {
self.quiet
}
/// Flushes the underlying writer and returns it.
#[allow(dead_code)]
pub fn into_inner(mut self) -> W {
let _ = self.wtr.flush();
self.wtr
@@ -153,11 +151,11 @@ impl<W: Terminal + Send> Printer<W> {
self.write(def.name().as_bytes());
self.write(b": ");
let mut first = true;
for pat in def.patterns() {
for glob in def.globs() {
if !first {
self.write(b", ");
}
self.write(pat.as_bytes());
self.write(glob.as_bytes());
first = false;
}
self.write_eol();
@@ -191,9 +189,6 @@ impl<W: Terminal + Send> Printer<W> {
/// Prints the context separator.
pub fn context_separate(&mut self) {
// N.B. We can't use `write` here because of borrowing restrictions.
if self.quiet {
return;
}
if self.context_separator.is_empty() {
return;
}
@@ -243,12 +238,7 @@ impl<W: Terminal + Send> Printer<W> {
self.write_file_sep();
self.write_heading(path.as_ref());
} else if !self.heading && self.with_filename {
self.write_path(path.as_ref());
if self.null {
self.write(b"\x00");
} else {
self.write(b":");
}
self.write_non_heading_path(path.as_ref());
}
if let Some(line_number) = line_number {
self.line_number(line_number, b':');
@@ -277,8 +267,7 @@ impl<W: Terminal + Send> Printer<W> {
let mut last_written = 0;
for (s, e) in re.find_iter(buf) {
self.write(&buf[last_written..s]);
let _ = self.wtr.fg(color::BRIGHT_RED);
let _ = self.wtr.attr(Attr::Bold);
let _ = self.wtr.set_color(self.colors.matched());
self.write(&buf[s..e]);
let _ = self.wtr.reset();
last_written = e;
@@ -315,30 +304,31 @@ impl<W: Terminal + Send> Printer<W> {
}
fn write_heading<P: AsRef<Path>>(&mut self, path: P) {
if self.wtr.supports_color() {
let _ = self.wtr.fg(color::BRIGHT_GREEN);
let _ = self.wtr.attr(Attr::Bold);
}
let _ = self.wtr.set_color(self.colors.path());
self.write_path(path.as_ref());
let _ = self.wtr.reset();
if self.null {
self.write(b"\x00");
} else {
self.write_eol();
}
if self.wtr.supports_color() {
let _ = self.wtr.reset();
}
fn write_non_heading_path<P: AsRef<Path>>(&mut self, path: P) {
let _ = self.wtr.set_color(self.colors.path());
self.write_path(path.as_ref());
let _ = self.wtr.reset();
if self.null {
self.write(b"\x00");
} else {
self.write(b":");
}
}
fn line_number(&mut self, n: u64, sep: u8) {
if self.wtr.supports_color() {
let _ = self.wtr.fg(color::BRIGHT_BLUE);
let _ = self.wtr.attr(Attr::Bold);
}
let _ = self.wtr.set_color(self.colors.line());
self.write(n.to_string().as_bytes());
if self.wtr.supports_color() {
let _ = self.wtr.reset();
}
let _ = self.wtr.reset();
self.write(&[sep]);
}
@@ -356,9 +346,6 @@ impl<W: Terminal + Send> Printer<W> {
}
fn write(&mut self, buf: &[u8]) {
if self.quiet {
return;
}
self.has_printed = true;
let _ = self.wtr.write_all(buf);
}
@@ -369,9 +356,6 @@ impl<W: Terminal + Send> Printer<W> {
}
fn write_file_sep(&mut self) {
if self.quiet {
return;
}
if let Some(ref sep) = self.file_separator {
self.has_printed = true;
let _ = self.wtr.write_all(sep);
@@ -379,3 +363,362 @@ impl<W: Terminal + Send> Printer<W> {
}
}
}
/// An error that can occur when parsing color specifications.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Error {
/// This occurs when an unrecognized output type is used.
UnrecognizedOutType(String),
/// This occurs when an unrecognized spec type is used.
UnrecognizedSpecType(String),
/// This occurs when an unrecognized color name is used.
UnrecognizedColor(String, String),
/// This occurs when an unrecognized style attribute is used.
UnrecognizedStyle(String),
/// This occurs when the format of a color specification is invalid.
InvalidFormat(String),
}
impl error::Error for Error {
fn description(&self) -> &str {
match *self {
Error::UnrecognizedOutType(_) => "unrecognized output type",
Error::UnrecognizedSpecType(_) => "unrecognized spec type",
Error::UnrecognizedColor(_, _) => "unrecognized color name",
Error::UnrecognizedStyle(_) => "unrecognized style attribute",
Error::InvalidFormat(_) => "invalid color spec",
}
}
fn cause(&self) -> Option<&error::Error> {
None
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::UnrecognizedOutType(ref name) => {
write!(f, "Unrecognized output type '{}'. Choose from: \
path, line, match.", name)
}
Error::UnrecognizedSpecType(ref name) => {
write!(f, "Unrecognized spec type '{}'. Choose from: \
fg, bg, style, none.", name)
}
Error::UnrecognizedColor(_, ref msg) => {
write!(f, "{}", msg)
}
Error::UnrecognizedStyle(ref name) => {
write!(f, "Unrecognized style attribute '{}'. Choose from: \
nobold, bold.", name)
}
Error::InvalidFormat(ref original) => {
write!(f, "Invalid color speci format: '{}'. Valid format \
is '(path|line|match):(fg|bg|style):(value)'.",
original)
}
}
}
}
impl From<ParseColorError> for Error {
fn from(err: ParseColorError) -> Error {
Error::UnrecognizedColor(err.invalid().to_string(), err.to_string())
}
}
/// A merged set of color specifications.
#[derive(Clone, Debug, Default, Eq, PartialEq)]
pub struct ColorSpecs {
path: ColorSpec,
line: ColorSpec,
matched: ColorSpec,
}
/// A single color specification provided by the user.
///
/// A `ColorSpecs` can be built by merging a sequence of `Spec`s.
///
/// ## Example
///
/// The only way to build a `Spec` is to parse it from a string. Once multiple
/// `Spec`s have been constructed, then can be merged into a single
/// `ColorSpecs` value.
///
/// ```rust
/// use termcolor::{Color, ColorSpecs, Spec};
///
/// let spec1: Spec = "path:fg:blue".parse().unwrap();
/// let spec2: Spec = "match:bg:green".parse().unwrap();
/// let specs = ColorSpecs::new(&[spec1, spec2]);
///
/// assert_eq!(specs.path().fg(), Some(Color::Blue));
/// assert_eq!(specs.matched().bg(), Some(Color::Green));
/// ```
///
/// ## Format
///
/// The format of a `Spec` is a triple: `{type}:{attribute}:{value}`. Each
/// component is defined as follows:
///
/// * `{type}` can be one of `path`, `line` or `match`.
/// * `{attribute}` can be one of `fg`, `bg` or `style`. `{attribute}` may also
/// be the special value `none`, in which case, `{value}` can be omitted.
/// * `{value}` is either a color name (for `fg`/`bg`) or a style instruction.
///
/// `{type}` controls which part of the output should be styled and is
/// application dependent.
///
/// When `{attribute}` is `none`, then this should cause any existing color
/// settings to be cleared.
///
/// `{value}` should be a color when `{attribute}` is `fg` or `bg`, or it
/// should be a style instruction when `{attribute}` is `style`. When
/// `{attribute}` is `none`, `{value}` must be omitted.
///
/// Valid colors are `black`, `blue`, `green`, `red`, `cyan`, `magenta`,
/// `yellow`, `white`.
///
/// Valid style instructions are `nobold` and `bold`.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Spec {
ty: OutType,
value: SpecValue,
}
/// The actual value given by the specification.
#[derive(Clone, Debug, Eq, PartialEq)]
enum SpecValue {
None,
Fg(Color),
Bg(Color),
Style(Style),
}
/// The set of configurable portions of ripgrep's output.
#[derive(Clone, Debug, Eq, PartialEq)]
enum OutType {
Path,
Line,
Match,
}
/// The specification type.
#[derive(Clone, Debug, Eq, PartialEq)]
enum SpecType {
Fg,
Bg,
Style,
None,
}
/// The set of available styles for use in the terminal.
#[derive(Clone, Debug, Eq, PartialEq)]
enum Style {
Bold,
NoBold,
}
impl ColorSpecs {
/// Create color specifications from a list of user supplied
/// specifications.
pub fn new(user_specs: &[Spec]) -> ColorSpecs {
let mut specs = ColorSpecs::default();
for user_spec in user_specs {
match user_spec.ty {
OutType::Path => user_spec.merge_into(&mut specs.path),
OutType::Line => user_spec.merge_into(&mut specs.line),
OutType::Match => user_spec.merge_into(&mut specs.matched),
}
}
specs
}
/// Return the color specification for coloring file paths.
fn path(&self) -> &ColorSpec {
&self.path
}
/// Return the color specification for coloring line numbers.
fn line(&self) -> &ColorSpec {
&self.line
}
/// Return the color specification for coloring matched text.
fn matched(&self) -> &ColorSpec {
&self.matched
}
}
impl Spec {
/// Merge this spec into the given color specification.
fn merge_into(&self, cspec: &mut ColorSpec) {
self.value.merge_into(cspec);
}
}
impl SpecValue {
/// Merge this spec value into the given color specification.
fn merge_into(&self, cspec: &mut ColorSpec) {
match *self {
SpecValue::None => cspec.clear(),
SpecValue::Fg(ref color) => { cspec.set_fg(Some(color.clone())); }
SpecValue::Bg(ref color) => { cspec.set_bg(Some(color.clone())); }
SpecValue::Style(ref style) => {
match *style {
Style::Bold => { cspec.set_bold(true); }
Style::NoBold => { cspec.set_bold(false); }
}
}
}
}
}
impl FromStr for Spec {
type Err = Error;
fn from_str(s: &str) -> Result<Spec, Error> {
let pieces: Vec<&str> = s.split(":").collect();
if pieces.len() <= 1 || pieces.len() > 3 {
return Err(Error::InvalidFormat(s.to_string()));
}
let otype: OutType = try!(pieces[0].parse());
match try!(pieces[1].parse()) {
SpecType::None => Ok(Spec { ty: otype, value: SpecValue::None }),
SpecType::Style => {
if pieces.len() < 3 {
return Err(Error::InvalidFormat(s.to_string()));
}
let style: Style = try!(pieces[2].parse());
Ok(Spec { ty: otype, value: SpecValue::Style(style) })
}
SpecType::Fg => {
if pieces.len() < 3 {
return Err(Error::InvalidFormat(s.to_string()));
}
let color: Color = try!(pieces[2].parse());
Ok(Spec { ty: otype, value: SpecValue::Fg(color) })
}
SpecType::Bg => {
if pieces.len() < 3 {
return Err(Error::InvalidFormat(s.to_string()));
}
let color: Color = try!(pieces[2].parse());
Ok(Spec { ty: otype, value: SpecValue::Bg(color) })
}
}
}
}
impl FromStr for OutType {
type Err = Error;
fn from_str(s: &str) -> Result<OutType, Error> {
match &*s.to_lowercase() {
"path" => Ok(OutType::Path),
"line" => Ok(OutType::Line),
"match" => Ok(OutType::Match),
_ => Err(Error::UnrecognizedOutType(s.to_string())),
}
}
}
impl FromStr for SpecType {
type Err = Error;
fn from_str(s: &str) -> Result<SpecType, Error> {
match &*s.to_lowercase() {
"fg" => Ok(SpecType::Fg),
"bg" => Ok(SpecType::Bg),
"style" => Ok(SpecType::Style),
"none" => Ok(SpecType::None),
_ => Err(Error::UnrecognizedSpecType(s.to_string())),
}
}
}
impl FromStr for Style {
type Err = Error;
fn from_str(s: &str) -> Result<Style, Error> {
match &*s.to_lowercase() {
"bold" => Ok(Style::Bold),
"nobold" => Ok(Style::NoBold),
_ => Err(Error::UnrecognizedStyle(s.to_string())),
}
}
}
#[cfg(test)]
mod tests {
use termcolor::{Color, ColorSpec};
use super::{ColorSpecs, Error, OutType, Spec, SpecValue, Style};
#[test]
fn merge() {
let user_specs: &[Spec] = &[
"match:fg:blue".parse().unwrap(),
"match:none".parse().unwrap(),
"match:style:bold".parse().unwrap(),
];
let mut expect_matched = ColorSpec::new();
expect_matched.set_bold(true);
assert_eq!(ColorSpecs::new(user_specs), ColorSpecs {
path: ColorSpec::default(),
line: ColorSpec::default(),
matched: expect_matched,
});
}
#[test]
fn specs() {
let spec: Spec = "path:fg:blue".parse().unwrap();
assert_eq!(spec, Spec {
ty: OutType::Path,
value: SpecValue::Fg(Color::Blue),
});
let spec: Spec = "path:bg:red".parse().unwrap();
assert_eq!(spec, Spec {
ty: OutType::Path,
value: SpecValue::Bg(Color::Red),
});
let spec: Spec = "match:style:bold".parse().unwrap();
assert_eq!(spec, Spec {
ty: OutType::Match,
value: SpecValue::Style(Style::Bold),
});
let spec: Spec = "line:none".parse().unwrap();
assert_eq!(spec, Spec {
ty: OutType::Line,
value: SpecValue::None,
});
}
#[test]
fn spec_errors() {
let err = "line:nonee".parse::<Spec>().unwrap_err();
assert_eq!(err, Error::UnrecognizedSpecType("nonee".to_string()));
let err = "".parse::<Spec>().unwrap_err();
assert_eq!(err, Error::InvalidFormat("".to_string()));
let err = "foo".parse::<Spec>().unwrap_err();
assert_eq!(err, Error::InvalidFormat("foo".to_string()));
let err = "line:style:italic".parse::<Spec>().unwrap_err();
assert_eq!(err, Error::UnrecognizedStyle("italic".to_string()));
let err = "line:fg:brown".parse::<Spec>().unwrap_err();
match err {
Error::UnrecognizedColor(name, _) => assert_eq!(name, "brown"),
err => assert!(false, "unexpected error: {:?}", err),
}
let err = "foo:fg:brown".parse::<Spec>().unwrap_err();
assert_eq!(err, Error::UnrecognizedOutType("foo".to_string()));
}
}

View File

@@ -10,7 +10,7 @@ use std::cmp;
use std::path::Path;
use grep::Grep;
use term::Terminal;
use termcolor::WriteColor;
use printer::Printer;
use search_stream::{IterLines, Options, count_lines, is_binary};
@@ -26,7 +26,7 @@ pub struct BufferSearcher<'a, W: 'a> {
last_line: usize,
}
impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
impl<'a, W: WriteColor> BufferSearcher<'a, W> {
pub fn new(
printer: &'a mut Printer<W>,
grep: &'a Grep,
@@ -61,6 +61,15 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self
}
/// If enabled, searching will print the path of files that *don't* match
/// the given pattern.
///
/// Disabled by default.
pub fn files_without_matches(mut self, yes: bool) -> Self {
self.opts.files_without_matches = yes;
self
}
/// Set the end-of-line byte used by this searcher.
pub fn eol(mut self, eol: u8) -> Self {
self.opts.eol = eol;
@@ -81,6 +90,21 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self
}
/// Limit the number of matches to the given count.
///
/// The default is None, which corresponds to no limit.
pub fn max_count(mut self, count: Option<u64>) -> Self {
self.opts.max_count = count;
self
}
/// If enabled, don't show any output and quit searching after the first
/// match is found.
pub fn quiet(mut self, yes: bool) -> Self {
self.opts.quiet = yes;
self
}
/// If enabled, search binary files as if they were text.
pub fn text(mut self, yes: bool) -> Self {
self.opts.text = yes;
@@ -104,11 +128,11 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self.print_match(m.start(), m.end());
}
last_end = m.end();
if self.printer.is_quiet() || self.opts.files_with_matches {
if self.opts.terminate(self.match_count) {
break;
}
}
if self.opts.invert_match {
if self.opts.invert_match && !self.opts.terminate(self.match_count) {
let upto = self.buf.len();
self.print_inverted_matches(last_end, upto);
}
@@ -118,6 +142,9 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
if self.opts.files_with_matches && self.match_count > 0 {
self.printer.path(self.path);
}
if self.opts.files_without_matches && self.match_count == 0 {
self.printer.path(self.path);
}
self.match_count
}
@@ -139,6 +166,9 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
debug_assert!(self.opts.invert_match);
let mut it = IterLines::new(self.opts.eol, start);
while let Some((s, e)) = it.next(&self.buf[..end]) {
if self.opts.terminate(self.match_count) {
return;
}
self.print_match(s, e);
}
}
@@ -166,10 +196,9 @@ mod tests {
use std::path::Path;
use grep::GrepBuilder;
use term::{Terminal, TerminfoTerminal};
use out::ColoredTerminal;
use printer::Printer;
use termcolor;
use super::BufferSearcher;
@@ -186,15 +215,14 @@ and exhibited clearly, with a label attached.\
&Path::new("/baz.rs")
}
type TestSearcher<'a> =
BufferSearcher<'a, ColoredTerminal<TerminfoTerminal<Vec<u8>>>>;
type TestSearcher<'a> = BufferSearcher<'a, termcolor::NoColor<Vec<u8>>>;
fn search<F: FnMut(TestSearcher) -> TestSearcher>(
pat: &str,
haystack: &str,
mut map: F,
) -> (u64, String) {
let outbuf = ColoredTerminal::NoColor(vec![]);
let outbuf = termcolor::NoColor::new(vec![]);
let mut pp = Printer::new(outbuf).with_filename(true);
let grep = GrepBuilder::new(pat).build().unwrap();
let count = {
@@ -259,6 +287,34 @@ and exhibited clearly, with a label attached.\
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn files_without_matches() {
let (count, out) = search(
"zzzz", SHERLOCK, |s| s.files_without_matches(true));
assert_eq!(0, count);
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn max_count() {
let (count, out) = search(
"Sherlock", SHERLOCK, |s| s.max_count(Some(1)));
assert_eq!(1, count);
assert_eq!(out, "\
/baz.rs:For the Doctor Watsons of this world, as opposed to the Sherlock
");
}
#[test]
fn invert_match_max_count() {
let (count, out) = search(
"zzzz", SHERLOCK, |s| s.invert_match(true).max_count(Some(1)));
assert_eq!(1, count);
assert_eq!(out, "\
/baz.rs:For the Doctor Watsons of this world, as opposed to the Sherlock
");
}
#[test]
fn invert_match() {
let (count, out) = search(

View File

@@ -10,9 +10,10 @@ use std::fmt;
use std::io;
use std::path::{Path, PathBuf};
use bytecount;
use grep::{Grep, Match};
use memchr::{memchr, memrchr};
use term::Terminal;
use termcolor::WriteColor;
use printer::Printer;
@@ -81,9 +82,12 @@ pub struct Options {
pub before_context: usize,
pub count: bool,
pub files_with_matches: bool,
pub files_without_matches: bool,
pub eol: u8,
pub invert_match: bool,
pub line_number: bool,
pub max_count: Option<u64>,
pub quiet: bool,
pub text: bool,
}
@@ -94,9 +98,12 @@ impl Default for Options {
before_context: 0,
count: false,
files_with_matches: false,
files_without_matches: false,
eol: b'\n',
invert_match: false,
line_number: false,
max_count: None,
quiet: false,
text: false,
}
}
@@ -104,14 +111,32 @@ impl Default for Options {
}
impl Options {
/// Both --count and --files-with-matches options imply that we should not
/// display matches at all.
/// Several options (--quiet, --count, --files-with-matches,
/// --files-without-match) imply that we shouldn't ever display matches.
pub fn skip_matches(&self) -> bool {
return self.count || self.files_with_matches;
self.count || self.files_with_matches || self.files_without_matches
|| self.quiet
}
/// Some options (--quiet, --files-with-matches, --files-without-match)
/// imply that we can stop searching after the first match.
pub fn stop_after_first_match(&self) -> bool {
self.files_with_matches || self.files_without_matches || self.quiet
}
/// Returns true if the search should terminate based on the match count.
pub fn terminate(&self, match_count: u64) -> bool {
if match_count > 0 && self.stop_after_first_match() {
return true;
}
if self.max_count.map_or(false, |max| match_count >= max) {
return true;
}
false
}
}
impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
impl<'a, R: io::Read, W: WriteColor> Searcher<'a, R, W> {
/// Create a new searcher.
///
/// `inp` is a reusable input buffer that is used as scratch space by this
@@ -177,6 +202,14 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
self
}
/// If enabled, searching will print the path of files without any matches.
///
/// Disabled by default.
pub fn files_without_matches(mut self, yes: bool) -> Self {
self.opts.files_without_matches = yes;
self
}
/// Set the end-of-line byte used by this searcher.
pub fn eol(mut self, eol: u8) -> Self {
self.opts.eol = eol;
@@ -197,6 +230,21 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
self
}
/// Limit the number of matches to the given count.
///
/// The default is None, which corresponds to no limit.
pub fn max_count(mut self, count: Option<u64>) -> Self {
self.opts.max_count = count;
self
}
/// If enabled, don't show any output and quit searching after the first
/// match is found.
pub fn quiet(mut self, yes: bool) -> Self {
self.opts.quiet = yes;
self
}
/// If enabled, search binary files as if they were text.
pub fn text(mut self, yes: bool) -> Self {
self.opts.text = yes;
@@ -259,14 +307,15 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
} else if self.opts.files_with_matches {
self.printer.path(self.path);
}
} else if self.match_count == 0 && self.opts.files_without_matches {
self.printer.path(self.path);
}
Ok(self.match_count)
}
#[inline(always)]
fn terminate(&self) -> bool {
self.match_count > 0
&& (self.printer.is_quiet() || self.opts.files_with_matches)
self.opts.terminate(self.match_count)
}
#[inline(always)]
@@ -303,6 +352,9 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
debug_assert!(self.opts.invert_match);
let mut it = IterLines::new(self.opts.eol, self.inp.pos);
while let Some((start, end)) = it.next(&self.inp.buf[..upto]) {
if self.terminate() {
return;
}
self.print_match(start, end);
self.inp.pos = end;
}
@@ -510,9 +562,10 @@ impl InputBuffer {
keep_from: usize,
) -> Result<bool, io::Error> {
// Rollover bytes from buf[keep_from..end] and update our various
// pointers. N.B. This could be done with the unsafe ptr::copy, but
// I haven't been able to produce a benchmark that notices a difference
// in performance. (Invariably, ptr::copy is also clearer IMO.)
// pointers. N.B. This could be done with the ptr::copy, but I haven't
// been able to produce a benchmark that notices a difference in
// performance. (Invariably, ptr::copy is seems clearer IMO, but it is
// not safe.)
self.tmp.clear();
self.tmp.extend_from_slice(&self.buf[keep_from..self.end]);
self.buf[0..self.tmp.len()].copy_from_slice(&self.tmp);
@@ -568,89 +621,9 @@ pub fn is_binary(buf: &[u8]) -> bool {
}
/// Count the number of lines in the given buffer.
#[inline(never)]
#[inline(never)]
pub fn count_lines(buf: &[u8], eol: u8) -> u64 {
// This was adapted from code in the memchr crate. The specific benefit
// here is that we can avoid a branch in the inner loop because all we're
// doing is counting.
// The technique to count EOL bytes was adapted from:
// http://bits.stephan-brumme.com/null.html
const LO_U64: u64 = 0x0101010101010101;
const HI_U64: u64 = 0x8080808080808080;
// use truncation
const LO_USIZE: usize = LO_U64 as usize;
const HI_USIZE: usize = HI_U64 as usize;
#[cfg(target_pointer_width = "32")]
const USIZE_BYTES: usize = 4;
#[cfg(target_pointer_width = "64")]
const USIZE_BYTES: usize = 8;
fn count_eol(eol: usize) -> u64 {
// Ideally, this would compile down to a POPCNT instruction, but
// it looks like you need to set RUSTFLAGS="-C target-cpu=native"
// (or target-feature=+popcnt) to get that to work. Bummer.
(eol.wrapping_sub(LO_USIZE) & !eol & HI_USIZE).count_ones() as u64
}
#[cfg(target_pointer_width = "32")]
fn repeat_byte(b: u8) -> usize {
let mut rep = (b as usize) << 8 | b as usize;
rep = rep << 16 | rep;
rep
}
#[cfg(target_pointer_width = "64")]
fn repeat_byte(b: u8) -> usize {
let mut rep = (b as usize) << 8 | b as usize;
rep = rep << 16 | rep;
rep = rep << 32 | rep;
rep
}
fn count_lines_slow(mut buf: &[u8], eol: u8) -> u64 {
let mut count = 0;
while let Some(pos) = memchr(eol, buf) {
count += 1;
buf = &buf[pos + 1..];
}
count
}
let len = buf.len();
let ptr = buf.as_ptr();
let mut count = 0;
// Search up to an aligned boundary...
let align = (ptr as usize) & (USIZE_BYTES - 1);
let mut i = 0;
if align > 0 {
i = cmp::min(USIZE_BYTES - align, len);
count += count_lines_slow(&buf[..i], eol);
}
// ... and search the rest.
let repeated_eol = repeat_byte(eol);
if len >= 2 * USIZE_BYTES {
while i <= len - (2 * USIZE_BYTES) {
unsafe {
let u = *(ptr.offset(i as isize) as *const usize);
let v = *(ptr.offset((i + USIZE_BYTES) as isize)
as *const usize);
count += count_eol(u ^ repeated_eol);
count += count_eol(v ^ repeated_eol);
}
i += USIZE_BYTES * 2;
}
}
count += count_lines_slow(&buf[i..], eol);
count
bytecount::count(buf, eol) as u64
}
/// Replaces a with b in buf.
@@ -790,10 +763,8 @@ mod tests {
use std::path::Path;
use grep::GrepBuilder;
use term::{Terminal, TerminfoTerminal};
use out::ColoredTerminal;
use printer::Printer;
use termcolor;
use super::{InputBuffer, Searcher, start_of_previous_lines};
@@ -833,7 +804,7 @@ fn main() {
type TestSearcher<'a> = Searcher<
'a,
io::Cursor<Vec<u8>>,
ColoredTerminal<TerminfoTerminal<Vec<u8>>>,
termcolor::NoColor<Vec<u8>>,
>;
fn search_smallcap<F: FnMut(TestSearcher) -> TestSearcher>(
@@ -842,7 +813,7 @@ fn main() {
mut map: F,
) -> (u64, String) {
let mut inp = InputBuffer::with_capacity(1);
let outbuf = ColoredTerminal::NoColor(vec![]);
let outbuf = termcolor::NoColor::new(vec![]);
let mut pp = Printer::new(outbuf).with_filename(true);
let grep = GrepBuilder::new(pat).build().unwrap();
let count = {
@@ -859,7 +830,7 @@ fn main() {
mut map: F,
) -> (u64, String) {
let mut inp = InputBuffer::with_capacity(4096);
let outbuf = ColoredTerminal::NoColor(vec![]);
let outbuf = termcolor::NoColor::new(vec![]);
let mut pp = Printer::new(outbuf).with_filename(true);
let grep = GrepBuilder::new(pat).build().unwrap();
let count = {
@@ -1026,6 +997,34 @@ fn main() {
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn files_without_matches() {
let (count, out) = search_smallcap(
"zzzz", SHERLOCK, |s| s.files_without_matches(true));
assert_eq!(0, count);
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn max_count() {
let (count, out) = search_smallcap(
"Sherlock", SHERLOCK, |s| s.max_count(Some(1)));
assert_eq!(1, count);
assert_eq!(out, "\
/baz.rs:For the Doctor Watsons of this world, as opposed to the Sherlock
");
}
#[test]
fn invert_match_max_count() {
let (count, out) = search(
"zzzz", SHERLOCK, |s| s.invert_match(true).max_count(Some(1)));
assert_eq!(1, count);
assert_eq!(out, "\
/baz.rs:For the Doctor Watsons of this world, as opposed to the Sherlock
");
}
#[test]
fn invert_match() {
let (count, out) = search_smallcap(

View File

View File

@@ -1,176 +0,0 @@
/*!
This module contains a Windows-only *in-memory* implementation of the
`term::Terminal` trait.
This particular implementation is a bit idiosyncratic, and the "in-memory"
specification is to blame. In particular, on Windows, coloring requires
communicating with the console synchronously as data is written to stdout.
This is anathema to how ripgrep fundamentally works: by writing search results
to intermediate thread local buffers in order to maximize parallelism.
Eliminating parallelism on Windows isn't an option, because that would negate
a tremendous performance benefit just for coloring.
We've worked around this by providing an implementation of `term::Terminal`
that records precisely where a color or a reset should be invoked, according
to a byte offset in the in memory buffer. When the buffer is actually printed,
we copy the bytes from the buffer to stdout incrementally while invoking the
corresponding console APIs for coloring at the right location.
(Another approach would be to do ANSI coloring unconditionally, then parse that
and translate it to console commands. The advantage of that approach is that
it doesn't require any additional memory for storing offsets. In practice
though, coloring is only used in the terminal, which tends to correspond to
searches that produce very few results with respect to the corpus searched.
Therefore, this is an acceptable trade off. Namely, we do not pay for it when
coloring is disabled.
*/
use std::io;
use term::{self, Terminal};
use term::color::Color;
/// An in-memory buffer that provides Windows console coloring.
#[derive(Clone, Debug)]
pub struct WindowsBuffer {
buf: Vec<u8>,
pos: usize,
colors: Vec<WindowsColor>,
}
/// A color associated with a particular location in a buffer.
#[derive(Clone, Debug)]
struct WindowsColor {
pos: usize,
opt: WindowsOption,
}
/// A color or reset directive that can be translated into an instruction to
/// the Windows console.
#[derive(Clone, Debug)]
enum WindowsOption {
Foreground(Color),
Background(Color),
Reset,
}
impl WindowsBuffer {
/// Create a new empty buffer for Windows console coloring.
pub fn new() -> WindowsBuffer {
WindowsBuffer {
buf: vec![],
pos: 0,
colors: vec![],
}
}
fn push(&mut self, opt: WindowsOption) {
let pos = self.pos;
self.colors.push(WindowsColor { pos: pos, opt: opt });
}
/// Print the contents to the given terminal.
pub fn print_stdout<T: Terminal + Send>(&self, tt: &mut T) {
if !tt.supports_color() {
let _ = tt.write_all(&self.buf);
let _ = tt.flush();
return;
}
let mut last = 0;
for col in &self.colors {
let _ = tt.write_all(&self.buf[last..col.pos]);
match col.opt {
WindowsOption::Foreground(c) => {
let _ = tt.fg(c);
}
WindowsOption::Background(c) => {
let _ = tt.bg(c);
}
WindowsOption::Reset => {
let _ = tt.reset();
}
}
last = col.pos;
}
let _ = tt.write_all(&self.buf[last..]);
let _ = tt.flush();
}
/// Clear the buffer.
pub fn clear(&mut self) {
self.buf.clear();
self.colors.clear();
self.pos = 0;
}
}
impl io::Write for WindowsBuffer {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let n = try!(self.buf.write(buf));
self.pos += n;
Ok(n)
}
fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}
impl Terminal for WindowsBuffer {
type Output = Vec<u8>;
fn fg(&mut self, fg: Color) -> term::Result<()> {
self.push(WindowsOption::Foreground(fg));
Ok(())
}
fn bg(&mut self, bg: Color) -> term::Result<()> {
self.push(WindowsOption::Background(bg));
Ok(())
}
fn attr(&mut self, _attr: term::Attr) -> term::Result<()> {
Err(term::Error::NotSupported)
}
fn supports_attr(&self, _attr: term::Attr) -> bool {
false
}
fn reset(&mut self) -> term::Result<()> {
self.push(WindowsOption::Reset);
Ok(())
}
fn supports_reset(&self) -> bool {
true
}
fn supports_color(&self) -> bool {
true
}
fn cursor_up(&mut self) -> term::Result<()> {
Err(term::Error::NotSupported)
}
fn delete_line(&mut self) -> term::Result<()> {
Err(term::Error::NotSupported)
}
fn carriage_return(&mut self) -> term::Result<()> {
Err(term::Error::NotSupported)
}
fn get_ref(&self) -> &Vec<u8> {
&self.buf
}
fn get_mut(&mut self) -> &mut Vec<u8> {
&mut self.buf
}
fn into_inner(self) -> Vec<u8> {
self.buf
}
}

View File

@@ -1,436 +0,0 @@
/*!
The types module provides a way of associating glob patterns on file names to
file types.
*/
use std::collections::HashMap;
use std::error::Error as StdError;
use std::fmt;
use std::path::Path;
use regex;
use gitignore::{Match, Pattern};
use glob::{self, MatchOptions};
const TYPE_EXTENSIONS: &'static [(&'static str, &'static [&'static str])] = &[
("asm", &["*.asm", "*.s", "*.S"]),
("awk", &["*.awk"]),
("c", &["*.c", "*.h", "*.H"]),
("cbor", &["*.cbor"]),
("clojure", &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
("cmake", &["CMakeLists.txt"]),
("coffeescript", &["*.coffee"]),
("cpp", &[
"*.C", "*.cc", "*.cpp", "*.cxx",
"*.h", "*.H", "*.hh", "*.hpp",
]),
("csharp", &["*.cs"]),
("css", &["*.css"]),
("cython", &["*.pyx"]),
("dart", &["*.dart"]),
("d", &["*.d"]),
("elisp", &["*.el"]),
("erlang", &["*.erl", "*.hrl"]),
("fortran", &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("go", &["*.go"]),
("groovy", &["*.groovy"]),
("haskell", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html"]),
("java", &["*.java"]),
("js", &[
"*.js", "*.jsx", "*.vue",
]),
("json", &["*.json"]),
("jsonl", &["*.jsonl"]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
("lua", &["*.lua"]),
("m4", &["*.ac", "*.m4"]),
("make", &["gnumakefile", "Gnumakefile", "makefile", "Makefile", "*.mk"]),
("markdown", &["*.md"]),
("matlab", &["*.m"]),
("mk", &["mkfile"]),
("ml", &["*.ml"]),
("nim", &["*.nim"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
("perl", &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm"]),
("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
("py", &["*.py"]),
("readme", &["README*", "*README"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rst", &["*.rst"]),
("ruby", &["*.rb"]),
("rust", &["*.rs"]),
("scala", &["*.scala"]),
("sh", &["*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh"]),
("sql", &["*.sql"]),
("sv", &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
("swift", &["*.swift"]),
("tex", &["*.tex", "*.cls", "*.sty"]),
("ts", &["*.ts", "*.tsx"]),
("txt", &["*.txt"]),
("toml", &["*.toml", "Cargo.lock"]),
("vala", &["*.vala"]),
("vb", &["*.vb"]),
("vimscript", &["*.vim"]),
("xml", &["*.xml"]),
("yacc", &["*.y"]),
("yaml", &["*.yaml", "*.yml"]),
];
/// Describes all the possible failure conditions for building a file type
/// matcher.
#[derive(Debug)]
pub enum Error {
/// We tried to select (or negate) a file type that is not defined.
UnrecognizedFileType(String),
/// A user specified file type definition could not be parsed.
InvalidDefinition,
/// There was an error building the matcher (probably a bad glob).
Glob(glob::Error),
/// There was an error compiling a glob as a regex.
Regex(regex::Error),
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Error::UnrecognizedFileType(_) => "unrecognized file type",
Error::InvalidDefinition => "invalid definition",
Error::Glob(ref err) => err.description(),
Error::Regex(ref err) => err.description(),
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::UnrecognizedFileType(ref ty) => {
write!(f, "unrecognized file type: {}", ty)
}
Error::InvalidDefinition => {
write!(f, "invalid definition (format is type:glob, e.g., \
html:*.html)")
}
Error::Glob(ref err) => err.fmt(f),
Error::Regex(ref err) => err.fmt(f),
}
}
}
impl From<glob::Error> for Error {
fn from(err: glob::Error) -> Error {
Error::Glob(err)
}
}
impl From<regex::Error> for Error {
fn from(err: regex::Error) -> Error {
Error::Regex(err)
}
}
/// A single file type definition.
#[derive(Clone, Debug)]
pub struct FileTypeDef {
name: String,
pats: Vec<String>,
}
impl FileTypeDef {
/// Return the name of this file type.
pub fn name(&self) -> &str {
&self.name
}
/// Return the glob patterns used to recognize this file type.
pub fn patterns(&self) -> &[String] {
&self.pats
}
}
/// Types is a file type matcher.
#[derive(Clone, Debug)]
pub struct Types {
selected: Option<glob::SetYesNo>,
negated: Option<glob::SetYesNo>,
has_selected: bool,
unmatched_pat: Pattern,
}
impl Types {
/// Creates a new file type matcher from the given Gitignore matcher. If
/// not Gitignore matcher is provided, then the file type matcher has no
/// effect.
///
/// If has_selected is true, then at least one file type was selected.
/// Therefore, any non-matches should be ignored.
fn new(
selected: Option<glob::SetYesNo>,
negated: Option<glob::SetYesNo>,
has_selected: bool,
) -> Types {
Types {
selected: selected,
negated: negated,
has_selected: has_selected,
unmatched_pat: Pattern {
from: Path::new("<filetype>").to_path_buf(),
original: "<N/A>".to_string(),
pat: "<N/A>".to_string(),
whitelist: false,
only_dir: false,
},
}
}
/// Creates a new file type matcher that never matches.
pub fn empty() -> Types {
Types::new(None, None, false)
}
/// Returns a match for the given path against this file type matcher.
///
/// The path is considered whitelisted if it matches a selected file type.
/// The path is considered ignored if it matched a negated file type.
/// If at least one file type is selected and path doesn't match, then
/// the path is also considered ignored.
pub fn matched<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> Match {
// If we don't have any matcher, then we can't do anything.
if self.negated.is_none() && self.selected.is_none() {
return Match::None;
}
// File types don't apply to directories.
if is_dir {
return Match::None;
}
let path = path.as_ref();
let name = match path.file_name() {
Some(name) => name.to_string_lossy(),
None if self.has_selected => {
return Match::Ignored(&self.unmatched_pat);
}
None => {
return Match::None;
}
};
if self.negated.as_ref().map(|s| s.is_match(&*name)).unwrap_or(false) {
return Match::Ignored(&self.unmatched_pat);
}
if self.selected.as_ref().map(|s|s.is_match(&*name)).unwrap_or(false) {
return Match::Whitelist(&self.unmatched_pat);
}
if self.has_selected {
Match::Ignored(&self.unmatched_pat)
} else {
Match::None
}
}
}
/// TypesBuilder builds a type matcher from a set of file type definitions and
/// a set of file type selections.
pub struct TypesBuilder {
types: HashMap<String, Vec<String>>,
selected: Vec<String>,
negated: Vec<String>,
}
impl TypesBuilder {
/// Create a new builder for a file type matcher.
pub fn new() -> TypesBuilder {
TypesBuilder {
types: HashMap::new(),
selected: vec![],
negated: vec![],
}
}
/// Build the current set of file type definitions *and* selections into
/// a file type matcher.
pub fn build(&self) -> Result<Types, Error> {
let opts = MatchOptions {
require_literal_separator: true, ..MatchOptions::default()
};
let selected_globs =
if self.selected.is_empty() {
None
} else {
let mut bset = glob::SetBuilder::new();
for name in &self.selected {
let globs = match self.types.get(name) {
Some(globs) => globs,
None => {
let msg = name.to_string();
return Err(Error::UnrecognizedFileType(msg));
}
};
for glob in globs {
try!(bset.add_with(glob, &opts));
}
}
Some(try!(bset.build_yesno()))
};
let negated_globs =
if self.negated.is_empty() {
None
} else {
let mut bset = glob::SetBuilder::new();
for name in &self.negated {
let globs = match self.types.get(name) {
Some(globs) => globs,
None => {
let msg = name.to_string();
return Err(Error::UnrecognizedFileType(msg));
}
};
for glob in globs {
try!(bset.add_with(glob, &opts));
}
}
Some(try!(bset.build_yesno()))
};
Ok(Types::new(
selected_globs, negated_globs, !self.selected.is_empty()))
}
/// Return the set of current file type definitions.
pub fn definitions(&self) -> Vec<FileTypeDef> {
let mut defs = vec![];
for (ref name, ref pats) in &self.types {
let mut pats = pats.to_vec();
pats.sort();
defs.push(FileTypeDef {
name: name.to_string(),
pats: pats,
});
}
defs.sort_by(|def1, def2| def1.name().cmp(def2.name()));
defs
}
/// Select the file type given by `name`.
///
/// If `name` is `all`, then all file types are selected.
pub fn select(&mut self, name: &str) -> &mut TypesBuilder {
if name == "all" {
for name in self.types.keys() {
self.selected.push(name.to_string());
}
} else {
self.selected.push(name.to_string());
}
self
}
/// Ignore the file type given by `name`.
///
/// If `name` is `all`, then all file types are negated.
pub fn negate(&mut self, name: &str) -> &mut TypesBuilder {
if name == "all" {
for name in self.types.keys() {
self.negated.push(name.to_string());
}
} else {
self.negated.push(name.to_string());
}
self
}
/// Clear any file type definitions for the type given.
pub fn clear(&mut self, name: &str) -> &mut TypesBuilder {
self.types.remove(name);
self
}
/// Add a new file type definition. `name` can be arbitrary and `pat`
/// should be a glob recognizing file paths belonging to the `name` type.
pub fn add(&mut self, name: &str, pat: &str) -> &mut TypesBuilder {
self.types.entry(name.to_string())
.or_insert(vec![]).push(pat.to_string());
self
}
/// Add a new file type definition specified in string form. The format
/// is `name:glob`. Names may not include a colon.
pub fn add_def(&mut self, def: &str) -> Result<(), Error> {
let name: String = def.chars().take_while(|&c| c != ':').collect();
let pat: String = def.chars().skip(name.chars().count() + 1).collect();
if name.is_empty() || pat.is_empty() {
return Err(Error::InvalidDefinition);
}
self.add(&name, &pat);
Ok(())
}
/// Add a set of default file type definitions.
pub fn add_defaults(&mut self) -> &mut TypesBuilder {
for &(name, exts) in TYPE_EXTENSIONS {
for ext in exts {
self.add(name, ext);
}
}
self
}
}
#[cfg(test)]
mod tests {
use super::TypesBuilder;
macro_rules! matched {
($name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr) => {
matched!($name, $types, $sel, $selnot, $path, true);
};
(not, $name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr) => {
matched!($name, $types, $sel, $selnot, $path, false);
};
($name:ident, $types:expr, $sel:expr, $selnot:expr,
$path:expr, $matched:expr) => {
#[test]
fn $name() {
let mut btypes = TypesBuilder::new();
for tydef in $types {
btypes.add_def(tydef).unwrap();
}
for sel in $sel {
btypes.select(sel);
}
for selnot in $selnot {
btypes.negate(selnot);
}
let types = btypes.build().unwrap();
let mat = types.matched($path, false);
assert_eq!($matched, !mat.is_ignored());
}
};
}
fn types() -> Vec<&'static str> {
vec![
"html:*.html",
"html:*.htm",
"rust:*.rs",
"js:*.js",
]
}
matched!(match1, types(), vec!["rust"], vec![], "lib.rs");
matched!(match2, types(), vec!["html"], vec![], "index.html");
matched!(match3, types(), vec!["html"], vec![], "index.htm");
matched!(match4, types(), vec!["html", "rust"], vec![], "main.rs");
matched!(match5, types(), vec![], vec![], "index.html");
matched!(match6, types(), vec![], vec!["rust"], "index.html");
matched!(not, matchnot1, types(), vec!["rust"], vec![], "index.html");
matched!(not, matchnot2, types(), vec![], vec!["rust"], "main.rs");
}

128
src/unescape.rs Normal file
View File

@@ -0,0 +1,128 @@
/// A single state in the state machine used by `unescape`.
#[derive(Clone, Copy, Eq, PartialEq)]
enum State {
/// The state after seeing a `\`.
Escape,
/// The state after seeing a `\x`.
HexFirst,
/// The state after seeing a `\x[0-9A-Fa-f]`.
HexSecond(char),
/// Default state.
Literal,
}
/// Unescapes a string given on the command line. It supports a limited set of
/// escape sequences:
///
/// * \t, \r and \n are mapped to their corresponding ASCII bytes.
/// * \xZZ hexadecimal escapes are mapped to their byte.
pub fn unescape(s: &str) -> Vec<u8> {
use self::State::*;
let mut bytes = vec![];
let mut state = Literal;
for c in s.chars() {
match state {
Escape => {
match c {
'n' => { bytes.push(b'\n'); state = Literal; }
'r' => { bytes.push(b'\r'); state = Literal; }
't' => { bytes.push(b'\t'); state = Literal; }
'x' => { state = HexFirst; }
c => {
bytes.extend(&format!(r"\{}", c).into_bytes());
state = Literal;
}
}
}
HexFirst => {
match c {
'0'...'9' | 'A'...'F' | 'a'...'f' => {
state = HexSecond(c);
}
c => {
bytes.extend(&format!(r"\x{}", c).into_bytes());
state = Literal;
}
}
}
HexSecond(first) => {
match c {
'0'...'9' | 'A'...'F' | 'a'...'f' => {
let ordinal = format!("{}{}", first, c);
let byte = u8::from_str_radix(&ordinal, 16).unwrap();
bytes.push(byte);
state = Literal;
}
c => {
let original = format!(r"\x{}{}", first, c);
bytes.extend(&original.into_bytes());
state = Literal;
}
}
}
Literal => {
match c {
'\\' => { state = Escape; }
c => { bytes.extend(c.to_string().as_bytes()); }
}
}
}
}
match state {
Escape => bytes.push(b'\\'),
HexFirst => bytes.extend(b"\\x"),
HexSecond(c) => bytes.extend(&format!("\\x{}", c).into_bytes()),
Literal => {}
}
bytes
}
#[cfg(test)]
mod tests {
use super::unescape;
fn b(bytes: &'static [u8]) -> Vec<u8> {
bytes.to_vec()
}
#[test]
fn unescape_nul() {
assert_eq!(b(b"\x00"), unescape(r"\x00"));
}
#[test]
fn unescape_nl() {
assert_eq!(b(b"\n"), unescape(r"\n"));
}
#[test]
fn unescape_tab() {
assert_eq!(b(b"\t"), unescape(r"\t"));
}
#[test]
fn unescape_carriage() {
assert_eq!(b(b"\r"), unescape(r"\r"));
}
#[test]
fn unescape_nothing_simple() {
assert_eq!(b(b"\\a"), unescape(r"\a"));
}
#[test]
fn unescape_nothing_hex0() {
assert_eq!(b(b"\\x"), unescape(r"\x"));
}
#[test]
fn unescape_nothing_hex1() {
assert_eq!(b(b"\\xz"), unescape(r"\xz"));
}
#[test]
fn unescape_nothing_hex2() {
assert_eq!(b(b"\\xzz"), unescape(r"\xzz"));
}
}

View File

@@ -1,140 +0,0 @@
/*!
The walk module implements a recursive directory iterator (using the `walkdir`)
crate that can efficiently skip and ignore files and directories specified in
a user's ignore patterns.
*/
use walkdir::{self, DirEntry, WalkDir, WalkDirIterator};
use ignore::Ignore;
/// Iter is a recursive directory iterator over file paths in a directory.
/// Only file paths should be searched are yielded.
pub struct Iter {
ig: Ignore,
it: WalkEventIter,
}
impl Iter {
/// Create a new recursive directory iterator using the ignore patterns
/// and walkdir iterator given.
pub fn new(ig: Ignore, wd: WalkDir) -> Iter {
Iter {
ig: ig,
it: WalkEventIter::from(wd),
}
}
/// Returns true if this entry should be skipped.
#[inline(always)]
fn skip_entry(&self, ent: &DirEntry) -> bool {
if ent.depth() == 0 {
// Never skip the root directory.
return false;
}
if self.ig.ignored(ent.path(), ent.file_type().is_dir()) {
return true;
}
false
}
}
impl Iterator for Iter {
type Item = DirEntry;
#[inline(always)]
fn next(&mut self) -> Option<DirEntry> {
while let Some(ev) = self.it.next() {
match ev {
Err(err) => {
eprintln!("{}", err);
}
Ok(WalkEvent::Exit) => {
self.ig.pop();
}
Ok(WalkEvent::Dir(ent)) => {
if self.skip_entry(&ent) {
self.it.it.skip_current_dir();
// Still need to push this on the stack because we'll
// get a WalkEvent::Exit event for this dir. We don't
// care if it errors though.
let _ = self.ig.push(ent.path());
continue;
}
if let Err(err) = self.ig.push(ent.path()) {
eprintln!("{}", err);
self.it.it.skip_current_dir();
continue;
}
}
Ok(WalkEvent::File(ent)) => {
if self.skip_entry(&ent) {
continue;
}
// If this isn't actually a file (e.g., a symlink), then
// skip it.
if !ent.file_type().is_file() {
continue;
}
return Some(ent);
}
}
}
None
}
}
/// WalkEventIter transforms a WalkDir iterator into an iterator that more
/// accurately describes the directory tree. Namely, it emits events that are
/// one of three types: directory, file or "exit." An "exit" event means that
/// the entire contents of a directory have been enumerated.
struct WalkEventIter {
depth: usize,
it: walkdir::Iter,
next: Option<Result<DirEntry, walkdir::Error>>,
}
#[derive(Debug)]
enum WalkEvent {
Dir(DirEntry),
File(DirEntry),
Exit,
}
impl From<WalkDir> for WalkEventIter {
fn from(it: WalkDir) -> WalkEventIter {
WalkEventIter { depth: 0, it: it.into_iter(), next: None }
}
}
impl Iterator for WalkEventIter {
type Item = walkdir::Result<WalkEvent>;
#[inline(always)]
fn next(&mut self) -> Option<walkdir::Result<WalkEvent>> {
let dent = self.next.take().or_else(|| self.it.next());
let depth = match dent {
None => 0,
Some(Ok(ref dent)) => dent.depth(),
Some(Err(ref err)) => err.depth(),
};
if depth < self.depth {
self.depth -= 1;
self.next = dent;
return Some(Ok(WalkEvent::Exit));
}
self.depth = depth;
match dent {
None => None,
Some(Err(err)) => Some(Err(err)),
Some(Ok(dent)) => {
if dent.file_type().is_dir() {
self.depth += 1;
Some(Ok(WalkEvent::Dir(dent)))
} else {
Some(Ok(WalkEvent::File(dent)))
}
}
}
}
}

291
src/worker.rs Normal file
View File

@@ -0,0 +1,291 @@
use std::fs::File;
use std::io;
use std::path::Path;
use grep::Grep;
use ignore::DirEntry;
use memmap::{Mmap, Protection};
use termcolor::WriteColor;
use pathutil::strip_prefix;
use printer::Printer;
use search_buffer::BufferSearcher;
use search_stream::{InputBuffer, Searcher};
use Result;
pub enum Work {
Stdin,
DirEntry(DirEntry),
}
pub struct WorkerBuilder {
grep: Grep,
opts: Options,
}
#[derive(Clone, Debug)]
struct Options {
mmap: bool,
after_context: usize,
before_context: usize,
count: bool,
files_with_matches: bool,
files_without_matches: bool,
eol: u8,
invert_match: bool,
line_number: bool,
max_count: Option<u64>,
no_messages: bool,
quiet: bool,
text: bool,
}
impl Default for Options {
fn default() -> Options {
Options {
mmap: false,
after_context: 0,
before_context: 0,
count: false,
files_with_matches: false,
files_without_matches: false,
eol: b'\n',
invert_match: false,
line_number: false,
max_count: None,
no_messages: false,
quiet: false,
text: false,
}
}
}
impl WorkerBuilder {
/// Create a new builder for a worker.
///
/// A reusable input buffer and a grep matcher are required, but there
/// are numerous additional options that can be configured on this builder.
pub fn new(grep: Grep) -> WorkerBuilder {
WorkerBuilder {
grep: grep,
opts: Options::default(),
}
}
/// Create the worker from this builder.
pub fn build(self) -> Worker {
let mut inpbuf = InputBuffer::new();
inpbuf.eol(self.opts.eol);
Worker {
grep: self.grep,
inpbuf: inpbuf,
opts: self.opts,
}
}
/// The number of contextual lines to show after each match. The default
/// is zero.
pub fn after_context(mut self, count: usize) -> Self {
self.opts.after_context = count;
self
}
/// The number of contextual lines to show before each match. The default
/// is zero.
pub fn before_context(mut self, count: usize) -> Self {
self.opts.before_context = count;
self
}
/// If enabled, searching will print a count instead of each match.
///
/// Disabled by default.
pub fn count(mut self, yes: bool) -> Self {
self.opts.count = yes;
self
}
/// If enabled, searching will print the path instead of each match.
///
/// Disabled by default.
pub fn files_with_matches(mut self, yes: bool) -> Self {
self.opts.files_with_matches = yes;
self
}
/// If enabled, searching will print the path of files without any matches.
///
/// Disabled by default.
pub fn files_without_matches(mut self, yes: bool) -> Self {
self.opts.files_without_matches = yes;
self
}
/// Set the end-of-line byte used by this searcher.
pub fn eol(mut self, eol: u8) -> Self {
self.opts.eol = eol;
self
}
/// If enabled, matching is inverted so that lines that *don't* match the
/// given pattern are treated as matches.
pub fn invert_match(mut self, yes: bool) -> Self {
self.opts.invert_match = yes;
self
}
/// If enabled, compute line numbers and prefix each line of output with
/// them.
pub fn line_number(mut self, yes: bool) -> Self {
self.opts.line_number = yes;
self
}
/// Limit the number of matches to the given count.
///
/// The default is None, which corresponds to no limit.
pub fn max_count(mut self, count: Option<u64>) -> Self {
self.opts.max_count = count;
self
}
/// If enabled, try to use memory maps for searching if possible.
pub fn mmap(mut self, yes: bool) -> Self {
self.opts.mmap = yes;
self
}
/// If enabled, error messages are suppressed.
///
/// This is disabled by default.
pub fn no_messages(mut self, yes: bool) -> Self {
self.opts.no_messages = yes;
self
}
/// If enabled, don't show any output and quit searching after the first
/// match is found.
pub fn quiet(mut self, yes: bool) -> Self {
self.opts.quiet = yes;
self
}
/// If enabled, search binary files as if they were text.
pub fn text(mut self, yes: bool) -> Self {
self.opts.text = yes;
self
}
}
/// Worker is responsible for executing searches on file paths, while choosing
/// streaming search or memory map search as appropriate.
pub struct Worker {
inpbuf: InputBuffer,
grep: Grep,
opts: Options,
}
impl Worker {
/// Execute the worker with the given printer and work item.
///
/// A work item can either be stdin or a file path.
pub fn run<W: WriteColor>(
&mut self,
printer: &mut Printer<W>,
work: Work,
) -> u64 {
let result = match work {
Work::Stdin => {
let stdin = io::stdin();
let stdin = stdin.lock();
self.search(printer, &Path::new("<stdin>"), stdin)
}
Work::DirEntry(dent) => {
let mut path = dent.path();
let file = match File::open(path) {
Ok(file) => file,
Err(err) => {
if !self.opts.no_messages {
eprintln!("{}: {}", path.display(), err);
}
return 0;
}
};
if let Some(p) = strip_prefix("./", path) {
path = p;
}
if self.opts.mmap {
self.search_mmap(printer, path, &file)
} else {
self.search(printer, path, file)
}
}
};
match result {
Ok(count) => {
count
}
Err(err) => {
if !self.opts.no_messages {
eprintln!("{}", err);
}
0
}
}
}
fn search<R: io::Read, W: WriteColor>(
&mut self,
printer: &mut Printer<W>,
path: &Path,
rdr: R,
) -> Result<u64> {
let searcher = Searcher::new(
&mut self.inpbuf, printer, &self.grep, path, rdr);
searcher
.after_context(self.opts.after_context)
.before_context(self.opts.before_context)
.count(self.opts.count)
.files_with_matches(self.opts.files_with_matches)
.files_without_matches(self.opts.files_without_matches)
.eol(self.opts.eol)
.line_number(self.opts.line_number)
.invert_match(self.opts.invert_match)
.max_count(self.opts.max_count)
.quiet(self.opts.quiet)
.text(self.opts.text)
.run()
.map_err(From::from)
}
fn search_mmap<W: WriteColor>(
&mut self,
printer: &mut Printer<W>,
path: &Path,
file: &File,
) -> Result<u64> {
if try!(file.metadata()).len() == 0 {
// Opening a memory map with an empty file results in an error.
// However, this may not actually be an empty file! For example,
// /proc/cpuinfo reports itself as an empty file, but it can
// produce data when it's read from. Therefore, we fall back to
// regular read calls.
return self.search(printer, path, file);
}
let mmap = try!(Mmap::open(file, Protection::Read));
let searcher = BufferSearcher::new(
printer, &self.grep, path, unsafe { mmap.as_slice() });
Ok(searcher
.count(self.opts.count)
.files_with_matches(self.opts.files_with_matches)
.files_without_matches(self.opts.files_without_matches)
.eol(self.opts.eol)
.line_number(self.opts.line_number)
.invert_match(self.opts.invert_match)
.max_count(self.opts.max_count)
.quiet(self.opts.quiet)
.text(self.opts.text)
.run())
}
}

20
termcolor/Cargo.toml Normal file
View File

@@ -0,0 +1,20 @@
[package]
name = "termcolor"
version = "0.1.0" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A simple cross platform library for writing colored text to a terminal.
"""
documentation = "https://docs.rs/termcolor"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/termcolor"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/termcolor"
readme = "README.md"
keywords = ["windows", "win", "color", "ansi", "console"]
license = "Unlicense/MIT"
[lib]
name = "termcolor"
bench = false
[target.'cfg(windows)'.dependencies]
wincolor = { version = "0.1.0", path = "../wincolor" }

88
termcolor/README.md Normal file
View File

@@ -0,0 +1,88 @@
termcolor
=========
A simple cross platform library for writing colored text to a terminal. This
library writes colored text either using standard ANSI escape sequences or
by interacting with the Windows console. Several convenient abstractions
are provided for use in single-threaded or multi-threaded command line
applications.
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/wincolor.svg)](https://crates.io/crates/wincolor)
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/termcolor.svg)](https://crates.io/crates/termcolor)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/termcolor](https://docs.rs/termcolor)
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
termcolor = "0.1"
```
and this to your crate root:
```rust
extern crate termcolor;
```
### Organization
The `WriteColor` trait extends the `io::Write` trait with methods for setting
colors or resetting them.
`Stdout` and `StdoutLock` both satisfy `WriteColor` and are analogous to
`std::io::Stdout` and `std::io::StdoutLock`.
`Buffer` is an in memory buffer that supports colored text. In a parallel
program, each thread might write to its own buffer. A buffer can be printed
to stdout using a `BufferWriter`. The advantage of this design is that
each thread can work in parallel on a buffer without having to synchronize
access to global resources such as the Windows console. Moreover, this design
also prevents interleaving of buffer output.
`Ansi` and `NoColor` both satisfy `WriteColor` for arbitrary implementors of
`io::Write`. These types are useful when you know exactly what you need. An
analogous type for the Windows console is not provided since it cannot exist.
### Example: using `Stdout`
The `Stdout` type in this crate works similarly to `std::io::Stdout`, except
it is augmented with methods for coloring by the `WriteColor` trait. For
example, to write some green text:
```rust
use std::io::Write;
use termcolor::{Color, ColorChoice, ColorSpec, Stdout, WriteColor};
let mut stdout = Stdout::new(ColorChoice::Always);
try!(stdout.set_color(ColorSpec::new().set_fg(Some(Color::Green))));
try!(writeln!(&mut stdout, "green text!"));
```
### Example: using `BufferWriter`
A `BufferWriter` can create buffers and write buffers to stdout. It does *not*
implement `io::Write` or `WriteColor` itself. Instead, `Buffer` implements
`io::Write` and `io::WriteColor`.
This example shows how to print some green text to stdout.
```rust
use std::io::Write;
use termcolor::{BufferWriter, Color, ColorChoice, ColorSpec, WriteColor};
let mut bufwtr = BufferWriter::stdout(ColorChoice::Always);
let mut buffer = bufwtr.buffer();
try!(buffer.set_color(ColorSpec::new().set_fg(Some(Color::Green))));
try!(writeln!(&mut buffer, "green text!"));
try!(bufwtr.print(&buffer));
```

1071
termcolor/src/lib.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -54,6 +54,20 @@ fn path(unix: &str) -> String {
}
}
fn paths(unix: &[&str]) -> Vec<String> {
let mut xs: Vec<_> = unix.iter().map(|s| path(s)).collect();
xs.sort();
xs
}
fn paths_from_stdout(stdout: String) -> Vec<String> {
let mut paths: Vec<_> = stdout.lines().map(|s| {
s.split(":").next().unwrap().to_string()
}).collect();
paths.sort();
paths
}
fn sort_lines(lines: &str) -> String {
let mut lines: Vec<String> =
lines.trim().lines().map(|s| s.to_owned()).collect();
@@ -325,6 +339,15 @@ sherlock!(files_with_matches, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, expected);
});
sherlock!(files_without_matches, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("file.py", "foo");
cmd.arg("--files-without-match");
let lines: String = wd.stdout(&mut cmd);
let expected = "file.py\n";
assert_eq!(lines, expected);
});
sherlock!(after_context, |wd: WorkDir, mut cmd: Command| {
cmd.arg("-A").arg("1");
let lines: String = wd.stdout(&mut cmd);
@@ -538,24 +561,26 @@ sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
assert_eq!(lines, expected);
});
#[cfg(not(windows))]
sherlock!(symlink_nofollow, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.remove("sherlock");
wd.create_dir("foo");
wd.create_dir("foo/bar");
wd.link("foo/baz", "foo/bar/baz");
wd.link_dir("foo/baz", "foo/bar/baz");
wd.create_dir("foo/baz");
wd.create("foo/baz/sherlock", hay::SHERLOCK);
cmd.current_dir(wd.path().join("foo/bar"));
wd.assert_err(&mut cmd);
});
#[cfg(not(windows))]
sherlock!(symlink_follow, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.remove("sherlock");
wd.create_dir("foo");
wd.create_dir("foo/bar");
wd.create_dir("foo/baz");
wd.create("foo/baz/sherlock", hay::SHERLOCK);
wd.link("foo/baz", "foo/bar/baz");
wd.link_dir("foo/baz", "foo/bar/baz");
cmd.arg("-L");
cmd.current_dir(wd.path().join("foo/bar"));
@@ -592,17 +617,6 @@ sherlock!(unrestricted2, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, expected);
});
#[cfg(not(windows))]
sherlock!(unrestricted3, "foo", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("-uuu");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "file:foo\x00bar\nfile:foo\x00baz\n");
});
// On Windows, this test uses memory maps, so the NUL bytes don't get replaced.
#[cfg(windows)]
sherlock!(unrestricted3, "foo", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("-uuu");
@@ -659,7 +673,6 @@ clean!(regression_30, "test", ".", |wd: WorkDir, mut cmd: Command| {
}
wd.create_dir("vendor");
wd.create("vendor/manifest", "test");
cmd.arg("--debug");
let lines: String = wd.stdout(&mut cmd);
let expected = path("vendor/manifest:test\n");
@@ -705,6 +718,13 @@ clean!(regression_67, "test", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, path("dir/bar:test\n"));
});
// See: https://github.com/BurntSushi/ripgrep/issues/87
clean!(regression_87, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "foo\n**no-vcs**");
wd.create("foo", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/90
clean!(regression_90, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "!.foo");
@@ -753,8 +773,188 @@ clean!(regression_105_part2, "test", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, "foo:3:zztest\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/127
clean!(regression_127, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
// Set up a directory hierarchy like this:
//
// .gitignore
// foo/
// sherlock
// watson
//
// Where `.gitignore` contains `foo/sherlock`.
//
// ripgrep should ignore 'foo/sherlock' giving us results only from
// 'foo/watson' but on Windows ripgrep will include both 'foo/sherlock' and
// 'foo/watson' in the search results.
wd.create(".gitignore", "foo/sherlock\n");
wd.create_dir("foo");
wd.create("foo/sherlock", hay::SHERLOCK);
wd.create("foo/watson", hay::SHERLOCK);
let lines: String = wd.stdout(&mut cmd);
let expected = format!("\
{path}:For the Doctor Watsons of this world, as opposed to the Sherlock
{path}:be, to a very large extent, the result of luck. Sherlock Holmes
", path=path("foo/watson"));
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/128
clean!(regression_128, "x", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_bytes("foo", b"01234567\x0b\n\x0b\n\x0b\n\x0b\nx");
cmd.arg("-n");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:5:x\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/131
//
// TODO(burntsushi): Darwin doesn't like this test for some reason.
#[cfg(not(target_os = "macos"))]
clean!(regression_131, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "TopÑapa");
wd.create("TopÑapa", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/137
//
// TODO(burntsushi): Figure out why Windows gives "access denied" errors
// when trying to create a file symlink. For now, disable test on Windows.
#[cfg(not(windows))]
sherlock!(regression_137, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.link_file("sherlock", "sym1");
wd.link_file("sherlock", "sym2");
cmd.arg("sym1");
cmd.arg("sym2");
cmd.arg("-j1");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
sym1:For the Doctor Watsons of this world, as opposed to the Sherlock
sym1:be, to a very large extent, the result of luck. Sherlock Holmes
sym2:For the Doctor Watsons of this world, as opposed to the Sherlock
sym2:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, path(expected));
});
// See: https://github.com/BurntSushi/ripgrep/issues/156
clean!(
regression_156,
r#"#(?:parse|include)\s*\(\s*(?:"|')[./A-Za-z_-]+(?:"|')"#,
"testcase.txt",
|wd: WorkDir, mut cmd: Command| {
const TESTCASE: &'static str = r#"#parse('widgets/foo_bar_macros.vm')
#parse ( 'widgets/mobile/foo_bar_macros.vm' )
#parse ("widgets/foobarhiddenformfields.vm")
#parse ( "widgets/foo_bar_legal.vm" )
#include( 'widgets/foo_bar_tips.vm' )
#include('widgets/mobile/foo_bar_macros.vm')
#include ("widgets/mobile/foo_bar_resetpw.vm")
#parse('widgets/foo-bar-macros.vm')
#parse ( 'widgets/mobile/foo-bar-macros.vm' )
#parse ("widgets/foo-bar-hiddenformfields.vm")
#parse ( "widgets/foo-bar-legal.vm" )
#include( 'widgets/foo-bar-tips.vm' )
#include('widgets/mobile/foo-bar-macros.vm')
#include ("widgets/mobile/foo-bar-resetpw.vm")
"#;
wd.create("testcase.txt", TESTCASE);
cmd.arg("-N");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, TESTCASE);
});
// See: https://github.com/BurntSushi/ripgrep/issues/184
clean!(regression_184, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", ".*");
wd.create_dir("foo/bar");
wd.create("foo/bar/baz", "test");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, format!("{}:test\n", path("foo/bar/baz")));
cmd.current_dir(wd.path().join("./foo/bar"));
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "baz:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/199
clean!(regression_199, r"\btest\b", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("foo", "tEsT");
cmd.arg("--smart-case");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:tEsT\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/206
clean!(regression_206, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_dir("foo");
wd.create("foo/bar.txt", "test");
cmd.arg("-g").arg("*.txt");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, format!("{}:test\n", path("foo/bar.txt")));
});
// See: https://github.com/BurntSushi/ripgrep/issues/210
#[cfg(unix)]
#[test]
fn regression_210() {
use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;
let badutf8 = OsStr::from_bytes(&b"foo\xffbar"[..]);
let wd = WorkDir::new("regression_210");
let mut cmd = wd.command();
wd.create(badutf8, "test");
cmd.arg("-H").arg("test").arg(badutf8);
let out = wd.output(&mut cmd);
assert_eq!(out.stdout, b"foo\xffbar:test\n".to_vec());
}
// See: https://github.com/BurntSushi/ripgrep/issues/228
clean!(regression_228, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_dir("foo");
cmd.arg("--ignore-file").arg("foo");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/7
sherlock!(feature_7, "-fpat", "sherlock", |wd: WorkDir, mut cmd: Command| {
wd.create("pat", "Sherlock\nHolmes");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
For the Doctor Watsons of this world, as opposed to the Sherlock
Holmeses, success in the province of detective work must always
be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/7
sherlock!(feature_7_dash, "-f-", ".", |wd: WorkDir, mut cmd: Command| {
let output = wd.pipe(&mut cmd, "Sherlock");
let lines = String::from_utf8_lossy(&output.stdout);
let expected = "\
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/20
sherlock!(feature_20, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
sherlock!(feature_20_no_filename, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--no-filename");
let lines: String = wd.stdout(&mut cmd);
@@ -765,8 +965,76 @@ be, to a very large extent, the result of luck. Sherlock Holmes
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/45
sherlock!(feature_45_relative_cwd, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create(".not-an-ignore", "foo\n/bar");
wd.create_dir("bar");
wd.create_dir("baz/bar");
wd.create_dir("baz/baz/bar");
wd.create("bar/test", "test");
wd.create("baz/bar/test", "test");
wd.create("baz/baz/bar/test", "test");
wd.create("baz/foo", "test");
wd.create("baz/test", "test");
wd.create("foo", "test");
wd.create("test", "test");
// First, get a baseline without applying ignore rules.
let lines = paths_from_stdout(wd.stdout(&mut cmd));
assert_eq!(lines, paths(&[
"bar/test", "baz/bar/test", "baz/baz/bar/test", "baz/foo",
"baz/test", "foo", "test",
]));
// Now try again with the ignore file activated.
cmd.arg("--ignore-file").arg(".not-an-ignore");
let lines = paths_from_stdout(wd.stdout(&mut cmd));
assert_eq!(lines, paths(&[
"baz/bar/test", "baz/baz/bar/test", "baz/test", "test",
]));
// Now do it again, but inside the baz directory.
// Since the ignore file is interpreted relative to the CWD, this will
// cause the /bar anchored pattern to filter out baz/bar, which is a
// subtle difference between true parent ignore files and manually
// specified ignore files.
let mut cmd = wd.command();
cmd.arg("test").arg(".").arg("--ignore-file").arg("../.not-an-ignore");
cmd.current_dir(wd.path().join("baz"));
let lines = paths_from_stdout(wd.stdout(&mut cmd));
assert_eq!(lines, paths(&["baz/bar/test", "test"]));
});
// See: https://github.com/BurntSushi/ripgrep/issues/45
sherlock!(feature_45_precedence_with_others, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create(".not-an-ignore", "*.log");
wd.create(".ignore", "!imp.log");
wd.create("imp.log", "test");
wd.create("wat.log", "test");
cmd.arg("--ignore-file").arg(".not-an-ignore");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "imp.log:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/45
sherlock!(feature_45_precedence_internal, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create(".not-an-ignore1", "*.log");
wd.create(".not-an-ignore2", "!imp.log");
wd.create("imp.log", "test");
wd.create("wat.log", "test");
cmd.arg("--ignore-file").arg(".not-an-ignore1");
cmd.arg("--ignore-file").arg(".not-an-ignore2");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "imp.log:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/68
clean!(feature_68, "test", ".", |wd: WorkDir, mut cmd: Command| {
clean!(feature_68_no_ignore_vcs, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "foo");
wd.create(".ignore", "bar");
wd.create("foo", "test");
@@ -778,7 +1046,8 @@ clean!(feature_68, "test", ".", |wd: WorkDir, mut cmd: Command| {
});
// See: https://github.com/BurntSushi/ripgrep/issues/70
sherlock!(feature_70, "sherlock", ".", |wd: WorkDir, mut cmd: Command| {
sherlock!(feature_70_smart_case, "sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--smart-case");
let lines: String = wd.stdout(&mut cmd);
@@ -798,6 +1067,16 @@ sherlock!(feature_89_files_with_matches, "Sherlock", ".",
assert_eq!(lines, "sherlock\x00");
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_files_without_matches, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("file.py", "foo");
cmd.arg("--null").arg("--files-without-match");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "file.py\x00");
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_count, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
@@ -831,6 +1110,51 @@ sherlock\x00can extract a clew from a wisp of straw or a flake of cigar ash;
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/109
clean!(feature_109_max_depth, "far", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_dir("one");
wd.create("one/pass", "far");
wd.create_dir("one/too");
wd.create("one/too/many", "far");
cmd.arg("--maxdepth").arg("2");
let lines: String = wd.stdout(&mut cmd);
let expected = path("one/pass:far\n");
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/124
clean!(feature_109_case_sensitive_part1, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo", "tEsT");
cmd.arg("--smart-case").arg("--case-sensitive");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/124
clean!(feature_109_case_sensitive_part2, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo", "tEsT");
cmd.arg("--ignore-case").arg("--case-sensitive");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/159
clean!(feature_159_works, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("foo", "test\ntest");
cmd.arg("-m1");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/159
clean!(feature_159_zero_max, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("foo", "test\ntest");
cmd.arg("-m0");
wd.assert_err(&mut cmd);
});
#[test]
fn binary_nosearch() {
let wd = WorkDir::new("binary_nosearch");

View File

@@ -43,9 +43,14 @@ impl WorkDir {
/// Create a new file with the given name and contents in this directory.
pub fn create<P: AsRef<Path>>(&self, name: P, contents: &str) {
self.create_bytes(name, contents.as_bytes());
}
/// Create a new file with the given name and contents in this directory.
pub fn create_bytes<P: AsRef<Path>>(&self, name: P, contents: &[u8]) {
let path = self.dir.join(name);
let mut file = nice_err(&path, File::create(&path));
nice_err(&path, file.write_all(contents.as_bytes()));
nice_err(&path, file.write_all(contents));
nice_err(&path, file.flush());
}
@@ -71,8 +76,29 @@ impl WorkDir {
}
/// Returns the path to the ripgrep executable.
#[cfg(not(windows))]
pub fn bin(&self) -> PathBuf {
self.root.join("rg")
let path = self.root.join("rg");
if !path.is_file() {
// Looks like a recent version of Cargo changed the cwd or the
// location of the test executable.
self.root.join("../rg")
} else {
path
}
}
/// Returns the path to the ripgrep executable.
#[cfg(windows)]
pub fn bin(&self) -> PathBuf {
let path = self.root.join("rg.exe");
if !path.is_file() {
// Looks like a recent version of Cargo changed the cwd or the
// location of the test executable.
self.root.join("../rg.exe")
} else {
path
}
}
/// Returns the path to this directory.
@@ -83,7 +109,7 @@ impl WorkDir {
/// Creates a directory symlink to the src with the given target name
/// in this directory.
#[cfg(not(windows))]
pub fn link<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
pub fn link_dir<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
use std::os::unix::fs::symlink;
let src = self.dir.join(src);
let target = self.dir.join(target);
@@ -91,8 +117,10 @@ impl WorkDir {
nice_err(&target, symlink(&src, &target));
}
/// Creates a directory symlink to the src with the given target name
/// in this directory.
#[cfg(windows)]
pub fn link<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
pub fn link_dir<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
use std::os::windows::fs::symlink_dir;
let src = self.dir.join(src);
let target = self.dir.join(target);
@@ -100,6 +128,32 @@ impl WorkDir {
nice_err(&target, symlink_dir(&src, &target));
}
/// Creates a file symlink to the src with the given target name
/// in this directory.
#[cfg(not(windows))]
pub fn link_file<S: AsRef<Path>, T: AsRef<Path>>(
&self,
src: S,
target: T,
) {
self.link_dir(src, target);
}
/// Creates a file symlink to the src with the given target name
/// in this directory.
#[cfg(windows)]
pub fn link_file<S: AsRef<Path>, T: AsRef<Path>>(
&self,
src: S,
target: T,
) {
use std::os::windows::fs::symlink_file;
let src = self.dir.join(src);
let target = self.dir.join(target);
let _ = fs::remove_file(&target);
nice_err(&target, symlink_file(&src, &target));
}
/// Runs and captures the stdout of the given command.
///
/// If the return type could not be created from a string, then this
@@ -120,7 +174,41 @@ impl WorkDir {
/// Gets the output of a command. If the command failed, then this panics.
pub fn output(&self, cmd: &mut process::Command) -> process::Output {
let o = cmd.output().unwrap();
let output = cmd.output().unwrap();
self.expect_success(cmd, output)
}
/// Pipe `input` to a command, and collect the output.
pub fn pipe(
&self,
cmd: &mut process::Command,
input: &str
) -> process::Output {
cmd.stdin(process::Stdio::piped());
cmd.stdout(process::Stdio::piped());
cmd.stderr(process::Stdio::piped());
let mut child = cmd.spawn().unwrap();
// Pipe input to child process using a separate thread to avoid
// risk of deadlock between parent and child process.
let mut stdin = child.stdin.take().expect("expected standard input");
let input = input.to_owned();
let worker = thread::spawn(move || {
write!(stdin, "{}", input)
});
let output = self.expect_success(cmd, child.wait_with_output().unwrap());
worker.join().unwrap().unwrap();
output
}
/// If `o` is not the output of a successful process run
fn expect_success(
&self,
cmd: &process::Command,
o: process::Output
) -> process::Output {
if !o.status.success() {
let suggest =
if o.stderr.is_empty() {

21
wincolor/Cargo.toml Normal file
View File

@@ -0,0 +1,21 @@
[package]
name = "wincolor"
version = "0.1.0" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A simple Windows specific API for controlling text color in a Windows console.
"""
documentation = "https://docs.rs/wincolor"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/wincolor"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/wincolor"
readme = "README.md"
keywords = ["windows", "win", "color", "ansi", "console"]
license = "Unlicense/MIT"
[lib]
name = "wincolor"
bench = false
[dependencies]
kernel32-sys = "0.2.2"
winapi = "0.2.8"

44
wincolor/README.md Normal file
View File

@@ -0,0 +1,44 @@
wincolor
========
A simple Windows specific API for controlling text color in a Windows console.
The purpose of this crate is to expose the full inflexibility of the Windows
console without any platform independent abstraction.
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/wincolor.svg)](https://crates.io/crates/wincolor)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/wincolor](https://docs.rs/wincolor)
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
wincolor = "0.1"
```
and this to your crate root:
```rust
extern crate wincolor;
```
### Example
This is a simple example that shows how to write text with a foreground color
of cyan and the intense attribute set:
```rust
use wincolor::{Console, Color, Intense};
let mut con = Console::stdout().unwrap();
con.fg(Intense::Yes, Color::Cyan).unwrap();
println!("This text will be intense cyan.");
con.reset().unwrap();
println!("This text will be normal.");
```

29
wincolor/src/lib.rs Normal file
View File

@@ -0,0 +1,29 @@
/*!
This crate provides a safe and simple Windows specific API to control
text attributes in the Windows console. Text attributes are limited to
foreground/background colors, as well as whether to make colors intense or not.
Note that on non-Windows platforms, this crate is empty but will compile.
# Example
```no_run
use wincolor::{Console, Color, Intense};
let mut con = Console::stdout().unwrap();
con.fg(Intense::Yes, Color::Cyan).unwrap();
println!("This text will be intense cyan.");
con.reset().unwrap();
println!("This text will be normal.");
```
*/
#[cfg(windows)]
extern crate kernel32;
#[cfg(windows)]
extern crate winapi;
#[cfg(windows)]
pub use win::*;
#[cfg(windows)]
mod win;

223
wincolor/src/win.rs Normal file
View File

@@ -0,0 +1,223 @@
use std::io;
use std::mem;
use kernel32;
use winapi::{DWORD, HANDLE, WORD};
use winapi::winbase::STD_OUTPUT_HANDLE;
use winapi::wincon::{
FOREGROUND_BLUE as FG_BLUE,
FOREGROUND_GREEN as FG_GREEN,
FOREGROUND_RED as FG_RED,
FOREGROUND_INTENSITY as FG_INTENSITY,
};
const FG_CYAN: DWORD = FG_BLUE | FG_GREEN;
const FG_MAGENTA: DWORD = FG_BLUE | FG_RED;
const FG_YELLOW: DWORD = FG_GREEN | FG_RED;
const FG_WHITE: DWORD = FG_BLUE | FG_GREEN | FG_RED;
/// A Windows console.
///
/// This represents a very limited set of functionality available to a Windows
/// console. In particular, it can only change text attributes such as color
/// and intensity.
///
/// There is no way to "write" to this console. Simply write to
/// stdout or stderr instead, while interleaving instructions to the console
/// to change text attributes.
///
/// A common pitfall when using a console is to forget to flush writes to
/// stdout before setting new text attributes.
#[derive(Debug)]
pub struct Console {
handle: HANDLE,
start_attr: TextAttributes,
cur_attr: TextAttributes,
}
unsafe impl Send for Console {}
impl Drop for Console {
fn drop(&mut self) {
unsafe { kernel32::CloseHandle(self.handle); }
}
}
impl Console {
/// Create a new Console to stdout.
///
/// If there was a problem creating the console, then an error is returned.
pub fn stdout() -> io::Result<Console> {
let mut info = unsafe { mem::zeroed() };
let (handle, res) = unsafe {
let handle = kernel32::GetStdHandle(STD_OUTPUT_HANDLE);
(handle, kernel32::GetConsoleScreenBufferInfo(handle, &mut info))
};
if res == 0 {
return Err(io::Error::last_os_error());
}
let attr = TextAttributes::from_word(info.wAttributes);
Ok(Console {
handle: handle,
start_attr: attr,
cur_attr: attr,
})
}
/// Applies the current text attributes.
fn set(&mut self) -> io::Result<()> {
let attr = self.cur_attr.to_word();
let res = unsafe {
kernel32::SetConsoleTextAttribute(self.handle, attr)
};
if res == 0 {
return Err(io::Error::last_os_error());
}
Ok(())
}
/// Apply the given intensity and color attributes to the console
/// foreground.
///
/// If there was a problem setting attributes on the console, then an error
/// is returned.
pub fn fg(&mut self, intense: Intense, color: Color) -> io::Result<()> {
self.cur_attr.fg_color = color;
self.cur_attr.fg_intense = intense;
self.set()
}
/// Apply the given intensity and color attributes to the console
/// background.
///
/// If there was a problem setting attributes on the console, then an error
/// is returned.
pub fn bg(&mut self, intense: Intense, color: Color) -> io::Result<()> {
self.cur_attr.bg_color = color;
self.cur_attr.bg_intense = intense;
self.set()
}
/// Reset the console text attributes to their original settings.
///
/// The original settings correspond to the text attributes on the console
/// when this `Console` value was created.
///
/// If there was a problem setting attributes on the console, then an error
/// is returned.
pub fn reset(&mut self) -> io::Result<()> {
self.cur_attr = self.start_attr;
self.set()
}
}
/// A representation of text attributes for the Windows console.
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
struct TextAttributes {
fg_color: Color,
fg_intense: Intense,
bg_color: Color,
bg_intense: Intense,
}
impl TextAttributes {
fn to_word(&self) -> WORD {
let mut w = 0;
w |= self.fg_color.to_fg();
w |= self.fg_intense.to_fg();
w |= self.bg_color.to_bg();
w |= self.bg_intense.to_bg();
w as WORD
}
fn from_word(word: WORD) -> TextAttributes {
let attr = word as DWORD;
TextAttributes {
fg_color: Color::from_fg(attr),
fg_intense: Intense::from_fg(attr),
bg_color: Color::from_bg(attr),
bg_intense: Intense::from_bg(attr),
}
}
}
/// Whether to use intense colors or not.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Intense {
Yes,
No,
}
impl Intense {
fn to_bg(&self) -> DWORD {
self.to_fg() << 4
}
fn from_bg(word: DWORD) -> Intense {
Intense::from_fg(word >> 4)
}
fn to_fg(&self) -> DWORD {
match *self {
Intense::No => 0,
Intense::Yes => FG_INTENSITY,
}
}
fn from_fg(word: DWORD) -> Intense {
if word & FG_INTENSITY > 0 {
Intense::Yes
} else {
Intense::No
}
}
}
/// The set of available colors for use with a Windows console.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Color {
Black,
Blue,
Green,
Red,
Cyan,
Magenta,
Yellow,
White,
}
impl Color {
fn to_bg(&self) -> DWORD {
self.to_fg() << 4
}
fn from_bg(word: DWORD) -> Color {
Color::from_fg(word >> 4)
}
fn to_fg(&self) -> DWORD {
match *self {
Color::Black => 0,
Color::Blue => FG_BLUE,
Color::Green => FG_GREEN,
Color::Red => FG_RED,
Color::Cyan => FG_CYAN,
Color::Magenta => FG_MAGENTA,
Color::Yellow => FG_YELLOW,
Color::White => FG_WHITE,
}
}
fn from_fg(word: DWORD) -> Color {
match word & 0b111 {
FG_BLUE => Color::Blue,
FG_GREEN => Color::Green,
FG_RED => Color::Red,
FG_CYAN => Color::Cyan,
FG_MAGENTA => Color::Magenta,
FG_YELLOW => Color::Yellow,
FG_WHITE => Color::White,
_ => Color::Black,
}
}
}