Compare commits

...

145 Commits

Author SHA1 Message Date
Andrew Gallant
4981991a6e 0.2.2 2016-10-10 22:24:36 -04:00
Andrew Gallant
51440f59cd Don't include HomebrewFormula in crate. 2016-10-10 22:24:28 -04:00
Andrew Gallant
7b8a8d77d0 changelog 0.2.2 2016-10-10 22:18:21 -04:00
Andrew Gallant
4737326ed3 Update regex-syntax for bug fix.
The bug fix was in expression pretty printing. ripgrep parses the regex
into an AST and may do some modifications to it, which requires the
ability to go from string -> AST -> string' -> AST' where string == string'
implies AST == AST'.

Also, add a regression test for the specific regex that tripped the bug.

Fixes #156.
2016-10-10 22:04:29 -04:00
Andrew Gallant
a3537aa32a Update darwin cfg attributes. 2016-10-10 21:48:47 -04:00
Andrew Gallant
d3e118a786 Fix debug expression statement. 2016-10-10 21:48:34 -04:00
Andrew Gallant
4e52059ad6 Disable regression_131 test on darwin.
It's not clear why it's failing. Maybe it doesn't permit certain
characters in file paths?
2016-10-10 21:03:11 -04:00
Andrew Gallant
60c016c243 Fix docopt usage string. Gah. 2016-10-10 20:49:39 -04:00
Andrew Gallant
4665128f25 Clarify documentation for --replace.
Also add a minor clarification for --type-add.

Fixes #147
2016-10-10 20:19:45 -04:00
Andrew Gallant
dde5bd5a80 globset-0.1.0 2016-10-10 20:07:13 -04:00
Andrew Gallant
762ad44f71 add version marker 2016-10-10 20:06:35 -04:00
Andrew Gallant
705386934d Fill in globset/Cargo.toml with more details. 2016-10-10 19:50:21 -04:00
Andrew Gallant
97bbc6ef11 Update appveyor to test subcrates. 2016-10-10 19:35:47 -04:00
Andrew Gallant
27a980c1bc Fix symlink test.
We attempt to run it on Windows, but I'm getting "access denied" errors
when trying to create a file symlink. So we disable the test on Windows.
2016-10-10 19:34:57 -04:00
Andrew Gallant
e8645dc8ae style nits 2016-10-10 19:27:12 -04:00
Andrew Gallant
e96d93034a Finish overhaul of glob matching.
This commit completes the initial move of glob matching to an external
crate, including fixing up cross platform support, polishing the
external crate for others to use and fixing a number of bugs in the
process.

Fixes #87, #127, #131
2016-10-10 19:24:18 -04:00
Andrew Gallant
bc5accc035 Merge pull request #161 from moshen/update-homebrew-readme
Update Homebrew instructions in the README
2016-10-10 19:11:50 -04:00
Andrew Gallant
c9d0ca8257 Merge pull request #157 from CannedYerins/follow-explicit-args
Always follow symlinks on explicit file arguments
2016-10-10 19:11:10 -04:00
Andrew Gallant
45fe4aab96 Merge pull request #155 from theamazingfedex/adding-extra-md-filetype
Adding extra .md filetype for ease of access to Markdown filetypes
2016-10-10 19:09:15 -04:00
Andrew Gallant
97f981fbcb Merge pull request #154 from theamazingfedex/adding-spark-filetype
Adding .spark filetype
2016-10-10 19:08:20 -04:00
Andrew Gallant
59329dcc61 Merge pull request #153 from theamazingfedex/master
Adding .config filetype
2016-10-10 19:08:06 -04:00
Colin Kennedy
604da8eb86 Update Homebrew instructions in the README 2016-10-09 23:45:02 -05:00
Ian Kerins
1c964372ad Always follow symlinks on explicit file arguments. 2016-10-08 22:40:03 -04:00
Daniel Wood
50a961960e added extra .md filetype for ease of access 2016-10-07 14:37:29 -06:00
Daniel Wood
7481c5fe29 added .spark filetype 2016-10-07 14:35:20 -06:00
Daniel Wood
3ae37b0937 added .config filetype 2016-10-07 13:50:26 -06:00
Andrew Gallant
4ee6dbe422 Merge pull request #148 from munyari/patch-1
Change Arch Linux instructions
2016-10-05 09:53:17 -04:00
Panashe Fundira
cd4bdcf810 Change Arch Linux instructions
The `-Syu` flag will do a full system upgrade and then install the package, which is not necessarily the desired behavior. Only the `-S` flag is necessary to install a single package.
See https://wiki.archlinux.org/index.php/Pacman#Installing_specific_packages
https://wiki.archlinux.org/index.php/Pacman#Upgrading_packages
2016-10-05 08:26:19 -04:00
Andrew Gallant
175406df01 Refactor and test glob sets.
This commit goes a long way toward refactoring glob sets so that the
code is easier to maintain going forward. In particular, it makes the
literal optimizations that glob sets used a lot more structured and much
easier to extend. Tests have also been modified to include glob sets.

There's still a bit of polish work left to do before a release.

This also fixes the immediate issue where large gitignore files were
causing ripgrep to slow way down. While we don't technically fix it for
good, we're a lot better about reducing the number of regexes we
compile. In particular, if a gitignore file contains thousands of
patterns that can't be matched more simply using literals, then ripgrep
will slow down again. We could fix this for good by avoiding RegexSet if
the number of regexes grows too large.

Fixes #134.
2016-10-04 20:28:56 -04:00
Andrew Gallant
89811d43d4 Merge pull request #146 from samuelcolvin/add-jinja-type
add jinja type for *.jinja and *.jinja2
2016-10-04 08:25:22 -04:00
Samuel Colvin
f0053682c0 add jinja type for *.jinja and *.jinja2 2016-10-04 13:15:31 +01:00
Andrew Gallant
35045d6105 Merge pull request #143 from moshen/change-brew-formula-name
Fix brew formula name to not conflict with core
2016-10-04 06:49:16 -04:00
Colin Kennedy
95f552fc06 Fix brew formula name to not conflict with core
Since the homebrew-core formula was accepted, we should differentiate
the prebuilt formula available in this tap
2016-10-03 22:30:26 -05:00
Andrew Gallant
48353bea17 Merge pull request #144 from bitshifter/dotcmake
Added *.cmake extension to cmake file type.
2016-10-03 20:51:05 -04:00
Cameron Hart
703d5b558e Added *.cmake extension to cmake file type. 2016-10-04 11:28:49 +11:00
Andrew Gallant
47efea234f Remove i686-darwin.
Apparently 32 bit Mac CPUs are really old at this point. Also, it has
been causing CI to fail lately. It's not worth it.
2016-10-03 17:16:28 -04:00
Andrew Gallant
ca0d8998a2 Merge pull request #139 from moshen/make-a-tap
Make the repo a Homebrew Tap
2016-10-03 17:14:09 -04:00
Andrew Gallant
fdf24317ac Move glob implementation to new crate.
It is isolated and complex enough that it deserves attention all on its
own. It's also eminently reusable.
2016-09-30 19:42:41 -04:00
Andrew Gallant
b9d5f22a4d Stopgap measure for projects with huge gitignore files.
This helps #134 by avoiding a slow regex execution path, but doesn't
actually fix the problem. Namely, we've gone from "so slow I'm not going
to keep waiting for rg to finish" to "wow that was slow but at least it
finished before I lost my patience."
2016-09-30 19:29:52 -04:00
Colin Kennedy
67bb4f040f Make the repo a Homebrew Tap 2016-09-30 12:51:37 -05:00
Andrew Gallant
cee2f09a6d Merge pull request #121 from lilydjwg/master
if --color always, always print with color, even when --vimgrep is given
2016-09-29 16:49:44 -04:00
Andrew Gallant
ced777e91f Merge pull request #133 from akien-mga/pr-appveyor
AppVeyor: Change release description to fit Travis binaries
2016-09-29 09:35:44 -04:00
Rémi Verschelde
e9d9083898 AppVeyor: Change release description to fit Travis binaries 2016-09-29 15:29:59 +02:00
Andrew Gallant
46dff8f4be Be better with short circuiting with --quiet.
It didn't make sense for --quiet to be part of the printer, because --quiet
doesn't just mean "don't print," it also means, "stop after the first
match is found." This needs to be wired all the way up through directory
traversal, and it also needs to cause all of the search workers to quit
as well. We do it with an atomic that is only checked with --quiet is
given.

Fixes #116.
2016-09-28 20:50:50 -04:00
Andrew Gallant
7aa6e87952 clarify 2016-09-28 16:47:10 -04:00
Andrew Gallant
925d0db9f0 Add -s/--case-sensitive flag.
This flag overrides both --smart-case and --ignore-case.

Closes #124.
2016-09-28 16:32:29 -04:00
Andrew Gallant
316ffd87b3 bump docopt to 0.6.86 2016-09-28 15:56:59 -04:00
依云
5943b1effe if --color always, always print with color, even when --vimgrep is given 2016-09-28 20:10:07 +08:00
Andrew Gallant
c42f97b4da Merge pull request #122 from lilydjwg/color-filename
colorize filepath at the beginning of line too
2016-09-28 07:06:34 -04:00
依云
0d9bba7816 colorize filepath at the beginning of line too 2016-09-28 11:54:43 +08:00
Andrew Gallant
3550f2e29a Merge pull request #111 from gsquire/max-depth
Max depth option
2016-09-27 19:46:29 -04:00
Garrett Squire
babe80d498 add a max-depth option for directory traversal
CR and add integration test
2016-09-27 16:14:53 -07:00
Andrew Gallant
3e892a7a80 Correct example with --type-add.
Fixes #118.
2016-09-27 18:35:06 -04:00
Andrew Gallant
1df3f0b793 Merge pull request #114 from cetra3/colorChoice
Create Colour Choice struct to adjust colours depending on platform
2016-09-27 09:43:47 -04:00
cetra3
b3935935cb Add colour choice 2016-09-27 22:51:07 +09:30
Andrew Gallant
67abbf6f22 Merge pull request #115 from nickstenning/update-brew-hashes
Update brew 0.2.1 package hashes
2016-09-27 07:01:27 -04:00
Nick Stenning
7b9f7d7dc6 Update brew 0.2.1 package hashes 2016-09-27 12:01:22 +02:00
Andrew Gallant
7ab29a91d0 fix use of --type-add 2016-09-26 20:58:28 -04:00
Andrew Gallant
9fa38c6232 brew 0.2.1 2016-09-26 20:42:12 -04:00
Andrew Gallant
de79be2db2 0.2.1 2016-09-26 20:02:58 -04:00
Andrew Gallant
416b69bae5 changelog 0.2.1 2016-09-26 20:02:47 -04:00
Andrew Gallant
3e78fce3a3 Don't print empty lines in single threaded mode.
Fixes #99.
2016-09-26 19:57:23 -04:00
Andrew Gallant
7a3fd1f23f Add a --null flag.
This flag causes a NUL byte to follow any file path in ripgrep's output.

Closes #89.
2016-09-26 19:21:17 -04:00
Andrew Gallant
d306403440 Fix an off-by-one error with --column.
Fixes #105.
2016-09-26 19:09:59 -04:00
Andrew Gallant
ebabe1df6a Merge branch 'gitignore_blank_lines' 2016-09-26 18:55:43 -04:00
Andrew Gallant
f27aa3ff6f Add regression test.
Fixes #106.
2016-09-26 18:55:26 -04:00
Tom Jackson
20ccd441f2 Allow (and ignore) whitespace-only lines in .gitignore files
Git considers these to be blank lines.
2016-09-26 18:52:57 -04:00
Andrew Gallant
104d740f76 Don't quit if opening a file fails.
This was already working correctly in multithreaded mode, but in single
threaded mode, a file failing to open caused search to stop. That's bad.

Fixes #98.
2016-09-26 18:44:19 -04:00
Andrew Gallant
2da0eab2b8 Don't initialize ignores for file arguments.
We'll never use them, so it's wasted effort.
2016-09-26 18:44:19 -04:00
Andrew Gallant
b8c7864a02 Merge pull request #107 from kaushalmodi/add-systemverilog-type
Add SystemVerilog (SV) type
2016-09-26 16:54:49 -04:00
Kaushal Modi
ec26995655 Add SystemVerilog (SV) type 2016-09-26 15:24:35 -04:00
Andrew Gallant
a41235a3b5 Merge pull request #100 from emlyn/patch-1
Recognise cljc and cljx extensions as Clojure(script)
2016-09-26 08:59:59 -04:00
Emlyn Corrin
1a91b900e7 Clojure files can also end in cljc or cljx
(see https://github.com/clojure/clojurescript/wiki/Using-cljc)
2016-09-26 09:53:46 +01:00
Andrew Gallant
2b15832655 update brew formula to 0.2.0 2016-09-25 22:50:50 -04:00
Andrew Gallant
b1c52b52d6 0.2.0 2016-09-25 22:32:14 -04:00
Andrew Gallant
109bc3f78e bump grep to 0.1.3 2016-09-25 22:30:17 -04:00
Andrew Gallant
b62195b33f grep 0.1.3 2016-09-25 22:29:35 -04:00
Andrew Gallant
baebfd7add changelog 0.2.0 2016-09-25 22:27:58 -04:00
Andrew Gallant
19e405e5c5 fix windows 2016-09-25 21:48:01 -04:00
Andrew Gallant
f85822266f Don't use an intermediate buffer when --threads=1.
Fixes #8
2016-09-25 21:27:17 -04:00
Andrew Gallant
b034b77798 Don't replace NUL bytes when searching binary files as text.
This was a result of misinterpreting a feature in grep where NUL bytes
are replaced with \n. The primary reason for doing this is to avoid
excessive memory usage on truly binary data. However, grep only does this
when searching binary files as if they were binary, and which only reports
whether the file matched or not. When grep is told to search binary data
as text (the -a/--text flag), then it doesn't do any replacement so we
shouldn't either.

In general, this makes sense, because the user is essentially asserting
that a particular file that looks like binary is actually text. In that
case, we shouldn't try to replace any NUL bytes.

ripgrep doesn't actually support searching binary data for whether it
matches or not, so we don't actually need the replace_buf function.
However, it does seem like a potentially useful feature.
2016-09-25 21:26:49 -04:00
Andrew Gallant
278e1168bf Make printing paths a bit faster.
It seems silly, but on *nix, we can just dump the bytes of the path
straight to the terminal. There's no need to do a UTF-8 check, which
can be costly when printing lots of matches.
2016-09-25 21:23:26 -04:00
Andrew Gallant
6a8051b258 Don't union inner literals of repetitions.
If we do, this results in extracting `foofoofoo` from `(\wfoo){3}`,
which is wrong. This does prevent us from extracting `foofoofoo` from
`foo{3}`, which is unfortunate, but we miss plenty of other stuff too.
Literal extracting needs a good rethink (all the way down into the regex
engine).

Fixes #93
2016-09-25 20:10:28 -04:00
Andrew Gallant
a13ac3e3d4 On Windows, always consider stdin to be a tty.
This means that `rg pat < file` won't do the expected thing and search
`fil`. Instead, it will recursively search the current directory for `pat`.
This isn't ideal, but is better than the previous behavior, which was to
wait for stdin when running `rg pat`, given the appearance of hanging
forever. The former is an important use case, but the latter is the
*central* use case of ripgrep, so we should make that work.

`rg` can still be used to search stdin on Windows, it just needs to be
done explicitly. e.g., `rg pat - < file` will search for `pat` in `file`.

Fixes #19
2016-09-25 20:00:29 -04:00
Andrew Gallant
a72467996b Fix Windows compilation error. 2016-09-25 20:00:29 -04:00
Andrew Gallant
9395076468 Merge pull request #92 from svenstaro/patch-1
ripgrep is now in [community]
2016-09-25 18:50:03 -04:00
Sven-Hendrik Haase
a12c63957b ripgrep is now in [community]
The README should reflect that.
2016-09-26 00:48:41 +02:00
Andrew Gallant
982265af70 Move --files-with-matches to less common options. 2016-09-25 18:32:41 -04:00
Andrew Gallant
ed94aedf27 Permit whitelisting hidden files in ignores.
Fixes #90
2016-09-25 18:31:41 -04:00
Andrew Gallant
fd5ae2f795 Add curly brace alternates to glob format.
Closes #80.
2016-09-25 17:28:23 -04:00
Andrew Gallant
3d6a39be06 Fix tests on Windows.
Mostly this is just using \\ instead of / in paths reported by the OS.
2016-09-25 15:45:51 -04:00
Andrew Gallant
e7839f2200 Merge pull request #71 from catchmrbharath/issue46
[Fixes #46] Use 1 less worker thread than number of threads
2016-09-25 15:02:38 -04:00
Andrew Gallant
9dc5464c84 Stop after first match is found with --quiet.
Fixes #77.
2016-09-25 15:01:29 -04:00
Andrew Gallant
95edcd4d3a Merge pull request #42 from andschwa/files-with-matches
Files with matches
2016-09-25 14:53:31 -04:00
Andrew Gallant
d97f404970 Stupid docopt.
It thinks `--type-clear is` is a flag spec.
2016-09-25 14:47:35 -04:00
Andrew Gallant
b2bbd46178 Clarify documentation of --type-add.
This explains it a bit more based on end user feedback. We also fix
the example, which was wrong.

Fixes #82.
2016-09-25 14:37:01 -04:00
Andrew Gallant
82542df5cb Merge pull request #84 from martinlindhe/ts
Add ts type for typescript
2016-09-25 11:33:11 -04:00
Martin Lindhe
e4329037aa Add ts type for typescript 2016-09-25 17:16:15 +02:00
Andrew Gallant
ab0d1c1c79 Be more conservative with stdin.
If no paths are given to ripgrep, only read from stdin if it's a file or
a FIFO. In particular, if something like `rg foo < /dev/null` is used,
then don't try to read from stdin.

Fixes #35, #81
2016-09-25 11:14:54 -04:00
Andrew Gallant
2015c56e8d Merge pull request #62 from martinlindhe/js-wc
--type js: include more extensions
2016-09-25 11:10:47 -04:00
Martin Lindhe
23ad8b989d --type js: include more extensions 2016-09-25 17:06:13 +02:00
Andrew Schwartzmeyer
a8f3d9e87e Add --files-with-matches flag.
Closes #26.

Acts like --count but emits only the paths of files with matches,
suitable for piping to xargs. Both mmap and no-mmap searches terminate
after the first match is found. Documentation updated and tests added.
2016-09-24 21:40:17 -07:00
Bharath M R
9f1aae64f8 [Fixes #46] Use 1 less worker thread than number of threads
The main thread does directory traversal. Hence
number of threads = main Thread + number of worker threads.
We should have atleast one worker thread.
2016-09-24 19:48:26 -07:00
Andrew Gallant
1595f0faf5 Add --smart-case.
It does what it says on the tin.

Closes #70.
2016-09-24 21:51:04 -04:00
Andrew Gallant
8eeb0c0b60 Add --no-ignore-vcs flag.
This flag will respect .ignore but not .gitignore.

Closes #68.
2016-09-24 21:31:24 -04:00
Andrew Gallant
423f2a1927 Permit options with --help/--version.
Fixes #47.
2016-09-24 21:13:24 -04:00
Andrew Gallant
4b5e789a2a Strip trailing whitespace in gitignore patterns.
Fixes #38.
2016-09-24 20:56:24 -04:00
Andrew Gallant
37b731a048 Update brew package to 0.1.17.
Closes #58, Fixes #13
2016-09-24 20:51:07 -04:00
Andrew Gallant
a44735aa87 Tweak memory maps on darwin.
Namely, don't automatically pick memory maps on darwin, ever. They
appear slower than standard read calls.

Closes #36.
2016-09-24 20:48:05 -04:00
Andrew Gallant
6b2efd4d88 If a file is empty, still try to search it.
Files like /proc/cpuinfo will advertise themselves as a normal file with
size 0. Normally, this isn't a problem, but if ripgrep decides to use a
memory map, it skipped searching if the file was empty since it's an error
to memory map an empty file. Instead of returning 0, we should just fall
back to standard read calls.

Fixes #55.
2016-09-24 20:45:06 -04:00
Andrew Gallant
c8227e0cf3 Don't ignore first path when using --files.
This is a docopt oddity, but probably not a bug. If --files is given,
then just interpret the pattern (if not empty) as the first file path.

Fixes #64.
2016-09-24 20:22:02 -04:00
Andrew Gallant
b941c10b90 Fix directory whitelisting.
There was a bug in the translation from a gitignore pattern to a standard
glob where `!/dir` wasn't being interpreted as an absolute path.

Fixes #67.
2016-09-24 20:10:30 -04:00
Andrew Gallant
872a107658 Fix whitelisting precedence.
Once a file is known to be whitelisted, we shouldn't check any ancestor
gitignores.
2016-09-24 20:09:29 -04:00
Andrew Gallant
71ad9bf393 Fix trailing recursive globs in gitignore.
A standard glob of `foo/**` will match `foo`, but gitignore semantics
specify that `foo/**` should only match the contents of `foo` and not
`foo` itself. We capture those semantics by translating `foo/**` to
`foo/**/*`.

Fixes #30.
2016-09-24 19:44:06 -04:00
Andrew Gallant
f733e9ebe4 Fix typo.
Thanks @dmit!
2016-09-24 19:43:01 -04:00
Andrew Gallant
ce85df1d2e Clarify what rg does in --help.
Fixes #24.
2016-09-24 19:26:28 -04:00
Andrew Gallant
a6e3cab65a Add --no-filename flag.
When this flag is set, a filename is never shown for a match.

Closes #20
2016-09-24 19:24:24 -04:00
Andrew Gallant
7b860affbe Change the default output of --files to elide './'.
This is kind of a ticky-tack change. I do think ./ as a prefix is
reasonable default, *but* we strip ./ when showing search results, so it
does make sense to be consistent.

Fixes #21.
2016-09-24 19:18:48 -04:00
Andrew Gallant
af4dc78537 Update to docopt 0.6.85.
The new version won't panic if printing to stdout fails.

Fixes #22.
2016-09-24 19:14:19 -04:00
Andrew Gallant
9ce0484670 Clarify the documentation of the --type-* flags.
Fixes #15
2016-09-24 18:55:48 -04:00
Andrew Gallant
346bad7dfc Fix handling of absolute patterns in parent gitignore files.
If a gitignore file in a *parent* directory is used, then it must be
matched relative to the directory it's in. ripgrep wasn't actually
adhering to this rule. Consider an example:

  .gitignore
  src
    llvm
      foo

Where `.gitignore` contains `/llvm/` and `foo` contains `test`. When
running `rg test` at the top-level directory, `foo` is correctly searched.
If you `cd` into `src` and re-run the same search, `foo` is ignored because
the `/llvm/` pattern is interpreted with respect to the current working
directory, which is wrong. The problem is that the path of `llvm` is
`./llvm`, which makes it look like it should match.

We fix this by rebuilding the directory path of each file when traversing
gitignores in parent directories. This does come with a small performance
hit.

Fixes #25.
2016-09-24 18:40:50 -04:00
Andrew Gallant
56fe93d343 Fix an absolute path name bug.
Namely, if a .gitignore inside a sub-directory has an absolute pattern,
e.g., `/foo/`, then we should match it relative to the directory containing
the .gitignore.
2016-09-24 17:31:24 -04:00
Andrew Gallant
155676b474 Fixes #43. 2016-09-24 16:34:34 -04:00
Andrew Gallant
a3fc4cdded Fix a bug in the translation from a gitignore pattern to a glob.
We were erroneously neglecting to prefix a pattern like `foo/`
with `**/` (to make `**/foo/`) because it had a slash in it. In fact, the
only reason to neglect a **/ prefix is if the pattern already starts
with **/, or if the pattern is absolute.

Fixes #16, #49, #50, #65
2016-09-24 16:29:25 -04:00
Andrew Gallant
3bec8f3f0a Impl Debug for IgnoreDir. 2016-09-24 16:29:25 -04:00
Andrew Gallant
3b37f12ec0 Merge pull request #69 from dloss/nim-filetype
Add support for the Nim programming language file type
2016-09-24 16:04:58 -04:00
Dirk Loss
a2ed677e03 Add support for the Nim programming language file type 2016-09-24 21:48:33 +02:00
Andrew Gallant
2fb9c3c42c Merge pull request #56 from chrisdoc/feature/swift-file-type
Add support for the Swift programming language file type
2016-09-24 14:34:06 -04:00
Andrew Gallant
447e1ba0e2 Merge pull request #66 from kontomondo/master
FSharp language file type
2016-09-24 14:33:01 -04:00
Konto Mondo
3b45059212 FSharp language file type 2016-09-24 10:30:30 -04:00
chrisdoc
f74078af5b Add support for the Swift programming language file type 2016-09-24 08:42:44 +02:00
Andrew Gallant
5ff9b2f2a2 Merge pull request #41 from BurntSushi/generic-ignore
Switch from .rgignore to .ignore.
2016-09-23 23:14:38 -04:00
Andrew Gallant
cc90511ab2 Switch from .rgignore to .ignore.
But don't actually remove support for .rgignore until the next semver
bump.

Note that this puts us in line with the silver searcher:
https://github.com/ggreer/the_silver_searcher/pull/974

Fixes #40
2016-09-23 22:44:33 -04:00
Andrew Gallant
f5d60a80a8 Merge pull request #28 from ledge23/patch-1
Add VB files to default type list
2016-09-23 22:33:12 -04:00
Andrew Gallant
6fa158f6d3 Merge pull request #29 from jimhester/r_extensions
Add a few more R relevant extensions
2016-09-23 22:32:10 -04:00
Andrew Gallant
ef6dea40ff Merge pull request #39 from JohnVillalovos/master
Prefer https:// over git://
2016-09-23 22:28:20 -04:00
John L. Villalovos
9035c6b7b3 Prefer https:// over git://
1) git is not a secure protocol and vulnerable to man-in-the-middle
   attacks.
2) git:// is a pain for users behind proxy servers :(

Change-Id: I1901bebbaf8f64b23b070dee8732a6fb13cbdfdd
2016-09-23 16:34:24 -07:00
Andrew Gallant
f5eb36baac Fixing VC++ wording and link.
Kudos to @retep998
2016-09-23 18:39:07 -04:00
Andrew Gallant
6367dd61ba Column numbers should start at 1.
ripgrep was documented to do 1-based indexing, so this is a bug and not
a breaking change.

Fixes #18
2016-09-23 17:11:09 -04:00
Jim Hester
98892de1c1 Add a few more R relevant extensions 2016-09-23 14:48:15 -04:00
Zack Schuster
273c14a45a Add VB files to default type list
Use-case: While not a vogue technology, VB is still a common file type taught in many university settings and used in many commercial settings. Working with VB files out-of-the-box would provide a lot of value to `ripgrep` users.

Example: I'm working on converting a legacy app to a modern infrastructure. The legacy app mixes CS and VB files liberally, so I always need to check both. For portability, it would be nice to just be able to ask for `-tcs -tvb` without registering with `--type-add` first.

Tests: I didn't notice any coverage aimed at this part of the code, but if I'm mistaken I'll amend the PR.
2016-09-23 11:44:53 -07:00
Andrew Gallant
b33e9cba69 0.1.17 2016-09-23 11:26:23 -04:00
Andrew Gallant
d5c045469b Don't use panic-on-abort.
We don't really care anyway, it was there as an experiment, and it seems
to be causing problems.

Fixes #14.
2016-09-23 11:25:46 -04:00
Andrew Gallant
0ce82403d4 Switch over to the real README. 2016-09-23 06:56:56 -04:00
Andrew Gallant
d2f95f6e59 bump PKGBUILD 2016-09-22 21:43:51 -04:00
39 changed files with 4307 additions and 1719 deletions

1
.gitignore vendored
View File

@@ -2,3 +2,4 @@
tags
target
/grep/Cargo.lock
/globset/Cargo.lock

View File

@@ -15,9 +15,6 @@ matrix:
- os: linux
rust: nightly
env: TARGET=x86_64-unknown-linux-musl
- os: osx
rust: nightly
env: TARGET=i686-apple-darwin
- os: osx
rust: nightly
env: TARGET=x86_64-apple-darwin

146
CHANGELOG.md Normal file
View File

@@ -0,0 +1,146 @@
0.2.2
=====
Packaging updates:
* `ripgrep` is now in homebrew-core. `brew install ripgrep` will do the trick
on a Mac.
* `ripgrep` is now in the Archlinux community repository.
`pacman -S ripgrep` will do the trick on Archlinux.
* Support has been discontinued for i686-darwin.
* Glob matching has been moved out into its own crate:
[`globset`](https://crates.io/crates/globset).
Feature enhancements:
* Added or improved file type filtering for CMake, config, Jinja, Markdown,
Spark,
* [FEATURE #109](https://github.com/BurntSushi/ripgrep/issues/109):
Add a --max-depth flag for directory traversal.
* [FEATURE #124](https://github.com/BurntSushi/ripgrep/issues/124):
Add -s/--case-sensitive flag. Overrides --smart-case.
* [FEATURE #139](https://github.com/BurntSushi/ripgrep/pull/139):
The `ripgrep` repo is now a Homebrew tap. This is useful for installing
SIMD accelerated binaries, which aren't available in homebrew-core.
Bug fixes:
* [BUG #87](https://github.com/BurntSushi/ripgrep/issues/87),
[BUG #127](https://github.com/BurntSushi/ripgrep/issues/127),
[BUG #131](https://github.com/BurntSushi/ripgrep/issues/131):
Various issues related to glob matching.
* [BUG #116](https://github.com/BurntSushi/ripgrep/issues/116):
--quiet should stop search after first match.
* [BUG #121](https://github.com/BurntSushi/ripgrep/pull/121):
--color always should show colors, even when --vimgrep is used.
* [BUG #122](https://github.com/BurntSushi/ripgrep/pull/122):
Colorize file path at beginning of line.
* [BUG #134](https://github.com/BurntSushi/ripgrep/issues/134):
Processing a large ignore file (thousands of globs) was very slow.
* [BUG #137](https://github.com/BurntSushi/ripgrep/issues/137):
Always follow symlinks when given as an explicit argument.
* [BUG #147](https://github.com/BurntSushi/ripgrep/issues/147):
Clarify documentation for --replace.
0.2.1
=====
Feature enhancements:
* Added or improved file type filtering for Clojure and SystemVerilog.
* [FEATURE #89](https://github.com/BurntSushi/ripgrep/issues/89):
Add a --null flag that outputs a NUL byte after every file path.
Bug fixes:
* [BUG #98](https://github.com/BurntSushi/ripgrep/issues/98):
Fix a bug in single threaded mode when if opening a file failed, ripgrep
quit instead of continuing the search.
* [BUG #99](https://github.com/BurntSushi/ripgrep/issues/99):
Fix another bug in single threaded mode where empty lines were being printed
by mistake.
* [BUG #105](https://github.com/BurntSushi/ripgrep/issues/105):
Fix an off-by-one error with --column.
* [BUG #106](https://github.com/BurntSushi/ripgrep/issues/106):
Fix a bug where a whitespace only line in a gitignore file caused ripgrep
to panic (i.e., crash).
0.2.0
=====
Feature enhancements:
* Added or improved file type filtering for VB, R, F#, Swift, Nim, Javascript,
TypeScript
* [FEATURE #20](https://github.com/BurntSushi/ripgrep/issues/20):
Adds a --no-filename flag.
* [FEATURE #26](https://github.com/BurntSushi/ripgrep/issues/26):
Adds --files-with-matches flag. Like --count, but only prints file paths
and doesn't need to count every match.
* [FEATURE #40](https://github.com/BurntSushi/ripgrep/issues/40):
Switch from using `.rgignore` to `.ignore`. Note that `.rgignore` is
still supported, but deprecated.
* [FEATURE #68](https://github.com/BurntSushi/ripgrep/issues/68):
Add --no-ignore-vcs flag that ignores .gitignore but not .ignore.
* [FEATURE #70](https://github.com/BurntSushi/ripgrep/issues/70):
Add -S/--smart-case flag (but is disabled by default).
* [FEATURE #80](https://github.com/BurntSushi/ripgrep/issues/80):
Add support for `{foo,bar}` globs.
Many many bug fixes. Thanks every for reporting these and helping make
`ripgrep` better! (Note that I haven't captured every tracking issue here,
some were closed as duplicates.)
* [BUG #8](https://github.com/BurntSushi/ripgrep/issues/8):
Don't use an intermediate buffer when --threads=1. (Permits constant memory
usage.)
* [BUG #15](https://github.com/BurntSushi/ripgrep/issues/15):
Improves the documentation for --type-add.
* [BUG #16](https://github.com/BurntSushi/ripgrep/issues/16),
[BUG #49](https://github.com/BurntSushi/ripgrep/issues/49),
[BUG #50](https://github.com/BurntSushi/ripgrep/issues/50),
[BUG #65](https://github.com/BurntSushi/ripgrep/issues/65):
Some gitignore globs were being treated as anchored when they weren't.
* [BUG #18](https://github.com/BurntSushi/ripgrep/issues/18):
--vimgrep reported incorrect column number.
* [BUG #19](https://github.com/BurntSushi/ripgrep/issues/19):
ripgrep was hanging waiting on stdin in some Windows terminals. Note that
this introduced a new bug:
[#94](https://github.com/BurntSushi/ripgrep/issues/94).
* [BUG #21](https://github.com/BurntSushi/ripgrep/issues/21):
Removes leading `./` when printing file paths.
* [BUG #22](https://github.com/BurntSushi/ripgrep/issues/22):
Running `rg --help | echo` caused `rg` to panic.
* [BUG #24](https://github.com/BurntSushi/ripgrep/issues/22):
Clarify the central purpose of rg in its usage message.
* [BUG #25](https://github.com/BurntSushi/ripgrep/issues/25):
Anchored gitignore globs weren't applied in subdirectories correctly.
* [BUG #30](https://github.com/BurntSushi/ripgrep/issues/30):
Globs like `foo/**` should match contents of `foo`, but not `foo` itself.
* [BUG #35](https://github.com/BurntSushi/ripgrep/issues/35),
[BUG #81](https://github.com/BurntSushi/ripgrep/issues/81):
When automatically detecting stdin, only read if it's a file or a fifo.
i.e., ignore stdin in `rg foo < /dev/null`.
* [BUG #36](https://github.com/BurntSushi/ripgrep/issues/36):
Don't automatically pick memory maps on MacOS. Ever.
* [BUG #38](https://github.com/BurntSushi/ripgrep/issues/38):
Trailing whitespace in gitignore wasn't being ignored.
* [BUG #43](https://github.com/BurntSushi/ripgrep/issues/43):
--glob didn't work with directories.
* [BUG #46](https://github.com/BurntSushi/ripgrep/issues/46):
Use one fewer worker thread than what is provided on CLI.
* [BUG #47](https://github.com/BurntSushi/ripgrep/issues/47):
--help/--version now work even if other options are set.
* [BUG #55](https://github.com/BurntSushi/ripgrep/issues/55):
ripgrep was refusing to search /proc/cpuinfo. Fixed by disabling memory
maps for files with zero size.
* [BUG #64](https://github.com/BurntSushi/ripgrep/issues/64):
The first path given with --files set was ignored.
* [BUG #67](https://github.com/BurntSushi/ripgrep/issues/67):
Sometimes whitelist globs like `!/dir` weren't interpreted as anchored.
* [BUG #77](https://github.com/BurntSushi/ripgrep/issues/77):
When -q/--quiet flag was passed, ripgrep kept searching even after a match
was found.
* [BUG #90](https://github.com/BurntSushi/ripgrep/issues/90):
Permit whitelisting hidden files.
* [BUG #93](https://github.com/BurntSushi/ripgrep/issues/93):
ripgrep was extracting an erroneous inner literal from a repeated pattern.

37
Cargo.lock generated
View File

@@ -1,13 +1,12 @@
[root]
name = "ripgrep"
version = "0.1.16"
version = "0.2.1"
dependencies = [
"deque 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"docopt 0.6.83 (registry+https://github.com/rust-lang/crates.io-index)",
"docopt 0.6.86 (registry+https://github.com/rust-lang/crates.io-index)",
"env_logger 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"grep 0.1.2",
"globset 0.1.0",
"grep 0.1.3",
"kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -40,7 +39,7 @@ dependencies = [
[[package]]
name = "docopt"
version = "0.6.83"
version = "0.6.86"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -74,19 +73,26 @@ dependencies = [
]
[[package]]
name = "glob"
version = "0.2.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
name = "globset"
version = "0.1.0"
dependencies = [
"aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
"lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "grep"
version = "0.1.2"
version = "0.1.3"
dependencies = [
"log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
@@ -155,7 +161,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 0.1.11 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.2.7 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -163,7 +169,7 @@ dependencies = [
[[package]]
name = "regex-syntax"
version = "0.3.5"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
@@ -234,11 +240,10 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
[metadata]
"checksum aho-corasick 0.5.3 (registry+https://github.com/rust-lang/crates.io-index)" = "ca972c2ea5f742bfce5687b9aef75506a764f61d37f8f649047846a9686ddb66"
"checksum deque 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "1614659040e711785ed8ea24219140654da1729f3ec8a47a9719d041112fe7bf"
"checksum docopt 0.6.83 (registry+https://github.com/rust-lang/crates.io-index)" = "fc42c6077823a361410c37d47c2535b73a190cbe10838dc4f400fe87c10c8c3b"
"checksum docopt 0.6.86 (registry+https://github.com/rust-lang/crates.io-index)" = "4a7ef30445607f6fc8720f0a0a2c7442284b629cf0d049286860fae23e71c4d9"
"checksum env_logger 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "15abd780e45b3ea4f76b4e9a26ff4843258dd8a3eed2775a0e7368c2e7936c2f"
"checksum fnv 1.0.5 (registry+https://github.com/rust-lang/crates.io-index)" = "6cc484842f1e2884faf56f529f960cc12ad8c71ce96cc7abba0a067c98fee344"
"checksum fs2 0.2.5 (registry+https://github.com/rust-lang/crates.io-index)" = "bcd414e5a1a979b931bb92f41b7a54106d3f6d2e6c253e9ce943b7cd468251ef"
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
"checksum kernel32-sys 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "7507624b29483431c0ba2d82aece8ca6cdba9382bff4ddd0f7490560c056098d"
"checksum lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)" = "49247ec2a285bb3dcb23cbd9c35193c025e7251bfce77c1d5da97e6362dffe7f"
"checksum libc 0.2.16 (registry+https://github.com/rust-lang/crates.io-index)" = "408014cace30ee0f767b1c4517980646a573ec61a57957aeeabcac8ac0a02e8d"
@@ -248,7 +253,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
"checksum num_cpus 1.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "8890e6084723d57d0df8d2720b0d60c6ee67d6c93e7169630e4371e88765dcad"
"checksum rand 0.3.14 (registry+https://github.com/rust-lang/crates.io-index)" = "2791d88c6defac799c3f20d74f094ca33b9332612d9aef9078519c82e4fe04a5"
"checksum regex 0.1.77 (registry+https://github.com/rust-lang/crates.io-index)" = "64b03446c466d35b42f2a8b203c8e03ed8b91c0f17b56e1f84f7210a257aa665"
"checksum regex-syntax 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "279401017ae31cf4e15344aa3f085d0e2e5c1e70067289ef906906fdbe92c8fd"
"checksum regex-syntax 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)" = "48f0573bcee95a48da786f8823465b5f2a1fae288a55407aca991e5b3e0eae11"
"checksum rustc-serialize 0.3.19 (registry+https://github.com/rust-lang/crates.io-index)" = "6159e4e6e559c81bd706afe9c8fd68f547d3e851ce12e76b1de7914bab61691b"
"checksum simd 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "63b5847c2d766ca7ce7227672850955802fabd779ba616aeabead4c2c3877023"
"checksum strsim 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "50c069df92e4b01425a8bf3576d5d417943a6a7272fbabaf5bd80b1aaa76442e"

View File

@@ -1,6 +1,6 @@
[package]
name = "ripgrep"
version = "0.1.16" #:version
version = "0.2.2" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Line oriented search tool using Rust's regex library. Combines the raw
@@ -12,6 +12,7 @@ repository = "https://github.com/BurntSushi/ripgrep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense/MIT"
exclude = ["HomebrewFormula"]
[[bin]]
bench = false
@@ -26,8 +27,8 @@ path = "tests/tests.rs"
deque = "0.3"
docopt = "0.6"
env_logger = "0.3"
fnv = "1.0"
grep = { version = "0.1.2", path = "grep" }
globset = { version = "0.1.0", path = "globset" }
grep = { version = "0.1.3", path = "grep" }
lazy_static = "0.2"
libc = "0.2"
log = "0.3"
@@ -46,9 +47,5 @@ winapi = "0.2"
[features]
simd-accel = ["regex/simd-accel"]
[dev-dependencies]
glob = "0.2"
[profile.release]
debug = true
panic = "abort"

1
HomebrewFormula Symbolic link
View File

@@ -0,0 +1 @@
pkg/brew

View File

@@ -1,263 +0,0 @@
ripgrep (rg)
------------
`ripgrep` is a command line search tool that combines the usability of The
Silver Searcher (an `ack` clone) with the raw speed of GNU grep. `ripgrep` has
first class support on Windows, Mac and Linux, with binary downloads available
for [every release](https://github.com/BurntSushi/ripgrep/releases).
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Screenshot of search results
[![A screenshot of a sample search with ripgrep](http://burntsushi.net/stuff/ripgrep1.png)](http://burntsushi.net/stuff/ripgrep1.png)
### Quick example comparing tools
This example searches the entire Linux kernel source tree (after running
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz.
Please remember that a single benchmark is never enough! See my
[blog post on `ripgrep`](http://blog.burntsushi.net/ripgrep/)
for a very detailed comparison with more benchmarks and analysis.
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.245s** |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.753s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.823s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.880s |
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.656s |
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 12.369s |
| [ack](http://beyondgrep.com/) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 16.952s |
(Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
[bug](https://github.com/petdance/ack2/issues/14).)
### Why should I use `ripgrep`?
* It can replace both The Silver Searcher and GNU grep because it is faster
than both. (N.B. It is not, strictly speaking, a "drop-in" replacement for
both, but the feature sets are far more similar than different.)
* Like The Silver Searcher, `ripgrep` defaults to recursive directory search
and won't search files ignored by your `.gitignore` files. It also ignores
hidden and binary files by default. `ripgrep` also implements full support
for `.gitignore`, where as there are many bugs related to that functionality
in The Silver Searcher.
* `ripgrep` can search specific types of files. For example, `rg -tpy foo`
limits your search to Python files and `rg -Tjs foo` excludes Javascript
files from your search. `ripgrep` can be taught about new file types with
custom matching rules.
* `ripgrep` supports many features found in `grep`, such as showing the context
of search results, searching multiple patterns, highlighting matches with
color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
supporting Unicode (which is always on).
In other words, use `ripgrep` if you like speed, sane defaults, fewer bugs and
Unicode.
### Is it really faster than everything else?
Yes. A large number of benchmarks with detailed analysis for each is
[available on my blog](http://blog.burntsushi.net/ripgrep/).
Summarizing, `ripgrep` is fast because:
* It is built on top of
[Rust's regex engine](https://github.com/rust-lang-nursery/regex).
Rust's regex engine uses finite automata, SIMD and aggressive literal
optimizations to make searching very fast.
* Rust's regex library maintains performance with full Unicode support by
building UTF-8 decoding directly into its deterministic finite automaton
engine.
* It supports searching with either memory maps or by searching incrementally
with an intermediate buffer. The former is better for single files and the
latter is better for large directories. `ripgrep` chooses the best searching
strategy for you automatically.
* Applies your ignore patterns in `.gitignore` files using a
[`RegexSet`](https://doc.rust-lang.org/regex/regex/struct.RegexSet.html).
That means a single file path can be matched against multiple glob patterns
simultaneously.
* Uses a Chase-Lev work-stealing queue for quickly distributing work to
multiple threads.
### Installation
The binary name for `ripgrep` is `rg`.
[Binaries for `ripgrep` are available for Windows, Mac and
Linux.](https://github.com/BurntSushi/ripgrep/releases) Linux binaries are
static executables. Windows binaries are available either as built with MinGW
(GNU) or with Microsoft Visual C++ (MSVC). When possible, prefer MSVC over GNU,
but you'll need to have the
[Microsoft Visual C++ Build
Tools](http://landinghub.visualstudio.com/visual-cpp-build-tools)
installed.
If you're a **Homebrew** user, then you can install it with a custom formula
(N.B. `ripgrep` isn't actually in Homebrew yet. This just installs the binary
directly):
```
$ brew install https://raw.githubusercontent.com/BurntSushi/ripgrep/master/pkg/brew/ripgrep.rb
```
If you're an **Archlinux** user, then you can install `ripgrep` from the
[`ripgrep` AUR package](https://aur.archlinux.org/packages/ripgrep/), e.g.,
```
$ yaourt -S ripgrep
```
If you're a **Rust programmer**, `ripgrep` can be installed with `cargo`:
```
$ cargo install ripgrep
```
`ripgrep` isn't currently in any other package repositories.
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
### Whirlwind tour
The command line usage of `ripgrep` doesn't differ much from other tools that
perform a similar function, so you probably already know how to use `ripgrep`.
The full details can be found in `rg --help`, but let's go on a whirlwind tour.
`ripgrep` detects when its printing to a terminal, and will automatically
colorize your output and show line numbers, just like The Silver Searcher.
Coloring works on Windows too! Colors can be controlled more granularly with
the `--color` flag.
One last thing before we get started: `ripgrep` assumes UTF-8 *everywhere*. It
can still search files that are invalid UTF-8 (like, say, latin-1), but it will
simply not work on UTF-16 encoded files or other more exotic encodings.
[Support for other encodings may
happen.](https://github.com/BurntSushi/ripgrep/issues/1)
To recursively search the current directory, while respecting all `.gitignore`
files, ignore hidden files and directories and skip binary files:
```
$ rg foobar
```
The above command also respects all `.rgignore` files, including in parent
directories. `.rgignore` files can be used when `.gitignore` files are
insufficient. In all cases, `.rgignore` patterns take precedence over
`.gitignore`.
To ignore all ignore files, use `-u`. To additionally search hidden files
and directories, use `-uu`. To additionally search binary files, use `-uuu`.
(In other words, "search everything, dammit!") In particular, `rg -uuu` is
similar to `grep -a -r`.
```
$ rg -uu foobar # similar to `grep -r`
$ rg -uuu foobar # similar to `grep -a -r`
```
(Tip: If your ignore files aren't being adhered to like you expect, run your
search with the `--debug` flag.)
Make the search case insensitive with `-i`, invert the search with `-v` or
show the 2 lines before and after every search result with `-C2`.
Force all matches to be surrounded by word boundaries with `-w`.
Search and replace (find first and last names and swap them):
```
$ rg '([A-Z][a-z]+)\s+([A-Z][a-z]+)' --replace '$2, $1'
```
Named groups are supported:
```
$ rg '(?P<first>[A-Z][a-z]+)\s+(?P<last>[A-Z][a-z]+)' --replace '$last, $first'
```
Up the ante with full Unicode support, by matching any uppercase Unicode letter
followed by any sequence of lowercase Unicode letters (good luck doing this
with other search tools!):
```
$ rg '(\p{Lu}\p{Ll}+)\s+(\p{Lu}\p{Ll}+)' --replace '$2, $1'
```
Search only files matching a particular glob:
```
$ rg foo -g 'README.*'
```
<!--*-->
Or exclude files matching a particular glob:
```
$ rg foo -g '!*.min.js'
```
Search only HTML and CSS files:
```
$ rg -thtml -tcss foobar
```
Search everything except for Javascript files:
```
$ rg -Tjs foobar
```
To see a list of types supported, run `rg --type-list`. To add a new type, use
`--type-add`:
```
$ rg --type-add 'foo:*.foo,*.foobar'
```
The type `foo` will now match any file ending with the `.foo` or `.foobar`
extensions.
### Regex syntax
The syntax supported is
[documented as part of Rust's regex library](https://doc.rust-lang.org/regex/regex/index.html#syntax).
### Building
`ripgrep` is written in Rust, so you'll need to grab a
[Rust installation](https://www.rust-lang.org/) in order to compile it.
`ripgrep` compiles with Rust 1.9 (stable) or newer. Building is easy:
```
$ git clone git://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo build --release
$ ./target/release/rg --version
0.1.3
```
If you have a Rust nightly compiler, then you can enable optional SIMD
acceleration like so:
```
RUSTFLAGS="-C target-cpu=native" cargo build --release --features simd-accel
```
### Running tests
`ripgrep` is relatively well tested, including both unit tests and integration
tests. To run the full test suite, use:
```
$ cargo test
```
from the repository root.

271
README.md
View File

@@ -1,6 +1,269 @@
**UNDER DEVELOPMENT.**
ripgrep (rg)
------------
ripgrep combines the usability of the silver searcher with the raw speed of
grep.
`ripgrep` is a command line search tool that combines the usability of The
Silver Searcher (an `ack` clone) with the raw speed of GNU grep. `ripgrep` has
first class support on Windows, Mac and Linux, with binary downloads available
for [every release](https://github.com/BurntSushi/ripgrep/releases).
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Screenshot of search results
[![A screenshot of a sample search with ripgrep](http://burntsushi.net/stuff/ripgrep1.png)](http://burntsushi.net/stuff/ripgrep1.png)
### Quick example comparing tools
This example searches the entire Linux kernel source tree (after running
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz.
Please remember that a single benchmark is never enough! See my
[blog post on `ripgrep`](http://blog.burntsushi.net/ripgrep/)
for a very detailed comparison with more benchmarks and analysis.
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.245s** |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.753s |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.823s |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.880s |
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.656s |
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 12.369s |
| [ack](http://beyondgrep.com/) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 16.952s |
(Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
[bug](https://github.com/petdance/ack2/issues/14).)
### Why should I use `ripgrep`?
* It can replace both The Silver Searcher and GNU grep because it is faster
than both. (N.B. It is not, strictly speaking, a "drop-in" replacement for
both, but the feature sets are far more similar than different.)
* Like The Silver Searcher, `ripgrep` defaults to recursive directory search
and won't search files ignored by your `.gitignore` files. It also ignores
hidden and binary files by default. `ripgrep` also implements full support
for `.gitignore`, where as there are many bugs related to that functionality
in The Silver Searcher.
* `ripgrep` can search specific types of files. For example, `rg -tpy foo`
limits your search to Python files and `rg -Tjs foo` excludes Javascript
files from your search. `ripgrep` can be taught about new file types with
custom matching rules.
* `ripgrep` supports many features found in `grep`, such as showing the context
of search results, searching multiple patterns, highlighting matches with
color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
supporting Unicode (which is always on).
In other words, use `ripgrep` if you like speed, sane defaults, fewer bugs and
Unicode.
### Is it really faster than everything else?
Yes. A large number of benchmarks with detailed analysis for each is
[available on my blog](http://blog.burntsushi.net/ripgrep/).
Summarizing, `ripgrep` is fast because:
* It is built on top of
[Rust's regex engine](https://github.com/rust-lang-nursery/regex).
Rust's regex engine uses finite automata, SIMD and aggressive literal
optimizations to make searching very fast.
* Rust's regex library maintains performance with full Unicode support by
building UTF-8 decoding directly into its deterministic finite automaton
engine.
* It supports searching with either memory maps or by searching incrementally
with an intermediate buffer. The former is better for single files and the
latter is better for large directories. `ripgrep` chooses the best searching
strategy for you automatically.
* Applies your ignore patterns in `.gitignore` files using a
[`RegexSet`](https://doc.rust-lang.org/regex/regex/struct.RegexSet.html).
That means a single file path can be matched against multiple glob patterns
simultaneously.
* Uses a Chase-Lev work-stealing queue for quickly distributing work to
multiple threads.
### Installation
The binary name for `ripgrep` is `rg`.
[Binaries for `ripgrep` are available for Windows, Mac and
Linux.](https://github.com/BurntSushi/ripgrep/releases) Linux binaries are
static executables. Windows binaries are available either as built with MinGW
(GNU) or with Microsoft Visual C++ (MSVC). When possible, prefer MSVC over GNU,
but you'll need to have the
[Microsoft VC++ 2015 redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)
installed.
If you're a **Mac OS X Homebrew** user, then you can install ripgrep either
from homebrew-core, (compiled with rust stable, no SIMD):
```
$ brew install ripgrep
```
or you can install a binary compiled with rust nightly (including SIMD and all
optimizations) by utilizing a custom tap:
```
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install burntsushi/ripgrep/ripgrep-bin
```
If you're an **Arch Linux** user, then you can install `ripgrep` from the official repos:
```
$ pacman -S ripgrep
```
If you're a **Rust programmer**, `ripgrep` can be installed with `cargo`:
```
$ cargo install ripgrep
```
`ripgrep` isn't currently in any other package repositories.
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
### Whirlwind tour
The command line usage of `ripgrep` doesn't differ much from other tools that
perform a similar function, so you probably already know how to use `ripgrep`.
The full details can be found in `rg --help`, but let's go on a whirlwind tour.
`ripgrep` detects when its printing to a terminal, and will automatically
colorize your output and show line numbers, just like The Silver Searcher.
Coloring works on Windows too! Colors can be controlled more granularly with
the `--color` flag.
One last thing before we get started: `ripgrep` assumes UTF-8 *everywhere*. It
can still search files that are invalid UTF-8 (like, say, latin-1), but it will
simply not work on UTF-16 encoded files or other more exotic encodings.
[Support for other encodings may
happen.](https://github.com/BurntSushi/ripgrep/issues/1)
To recursively search the current directory, while respecting all `.gitignore`
files, ignore hidden files and directories and skip binary files:
```
$ rg foobar
```
The above command also respects all `.ignore` files, including in parent
directories. `.ignore` files can be used when `.gitignore` files are
insufficient. In all cases, `.ignore` patterns take precedence over
`.gitignore`.
To ignore all ignore files, use `-u`. To additionally search hidden files
and directories, use `-uu`. To additionally search binary files, use `-uuu`.
(In other words, "search everything, dammit!") In particular, `rg -uuu` is
similar to `grep -a -r`.
```
$ rg -uu foobar # similar to `grep -r`
$ rg -uuu foobar # similar to `grep -a -r`
```
(Tip: If your ignore files aren't being adhered to like you expect, run your
search with the `--debug` flag.)
Make the search case insensitive with `-i`, invert the search with `-v` or
show the 2 lines before and after every search result with `-C2`.
Force all matches to be surrounded by word boundaries with `-w`.
Search and replace (find first and last names and swap them):
```
$ rg '([A-Z][a-z]+)\s+([A-Z][a-z]+)' --replace '$2, $1'
```
Named groups are supported:
```
$ rg '(?P<first>[A-Z][a-z]+)\s+(?P<last>[A-Z][a-z]+)' --replace '$last, $first'
```
Up the ante with full Unicode support, by matching any uppercase Unicode letter
followed by any sequence of lowercase Unicode letters (good luck doing this
with other search tools!):
```
$ rg '(\p{Lu}\p{Ll}+)\s+(\p{Lu}\p{Ll}+)' --replace '$2, $1'
```
Search only files matching a particular glob:
```
$ rg foo -g 'README.*'
```
<!--*-->
Or exclude files matching a particular glob:
```
$ rg foo -g '!*.min.js'
```
Search only HTML and CSS files:
```
$ rg -thtml -tcss foobar
```
Search everything except for Javascript files:
```
$ rg -Tjs foobar
```
To see a list of types supported, run `rg --type-list`. To add a new type, use
`--type-add`, which must be accompanied by a pattern for searching (`rg` won't
persist your type settings):
```
$ rg --type-add 'foo:*.{foo,foobar}' -tfoo bar
```
The type `foo` will now match any file ending with the `.foo` or `.foobar`
extensions.
### Regex syntax
The syntax supported is
[documented as part of Rust's regex library](https://doc.rust-lang.org/regex/regex/index.html#syntax).
### Building
`ripgrep` is written in Rust, so you'll need to grab a
[Rust installation](https://www.rust-lang.org/) in order to compile it.
`ripgrep` compiles with Rust 1.9 (stable) or newer. Building is easy:
```
$ git clone https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo build --release
$ ./target/release/rg --version
0.1.3
```
If you have a Rust nightly compiler, then you can enable optional SIMD
acceleration like so:
```
RUSTFLAGS="-C target-cpu=native" cargo build --release --features simd-accel
```
### Running tests
`ripgrep` is relatively well tested, including both unit tests and integration
tests. To run the full test suite, use:
```
$ cargo test
```
from the repository root.

View File

@@ -28,6 +28,8 @@ build: false
# TODO modify this phase as you see fit
test_script:
- cargo test --verbose
- cargo test --verbose --manifest-path grep/Cargo.toml
- cargo test --verbose --manifest-path globset/Cargo.toml
before_deploy:
# Generate artifacts for release
@@ -41,7 +43,7 @@ before_deploy:
- appveyor PushArtifact ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip
deploy:
description: 'Windows release'
description: 'Automatically deployed release'
# All the zipped artifacts will be deployed
artifact: /.*\.zip/
auth_token:

View File

@@ -1,5 +0,0 @@
These are internal microbenchmarks for tracking the peformance of individual
components inside of ripgrep. At the moment, they aren't heavily used.
For performance benchmarks of ripgrep proper, see the sibling `benchsuite`
directory.

View File

@@ -16,8 +16,13 @@ disable_cross_doctests() {
}
run_test_suite() {
cargo clean --target $TARGET --verbose
cargo build --target $TARGET --verbose
cargo test --target $TARGET --verbose
cargo build --target $TARGET --verbose --manifest-path grep/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path grep/Cargo.toml
cargo build --target $TARGET --verbose --manifest-path globset/Cargo.toml
cargo test --target $TARGET --verbose --manifest-path globset/Cargo.toml
# sanity check the file type
file target/$TARGET/debug/rg

View File

@@ -16,9 +16,9 @@ rg [\f[I]options\f[]] \-\-files [\f[I]<\f[]path\f[I]> ...\f[]]
.PP
rg [\f[I]options\f[]] \-\-type\-list
.PP
rg \-\-help
rg [\f[I]options\f[]] \-\-help
.PP
rg \-\-version
rg [\f[I]options\f[]] \-\-version
.SH DESCRIPTION
.PP
rg (ripgrep) combines the usability of The Silver Searcher (an ack
@@ -70,6 +70,7 @@ Show this usage message.
.TP
.B \-i, \-\-ignore\-case
Case insensitive search.
Overridden by \-\-case\-sensitive.
.RS
.RE
.TP
@@ -86,12 +87,7 @@ Suppress line numbers.
.TP
.B \-q, \-\-quiet
Do not print anything to stdout.
.RS
.RE
.TP
.B \-r, \-\-replace \f[I]ARG\f[]
Replace every match with the string given.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported.
If a match is found in a file, stop searching that file.
.RS
.RE
.TP
@@ -169,12 +165,23 @@ Print each file that would be searched (but don\[aq]t search).
.RS
.RE
.TP
.B \-l, \-\-files\-with\-matches
Only show path of each file with matches.
.RS
.RE
.TP
.B \-H, \-\-with\-filename
Prefix each match with the file name that contains it.
This is the default when more than one file is searched.
.RS
.RE
.TP
.B \-\-no\-filename
Never show the filename for a match.
This is the default when one file is searched.
.RS
.RE
.TP
.B \-\-heading
Show the file name above clusters of matches from each file.
This is the default mode at a tty.
@@ -197,6 +204,12 @@ Follow symlinks.
.RS
.RE
.TP
.B \-\-maxdepth \f[I]NUM\f[]
Descend at most NUM directories below the command line arguments.
A value of zero searches only the starting\-points themselves.
.RS
.RE
.TP
.B \-\-mmap
Search using memory maps when possible.
This is enabled by default when ripgrep thinks it will be faster.
@@ -211,8 +224,8 @@ Never use memory maps, even when they might be faster.
.RE
.TP
.B \-\-no\-ignore
Don\[aq]t respect ignore files (.gitignore, .rgignore, etc.) This
implies \-\-no\-ignore\-parent.
Don\[aq]t respect ignore files (.gitignore, .ignore, etc.) This implies
\-\-no\-ignore\-parent.
.RS
.RE
.TP
@@ -221,11 +234,47 @@ Don\[aq]t respect ignore files in parent directories.
.RS
.RE
.TP
.B \-\-no\-ignore\-vcs
Don\[aq]t respect version control ignore files (e.g., .gitignore).
Note that .ignore files will continue to be respected.
.RS
.RE
.TP
.B \-\-null
Whenever a file name is printed, follow it with a NUL byte.
This includes printing filenames before matches, and when printing a
list of matching files such as with \-\-count, \-\-files\-with\-matches
and \-\-files.
.RS
.RE
.TP
.B \-p, \-\-pretty
Alias for \-\-color=always \-\-heading \-n.
.RS
.RE
.TP
.B \-r, \-\-replace \f[I]ARG\f[]
Replace every match with the string given when printing search results.
Neither this flag nor any other flag will modify your files.
.RS
.PP
Capture group indices (e.g., $5) and names (e.g., $foo) are supported in
the replacement string.
.RE
.TP
.B \-s, \-\-case\-sensitive
Search case sensitively.
This overrides \-\-ignore\-case and \-\-smart\-case.
.RS
.RE
.TP
.B \-S, \-\-smart\-case
Search case insensitively if the pattern is all lowercase.
Search case sensitively otherwise.
This is overridden by either \-\-case\-sensitive or \-\-ignore\-case.
.RS
.RE
.TP
.B \-j, \-\-threads \f[I]ARG\f[]
The number of threads to use.
Defaults to the number of logical CPUs (capped at 6).
@@ -254,11 +303,22 @@ Show all supported file types and their associated globs.
.TP
.B \-\-type\-add \f[I]ARG\f[] ...
Add a new glob for a particular file type.
Example: \-\-type\-add html:\f[I]\&.html,\f[].htm
Only one glob can be added at a time.
Multiple \-\-type\-add flags can be provided.
Unless \-\-type\-clear is used, globs are added to any existing globs
inside of ripgrep.
Note that this must be passed to every invocation of rg.
Type settings are NOT persisted.
.RS
.PP
Example:
\f[C]rg\ \-\-type\-add\ \[aq]foo:*.foo\[aq]\ \-tfoo\ PATTERN\f[]
.RE
.TP
.B \-\-type\-clear \f[I]TYPE\f[] ...
Clear the file type globs for TYPE.
Clear the file type globs previously defined for TYPE.
This only clears the default type definitions that are found inside of
ripgrep.
Note that this must be passed to every invocation of rg.
.RS
.RE

View File

@@ -12,9 +12,9 @@ rg [*options*] --files [*<*path*> ...*]
rg [*options*] --type-list
rg --help
rg [*options*] --help
rg --version
rg [*options*] --version
# DESCRIPTION
@@ -49,7 +49,7 @@ the raw speed of grep.
: Show this usage message.
-i, --ignore-case
: Case insensitive search.
: Case insensitive search. Overridden by --case-sensitive.
-n, --line-number
: Show line numbers (1-based). This is enabled by default at a tty.
@@ -58,11 +58,8 @@ the raw speed of grep.
: Suppress line numbers.
-q, --quiet
: Do not print anything to stdout.
-r, --replace *ARG*
: Replace every match with the string given. Capture group indices (e.g., $5)
and names (e.g., $foo) are supported.
: Do not print anything to stdout. If a match is found in a file, stop
searching that file.
-t, --type *TYPE* ...
: Only search files matching TYPE. Multiple type flags may be provided. Use the
@@ -110,10 +107,17 @@ the raw speed of grep.
--files
: Print each file that would be searched (but don't search).
-l, --files-with-matches
: Only show path of each file with matches.
-H, --with-filename
: Prefix each match with the file name that contains it. This is the
default when more than one file is searched.
--no-filename
: Never show the filename for a match. This is the default when
one file is searched.
--heading
: Show the file name above clusters of matches from each file.
This is the default mode at a tty.
@@ -128,6 +132,10 @@ the raw speed of grep.
-L, --follow
: Follow symlinks.
--maxdepth *NUM*
: Descend at most NUM directories below the command line arguments.
A value of zero searches only the starting-points themselves.
--mmap
: Search using memory maps when possible. This is enabled by default
when ripgrep thinks it will be faster. (Note that mmap searching
@@ -137,15 +145,40 @@ the raw speed of grep.
: Never use memory maps, even when they might be faster.
--no-ignore
: Don't respect ignore files (.gitignore, .rgignore, etc.)
: Don't respect ignore files (.gitignore, .ignore, etc.)
This implies --no-ignore-parent.
--no-ignore-parent
: Don't respect ignore files in parent directories.
--no-ignore-vcs
: Don't respect version control ignore files (e.g., .gitignore).
Note that .ignore files will continue to be respected.
--null
: Whenever a file name is printed, follow it with a NUL byte.
This includes printing filenames before matches, and when printing
a list of matching files such as with --count, --files-with-matches
and --files.
-p, --pretty
: Alias for --color=always --heading -n.
-r, --replace *ARG*
: Replace every match with the string given when printing search results.
Neither this flag nor any other flag will modify your files.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported
in the replacement string.
-s, --case-sensitive
: Search case sensitively. This overrides --ignore-case and --smart-case.
-S, --smart-case
: Search case insensitively if the pattern is all lowercase.
Search case sensitively otherwise. This is overridden by either
--case-sensitive or --ignore-case.
-j, --threads *ARG*
: The number of threads to use. Defaults to the number of logical CPUs
(capped at 6). [default: 0]
@@ -164,8 +197,15 @@ the raw speed of grep.
: Show all supported file types and their associated globs.
--type-add *ARG* ...
: Add a new glob for a particular file type.
Example: --type-add html:*.html,*.htm
: Add a new glob for a particular file type. Only one glob can be added
at a time. Multiple --type-add flags can be provided. Unless --type-clear
is used, globs are added to any existing globs inside of ripgrep. Note that
this must be passed to every invocation of rg. Type settings are NOT
persisted.
Example: `rg --type-add 'foo:*.foo' -tfoo PATTERN`
--type-clear *TYPE* ...
: Clear the file type globs for TYPE.
: Clear the file type globs previously defined for TYPE. This only clears
the default type definitions that are found inside of ripgrep. Note
that this must be passed to every invocation of rg.

30
globset/Cargo.toml Normal file
View File

@@ -0,0 +1,30 @@
[package]
name = "globset"
version = "0.1.0" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
"""
documentation = "https://docs.rs/globset"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/globset"
readme = "README.md"
keywords = ["regex", "glob", "multiple", "set", "pattern"]
license = "Unlicense/MIT"
[lib]
name = "globset"
bench = false
[dependencies]
aho-corasick = "0.5.3"
fnv = "1.0"
lazy_static = "0.2"
log = "0.3"
memchr = "0.1"
regex = "0.1.77"
[dev-dependencies]
glob = "0.2"

122
globset/README.md Normal file
View File

@@ -0,0 +1,122 @@
globset
=======
Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![](https://img.shields.io/crates/v/globset.svg)](https://crates.io/crates/globset)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
### Documentation
[https://docs.rs/globset](https://docs.rs/globset)
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
globset = "0.1"
```
and this to your crate root:
```rust
extern crate globset;
```
### Example: one glob
This example shows how to match a single glob against a single file path.
```rust
use globset::Glob;
let glob = try!(Glob::new("*.rs")).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
```
### Example: configuring a glob matcher
This example shows how to use a `GlobBuilder` to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
```rust
use globset::GlobBuilder;
let glob = try!(GlobBuilder::new("*.rs")
.literal_separator(true).build()).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
```
### Example: match multiple globs at once
This example shows how to match multiple glob patterns at once.
```rust
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(try!(Glob::new("*.rs")));
builder.add(try!(Glob::new("src/lib.rs")));
builder.add(try!(Glob::new("src/**/foo.rs")));
let set = try!(builder.build());
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
```
### Performance
This crate implements globs by converting them to regular expressions, and
executing them with the
[`regex`](https://github.com/rust-lang-nursery/regex)
crate.
For single glob matching, performance of this crate should be roughly on par
with the performance of the
[`glob`](https://github.com/rust-lang-nursery/glob)
crate. (`*_regex` correspond to benchmarks for this library while `*_glob`
correspond to benchmarks for the `glob` library.)
Optimizations in the `regex` crate may propel this library past `glob`,
particularly when matching longer paths.
```
test ext_glob ... bench: 425 ns/iter (+/- 21)
test ext_regex ... bench: 175 ns/iter (+/- 10)
test long_glob ... bench: 182 ns/iter (+/- 11)
test long_regex ... bench: 173 ns/iter (+/- 10)
test short_glob ... bench: 69 ns/iter (+/- 4)
test short_regex ... bench: 83 ns/iter (+/- 2)
```
The primary performance advantage of this crate is when matching multiple
globs against a single path. With the `glob` crate, one must match each glob
synchronously, one after the other. In this crate, many can be matched
simultaneously. For example:
```
test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
```
### Comparison with the [`glob`](https://github.com/rust-lang-nursery/glob) crate
* Supports alternate "or" globs, e.g., `*.{foo,bar}`.
* Can match non-UTF-8 file paths correctly.
* Supports matching multiple globs at once.
* Doesn't provide a recursive directory iterator of matching file paths,
although I believe this crate should grow one eventually.
* Supports case insensitive and require-literal-separator match options, but
**doesn't** support the require-literal-leading-dot option.

View File

@@ -5,37 +5,50 @@ tool itself, see the benchsuite directory.
#![feature(test)]
extern crate glob;
extern crate globset;
#[macro_use]
extern crate lazy_static;
extern crate regex;
extern crate test;
use globset::{Candidate, Glob, GlobMatcher, GlobSet, GlobSetBuilder};
const EXT: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";
const EXT_PAT: &'static str = "*.txt";
const SHORT: &'static str = "some/needle.txt";
const SHORT_PAT: &'static str = "some/**/needle.txt";
const LONG: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";
const LONG_PAT: &'static str = "some/**/needle.txt";
#[allow(dead_code, unused_variables)]
#[path = "../src/glob.rs"]
mod reglob;
fn new_glob(pat: &str) -> glob::Pattern {
glob::Pattern::new(pat).unwrap()
}
fn new_reglob(pat: &str) -> reglob::Set {
let mut builder = reglob::SetBuilder::new();
builder.add(pat).unwrap();
fn new_reglob(pat: &str) -> GlobMatcher {
Glob::new(pat).unwrap().compile_matcher()
}
fn new_reglob_many(pats: &[&str]) -> GlobSet {
let mut builder = GlobSetBuilder::new();
for pat in pats {
builder.add(Glob::new(pat).unwrap());
}
builder.build().unwrap()
}
fn new_reglob_many(pats: &[&str]) -> reglob::Set {
let mut builder = reglob::SetBuilder::new();
for pat in pats {
builder.add(pat).unwrap();
}
builder.build().unwrap()
#[bench]
fn ext_glob(b: &mut test::Bencher) {
let pat = new_glob(EXT_PAT);
b.iter(|| assert!(pat.matches(EXT)));
}
#[bench]
fn ext_regex(b: &mut test::Bencher) {
let set = new_reglob(EXT_PAT);
let cand = Candidate::new(EXT);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
#[bench]
@@ -47,7 +60,8 @@ fn short_glob(b: &mut test::Bencher) {
#[bench]
fn short_regex(b: &mut test::Bencher) {
let set = new_reglob(SHORT_PAT);
b.iter(|| assert!(set.is_match(SHORT)));
let cand = Candidate::new(SHORT);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
#[bench]
@@ -59,7 +73,8 @@ fn long_glob(b: &mut test::Bencher) {
#[bench]
fn long_regex(b: &mut test::Bencher) {
let set = new_reglob(LONG_PAT);
b.iter(|| assert!(set.is_match(LONG)));
let cand = Candidate::new(LONG);
b.iter(|| assert!(set.is_match_candidate(&cand)));
}
const MANY_SHORT_GLOBS: &'static [&'static str] = &[
@@ -101,26 +116,3 @@ fn many_short_regex_set(b: &mut test::Bencher) {
let set = new_reglob_many(MANY_SHORT_GLOBS);
b.iter(|| assert_eq!(2, set.matches(MANY_SHORT_SEARCH).iter().count()));
}
// This is the fastest on my system (beating many_glob by about 2x). This
// suggests that a RegexSet needs quite a few regexes (or a larger haystack)
// in order for it to scale.
//
// TODO(burntsushi): come up with a benchmark that uses more complex patterns
// or a longer haystack.
#[bench]
fn many_short_regex_pattern(b: &mut test::Bencher) {
let pats: Vec<_> = MANY_SHORT_GLOBS.iter().map(|&s| {
let pat = reglob::Pattern::new(s).unwrap();
regex::Regex::new(&pat.to_regex()).unwrap()
}).collect();
b.iter(|| {
let mut count = 0;
for pat in &pats {
if pat.is_match(MANY_SHORT_SEARCH) {
count += 1;
}
}
assert_eq!(2, count);
})
}

1300
globset/src/glob.rs Normal file

File diff suppressed because it is too large Load Diff

753
globset/src/lib.rs Normal file
View File

@@ -0,0 +1,753 @@
/*!
The globset crate provides cross platform single glob and glob set matching.
Glob set matching is the process of matching one or more glob patterns against
a single candidate path simultaneously, and returning all of the globs that
matched. For example, given this set of globs:
```ignore
*.rs
src/lib.rs
src/**/foo.rs
```
and a path `src/bar/baz/foo.rs`, then the set would report the first and third
globs as matching.
Single glob matching is also provided and is done by converting globs to
# Example: one glob
This example shows how to match a single glob against a single file path.
```
# fn example() -> Result<(), globset::Error> {
use globset::Glob;
let glob = try!(Glob::new("*.rs")).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
```
# Example: configuring a glob matcher
This example shows how to use a `GlobBuilder` to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
```
# fn example() -> Result<(), globset::Error> {
use globset::GlobBuilder;
let glob = try!(GlobBuilder::new("*.rs")
.literal_separator(true).build()).compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
```
# Example: match multiple globs at once
This example shows how to match multiple glob patterns at once.
```
# fn example() -> Result<(), globset::Error> {
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(try!(Glob::new("*.rs")));
builder.add(try!(Glob::new("src/lib.rs")));
builder.add(try!(Glob::new("src/**/foo.rs")));
let set = try!(builder.build());
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
# Ok(()) } example().unwrap();
```
# Syntax
Standard Unix-style glob syntax is supported:
* `?` matches any single character. (If the `literal_separator` option is
enabled, then `?` can never match a path separator.)
* `*` matches zero or more characters. (If the `literal_separator` option is
enabled, then `*` can never match a path separator.)
* `**` recursively matches directories but are only legal in three situations.
First, if the glob starts with <code>\*\*&#x2F;</code>, then it matches
all directories. For example, <code>\*\*&#x2F;foo</code> matches `foo`
and `bar/foo` but not `foo/bar`. Secondly, if the glob ends with
<code>&#x2F;\*\*</code>, then it matches all sub-entries. For example,
<code>foo&#x2F;\*\*</code> matches `foo/a` and `foo/a/b`, but not `foo`.
Thirdly, if the glob contains <code>&#x2F;\*\*&#x2F;</code> anywhere within
the pattern, then it matches zero or more directories. Using `**` anywhere
else is illegal (N.B. the glob `**` is allowed and means "match everything").
* `{a,b}` matches `a` or `b` where `a` and `b` are arbitrary glob patterns.
(N.B. Nesting `{...}` is not currently allowed.)
* `[ab]` matches `a` or `b` where `a` and `b` are characters. Use
`[!ab]` to match any character except for `a` and `b`.
* Metacharacters such as `*` and `?` can be escaped with character class
notation. e.g., `[*]` matches `*`.
A `GlobBuilder` can be used to prevent wildcards from matching path separators,
or to enable case insensitive matching.
*/
#![deny(missing_docs)]
extern crate aho_corasick;
extern crate fnv;
#[macro_use]
extern crate lazy_static;
#[macro_use]
extern crate log;
extern crate memchr;
extern crate regex;
use std::borrow::Cow;
use std::collections::{BTreeMap, HashMap};
use std::error::Error as StdError;
use std::ffi::{OsStr, OsString};
use std::fmt;
use std::hash;
use std::path::Path;
use std::str;
use aho_corasick::{Automaton, AcAutomaton, FullAcAutomaton};
use regex::bytes::{Regex, RegexBuilder, RegexSet};
use pathutil::{
file_name, file_name_ext, normalize_path, os_str_bytes, path_bytes,
};
use glob::MatchStrategy;
pub use glob::{Glob, GlobBuilder, GlobMatcher};
mod glob;
mod pathutil;
macro_rules! eprintln {
($($tt:tt)*) => {{
use std::io::Write;
let _ = writeln!(&mut ::std::io::stderr(), $($tt)*);
}}
}
/// Represents an error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Error {
/// Occurs when a use of `**` is invalid. Namely, `**` can only appear
/// adjacent to a path separator, or the beginning/end of a glob.
InvalidRecursive,
/// Occurs when a character class (e.g., `[abc]`) is not closed.
UnclosedClass,
/// Occurs when a range in a character (e.g., `[a-z]`) is invalid. For
/// example, if the range starts with a lexicographically larger character
/// than it ends with.
InvalidRange(char, char),
/// Occurs when a `}` is found without a matching `{`.
UnopenedAlternates,
/// Occurs when a `{` is found without a matching `}`.
UnclosedAlternates,
/// Occurs when an alternating group is nested inside another alternating
/// group, e.g., `{{a,b},{c,d}}`.
NestedAlternates,
/// An error associated with parsing or compiling a regex.
Regex(String),
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Error::InvalidRecursive => {
"invalid use of **; must be one path component"
}
Error::UnclosedClass => {
"unclosed character class; missing ']'"
}
Error::InvalidRange(_, _) => {
"invalid character range"
}
Error::UnopenedAlternates => {
"unopened alternate group; missing '{' \
(maybe escape '}' with '[}]'?)"
}
Error::UnclosedAlternates => {
"unclosed alternate group; missing '}' \
(maybe escape '{' with '[{]'?)"
}
Error::NestedAlternates => {
"nested alternate groups are not allowed"
}
Error::Regex(ref err) => err,
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::InvalidRecursive
| Error::UnclosedClass
| Error::UnopenedAlternates
| Error::UnclosedAlternates
| Error::NestedAlternates
| Error::Regex(_) => {
write!(f, "{}", self.description())
}
Error::InvalidRange(s, e) => {
write!(f, "invalid range; '{}' > '{}'", s, e)
}
}
}
}
fn new_regex(pat: &str) -> Result<Regex, Error> {
RegexBuilder::new(pat)
.dot_matches_new_line(true)
.size_limit(10 * (1 << 20))
.dfa_size_limit(10 * (1 << 20))
.compile()
.map_err(|err| Error::Regex(err.to_string()))
}
fn new_regex_set<I, S>(pats: I) -> Result<RegexSet, Error>
where S: AsRef<str>, I: IntoIterator<Item=S> {
RegexSet::new(pats).map_err(|err| Error::Regex(err.to_string()))
}
type Fnv = hash::BuildHasherDefault<fnv::FnvHasher>;
/// GlobSet represents a group of globs that can be matched together in a
/// single pass.
#[derive(Clone, Debug)]
pub struct GlobSet {
strats: Vec<GlobSetMatchStrategy>,
}
impl GlobSet {
/// Returns true if any glob in this set matches the path given.
pub fn is_match<P: AsRef<Path>>(&self, path: P) -> bool {
self.is_match_candidate(&Candidate::new(path.as_ref()))
}
/// Returns true if any glob in this set matches the path given.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
for strat in &self.strats {
if strat.is_match(path) {
return true;
}
}
false
}
/// Returns the sequence number of every glob pattern that matches the
/// given path.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn matches<P: AsRef<Path>>(&self, path: P) -> Vec<usize> {
self.matches_candidate(&Candidate::new(path.as_ref()))
}
/// Returns the sequence number of every glob pattern that matches the
/// given path.
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn matches_candidate(&self, path: &Candidate) -> Vec<usize> {
let mut into = vec![];
self.matches_candidate_into(path, &mut into);
into
}
/// Adds the sequence number of every glob pattern that matches the given
/// path to the vec given.
///
/// `into` is is cleared before matching begins, and contains the set of
/// sequence numbers (in ascending order) after matching ends. If no globs
/// were matched, then `into` will be empty.
pub fn matches_candidate_into(
&self,
path: &Candidate,
into: &mut Vec<usize>,
) {
into.clear();
for strat in &self.strats {
strat.matches_into(path, into);
}
into.sort();
into.dedup();
}
fn new(pats: &[Glob]) -> Result<GlobSet, Error> {
let mut lits = LiteralStrategy::new();
let mut base_lits = BasenameLiteralStrategy::new();
let mut exts = ExtensionStrategy::new();
let mut prefixes = MultiStrategyBuilder::new();
let mut suffixes = MultiStrategyBuilder::new();
let mut required_exts = RequiredExtensionStrategyBuilder::new();
let mut regexes = MultiStrategyBuilder::new();
for (i, p) in pats.iter().enumerate() {
match MatchStrategy::new(p) {
MatchStrategy::Literal(lit) => {
lits.add(i, lit);
}
MatchStrategy::BasenameLiteral(lit) => {
base_lits.add(i, lit);
}
MatchStrategy::Extension(ext) => {
exts.add(i, ext);
}
MatchStrategy::Prefix(prefix) => {
prefixes.add(i, prefix);
}
MatchStrategy::Suffix { suffix, component } => {
if component {
lits.add(i, suffix[1..].to_string());
}
suffixes.add(i, suffix);
}
MatchStrategy::RequiredExtension(ext) => {
required_exts.add(i, ext, p.regex().to_owned());
}
MatchStrategy::Regex => {
debug!("glob converted to regex: {:?}", p);
regexes.add(i, p.regex().to_owned());
}
}
}
debug!("built glob set; {} literals, {} basenames, {} extensions, \
{} prefixes, {} suffixes, {} required extensions, {} regexes",
lits.0.len(), base_lits.0.len(), exts.0.len(),
prefixes.literals.len(), suffixes.literals.len(),
required_exts.0.len(), regexes.literals.len());
Ok(GlobSet {
strats: vec![
GlobSetMatchStrategy::Extension(exts),
GlobSetMatchStrategy::BasenameLiteral(base_lits),
GlobSetMatchStrategy::Literal(lits),
GlobSetMatchStrategy::Suffix(suffixes.suffix()),
GlobSetMatchStrategy::Prefix(prefixes.prefix()),
GlobSetMatchStrategy::RequiredExtension(
try!(required_exts.build())),
GlobSetMatchStrategy::Regex(try!(regexes.regex_set())),
],
})
}
}
/// GlobSetBuilder builds a group of patterns that can be used to
/// simultaneously match a file path.
pub struct GlobSetBuilder {
pats: Vec<Glob>,
}
impl GlobSetBuilder {
/// Create a new GlobSetBuilder. A GlobSetBuilder can be used to add new
/// patterns. Once all patterns have been added, `build` should be called
/// to produce a `GlobSet`, which can then be used for matching.
pub fn new() -> GlobSetBuilder {
GlobSetBuilder { pats: vec![] }
}
/// Builds a new matcher from all of the glob patterns added so far.
///
/// Once a matcher is built, no new patterns can be added to it.
pub fn build(&self) -> Result<GlobSet, Error> {
GlobSet::new(&self.pats)
}
/// Add a new pattern to this set.
#[allow(dead_code)]
pub fn add(&mut self, pat: Glob) -> &mut GlobSetBuilder {
self.pats.push(pat);
self
}
}
/// A candidate path for matching.
///
/// All glob matching in this crate operates on `Candidate` values.
/// Constructing candidates has a very small cost associated with it, so
/// callers may find it beneficial to amortize that cost when matching a single
/// path against multiple globs or sets of globs.
#[derive(Clone, Debug)]
pub struct Candidate<'a> {
path: Cow<'a, [u8]>,
basename: Cow<'a, [u8]>,
ext: &'a OsStr,
}
impl<'a> Candidate<'a> {
/// Create a new candidate for matching from the given path.
pub fn new<P: AsRef<Path> + ?Sized>(path: &'a P) -> Candidate<'a> {
let path = path.as_ref();
let basename = file_name(path).unwrap_or(OsStr::new(""));
Candidate {
path: normalize_path(path_bytes(path)),
basename: os_str_bytes(basename),
ext: file_name_ext(basename).unwrap_or(OsStr::new("")),
}
}
fn path_prefix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {
&self.path[..max]
}
}
fn path_suffix(&self, max: usize) -> &[u8] {
if self.path.len() <= max {
&*self.path
} else {
&self.path[self.path.len() - max..]
}
}
}
#[derive(Clone, Debug)]
enum GlobSetMatchStrategy {
Literal(LiteralStrategy),
BasenameLiteral(BasenameLiteralStrategy),
Extension(ExtensionStrategy),
Prefix(PrefixStrategy),
Suffix(SuffixStrategy),
RequiredExtension(RequiredExtensionStrategy),
Regex(RegexSetStrategy),
}
impl GlobSetMatchStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.is_match(candidate),
BasenameLiteral(ref s) => s.is_match(candidate),
Extension(ref s) => s.is_match(candidate),
Prefix(ref s) => s.is_match(candidate),
Suffix(ref s) => s.is_match(candidate),
RequiredExtension(ref s) => s.is_match(candidate),
Regex(ref s) => s.is_match(candidate),
}
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.matches_into(candidate, matches),
BasenameLiteral(ref s) => s.matches_into(candidate, matches),
Extension(ref s) => s.matches_into(candidate, matches),
Prefix(ref s) => s.matches_into(candidate, matches),
Suffix(ref s) => s.matches_into(candidate, matches),
RequiredExtension(ref s) => s.matches_into(candidate, matches),
Regex(ref s) => s.matches_into(candidate, matches),
}
}
}
#[derive(Clone, Debug)]
struct LiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
impl LiteralStrategy {
fn new() -> LiteralStrategy {
LiteralStrategy(BTreeMap::new())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
self.0.contains_key(&*candidate.path)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if let Some(hits) = self.0.get(&*candidate.path) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct BasenameLiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
impl BasenameLiteralStrategy {
fn new() -> BasenameLiteralStrategy {
BasenameLiteralStrategy(BTreeMap::new())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.basename.is_empty() {
return false;
}
self.0.contains_key(&*candidate.basename)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.basename.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.basename) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct ExtensionStrategy(HashMap<OsString, Vec<usize>, Fnv>);
impl ExtensionStrategy {
fn new() -> ExtensionStrategy {
ExtensionStrategy(HashMap::with_hasher(Fnv::default()))
}
fn add(&mut self, global_index: usize, ext: OsString) {
self.0.entry(ext).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.ext.is_empty() {
return false;
}
self.0.contains_key(candidate.ext)
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.ext.is_empty() {
return;
}
if let Some(hits) = self.0.get(candidate.ext) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct PrefixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
map: Vec<usize>,
longest: usize,
}
impl PrefixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
matches.push(self.map[m.pati]);
}
}
}
}
#[derive(Clone, Debug)]
struct SuffixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
map: Vec<usize>,
longest: usize,
}
impl SuffixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
matches.push(self.map[m.pati]);
}
}
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategy(HashMap<OsString, Vec<(usize, Regex)>, Fnv>);
impl RequiredExtensionStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
if candidate.ext.is_empty() {
return false;
}
match self.0.get(candidate.ext) {
None => false,
Some(regexes) => {
for &(_, ref re) in regexes {
if re.is_match(&*candidate.path) {
return true;
}
}
false
}
}
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if candidate.ext.is_empty() {
return;
}
if let Some(regexes) = self.0.get(candidate.ext) {
for &(global_index, ref re) in regexes {
if re.is_match(&*candidate.path) {
matches.push(global_index);
}
}
}
}
}
#[derive(Clone, Debug)]
struct RegexSetStrategy {
matcher: RegexSet,
map: Vec<usize>,
}
impl RegexSetStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
self.matcher.is_match(&*candidate.path)
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
for i in self.matcher.matches(&*candidate.path) {
matches.push(self.map[i]);
}
}
}
#[derive(Clone, Debug)]
struct MultiStrategyBuilder {
literals: Vec<String>,
map: Vec<usize>,
longest: usize,
}
impl MultiStrategyBuilder {
fn new() -> MultiStrategyBuilder {
MultiStrategyBuilder {
literals: vec![],
map: vec![],
longest: 0,
}
}
fn add(&mut self, global_index: usize, literal: String) {
if literal.len() > self.longest {
self.longest = literal.len();
}
self.map.push(global_index);
self.literals.push(literal);
}
fn prefix(self) -> PrefixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
PrefixStrategy {
matcher: AcAutomaton::new(it).into_full(),
map: self.map,
longest: self.longest,
}
}
fn suffix(self) -> SuffixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
SuffixStrategy {
matcher: AcAutomaton::new(it).into_full(),
map: self.map,
longest: self.longest,
}
}
fn regex_set(self) -> Result<RegexSetStrategy, Error> {
Ok(RegexSetStrategy {
matcher: try!(new_regex_set(self.literals)),
map: self.map,
})
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategyBuilder(
HashMap<OsString, Vec<(usize, String)>>,
);
impl RequiredExtensionStrategyBuilder {
fn new() -> RequiredExtensionStrategyBuilder {
RequiredExtensionStrategyBuilder(HashMap::new())
}
fn add(&mut self, global_index: usize, ext: OsString, regex: String) {
self.0.entry(ext).or_insert(vec![]).push((global_index, regex));
}
fn build(self) -> Result<RequiredExtensionStrategy, Error> {
let mut exts = HashMap::with_hasher(Fnv::default());
for (ext, regexes) in self.0.into_iter() {
exts.insert(ext.clone(), vec![]);
for (global_index, regex) in regexes {
let compiled = try!(new_regex(&regex));
exts.get_mut(&ext).unwrap().push((global_index, compiled));
}
}
Ok(RequiredExtensionStrategy(exts))
}
}
#[cfg(test)]
mod tests {
use super::GlobSetBuilder;
use glob::Glob;
#[test]
fn set_works() {
let mut builder = GlobSetBuilder::new();
builder.add(Glob::new("src/**/*.rs").unwrap());
builder.add(Glob::new("*.c").unwrap());
builder.add(Glob::new("src/lib.rs").unwrap());
let set = builder.build().unwrap();
assert!(set.is_match("foo.c"));
assert!(set.is_match("src/foo.c"));
assert!(!set.is_match("foo.rs"));
assert!(!set.is_match("tests/foo.rs"));
assert!(set.is_match("src/foo.rs"));
assert!(set.is_match("src/grep/src/main.rs"));
let matches = set.matches("src/lib.rs");
assert_eq!(2, matches.len());
assert_eq!(0, matches[0]);
assert_eq!(2, matches[1]);
}
}

180
globset/src/pathutil.rs Normal file
View File

@@ -0,0 +1,180 @@
use std::borrow::Cow;
use std::ffi::OsStr;
use std::path::Path;
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(unix)]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
use std::os::unix::ffi::OsStrExt;
use memchr::memrchr;
let path = path.as_ref().as_os_str().as_bytes();
if path.is_empty() {
return None;
} else if path.len() == 1 && path[0] == b'.' {
return None;
} else if path.last() == Some(&b'.') {
return None;
} else if path.len() >= 2 && &path[path.len() - 2..] == &b".."[..] {
return None;
}
let last_slash = memrchr(b'/', path).map(|i| i + 1).unwrap_or(0);
Some(OsStr::from_bytes(&path[last_slash..]))
}
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in ., .., or consists solely of a root of prefix,
/// file_name will return None.
#[cfg(not(unix))]
pub fn file_name<'a, P: AsRef<Path> + ?Sized>(
path: &'a P,
) -> Option<&'a OsStr> {
path.as_ref().file_name()
}
/// Return a file extension given a path's file name.
///
/// Note that this does NOT match the semantics of std::path::Path::extension.
/// Namely, the extension includes the `.` and matching is otherwise more
/// liberal. Specifically, the extenion is:
///
/// * None, if the file name given is empty;
/// * None, if there is no embedded `.`;
/// * Otherwise, the portion of the file name starting with the final `.`.
///
/// e.g., A file name of `.rs` has an extension `.rs`.
///
/// N.B. This is done to make certain glob match optimizations easier. Namely,
/// a pattern like `*.rs` is obviously trying to match files with a `rs`
/// extension, but it also matches files like `.rs`, which doesn't have an
/// extension according to std::path::Path::extension.
pub fn file_name_ext(name: &OsStr) -> Option<&OsStr> {
// Yes, these functions are awful, and yes, we are completely violating
// the abstraction barrier of std::ffi. The barrier we're violating is
// that an OsStr's encoding is *ASCII compatible*. While this is obviously
// true on Unix systems, it's also true on Windows because an OsStr uses
// WTF-8 internally: https://simonsapin.github.io/wtf-8/
//
// We should consider doing the same for the other path utility functions.
// Right now, we don't break any barriers, but Windows users are paying
// for it.
//
// Got any better ideas that don't cost anything? Hit me up. ---AG
unsafe fn os_str_as_u8_slice(s: &OsStr) -> &[u8] {
::std::mem::transmute(s)
}
unsafe fn u8_slice_as_os_str(s: &[u8]) -> &OsStr {
::std::mem::transmute(s)
}
if name.is_empty() {
return None;
}
let name = unsafe { os_str_as_u8_slice(name) };
for (i, &b) in name.iter().enumerate().rev() {
if b == b'.' {
return Some(unsafe { u8_slice_as_os_str(&name[i..]) });
}
}
None
}
/// Return raw bytes of a path, transcoded to UTF-8 if necessary.
pub fn path_bytes(path: &Path) -> Cow<[u8]> {
os_str_bytes(path.as_os_str())
}
/// Return the raw bytes of the given OS string, transcoded to UTF-8 if
/// necessary.
#[cfg(unix)]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
use std::os::unix::ffi::OsStrExt;
Cow::Borrowed(s.as_bytes())
}
/// Return the raw bytes of the given OS string, transcoded to UTF-8 if
/// necessary.
#[cfg(not(unix))]
pub fn os_str_bytes(s: &OsStr) -> Cow<[u8]> {
// TODO(burntsushi): On Windows, OS strings are WTF-8, which is a superset
// of UTF-8, so even if we could get at the raw bytes, they wouldn't
// be useful. We *must* convert to UTF-8 before doing path matching.
// Unfortunate, but necessary.
match s.to_string_lossy() {
Cow::Owned(s) => Cow::Owned(s.into_bytes()),
Cow::Borrowed(s) => Cow::Borrowed(s.as_bytes()),
}
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(unix)]
pub fn normalize_path(path: Cow<[u8]>) -> Cow<[u8]> {
// UNIX only uses /, so we're good.
path
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(not(unix))]
pub fn normalize_path(mut path: Cow<[u8]>) -> Cow<[u8]> {
use std::path::is_separator;
for i in 0..path.len() {
if path[i] == b'/' || !is_separator(path[i] as char) {
continue;
}
path.to_mut()[i] = b'/';
}
path
}
#[cfg(test)]
mod tests {
use std::borrow::Cow;
use std::ffi::OsStr;
use super::{file_name_ext, normalize_path};
macro_rules! ext {
($name:ident, $file_name:expr, $ext:expr) => {
#[test]
fn $name() {
let got = file_name_ext(OsStr::new($file_name));
assert_eq!($ext.map(OsStr::new), got);
}
};
}
ext!(ext1, "foo.rs", Some(".rs"));
ext!(ext2, ".rs", Some(".rs"));
ext!(ext3, "..rs", Some(".rs"));
ext!(ext4, "", None::<&str>);
ext!(ext5, "foo", None::<&str>);
macro_rules! normalize {
($name:ident, $path:expr, $expected:expr) => {
#[test]
fn $name() {
let got = normalize_path(Cow::Owned($path.to_vec()));
assert_eq!($expected.to_vec(), got.into_owned());
}
};
}
normalize!(normal1, b"foo", b"foo");
normalize!(normal2, b"foo/bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal3, b"foo\\bar", b"foo\\bar");
#[cfg(not(unix))]
normalize!(normal3, b"foo\\bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal4, b"foo\\bar/baz", b"foo\\bar/baz");
#[cfg(not(unix))]
normalize!(normal4, b"foo\\bar/baz", b"foo/bar/baz");
}

View File

@@ -1,6 +1,6 @@
[package]
name = "grep"
version = "0.1.2" #:version
version = "0.1.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.

View File

@@ -8,7 +8,6 @@ Note that this implementation is incredibly suspicious. We need something more
principled.
*/
use std::cmp;
use std::iter;
use regex::bytes::Regex;
use syntax::{
@@ -181,8 +180,6 @@ fn repeat_range_literals<F: FnMut(&Expr, &mut Literals)>(
lits: &mut Literals,
mut f: F,
) {
use syntax::Expr::*;
if min == 0 {
// This is a bit conservative. If `max` is set, then we could
// treat this as a finite set of alternations. For now, we
@@ -190,8 +187,12 @@ fn repeat_range_literals<F: FnMut(&Expr, &mut Literals)>(
lits.cut();
} else {
let n = cmp::min(lits.limit_size(), min as usize);
let es = iter::repeat(e.clone()).take(n).collect();
f(&Concat(es), lits);
// We only extract literals from a single repetition, even though
// we could do more. e.g., `a{3}` will have `a` extracted instead of
// `aaa`. The reason is that inner literal extraction can't be unioned
// across repetitions. e.g., extracting `foofoofoo` from `(\w+foo){3}`
// is wrong.
f(e, lits);
if n < min as usize {
lits.cut();
}

View File

@@ -52,6 +52,7 @@ pub struct GrepBuilder {
#[derive(Clone, Debug)]
struct Options {
case_insensitive: bool,
case_smart: bool,
line_terminator: u8,
size_limit: usize,
dfa_size_limit: usize,
@@ -61,6 +62,7 @@ impl Default for Options {
fn default() -> Options {
Options {
case_insensitive: false,
case_smart: false,
line_terminator: b'\n',
size_limit: 10 * (1 << 20),
dfa_size_limit: 10 * (1 << 20),
@@ -98,6 +100,18 @@ impl GrepBuilder {
self
}
/// Whether to enable smart case search or not (disabled by default).
///
/// Smart case uses case insensitive search if the regex is contains all
/// lowercase literal characters. Otherwise, a case sensitive search is
/// used instead.
///
/// Enabling the case_insensitive flag overrides this.
pub fn case_smart(mut self, yes: bool) -> GrepBuilder {
self.opts.case_smart = yes;
self
}
/// Set the approximate size limit of the compiled regular expression.
///
/// This roughly corresponds to the number of bytes occupied by a
@@ -148,8 +162,11 @@ impl GrepBuilder {
/// Creates a new regex from the given expression with the current
/// configuration.
fn regex(&self, expr: &Expr) -> Result<Regex> {
let casei =
self.opts.case_insensitive
|| (self.opts.case_smart && !has_uppercase_literal(expr));
RegexBuilder::new(&expr.to_string())
.case_insensitive(self.opts.case_insensitive)
.case_insensitive(casei)
.multi_line(true)
.unicode(true)
.size_limit(self.opts.size_limit)
@@ -167,8 +184,9 @@ impl GrepBuilder {
.unicode(true)
.case_insensitive(self.opts.case_insensitive)
.parse(&self.pattern));
let expr = try!(nonl::remove(expr, self.opts.line_terminator));
debug!("regex ast:\n{:#?}", expr);
Ok(try!(nonl::remove(expr, self.opts.line_terminator)))
Ok(expr)
}
}
@@ -274,6 +292,23 @@ impl<'b, 's> Iterator for Iter<'b, 's> {
}
}
fn has_uppercase_literal(expr: &Expr) -> bool {
use syntax::Expr::*;
match *expr {
Literal { ref chars, casei } => {
casei || chars.iter().any(|c| c.is_uppercase())
}
LiteralBytes { ref bytes, casei } => {
casei || bytes.iter().any(|&b| b'A' <= b && b <= b'Z')
}
Group { ref e, .. } => has_uppercase_literal(e),
Repeat { ref e, .. } => has_uppercase_literal(e),
Concat(ref es) => es.iter().any(has_uppercase_literal),
Alternate(ref es) => es.iter().any(has_uppercase_literal),
_ => false,
}
}
#[cfg(test)]
mod tests {
#![allow(unused_imports)]

View File

@@ -1,7 +1,7 @@
# Contributor: Andrew Gallant <jamslam@gmail.com>
# Maintainer: Andrew Gallant
pkgname=ripgrep
pkgver=0.1.15
pkgver=0.1.16
pkgrel=1
pkgdesc="A search tool that combines the usability of The Silver Searcher with the raw speed of grep."
arch=('i686' 'x86_64')
@@ -9,7 +9,7 @@ url="https://github.com/BurntSushi/ripgrep"
license=('UNLICENSE')
makedepends=('cargo')
source=("https://github.com/BurntSushi/$pkgname/archive/$pkgver.tar.gz")
sha256sums=('ced856378c4ca625e4798ccae85418badd22e099fc324bcb162df51824808622')
sha256sums=('6f877018742c9a7557102ccebeedb40d7c779b470a5910a7bdab50ca2ce21532')
build() {
cd "$pkgname-$pkgver"

14
pkg/brew/ripgrep-bin.rb Normal file
View File

@@ -0,0 +1,14 @@
class RipgrepBin < Formula
version '0.2.1'
desc "Search tool like grep and The Silver Searcher."
homepage "https://github.com/BurntSushi/ripgrep"
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
sha256 "f8b208239b988708da2e58f848a75bf70ad144e201b3ed99cd323cc5a699625f"
conflicts_with "ripgrep"
def install
bin.install "rg"
man1.install "rg.1"
end
end

View File

@@ -1,19 +0,0 @@
require 'formula'
class Ripgrep < Formula
version '0.1.15'
desc "Search tool like grep and The Silver Searcher."
homepage "https://github.com/BurntSushi/ripgrep"
if Hardware::CPU.is_64_bit?
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
sha256 "fc138cd57b533bd65739f3f695322e483fe648736358d853ddb9bcd26d84fdc5"
else
url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-i686-apple-darwin.tar.gz"
sha256 "3ce1f12e49a463bc9dd4cfe2537aa9989a0dc81f7aa6f959ee0d0d82b5f768cb"
end
def install
bin.install "rg"
man1.install "rg.1"
end
end

View File

@@ -39,10 +39,10 @@ Usage: rg [options] -e PATTERN ... [<path> ...]
rg [options] <pattern> [<path> ...]
rg [options] --files [<path> ...]
rg [options] --type-list
rg --help
rg --version
rg [options] --help
rg [options] --version
rg combines the usability of The Silver Searcher with the raw speed of grep.
rg recursively searches your current directory for a regex pattern.
Common options:
-a, --text Search binary files as if they were text.
@@ -62,13 +62,12 @@ Common options:
Precede a glob with a '!' to exclude it.
-h, --help Show this usage message.
-i, --ignore-case Case insensitive search.
Overridden by --case-sensitive.
-n, --line-number Show line numbers (1-based). This is enabled
by default at a tty.
-N, --no-line-number Suppress line numbers.
-q, --quiet Do not print anything to stdout.
-r, --replace ARG Replace every match with the string given.
Capture group indices (e.g., $5) and names
(e.g., $foo) are supported.
-q, --quiet Do not print anything to stdout. If a match is
found in a file, stop searching that file.
-t, --type TYPE ... Only search files matching TYPE. Multiple type
flags may be provided. Use the --type-list flag
to list all available types.
@@ -110,10 +109,17 @@ Less common options:
--files
Print each file that would be searched (but don't search).
-l, --files-with-matches
Only show path of each file with matches.
-H, --with-filename
Prefix each match with the file name that contains it. This is the
default when more than one file is searched.
--no-filename
Never show the filename for a match. This is the default when
one file is searched.
--heading
Show the file name above clusters of matches from each file.
This is the default mode at a tty.
@@ -128,6 +134,10 @@ Less common options:
-L, --follow
Follow symlinks.
--maxdepth NUM
Descend at most NUM directories below the command line arguments.
A value of zero only searches the starting-points themselves.
--mmap
Search using memory maps when possible. This is enabled by default
when ripgrep thinks it will be faster. (Note that mmap searching
@@ -137,15 +147,40 @@ Less common options:
Never use memory maps, even when they might be faster.
--no-ignore
Don't respect ignore files (.gitignore, .rgignore, etc.)
Don't respect ignore files (.gitignore, .ignore, etc.)
This implies --no-ignore-parent.
--no-ignore-parent
Don't respect ignore files in parent directories.
--no-ignore-vcs
Don't respect version control ignore files (e.g., .gitignore).
Note that .ignore files will continue to be respected.
--null
Whenever a file name is printed, follow it with a NUL byte.
This includes printing filenames before matches, and when printing
a list of matching files such as with --count, --files-with-matches
and --files.
-p, --pretty
Alias for --color=always --heading -n.
-r, --replace ARG
Replace every match with the string given when printing search results.
Neither this flag nor any other flag will modify your files.
Capture group indices (e.g., $5) and names (e.g., $foo) are supported
in the replacement string.
-s, --case-sensitive
Search case sensitively. This overrides --ignore-case and --smart-case.
-S, --smart-case
Search case insensitively if the pattern is all lowercase.
Search case sensitively otherwise. This is overridden by
either --case-sensitive or --ignore-case.
-j, --threads ARG
The number of threads to use. Defaults to the number of logical CPUs
(capped at 6). [default: 0]
@@ -163,11 +198,18 @@ File type management options:
Show all supported file types and their associated globs.
--type-add ARG ...
Add a new glob for a particular file type.
Example: --type-add html:*.html,*.htm
Add a new glob for a particular file type. Only one glob can be
added at a time. Multiple --type-add flags can be provided.
Unless --type-clear is used, globs are added to any existing globs
inside of ripgrep. Note that this must be passed to every invocation of
rg. Type settings are NOT persisted.
Example: `rg --type-add 'foo:*.foo' -tfoo PATTERN`
--type-clear TYPE ...
Clear the file type globs for TYPE.
Clear the file type globs previously defined for TYPE. This only clears
the default type definitions that are found inside of ripgrep. Note
that this must be passed to every invocation of rg.
";
/// RawArgs are the args as they are parsed from Docopt. They aren't used
@@ -178,11 +220,13 @@ pub struct RawArgs {
arg_path: Vec<String>,
flag_after_context: usize,
flag_before_context: usize,
flag_case_sensitive: bool,
flag_color: String,
flag_column: bool,
flag_context: usize,
flag_context_separator: String,
flag_count: bool,
flag_files_with_matches: bool,
flag_debug: bool,
flag_files: bool,
flag_follow: bool,
@@ -193,16 +237,21 @@ pub struct RawArgs {
flag_invert_match: bool,
flag_line_number: bool,
flag_fixed_strings: bool,
flag_maxdepth: Option<usize>,
flag_mmap: bool,
flag_no_heading: bool,
flag_no_ignore: bool,
flag_no_ignore_parent: bool,
flag_no_ignore_vcs: bool,
flag_no_line_number: bool,
flag_no_mmap: bool,
flag_no_filename: bool,
flag_null: bool,
flag_pretty: bool,
flag_quiet: bool,
flag_regexp: Vec<String>,
flag_replace: Option<String>,
flag_smart_case: bool,
flag_text: bool,
flag_threads: usize,
flag_type: Vec<String>,
@@ -219,7 +268,6 @@ pub struct RawArgs {
/// Args are transformed/normalized from RawArgs.
#[derive(Debug)]
pub struct Args {
pattern: String,
paths: Vec<PathBuf>,
after_context: usize,
before_context: usize,
@@ -227,6 +275,7 @@ pub struct Args {
column: bool,
context_separator: Vec<u8>,
count: bool,
files_with_matches: bool,
eol: u8,
files: bool,
follow: bool,
@@ -238,14 +287,16 @@ pub struct Args {
invert_match: bool,
line_number: bool,
line_per_match: bool,
maxdepth: Option<usize>,
mmap: bool,
no_ignore: bool,
no_ignore_parent: bool,
no_ignore_vcs: bool,
null: bool,
quiet: bool,
replace: Option<Vec<u8>>,
text: bool,
threads: usize,
type_defs: Vec<FileTypeDef>,
type_list: bool,
types: Types,
with_filename: bool,
@@ -254,12 +305,12 @@ pub struct Args {
impl RawArgs {
/// Convert arguments parsed into a configuration used by ripgrep.
fn to_args(&self) -> Result<Args> {
let pattern = self.pattern();
let paths =
if self.arg_path.is_empty() {
if atty::on_stdin()
|| self.flag_files
|| self.flag_type_list {
|| self.flag_type_list
|| !atty::stdin_is_readable() {
vec![Path::new("./").to_path_buf()]
} else {
vec![Path::new("-").to_path_buf()]
@@ -283,6 +334,9 @@ impl RawArgs {
} else if cfg!(windows) {
// On Windows, memory maps appear faster than read calls. Neat.
true
} else if cfg!(target_os = "macos") {
// On Mac, memory maps appear to suck. Neat.
false
} else {
// If we're only searching a few paths and all of them are
// files, then memory maps are probably faster.
@@ -309,33 +363,26 @@ impl RawArgs {
self.flag_threads
};
let color =
if self.flag_vimgrep {
if self.flag_color == "always" {
true
} else if self.flag_vimgrep {
false
} else if self.flag_color == "auto" {
atty::on_stdout() || self.flag_pretty
} else {
self.flag_color == "always"
false
};
let eol = b'\n';
let mut with_filename = self.flag_with_filename;
if !with_filename {
with_filename = paths.len() > 1 || paths[0].is_dir();
}
let mut btypes = TypesBuilder::new();
btypes.add_defaults();
try!(self.add_types(&mut btypes));
let types = try!(btypes.build());
let grep = try!(
GrepBuilder::new(&pattern)
.case_insensitive(self.flag_ignore_case)
.line_terminator(eol)
.build()
);
with_filename = with_filename && !self.flag_no_filename;
let no_ignore = self.flag_no_ignore || self.flag_unrestricted >= 1;
let hidden = self.flag_hidden || self.flag_unrestricted >= 2;
let text = self.flag_text || self.flag_unrestricted >= 3;
let mut args = Args {
pattern: pattern,
paths: paths,
after_context: after_context,
before_context: before_context,
@@ -343,29 +390,34 @@ impl RawArgs {
column: self.flag_column,
context_separator: unescape(&self.flag_context_separator),
count: self.flag_count,
eol: eol,
files_with_matches: self.flag_files_with_matches,
eol: self.eol(),
files: self.flag_files,
follow: self.flag_follow,
glob_overrides: glob_overrides,
grep: grep,
grep: try!(self.grep()),
heading: !self.flag_no_heading && self.flag_heading,
hidden: hidden,
ignore_case: self.flag_ignore_case,
invert_match: self.flag_invert_match,
line_number: !self.flag_no_line_number && self.flag_line_number,
line_per_match: self.flag_vimgrep,
maxdepth: self.flag_maxdepth,
mmap: mmap,
no_ignore: no_ignore,
no_ignore_parent:
// --no-ignore implies --no-ignore-parent
self.flag_no_ignore_parent || no_ignore,
no_ignore_vcs:
// --no-ignore implies --no-ignore-vcs
self.flag_no_ignore_vcs || no_ignore,
null: self.flag_null,
quiet: self.flag_quiet,
replace: self.flag_replace.clone().map(|s| s.into_bytes()),
text: text,
threads: threads,
type_defs: btypes.definitions(),
type_list: self.flag_type_list,
types: types,
types: try!(self.types()),
with_filename: with_filename,
};
// If stdout is a tty, then apply some special default options.
@@ -384,20 +436,22 @@ impl RawArgs {
Ok(args)
}
fn add_types(&self, types: &mut TypesBuilder) -> Result<()> {
fn types(&self) -> Result<Types> {
let mut btypes = TypesBuilder::new();
btypes.add_defaults();
for ty in &self.flag_type_clear {
types.clear(ty);
btypes.clear(ty);
}
for def in &self.flag_type_add {
try!(types.add_def(def));
try!(btypes.add_def(def));
}
for ty in &self.flag_type {
types.select(ty);
btypes.select(ty);
}
for ty in &self.flag_type_not {
types.negate(ty);
btypes.negate(ty);
}
Ok(())
btypes.build().map_err(From::from)
}
fn pattern(&self) -> String {
@@ -427,6 +481,27 @@ impl RawArgs {
s
}
}
fn eol(&self) -> u8 {
// We might want to make this configurable.
b'\n'
}
fn grep(&self) -> Result<Grep> {
let smart =
self.flag_smart_case
&& !self.flag_ignore_case
&& !self.flag_case_sensitive;
let casei =
self.flag_ignore_case
&& !self.flag_case_sensitive;
GrepBuilder::new(&self.pattern())
.case_smart(smart)
.case_insensitive(casei)
.line_terminator(self.eol())
.build()
.map_err(From::from)
}
}
impl Args {
@@ -451,7 +526,7 @@ impl Args {
}
}
}
let raw: RawArgs =
let mut raw: RawArgs =
Docopt::new(USAGE)
.and_then(|d| d.argv(argv).version(Some(version())).decode())
.unwrap_or_else(|e| e.exit());
@@ -466,6 +541,13 @@ impl Args {
errored!("failed to initialize logger: {}", err);
}
// *sigh*... If --files is given, then the first path ends up in
// pattern.
if raw.flag_files {
if !raw.arg_pattern.is_empty() {
raw.arg_path.insert(0, raw.arg_pattern.clone());
}
}
raw.to_args().map_err(From::from)
}
@@ -496,6 +578,11 @@ impl Args {
self.mmap
}
/// Whether ripgrep should be quiet or not.
pub fn quiet(&self) -> bool {
self.quiet
}
/// Create a new printer of individual search results that writes to the
/// writer given.
pub fn printer<W: Terminal + Send>(&self, wtr: W) -> Printer<W> {
@@ -505,7 +592,7 @@ impl Args {
.eol(self.eol)
.heading(self.heading)
.line_per_match(self.line_per_match)
.quiet(self.quiet)
.null(self.null)
.with_filename(self.with_filename);
if let Some(ref rep) = self.replace {
p = p.replace(rep.clone());
@@ -517,14 +604,23 @@ impl Args {
/// to the writer given.
pub fn out(&self) -> Out {
let mut out = Out::new(self.color);
if self.heading && !self.count {
out = out.file_separator(b"".to_vec());
} else if self.before_context > 0 || self.after_context > 0 {
out = out.file_separator(self.context_separator.clone());
if let Some(filesep) = self.file_separator() {
out = out.file_separator(filesep);
}
out
}
/// Retrieve the configured file separator.
pub fn file_separator(&self) -> Option<Vec<u8>> {
if self.heading && !self.count && !self.files_with_matches {
Some(b"".to_vec())
} else if self.before_context > 0 || self.after_context > 0 {
Some(self.context_separator.clone())
} else {
None
}
}
/// Create a new buffer for use with searching.
#[cfg(not(windows))]
pub fn outbuf(&self) -> ColoredTerminal<term::TerminfoTerminal<Vec<u8>>> {
@@ -571,9 +667,11 @@ impl Args {
.after_context(self.after_context)
.before_context(self.before_context)
.count(self.count)
.files_with_matches(self.files_with_matches)
.eol(self.eol)
.line_number(self.line_number)
.invert_match(self.invert_match)
.quiet(self.quiet)
.text(self.text)
}
@@ -589,9 +687,11 @@ impl Args {
) -> BufferSearcher<'a, W> {
BufferSearcher::new(printer, grep, path, buf)
.count(self.count)
.files_with_matches(self.files_with_matches)
.eol(self.eol)
.line_number(self.line_number)
.invert_match(self.invert_match)
.quiet(self.quiet)
.text(self.text)
}
@@ -602,7 +702,7 @@ impl Args {
/// Returns a list of type definitions currently loaded.
pub fn type_defs(&self) -> &[FileTypeDef] {
&self.type_defs
self.types.definitions()
}
/// Returns true if ripgrep should print the type definitions currently
@@ -613,16 +713,27 @@ impl Args {
/// Create a new recursive directory iterator at the path given.
pub fn walker(&self, path: &Path) -> Result<walk::Iter> {
let wd = WalkDir::new(path).follow_links(self.follow);
let mut ig = Ignore::new();
ig.ignore_hidden(!self.hidden);
ig.no_ignore(self.no_ignore);
ig.add_types(self.types.clone());
if !self.no_ignore_parent {
try!(ig.push_parents(path));
// Always follow symlinks for explicitly specified files.
let mut wd = WalkDir::new(path).follow_links(
self.follow || path.is_file());
if let Some(maxdepth) = self.maxdepth {
wd = wd.max_depth(maxdepth);
}
if let Some(ref overrides) = self.glob_overrides {
ig.add_override(overrides.clone());
let mut ig = Ignore::new();
// Only register ignore rules if this is a directory. If it's a file,
// then it was explicitly given by the end user, so we always search
// it.
if path.is_dir() {
ig.ignore_hidden(!self.hidden);
ig.no_ignore(self.no_ignore);
ig.no_ignore_vcs(self.no_ignore_vcs);
ig.add_types(self.types.clone());
if !self.no_ignore_parent {
try!(ig.push_parents(path));
}
if let Some(ref overrides) = self.glob_overrides {
ig.add_override(overrides.clone());
}
}
Ok(walk::Iter::new(ig, wd))
}

View File

@@ -4,30 +4,58 @@ from (or to) a terminal. Windows and Unix do this differently, so implement
both here.
*/
#[cfg(unix)]
pub fn stdin_is_readable() -> bool {
use std::fs::File;
use std::os::unix::fs::FileTypeExt;
use std::os::unix::io::{FromRawFd, IntoRawFd};
use libc;
let file = unsafe { File::from_raw_fd(libc::STDIN_FILENO) };
let md = file.metadata();
let _ = file.into_raw_fd();
let ft = match md {
Err(_) => return false,
Ok(md) => md.file_type(),
};
ft.is_file() || ft.is_fifo()
}
#[cfg(windows)]
pub fn stdin_is_readable() -> bool {
// ???
true
}
/// Returns true if there is a tty on stdin.
#[cfg(unix)]
pub fn on_stdin() -> bool {
use libc;
0 < unsafe { libc::isatty(libc::STDIN_FILENO) }
}
/// Returns true if there is a tty on stdout.
#[cfg(unix)]
pub fn on_stdout() -> bool {
use libc;
0 < unsafe { libc::isatty(libc::STDOUT_FILENO) }
}
/// Returns true if there is a tty on stdin.
#[cfg(windows)]
pub fn on_stdin() -> bool {
use kernel32;
use winapi;
unsafe {
let fd = winapi::winbase::STD_INPUT_HANDLE;
let mut out = 0;
kernel32::GetConsoleMode(kernel32::GetStdHandle(fd), &mut out) != 0
}
// BUG: https://github.com/BurntSushi/ripgrep/issues/19
// It's not clear to me how to determine whether there is a tty on stdin.
// Checking GetConsoleMode(GetStdHandle(stdin)) != 0 appears to report
// that stdin is a pipe, even if it's not in a cygwin terminal, for
// example.
//
// To fix this, we just assume there is always a tty on stdin. If Windows
// users need to search stdin, they'll have to pass -. Ug.
true
}
/// Returns true if there is a tty on stdout.
#[cfg(windows)]
pub fn on_stdout() -> bool {
use kernel32;

View File

@@ -9,7 +9,7 @@ The motivation for this submodule is performance and portability:
2. We could shell out to a `git` sub-command like ls-files or status, but it
seems better to not rely on the existence of external programs for a search
tool. Besides, we need to implement this logic anyway to support things like
an .rgignore file.
an .ignore file.
The key implementation detail here is that a single gitignore file is compiled
into a single RegexSet, which can be used to report which globs match a
@@ -28,15 +28,15 @@ use std::fs::File;
use std::io::{self, BufRead};
use std::path::{Path, PathBuf};
use globset::{self, Candidate, GlobBuilder, GlobSet, GlobSetBuilder};
use regex;
use glob;
use pathutil::strip_prefix;
use pathutil::{is_file_name, strip_prefix};
/// Represents an error that can occur when parsing a gitignore file.
#[derive(Debug)]
pub enum Error {
Glob(glob::Error),
Glob(globset::Error),
Regex(regex::Error),
Io(io::Error),
}
@@ -61,8 +61,8 @@ impl fmt::Display for Error {
}
}
impl From<glob::Error> for Error {
fn from(err: glob::Error) -> Error {
impl From<globset::Error> for Error {
fn from(err: globset::Error) -> Error {
Error::Glob(err)
}
}
@@ -82,7 +82,7 @@ impl From<io::Error> for Error {
/// Gitignore is a matcher for the glob patterns in a single gitignore file.
#[derive(Clone, Debug)]
pub struct Gitignore {
set: glob::Set,
set: GlobSet,
root: PathBuf,
patterns: Vec<Pattern>,
num_ignores: u64,
@@ -115,7 +115,17 @@ impl Gitignore {
if let Some(p) = strip_prefix("./", path) {
path = p;
}
if let Some(p) = strip_prefix(&self.root, path) {
// Strip any common prefix between the candidate path and the root
// of the gitignore, to make sure we get relative matching right.
// BUT, a file name might not have any directory components to it,
// in which case, we don't want to accidentally strip any part of the
// file name.
if !is_file_name(path) {
if let Some(p) = strip_prefix(&self.root, path) {
path = p;
}
}
if let Some(p) = strip_prefix("/", path) {
path = p;
}
self.matched_stripped(path, is_dir)
@@ -130,7 +140,8 @@ impl Gitignore {
};
MATCHES.with(|matches| {
let mut matches = matches.borrow_mut();
self.set.matches_into(path, &mut *matches);
let candidate = Candidate::new(path);
self.set.matches_candidate_into(&candidate, &mut *matches);
for &i in matches.iter().rev() {
let pat = &self.patterns[i];
if !pat.only_dir || is_dir {
@@ -197,7 +208,7 @@ impl<'a> Match<'a> {
/// GitignoreBuilder constructs a matcher for a single set of globs from a
/// .gitignore file.
pub struct GitignoreBuilder {
builder: glob::SetBuilder,
builder: GlobSetBuilder,
root: PathBuf,
patterns: Vec<Pattern>,
}
@@ -225,9 +236,10 @@ impl GitignoreBuilder {
/// The path given should be the path at which the globs for this gitignore
/// file should be matched.
pub fn new<P: AsRef<Path>>(root: P) -> GitignoreBuilder {
let root = strip_prefix("./", root.as_ref()).unwrap_or(root.as_ref());
GitignoreBuilder {
builder: glob::SetBuilder::new(),
root: root.as_ref().to_path_buf(),
builder: GlobSetBuilder::new(),
root: root.to_path_buf(),
patterns: vec![],
}
}
@@ -250,8 +262,19 @@ impl GitignoreBuilder {
/// Add each pattern line from the file path given.
pub fn add_path<P: AsRef<Path>>(&mut self, path: P) -> Result<(), Error> {
let rdr = io::BufReader::new(try!(File::open(&path)));
for line in rdr.lines() {
try!(self.add(&path, &try!(line)));
debug!("gitignore: {}", path.as_ref().display());
for (i, line) in rdr.lines().enumerate() {
let line = match line {
Ok(line) => line,
Err(err) => {
debug!("error reading line {} in {}: {}",
i, path.as_ref().display(), err);
continue;
}
};
if let Err(err) = self.add(&path, &line) {
debug!("error adding gitignore pattern: '{}': {}", line, err);
}
}
Ok(())
}
@@ -272,6 +295,12 @@ impl GitignoreBuilder {
from: P,
mut line: &str,
) -> Result<(), Error> {
if line.starts_with("#") {
return Ok(());
}
if !line.ends_with("\\ ") {
line = line.trim_right();
}
if line.is_empty() {
return Ok(());
}
@@ -282,34 +311,24 @@ impl GitignoreBuilder {
whitelist: false,
only_dir: false,
};
let mut opts = glob::MatchOptions::default();
let mut literal_separator = false;
let has_slash = line.chars().any(|c| c == '/');
// If the line starts with an escaped '!', then remove the escape.
// Otherwise, if it starts with an unescaped '!', then this is a
// whitelist pattern.
match line.chars().nth(0) {
Some('#') => return Ok(()),
Some('\\') => {
match line.chars().nth(1) {
Some('!') | Some('#') => {
line = &line[1..];
}
_ => {}
}
}
Some('!') => {
let is_absolute = line.chars().nth(0).unwrap() == '/';
if line.starts_with("\\!") || line.starts_with("\\#") {
line = &line[1..];
} else {
if line.starts_with("!") {
pat.whitelist = true;
line = &line[1..];
}
Some('/') => {
if line.starts_with("/") {
// `man gitignore` says that if a glob starts with a slash,
// then the glob can only match the beginning of a path
// (relative to the location of gitignore). We achieve this by
// simply banning wildcards from matching /.
opts.require_literal_separator = true;
literal_separator = true;
line = &line[1..];
}
_ => {}
}
// If it ends with a slash, then this should only match directories,
// but the slash should otherwise not be used while globbing.
@@ -320,16 +339,31 @@ impl GitignoreBuilder {
}
}
// If there is a literal slash, then we note that so that globbing
// doesn't let wildcards match slashes. Otherwise, we need to let
// the pattern match anywhere, so we add a `**/` prefix to achieve
// that behavior.
// doesn't let wildcards match slashes.
pat.pat = line.to_string();
if has_slash {
opts.require_literal_separator = true;
} else {
pat.pat = format!("**/{}", pat.pat);
literal_separator = true;
}
try!(self.builder.add_with(&pat.pat, &opts));
// If there was a leading slash, then this is a pattern that must
// match the entire path name. Otherwise, we should let it match
// anywhere, so use a **/ prefix.
if !is_absolute {
// ... but only if we don't already have a **/ prefix.
if !pat.pat.starts_with("**/") {
pat.pat = format!("**/{}", pat.pat);
}
}
// If the pattern ends with `/**`, then we should only match everything
// inside a directory, but not the directory itself. Standard globs
// will match the directory. So we add `/*` to force the issue.
if pat.pat.ends_with("/**") {
pat.pat = format!("{}/*", pat.pat);
}
let parsed = try!(
GlobBuilder::new(&pat.pat)
.literal_separator(literal_separator)
.build());
self.builder.add(parsed);
self.patterns.push(pat);
Ok(())
}
@@ -393,10 +427,14 @@ mod tests {
ignored!(ig24, ROOT, "target", "grep/target");
ignored!(ig25, ROOT, "Cargo.lock", "./tabwriter-bin/Cargo.lock");
ignored!(ig26, ROOT, "/foo/bar/baz", "./foo/bar/baz");
ignored!(ig27, ROOT, "foo/", "xyz/foo", true);
ignored!(ig28, ROOT, "src/*.rs", "src/grep/src/main.rs");
ignored!(ig29, "./src", "/llvm/", "./src/llvm", true);
ignored!(ig30, ROOT, "node_modules/ ", "node_modules", true);
not_ignored!(ignot1, ROOT, "amonths", "months");
not_ignored!(ignot2, ROOT, "monthsa", "months");
not_ignored!(ignot3, ROOT, "src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot3, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot4, ROOT, "/*.c", "mozilla-sha1/sha1.c");
not_ignored!(ignot5, ROOT, "/src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot6, ROOT, "*.rs\n!src/main.rs", "src/main.rs");
@@ -406,4 +444,14 @@ mod tests {
not_ignored!(ignot10, ROOT, "**/foo/bar", "foo/src/bar");
not_ignored!(ignot11, ROOT, "#foo", "#foo");
not_ignored!(ignot12, ROOT, "\n\n\n", "foo");
not_ignored!(ignot13, ROOT, "foo/**", "foo", true);
not_ignored!(
ignot14, "./third_party/protobuf", "m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config");
// See: https://github.com/BurntSushi/ripgrep/issues/106
#[test]
fn regression_106() {
Gitignore::from_str("/", " ").unwrap();
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -5,7 +5,7 @@ whether a *single* file path should be searched or not.
In general, there are two ways to ignore a particular file:
1. Specify an ignore rule in some "global" configuration, such as a
$HOME/.rgignore or on the command line.
$HOME/.ignore or on the command line.
2. A specific ignore file (like .gitignore) found during directory traversal.
The `IgnoreDir` type handles ignore patterns for any one particular directory
@@ -14,16 +14,18 @@ of `IgnoreDir`s for use during directory traversal.
*/
use std::error::Error as StdError;
use std::ffi::OsString;
use std::fmt;
use std::io;
use std::path::{Path, PathBuf};
use gitignore::{self, Gitignore, GitignoreBuilder, Match, Pattern};
use pathutil::is_hidden;
use pathutil::{file_name, is_hidden};
use types::Types;
const IGNORE_NAMES: &'static [&'static str] = &[
".gitignore",
".ignore",
".rgignore",
];
@@ -77,7 +79,9 @@ impl From<gitignore::Error> for Error {
pub struct Ignore {
/// A stack of ignore patterns at each directory level of traversal.
/// A directory that contributes no ignore patterns is `None`.
stack: Vec<Option<IgnoreDir>>,
stack: Vec<IgnoreDir>,
/// A stack of parent directories above the root of the current search.
parent_stack: Vec<IgnoreDir>,
/// A set of override globs that are always checked first. A match (whether
/// it's whitelist or blacklist) trumps anything in stack.
overrides: Overrides,
@@ -85,9 +89,11 @@ pub struct Ignore {
types: Types,
/// Whether to ignore hidden files or not.
ignore_hidden: bool,
/// When true, don't look at .gitignore or .agignore files for ignore
/// When true, don't look at .gitignore or .ignore files for ignore
/// rules.
no_ignore: bool,
/// When true, don't look at .gitignore files for ignore rules.
no_ignore_vcs: bool,
}
impl Ignore {
@@ -95,10 +101,12 @@ impl Ignore {
pub fn new() -> Ignore {
Ignore {
stack: vec![],
parent_stack: vec![],
overrides: Overrides::new(None),
types: Types::empty(),
ignore_hidden: true,
no_ignore: false,
no_ignore_vcs: true,
}
}
@@ -114,6 +122,12 @@ impl Ignore {
self
}
/// When set, VCS ignore files are ignored.
pub fn no_ignore_vcs(&mut self, yes: bool) -> &mut Ignore {
self.no_ignore_vcs = yes;
self
}
/// Add a set of globs that overrides all other match logic.
pub fn add_override(&mut self, gi: Gitignore) -> &mut Ignore {
self.overrides = Overrides::new(Some(gi));
@@ -138,10 +152,13 @@ impl Ignore {
let mut path = &*path;
let mut saw_git = path.join(".git").is_dir();
let mut ignore_names = IGNORE_NAMES.to_vec();
if self.no_ignore_vcs {
ignore_names.retain(|&name| name != ".gitignore");
}
let mut ignore_dir_results = vec![];
while let Some(parent) = path.parent() {
if self.no_ignore {
ignore_dir_results.push(Ok(None));
ignore_dir_results.push(Ok(IgnoreDir::empty(parent)));
} else {
if saw_git {
ignore_names.retain(|&name| name != ".gitignore");
@@ -156,7 +173,7 @@ impl Ignore {
}
for ignore_dir_result in ignore_dir_results.into_iter().rev() {
try!(self.push_ignore_dir(ignore_dir_result));
self.parent_stack.push(try!(ignore_dir_result));
}
Ok(())
}
@@ -167,10 +184,13 @@ impl Ignore {
/// stack (and therefore should be popped).
pub fn push<P: AsRef<Path>>(&mut self, path: P) -> Result<(), Error> {
if self.no_ignore {
self.stack.push(None);
return Ok(());
self.stack.push(IgnoreDir::empty(path));
Ok(())
} else if self.no_ignore_vcs {
self.push_ignore_dir(IgnoreDir::without_vcs(path))
} else {
self.push_ignore_dir(IgnoreDir::new(path))
}
self.push_ignore_dir(IgnoreDir::new(path))
}
/// Pushes the result of building a directory matcher on to the stack.
@@ -178,7 +198,7 @@ impl Ignore {
/// If the result given contains an error, then it is returned.
pub fn push_ignore_dir(
&mut self,
result: Result<Option<IgnoreDir>, Error>,
result: Result<IgnoreDir, Error>,
) -> Result<(), Error> {
match result {
Ok(id) => {
@@ -187,7 +207,7 @@ impl Ignore {
}
Err(err) => {
// Don't leave the stack in an inconsistent state.
self.stack.push(None);
self.stack.push(IgnoreDir::empty("error"));
Err(err)
}
}
@@ -207,12 +227,9 @@ impl Ignore {
if let Some(is_ignored) = self.ignore_match(path, mat) {
return is_ignored;
}
if self.ignore_hidden && is_hidden(&path) {
debug!("{} ignored because it is hidden", path.display());
return true;
}
let mut whitelisted = false;
if !self.no_ignore {
for id in self.stack.iter().rev().filter_map(|id| id.as_ref()) {
for id in self.stack.iter().rev() {
let mat = id.matched(path, is_dir);
if let Some(is_ignored) = self.ignore_match(path, mat) {
if is_ignored {
@@ -220,13 +237,43 @@ impl Ignore {
}
// If this path is whitelisted by an ignore, then
// fallthrough and let the file type matcher have a say.
whitelisted = true;
break;
}
}
// If the file has been whitelisted, then we have to stop checking
// parent directories. The only thing that can override a whitelist
// at this point is a type filter.
if !whitelisted {
let mut path = path.to_path_buf();
for id in self.parent_stack.iter().rev() {
if let Some(ref dirname) = id.name {
path = Path::new(dirname).join(path);
}
let mat = id.matched(&*path, is_dir);
if let Some(is_ignored) = self.ignore_match(&*path, mat) {
if is_ignored {
return true;
}
// If this path is whitelisted by an ignore, then
// fallthrough and let the file type matcher have a
// say.
whitelisted = true;
break;
}
}
}
}
let mat = self.types.matched(path, is_dir);
if let Some(is_ignored) = self.ignore_match(path, mat) {
return is_ignored;
if is_ignored {
return true;
}
whitelisted = true;
}
if !whitelisted && self.ignore_hidden && is_hidden(&path) {
debug!("{} ignored because it is hidden", path.display());
return true;
}
false
}
@@ -256,9 +303,12 @@ impl Ignore {
/// IgnoreDir represents a set of ignore patterns retrieved from a single
/// directory.
#[derive(Debug)]
pub struct IgnoreDir {
/// The path to this directory as given.
path: PathBuf,
/// The directory name, if one exists.
name: Option<OsString>,
/// A single accumulation of glob patterns for this directory, matched
/// using gitignore semantics.
///
@@ -272,13 +322,27 @@ pub struct IgnoreDir {
impl IgnoreDir {
/// Create a new matcher for the given directory.
///
/// If no ignore glob patterns could be found in the directory then `None`
/// is returned.
pub fn new<P: AsRef<Path>>(path: P) -> Result<Option<IgnoreDir>, Error> {
pub fn new<P: AsRef<Path>>(path: P) -> Result<IgnoreDir, Error> {
IgnoreDir::with_ignore_names(path, IGNORE_NAMES.iter())
}
/// Create a new matcher for the given directory.
///
/// Don't respect VCS ignore files.
pub fn without_vcs<P: AsRef<Path>>(path: P) -> Result<IgnoreDir, Error> {
let names = IGNORE_NAMES.iter().filter(|name| **name != ".gitignore");
IgnoreDir::with_ignore_names(path, names)
}
/// Create a new IgnoreDir that never matches anything with the given path.
pub fn empty<P: AsRef<Path>>(path: P) -> IgnoreDir {
IgnoreDir {
path: path.as_ref().to_path_buf(),
name: file_name(path.as_ref()).map(|s| s.to_os_string()),
gi: None,
}
}
/// Create a new matcher for the given directory using only the ignore
/// patterns found in the file names given.
///
@@ -291,12 +355,9 @@ impl IgnoreDir {
pub fn with_ignore_names<P: AsRef<Path>, S, I>(
path: P,
names: I,
) -> Result<Option<IgnoreDir>, Error>
) -> Result<IgnoreDir, Error>
where P: AsRef<Path>, S: AsRef<str>, I: Iterator<Item=S> {
let mut id = IgnoreDir {
path: path.as_ref().to_path_buf(),
gi: None,
};
let mut id = IgnoreDir::empty(path);
let mut ok = false;
let mut builder = GitignoreBuilder::new(&id.path);
// The ordering here is important. Later globs have higher precedence.
@@ -304,11 +365,10 @@ impl IgnoreDir {
ok = builder.add_path(id.path.join(name.as_ref())).is_ok() || ok;
}
if !ok {
Ok(None)
} else {
id.gi = Some(try!(builder.build()));
Ok(Some(id))
return Ok(id);
}
id.gi = Some(try!(builder.build()));
Ok(id)
}
/// Returns true if and only if the given file path should be ignored
@@ -359,10 +419,6 @@ impl Overrides {
/// Match::None (and interpreting non-matches as ignored) unless is_dir
/// is true.
pub fn matched<P: AsRef<Path>>(&self, path: P, is_dir: bool) -> Match {
// File types don't apply to directories.
if is_dir {
return Match::None;
}
let path = path.as_ref();
self.gi.as_ref()
.map(|gi| {
@@ -394,6 +450,9 @@ mod tests {
let gi = builder.build().unwrap();
let id = IgnoreDir {
path: Path::new($root).to_path_buf(),
name: Path::new($root).file_name().map(|s| {
s.to_os_string()
}),
gi: Some(gi),
};
assert!(id.matched($path, false).is_ignored());
@@ -411,6 +470,9 @@ mod tests {
let gi = builder.build().unwrap();
let id = IgnoreDir {
path: Path::new($root).to_path_buf(),
name: Path::new($root).file_name().map(|s| {
s.to_os_string()
}),
gi: Some(gi),
};
assert!(!id.matched($path, false).is_ignored());

View File

@@ -1,7 +1,7 @@
extern crate deque;
extern crate docopt;
extern crate env_logger;
extern crate fnv;
extern crate globset;
extern crate grep;
#[cfg(windows)]
extern crate kernel32;
@@ -23,11 +23,13 @@ extern crate winapi;
use std::error::Error;
use std::fs::File;
use std::io;
use std::path::{Path, PathBuf};
use std::path::Path;
use std::process;
use std::result;
use std::sync::{Arc, Mutex};
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
use std::cmp;
use deque::{Stealer, Stolen};
use grep::Grep;
@@ -59,7 +61,6 @@ macro_rules! eprintln {
mod args;
mod atty;
mod gitignore;
mod glob;
mod ignore;
mod out;
mod pathutil;
@@ -87,24 +88,29 @@ fn main() {
fn run(args: Args) -> Result<u64> {
let args = Arc::new(args);
let paths = args.paths();
let threads = cmp::max(1, args.threads() - 1);
let isone =
paths.len() == 1 && (paths[0] == Path::new("-") || paths[0].is_file());
if args.files() {
return run_files(args.clone());
}
if args.type_list() {
return run_types(args.clone());
}
if paths.len() == 1 && (paths[0] == Path::new("-") || paths[0].is_file()) {
return run_one(args.clone(), &paths[0]);
if threads == 1 || isone {
return run_one_thread(args.clone());
}
let out = Arc::new(Mutex::new(args.out()));
let quiet_matched = QuietMatched::new(args.quiet());
let mut workers = vec![];
let workq = {
let (workq, stealer) = deque::new();
for _ in 0..args.threads() {
for _ in 0..threads {
let worker = MultiWorker {
chan_work: stealer.clone(),
quiet_matched: quiet_matched.clone(),
out: out.clone(),
outbuf: Some(args.outbuf()),
worker: Worker {
@@ -120,11 +126,17 @@ fn run(args: Args) -> Result<u64> {
};
let mut paths_searched: u64 = 0;
for p in paths {
if quiet_matched.has_match() {
break;
}
if p == Path::new("-") {
paths_searched += 1;
workq.push(Work::Stdin);
} else {
for ent in try!(args.walker(p)) {
if quiet_matched.has_match() {
break;
}
paths_searched += 1;
workq.push(Work::File(ent));
}
@@ -145,22 +157,58 @@ fn run(args: Args) -> Result<u64> {
Ok(match_count)
}
fn run_one(args: Arc<Args>, path: &Path) -> Result<u64> {
fn run_one_thread(args: Arc<Args>) -> Result<u64> {
let mut worker = Worker {
args: args.clone(),
inpbuf: args.input_buffer(),
grep: args.grep(),
match_count: 0,
};
let term = args.stdout();
let mut printer = args.printer(term);
let work =
if path == Path::new("-") {
WorkReady::Stdin
let paths = args.paths();
let mut term = args.stdout();
let mut paths_searched: u64 = 0;
for p in paths {
if args.quiet() && worker.match_count > 0 {
break;
}
if p == Path::new("-") {
paths_searched += 1;
let mut printer = args.printer(&mut term);
if worker.match_count > 0 {
if let Some(sep) = args.file_separator() {
printer = printer.file_separator(sep);
}
}
worker.do_work(&mut printer, WorkReady::Stdin);
} else {
WorkReady::PathFile(path.to_path_buf(), try!(File::open(path)))
};
worker.do_work(&mut printer, work);
for ent in try!(args.walker(p)) {
paths_searched += 1;
let mut printer = args.printer(&mut term);
if worker.match_count > 0 {
if args.quiet() {
break;
}
if let Some(sep) = args.file_separator() {
printer = printer.file_separator(sep);
}
}
let file = match File::open(ent.path()) {
Ok(file) => file,
Err(err) => {
eprintln!("{}: {}", ent.path().display(), err);
continue;
}
};
worker.do_work(&mut printer, WorkReady::DirFile(ent, file));
}
}
}
if !paths.is_empty() && paths_searched == 0 {
eprintln!("No files were searched, which means ripgrep probably \
applied a filter you didn't expect. \
Try running again with --debug.");
}
Ok(worker.match_count)
}
@@ -202,11 +250,11 @@ enum Work {
enum WorkReady {
Stdin,
DirFile(DirEntry, File),
PathFile(PathBuf, File),
}
struct MultiWorker {
chan_work: Stealer<Work>,
quiet_matched: QuietMatched,
out: Arc<Mutex<Out>>,
#[cfg(not(windows))]
outbuf: Option<ColoredTerminal<term::TerminfoTerminal<Vec<u8>>>>,
@@ -225,6 +273,9 @@ struct Worker {
impl MultiWorker {
fn run(mut self) -> u64 {
loop {
if self.quiet_matched.has_match() {
break;
}
let work = match self.chan_work.steal() {
Stolen::Empty | Stolen::Abort => continue,
Stolen::Data(Work::Quit) => break,
@@ -243,6 +294,9 @@ impl MultiWorker {
outbuf.clear();
let mut printer = self.worker.args.printer(outbuf);
self.worker.do_work(&mut printer, work);
if self.quiet_matched.set_match(self.worker.match_count > 0) {
break;
}
let outbuf = printer.into_inner();
if !outbuf.get_ref().is_empty() {
let mut out = self.out.lock().unwrap();
@@ -277,17 +331,6 @@ impl Worker {
self.search(printer, path, file)
}
}
WorkReady::PathFile(path, file) => {
let mut path = &*path;
if let Some(p) = strip_prefix("./", path) {
path = p;
}
if self.args.mmap() {
self.search_mmap(printer, path, &file)
} else {
self.search(printer, path, file)
}
}
};
match result {
Ok(count) => {
@@ -322,7 +365,11 @@ impl Worker {
) -> Result<u64> {
if try!(file.metadata()).len() == 0 {
// Opening a memory map with an empty file results in an error.
return Ok(0);
// However, this may not actually be an empty file! For example,
// /proc/cpuinfo reports itself as an empty file, but it can
// produce data when it's read from. Therefore, we fall back to
// regular read calls.
return self.search(printer, path, file);
}
let mmap = try!(Mmap::open(file, Protection::Read));
Ok(self.args.searcher_buffer(
@@ -333,3 +380,28 @@ impl Worker {
).run())
}
}
#[derive(Clone, Debug)]
struct QuietMatched(Arc<Option<AtomicBool>>);
impl QuietMatched {
fn new(quiet: bool) -> QuietMatched {
let atomic = if quiet { Some(AtomicBool::new(false)) } else { None };
QuietMatched(Arc::new(atomic))
}
fn has_match(&self) -> bool {
match *self.0 {
None => false,
Some(ref matched) => matched.load(Ordering::SeqCst),
}
}
fn set_match(&self, yes: bool) -> bool {
match *self.0 {
None => false,
Some(_) if !yes => false,
Some(ref m) => { m.store(true, Ordering::SeqCst); true }
}
}
}

View File

@@ -48,8 +48,6 @@ impl Out {
/// If set, the separator is printed between matches from different files.
/// By default, no separator is printed.
///
/// If sep is empty, then no file separator is printed.
pub fn file_separator(mut self, sep: Vec<u8>) -> Out {
self.file_separator = Some(sep);
self
@@ -317,3 +315,60 @@ impl<T: Terminal + Send> term::Terminal for ColoredTerminal<T> {
}
}
}
impl<'a, T: Terminal + Send> term::Terminal for &'a mut ColoredTerminal<T> {
type Output = T::Output;
fn fg(&mut self, fg: term::color::Color) -> term::Result<()> {
(**self).fg(fg)
}
fn bg(&mut self, bg: term::color::Color) -> term::Result<()> {
(**self).bg(bg)
}
fn attr(&mut self, attr: term::Attr) -> term::Result<()> {
(**self).attr(attr)
}
fn supports_attr(&self, attr: term::Attr) -> bool {
(**self).supports_attr(attr)
}
fn reset(&mut self) -> term::Result<()> {
(**self).reset()
}
fn supports_reset(&self) -> bool {
(**self).supports_reset()
}
fn supports_color(&self) -> bool {
(**self).supports_color()
}
fn cursor_up(&mut self) -> term::Result<()> {
(**self).cursor_up()
}
fn delete_line(&mut self) -> term::Result<()> {
(**self).delete_line()
}
fn carriage_return(&mut self) -> term::Result<()> {
(**self).carriage_return()
}
fn get_ref(&self) -> &Self::Output {
(**self).get_ref()
}
fn get_mut(&mut self) -> &mut Self::Output {
(**self).get_mut()
}
fn into_inner(self) -> Self::Output {
// Good golly miss molly...
unimplemented!()
}
}

View File

@@ -98,3 +98,21 @@ pub fn is_hidden<P: AsRef<Path>>(path: P) -> bool {
false
}
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(unix)]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
use std::os::unix::ffi::OsStrExt;
use memchr::memchr;
let path = path.as_ref().as_os_str().as_bytes();
memchr(b'/', path).is_none()
}
/// Returns true if this file path is just a file name. i.e., Its parent is
/// the empty string.
#[cfg(not(unix))]
pub fn is_file_name<P: AsRef<Path>>(path: P) -> bool {
path.as_ref().parent().map(|p| p.as_os_str().is_empty()).unwrap_or(false)
}

View File

@@ -4,6 +4,7 @@ use regex::bytes::Regex;
use term::{Attr, Terminal};
use term::color;
use pathutil::strip_prefix;
use types::FileTypeDef;
/// Printer encapsulates all output logic for searching.
@@ -24,18 +25,49 @@ pub struct Printer<W> {
/// printed via the match directly, but occasionally we need to insert them
/// ourselves (for example, to print a context separator).
eol: u8,
/// A file separator to show before any matches are printed.
file_separator: Option<Vec<u8>>,
/// Whether to show file name as a heading or not.
///
/// N.B. If with_filename is false, then this setting has no effect.
heading: bool,
/// Whether to show every match on its own line.
line_per_match: bool,
/// Whether to suppress all output.
quiet: bool,
/// Whether to print NUL bytes after a file path instead of new lines
/// or `:`.
null: bool,
/// A string to use as a replacement of each match in a matching line.
replace: Option<Vec<u8>>,
/// Whether to prefix each match with the corresponding file name.
with_filename: bool,
/// The choice of colors.
color_choice: ColorChoice
}
struct ColorChoice {
matched_line: color::Color,
heading: color::Color,
line_number: color::Color
}
impl ColorChoice {
#[cfg(unix)]
pub fn new() -> ColorChoice {
ColorChoice {
matched_line: color::RED,
heading: color::GREEN,
line_number: color::BLUE
}
}
#[cfg(not(unix))]
pub fn new() -> ColorChoice {
ColorChoice {
matched_line: color::BRIGHT_RED,
heading: color::BRIGHT_GREEN,
line_number: color::BRIGHT_BLUE
}
}
}
impl<W: Terminal + Send> Printer<W> {
@@ -47,11 +79,13 @@ impl<W: Terminal + Send> Printer<W> {
column: false,
context_separator: "--".to_string().into_bytes(),
eol: b'\n',
file_separator: None,
heading: false,
line_per_match: false,
quiet: false,
null: false,
replace: None,
with_filename: false,
color_choice: ColorChoice::new()
}
}
@@ -74,6 +108,13 @@ impl<W: Terminal + Send> Printer<W> {
self
}
/// If set, the separator is printed before any matches. By default, no
/// separator is printed.
pub fn file_separator(mut self, sep: Vec<u8>) -> Printer<W> {
self.file_separator = Some(sep);
self
}
/// Whether to show file name as a heading or not.
///
/// N.B. If with_filename is false, then this setting has no effect.
@@ -88,9 +129,10 @@ impl<W: Terminal + Send> Printer<W> {
self
}
/// When set, all output is suppressed.
pub fn quiet(mut self, yes: bool) -> Printer<W> {
self.quiet = yes;
/// Whether to cause NUL bytes to follow file paths instead of other
/// visual separators (like `:`, `-` and `\n`).
pub fn null(mut self, yes: bool) -> Printer<W> {
self.null = yes;
self
}
@@ -138,15 +180,24 @@ impl<W: Terminal + Send> Printer<W> {
/// Prints the given path.
pub fn path<P: AsRef<Path>>(&mut self, path: P) {
self.write(path.as_ref().to_string_lossy().as_bytes());
self.write_eol();
let path = strip_prefix("./", path.as_ref()).unwrap_or(path.as_ref());
self.write_path(path);
if self.null {
self.write(b"\x00");
} else {
self.write_eol();
}
}
/// Prints the given path and a count of the number of matches found.
pub fn path_count<P: AsRef<Path>>(&mut self, path: P, count: u64) {
if self.with_filename {
self.write(path.as_ref().to_string_lossy().as_bytes());
self.write(b":");
self.write_path(path);
if self.null {
self.write(b"\x00");
} else {
self.write(b":");
}
}
self.write(count.to_string().as_bytes());
self.write_eol();
@@ -155,9 +206,6 @@ impl<W: Terminal + Send> Printer<W> {
/// Prints the context separator.
pub fn context_separate(&mut self) {
// N.B. We can't use `write` here because of borrowing restrictions.
if self.quiet {
return;
}
if self.context_separator.is_empty() {
return;
}
@@ -179,7 +227,7 @@ impl<W: Terminal + Send> Printer<W> {
let column =
if self.column {
Some(re.find(&buf[start..end])
.map(|(s, _)| s + 1).unwrap_or(0) as u64)
.map(|(s, _)| s).unwrap_or(0) as u64)
} else {
None
};
@@ -204,16 +252,16 @@ impl<W: Terminal + Send> Printer<W> {
column: Option<u64>,
) {
if self.heading && self.with_filename && !self.has_printed {
self.write_file_sep();
self.write_heading(path.as_ref());
} else if !self.heading && self.with_filename {
self.write(path.as_ref().to_string_lossy().as_bytes());
self.write(b":");
self.write_non_heading_path(path.as_ref());
}
if let Some(line_number) = line_number {
self.line_number(line_number, b':');
}
if let Some(c) = column {
self.write(c.to_string().as_bytes());
self.write((c + 1).to_string().as_bytes());
self.write(b":");
}
if self.replace.is_some() {
@@ -236,7 +284,7 @@ impl<W: Terminal + Send> Printer<W> {
let mut last_written = 0;
for (s, e) in re.find_iter(buf) {
self.write(&buf[last_written..s]);
let _ = self.wtr.fg(color::BRIGHT_RED);
let _ = self.wtr.fg(self.color_choice.matched_line);
let _ = self.wtr.attr(Attr::Bold);
self.write(&buf[s..e]);
let _ = self.wtr.reset();
@@ -254,10 +302,15 @@ impl<W: Terminal + Send> Printer<W> {
line_number: Option<u64>,
) {
if self.heading && self.with_filename && !self.has_printed {
self.write_file_sep();
self.write_heading(path.as_ref());
} else if !self.heading && self.with_filename {
self.write(path.as_ref().to_string_lossy().as_bytes());
self.write(b"-");
self.write_path(path.as_ref());
if self.null {
self.write(b"\x00");
} else {
self.write(b"-");
}
}
if let Some(line_number) = line_number {
self.line_number(line_number, b'-');
@@ -270,19 +323,39 @@ impl<W: Terminal + Send> Printer<W> {
fn write_heading<P: AsRef<Path>>(&mut self, path: P) {
if self.wtr.supports_color() {
let _ = self.wtr.fg(color::BRIGHT_GREEN);
let _ = self.wtr.fg(self.color_choice.heading);
let _ = self.wtr.attr(Attr::Bold);
}
self.write(path.as_ref().to_string_lossy().as_bytes());
self.write_eol();
self.write_path(path.as_ref());
if self.null {
self.write(b"\x00");
} else {
self.write_eol();
}
if self.wtr.supports_color() {
let _ = self.wtr.reset();
}
}
fn write_non_heading_path<P: AsRef<Path>>(&mut self, path: P) {
if self.wtr.supports_color() {
let _ = self.wtr.fg(self.color_choice.heading);
let _ = self.wtr.attr(Attr::Bold);
}
self.write_path(path.as_ref());
if self.wtr.supports_color() {
let _ = self.wtr.reset();
}
if self.null {
self.write(b"\x00");
} else {
self.write(b":");
}
}
fn line_number(&mut self, n: u64, sep: u8) {
if self.wtr.supports_color() {
let _ = self.wtr.fg(color::BRIGHT_BLUE);
let _ = self.wtr.fg(self.color_choice.line_number);
let _ = self.wtr.attr(Attr::Bold);
}
self.write(n.to_string().as_bytes());
@@ -292,10 +365,20 @@ impl<W: Terminal + Send> Printer<W> {
self.write(&[sep]);
}
#[cfg(unix)]
fn write_path<P: AsRef<Path>>(&mut self, path: P) {
use std::os::unix::ffi::OsStrExt;
let path = path.as_ref().as_os_str().as_bytes();
self.write(path);
}
#[cfg(not(unix))]
fn write_path<P: AsRef<Path>>(&mut self, path: P) {
self.write(path.as_ref().to_string_lossy().as_bytes());
}
fn write(&mut self, buf: &[u8]) {
if self.quiet {
return;
}
self.has_printed = true;
let _ = self.wtr.write_all(buf);
}
@@ -304,4 +387,12 @@ impl<W: Terminal + Send> Printer<W> {
let eol = self.eol;
self.write(&[eol]);
}
fn write_file_sep(&mut self) {
if let Some(ref sep) = self.file_separator {
self.has_printed = true;
let _ = self.wtr.write_all(sep);
let _ = self.wtr.write_all(b"\n");
}
}
}

View File

@@ -53,6 +53,14 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self
}
/// If enabled, searching will print the path instead of each match.
///
/// Disabled by default.
pub fn files_with_matches(mut self, yes: bool) -> Self {
self.opts.files_with_matches = yes;
self
}
/// Set the end-of-line byte used by this searcher.
pub fn eol(mut self, eol: u8) -> Self {
self.opts.eol = eol;
@@ -73,6 +81,13 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self
}
/// If enabled, don't show any output and quit searching after the first
/// match is found.
pub fn quiet(mut self, yes: bool) -> Self {
self.opts.quiet = yes;
self
}
/// If enabled, search binary files as if they were text.
pub fn text(mut self, yes: bool) -> Self {
self.opts.text = yes;
@@ -96,6 +111,9 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
self.print_match(m.start(), m.end());
}
last_end = m.end();
if self.opts.stop_after_first_match() {
break;
}
}
if self.opts.invert_match {
let upto = self.buf.len();
@@ -104,13 +122,16 @@ impl<'a, W: Send + Terminal> BufferSearcher<'a, W> {
if self.opts.count && self.match_count > 0 {
self.printer.path_count(self.path, self.match_count);
}
if self.opts.files_with_matches && self.match_count > 0 {
self.printer.path(self.path);
}
self.match_count
}
#[inline(always)]
pub fn print_match(&mut self, start: usize, end: usize) {
self.match_count += 1;
if self.opts.count {
if self.opts.skip_matches() {
return;
}
self.count_lines(start);
@@ -237,6 +258,14 @@ and exhibited clearly, with a label attached.\
assert_eq!(out, "/baz.rs:2\n");
}
#[test]
fn files_with_matches() {
let (count, out) = search(
"Sherlock", SHERLOCK, |s| s.files_with_matches(true));
assert_eq!(1, count);
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn invert_match() {
let (count, out) = search(

View File

@@ -80,9 +80,11 @@ pub struct Options {
pub after_context: usize,
pub before_context: usize,
pub count: bool,
pub files_with_matches: bool,
pub eol: u8,
pub invert_match: bool,
pub line_number: bool,
pub quiet: bool,
pub text: bool,
}
@@ -92,12 +94,29 @@ impl Default for Options {
after_context: 0,
before_context: 0,
count: false,
files_with_matches: false,
eol: b'\n',
invert_match: false,
line_number: false,
quiet: false,
text: false,
}
}
}
impl Options {
/// Several options (--quiet, --count, --files-with-matches) imply that
/// we shouldn't ever display matches.
pub fn skip_matches(&self) -> bool {
self.count || self.files_with_matches || self.quiet
}
/// Some options (--quiet, --files-with-matches) imply that we can stop
/// searching after the first match.
pub fn stop_after_first_match(&self) -> bool {
self.files_with_matches || self.quiet
}
}
impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
@@ -158,6 +177,14 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
self
}
/// If enabled, searching will print the path instead of each match.
///
/// Disabled by default.
pub fn files_with_matches(mut self, yes: bool) -> Self {
self.opts.files_with_matches = yes;
self
}
/// Set the end-of-line byte used by this searcher.
pub fn eol(mut self, eol: u8) -> Self {
self.opts.eol = eol;
@@ -178,6 +205,13 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
self
}
/// If enabled, don't show any output and quit searching after the first
/// match is found.
pub fn quiet(mut self, yes: bool) -> Self {
self.opts.quiet = yes;
self
}
/// If enabled, search binary files as if they were text.
pub fn text(mut self, yes: bool) -> Self {
self.opts.text = yes;
@@ -193,7 +227,7 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
self.line_count = if self.opts.line_number { Some(0) } else { None };
self.last_match = Match::default();
self.after_context_remaining = 0;
loop {
while !self.terminate() {
let upto = self.inp.lastnl;
self.print_after_context(upto);
if !try!(self.fill()) {
@@ -202,7 +236,7 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
if !self.opts.text && self.inp.is_binary {
break;
}
while self.inp.pos < self.inp.lastnl {
while !self.terminate() && self.inp.pos < self.inp.lastnl {
let matched = self.grep.read_match(
&mut self.last_match,
&mut self.inp.buf[..self.inp.lastnl],
@@ -234,12 +268,21 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
}
}
}
if self.opts.count && self.match_count > 0 {
self.printer.path_count(self.path, self.match_count);
if self.match_count > 0 {
if self.opts.count {
self.printer.path_count(self.path, self.match_count);
} else if self.opts.files_with_matches {
self.printer.path(self.path);
}
}
Ok(self.match_count)
}
#[inline(always)]
fn terminate(&self) -> bool {
self.match_count > 0 && self.opts.stop_after_first_match()
}
#[inline(always)]
fn fill(&mut self) -> Result<bool, Error> {
let mut keep = self.inp.lastnl;
@@ -281,7 +324,7 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
#[inline(always)]
fn print_before_context(&mut self, upto: usize) {
if self.opts.count || self.opts.before_context == 0 {
if self.opts.skip_matches() || self.opts.before_context == 0 {
return;
}
let start = self.last_printed;
@@ -304,7 +347,7 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
#[inline(always)]
fn print_after_context(&mut self, upto: usize) {
if self.opts.count || self.after_context_remaining == 0 {
if self.opts.skip_matches() || self.after_context_remaining == 0 {
return;
}
let start = self.last_printed;
@@ -322,7 +365,7 @@ impl<'a, R: io::Read, W: Terminal + Send> Searcher<'a, R, W> {
#[inline(always)]
fn print_match(&mut self, start: usize, end: usize) {
self.match_count += 1;
if self.opts.count {
if self.opts.skip_matches() {
return;
}
self.print_separator(start);
@@ -503,10 +546,6 @@ impl InputBuffer {
if self.first && is_binary(&self.buf[self.end..self.end + n]) {
self.is_binary = true;
}
if self.is_binary {
replace_buf(
&mut self.buf[self.end..self.end + n], b'\x00', self.eol);
}
self.first = false;
// We assume that reading 0 bytes means we've hit EOF.
if n == 0 {
@@ -629,6 +668,7 @@ pub fn count_lines(buf: &[u8], eol: u8) -> u64 {
}
/// Replaces a with b in buf.
#[allow(dead_code)]
fn replace_buf(buf: &mut [u8], a: u8, b: u8) {
if a == b {
return;
@@ -970,7 +1010,7 @@ fn main() {
let text = "Sherlock\n\x00Holmes\n";
let (count, out) = search("Sherlock|Holmes", text, |s| s.text(true));
assert_eq!(2, count);
assert_eq!(out, "/baz.rs:Sherlock\n/baz.rs:Holmes\n");
assert_eq!(out, "/baz.rs:Sherlock\n/baz.rs:\x00Holmes\n");
}
#[test]
@@ -992,6 +1032,14 @@ fn main() {
assert_eq!(out, "/baz.rs:2\n");
}
#[test]
fn files_with_matches() {
let (count, out) = search_smallcap(
"Sherlock", SHERLOCK, |s| s.files_with_matches(true));
assert_eq!(1, count);
assert_eq!(out, "/baz.rs\n");
}
#[test]
fn invert_match() {
let (count, out) = search_smallcap(

View File

@@ -11,16 +11,17 @@ use std::path::Path;
use regex;
use gitignore::{Match, Pattern};
use glob::{self, MatchOptions};
use globset::{self, GlobBuilder, GlobSet, GlobSetBuilder};
const TYPE_EXTENSIONS: &'static [(&'static str, &'static [&'static str])] = &[
("asm", &["*.asm", "*.s", "*.S"]),
("awk", &["*.awk"]),
("c", &["*.c", "*.h", "*.H"]),
("cbor", &["*.cbor"]),
("clojure", &["*.clj", "*.cljs"]),
("cmake", &["CMakeLists.txt"]),
("clojure", &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
("cmake", &["*.cmake", "CMakeLists.txt"]),
("coffeescript", &["*.coffee"]),
("config", &["*.config"]),
("cpp", &[
"*.C", "*.cc", "*.cpp", "*.cxx",
"*.h", "*.H", "*.hh", "*.hpp",
@@ -36,12 +37,16 @@ const TYPE_EXTENSIONS: &'static [(&'static str, &'static [&'static str])] = &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
("fsharp", &["*.fs", "*.fsx", "*.fsi"]),
("go", &["*.go"]),
("groovy", &["*.groovy"]),
("haskell", &["*.hs", "*.lhs"]),
("html", &["*.htm", "*.html"]),
("java", &["*.java"]),
("js", &["*.js"]),
("jinja", &["*.jinja", "*.jinja2"]),
("js", &[
"*.js", "*.jsx", "*.vue",
]),
("json", &["*.json"]),
("jsonl", &["*.jsonl"]),
("lisp", &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
@@ -49,9 +54,11 @@ const TYPE_EXTENSIONS: &'static [(&'static str, &'static [&'static str])] = &[
("m4", &["*.ac", "*.m4"]),
("make", &["gnumakefile", "Gnumakefile", "makefile", "Makefile", "*.mk"]),
("markdown", &["*.md"]),
("md", &["*.md"]),
("matlab", &["*.m"]),
("mk", &["mkfile"]),
("ml", &["*.ml"]),
("nim", &["*.nim"]),
("objc", &["*.h", "*.m"]),
("objcpp", &["*.h", "*.mm"]),
("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
@@ -59,17 +66,22 @@ const TYPE_EXTENSIONS: &'static [(&'static str, &'static [&'static str])] = &[
("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
("py", &["*.py"]),
("readme", &["README*", "*README"]),
("rr", &["*.R"]),
("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
("rst", &["*.rst"]),
("ruby", &["*.rb"]),
("rust", &["*.rs"]),
("scala", &["*.scala"]),
("sh", &["*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh"]),
("spark", &["*.spark"]),
("sql", &["*.sql"]),
("sv", &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
("swift", &["*.swift"]),
("tex", &["*.tex", "*.cls", "*.sty"]),
("ts", &["*.ts", "*.tsx"]),
("txt", &["*.txt"]),
("toml", &["*.toml", "Cargo.lock"]),
("vala", &["*.vala"]),
("vb", &["*.vb"]),
("vimscript", &["*.vim"]),
("xml", &["*.xml"]),
("yacc", &["*.y"]),
@@ -85,7 +97,7 @@ pub enum Error {
/// A user specified file type definition could not be parsed.
InvalidDefinition,
/// There was an error building the matcher (probably a bad glob).
Glob(glob::Error),
Glob(globset::Error),
/// There was an error compiling a glob as a regex.
Regex(regex::Error),
}
@@ -117,8 +129,8 @@ impl fmt::Display for Error {
}
}
impl From<glob::Error> for Error {
fn from(err: glob::Error) -> Error {
impl From<globset::Error> for Error {
fn from(err: globset::Error) -> Error {
Error::Glob(err)
}
}
@@ -151,8 +163,9 @@ impl FileTypeDef {
/// Types is a file type matcher.
#[derive(Clone, Debug)]
pub struct Types {
selected: Option<glob::SetYesNo>,
negated: Option<glob::SetYesNo>,
defs: Vec<FileTypeDef>,
selected: Option<GlobSet>,
negated: Option<GlobSet>,
has_selected: bool,
unmatched_pat: Pattern,
}
@@ -165,11 +178,13 @@ impl Types {
/// If has_selected is true, then at least one file type was selected.
/// Therefore, any non-matches should be ignored.
fn new(
selected: Option<glob::SetYesNo>,
negated: Option<glob::SetYesNo>,
selected: Option<GlobSet>,
negated: Option<GlobSet>,
has_selected: bool,
defs: Vec<FileTypeDef>,
) -> Types {
Types {
defs: defs,
selected: selected,
negated: negated,
has_selected: has_selected,
@@ -185,7 +200,7 @@ impl Types {
/// Creates a new file type matcher that never matches.
pub fn empty() -> Types {
Types::new(None, None, false)
Types::new(None, None, false, vec![])
}
/// Returns a match for the given path against this file type matcher.
@@ -225,6 +240,11 @@ impl Types {
Match::None
}
}
/// Return the set of current file type definitions.
pub fn definitions(&self) -> &[FileTypeDef] {
&self.defs
}
}
/// TypesBuilder builds a type matcher from a set of file type definitions and
@@ -248,14 +268,11 @@ impl TypesBuilder {
/// Build the current set of file type definitions *and* selections into
/// a file type matcher.
pub fn build(&self) -> Result<Types, Error> {
let opts = MatchOptions {
require_literal_separator: true, ..MatchOptions::default()
};
let selected_globs =
if self.selected.is_empty() {
None
} else {
let mut bset = glob::SetBuilder::new();
let mut bset = GlobSetBuilder::new();
for name in &self.selected {
let globs = match self.types.get(name) {
Some(globs) => globs,
@@ -265,16 +282,19 @@ impl TypesBuilder {
}
};
for glob in globs {
try!(bset.add_with(glob, &opts));
let pat = try!(
GlobBuilder::new(glob)
.literal_separator(true).build());
bset.add(pat);
}
}
Some(try!(bset.build_yesno()))
Some(try!(bset.build()))
};
let negated_globs =
if self.negated.is_empty() {
None
} else {
let mut bset = glob::SetBuilder::new();
let mut bset = GlobSetBuilder::new();
for name in &self.negated {
let globs = match self.types.get(name) {
Some(globs) => globs,
@@ -284,13 +304,20 @@ impl TypesBuilder {
}
};
for glob in globs {
try!(bset.add_with(glob, &opts));
let pat = try!(
GlobBuilder::new(glob)
.literal_separator(true).build());
bset.add(pat);
}
}
Some(try!(bset.build_yesno()))
Some(try!(bset.build()))
};
Ok(Types::new(
selected_globs, negated_globs, !self.selected.is_empty()))
selected_globs,
negated_globs,
!self.selected.is_empty(),
self.definitions(),
))
}
/// Return the set of current file type definitions.

View File

@@ -34,6 +34,33 @@ macro_rules! sherlock {
};
}
macro_rules! clean {
($name:ident, $query:expr, $path:expr, $fun:expr) => {
#[test]
fn $name() {
let wd = WorkDir::new(stringify!($name));
let mut cmd = wd.command();
cmd.arg($query).arg($path);
$fun(wd, cmd);
}
};
}
fn path(unix: &str) -> String {
if cfg!(windows) {
unix.replace("/", "\\")
} else {
unix.to_string()
}
}
fn sort_lines(lines: &str) -> String {
let mut lines: Vec<String> =
lines.trim().lines().map(|s| s.to_owned()).collect();
lines.sort();
format!("{}\n", lines.join("\n"))
}
sherlock!(single_file, |wd: WorkDir, mut cmd| {
let lines: String = wd.stdout(&mut cmd);
let expected = "\
@@ -118,7 +145,11 @@ be, to a very large extent, the result of luck. Sherlock Holmes
foo
Sherlock Holmes lives on Baker Street.
";
assert!(lines == expected1 || lines == expected2);
if lines != expected1 {
assert_eq!(lines, expected2);
} else {
assert_eq!(lines, expected1);
}
});
sherlock!(inverted, |wd: WorkDir, mut cmd: Command| {
@@ -280,6 +311,20 @@ sherlock!(glob_negate, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, "file.py:Sherlock\n");
});
sherlock!(count, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
cmd.arg("--count");
let lines: String = wd.stdout(&mut cmd);
let expected = "sherlock:2\n";
assert_eq!(lines, expected);
});
sherlock!(files_with_matches, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
cmd.arg("--files-with-matches");
let lines: String = wd.stdout(&mut cmd);
let expected = "sherlock\n";
assert_eq!(lines, expected);
});
sherlock!(after_context, |wd: WorkDir, mut cmd: Command| {
cmd.arg("-A").arg("1");
let lines: String = wd.stdout(&mut cmd);
@@ -377,6 +422,11 @@ sherlock!(ignore_git, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.assert_err(&mut cmd);
});
sherlock!(ignore_generic, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".ignore", "sherlock\n");
wd.assert_err(&mut cmd);
});
sherlock!(ignore_ripgrep, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".rgignore", "sherlock\n");
wd.assert_err(&mut cmd);
@@ -492,7 +542,7 @@ sherlock!(symlink_nofollow, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.remove("sherlock");
wd.create_dir("foo");
wd.create_dir("foo/bar");
wd.link("foo/baz", "foo/bar/baz");
wd.link_dir("foo/baz", "foo/bar/baz");
wd.create_dir("foo/baz");
wd.create("foo/baz/sherlock", hay::SHERLOCK);
cmd.current_dir(wd.path().join("foo/bar"));
@@ -505,24 +555,16 @@ sherlock!(symlink_follow, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_dir("foo/bar");
wd.create_dir("foo/baz");
wd.create("foo/baz/sherlock", hay::SHERLOCK);
wd.link("foo/baz", "foo/bar/baz");
wd.link_dir("foo/baz", "foo/bar/baz");
cmd.arg("-L");
cmd.current_dir(wd.path().join("foo/bar"));
let lines: String = wd.stdout(&mut cmd);
if cfg!(windows) {
let expected = "\
baz\\sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
baz\\sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
} else {
let expected = "\
let expected = "\
baz/sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
baz/sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
}
assert_eq!(lines, path(expected));
});
sherlock!(unrestricted1, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
@@ -550,17 +592,6 @@ sherlock!(unrestricted2, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
assert_eq!(lines, expected);
});
#[cfg(not(windows))]
sherlock!(unrestricted3, "foo", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("-uuu");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "file:foo\nfile:foo\n");
});
// On Windows, this test uses memory maps, so the NUL bytes don't get replaced.
#[cfg(windows)]
sherlock!(unrestricted3, "foo", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("file", "foo\x00bar\nfoo\x00baz\n");
cmd.arg("-uuu");
@@ -574,14 +605,348 @@ sherlock!(vimgrep, "Sherlock|Watson", ".", |wd: WorkDir, mut cmd: Command| {
let lines: String = wd.stdout(&mut cmd);
let expected = "\
sherlock:1:15:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:1:56:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:3:48:be, to a very large extent, the result of luck. Sherlock Holmes
sherlock:5:11:but Doctor Watson has to have it taken out for him and dusted,
sherlock:1:16:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:1:57:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:3:49:be, to a very large extent, the result of luck. Sherlock Holmes
sherlock:5:12:but Doctor Watson has to have it taken out for him and dusted,
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/16
clean!(regression_16, "xyz", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "ghi/");
wd.create_dir("ghi");
wd.create_dir("def/ghi");
wd.create("ghi/toplevel.txt", "xyz");
wd.create("def/ghi/subdir.txt", "xyz");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/25
clean!(regression_25, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "/llvm/");
wd.create_dir("src/llvm");
wd.create("src/llvm/foo", "test");
let lines: String = wd.stdout(&mut cmd);
let expected = path("src/llvm/foo:test\n");
assert_eq!(lines, expected);
cmd.current_dir(wd.path().join("src"));
let lines: String = wd.stdout(&mut cmd);
let expected = path("llvm/foo:test\n");
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/30
clean!(regression_30, "test", ".", |wd: WorkDir, mut cmd: Command| {
if cfg!(windows) {
wd.create(".gitignore", "vendor/**\n!vendor\\manifest");
} else {
wd.create(".gitignore", "vendor/**\n!vendor/manifest");
}
wd.create_dir("vendor");
wd.create("vendor/manifest", "test");
let lines: String = wd.stdout(&mut cmd);
let expected = path("vendor/manifest:test\n");
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/49
clean!(regression_49, "xyz", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "foo/bar");
wd.create_dir("test/foo/bar");
wd.create("test/foo/bar/baz", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/50
clean!(regression_50, "xyz", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "XXX/YYY/");
wd.create_dir("abc/def/XXX/YYY");
wd.create_dir("ghi/XXX/YYY");
wd.create("abc/def/XXX/YYY/bar", "test");
wd.create("ghi/XXX/YYY/bar", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/65
clean!(regression_65, "xyz", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "a/");
wd.create_dir("a");
wd.create("a/foo", "xyz");
wd.create("a/bar", "xyz");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/67
clean!(regression_67, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "/*\n!/dir");
wd.create_dir("dir");
wd.create_dir("foo");
wd.create("foo/bar", "test");
wd.create("dir/bar", "test");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, path("dir/bar:test\n"));
});
// See: https://github.com/BurntSushi/ripgrep/issues/87
clean!(regression_87, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "foo\n**no-vcs**");
wd.create("foo", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/90
clean!(regression_90, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "!.foo");
wd.create(".foo", "test");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, ".foo:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/93
clean!(regression_93, r"(\d{1,3}\.){3}\d{1,3}", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo", "192.168.1.1");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:192.168.1.1\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/99
clean!(regression_99, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo1", "test");
wd.create("foo2", "zzz");
wd.create("bar", "test");
cmd.arg("-j1").arg("--heading");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(sort_lines(&lines), sort_lines("bar\ntest\n\nfoo1\ntest\n"));
});
// See: https://github.com/BurntSushi/ripgrep/issues/105
clean!(regression_105_part1, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("foo", "zztest");
cmd.arg("--vimgrep");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:1:3:zztest\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/105
clean!(regression_105_part2, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create("foo", "zztest");
cmd.arg("--column");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:3:zztest\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/127
clean!(regression_127, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
// Set up a directory hierarchy like this:
//
// .gitignore
// foo/
// sherlock
// watson
//
// Where `.gitignore` contains `foo/sherlock`.
//
// ripgrep should ignore 'foo/sherlock' giving us results only from
// 'foo/watson' but on Windows ripgrep will include both 'foo/sherlock' and
// 'foo/watson' in the search results.
wd.create(".gitignore", "foo/sherlock\n");
wd.create_dir("foo");
wd.create("foo/sherlock", hay::SHERLOCK);
wd.create("foo/watson", hay::SHERLOCK);
let lines: String = wd.stdout(&mut cmd);
let expected = format!("\
{path}:For the Doctor Watsons of this world, as opposed to the Sherlock
{path}:be, to a very large extent, the result of luck. Sherlock Holmes
", path=path("foo/watson"));
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/131
//
// TODO(burntsushi): Darwin doesn't like this test for some reason.
#[cfg(not(target_os = "macos"))]
clean!(regression_131, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "TopÑapa");
wd.create("TopÑapa", "test");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/137
//
// TODO(burntsushi): Figure out why Windows gives "access denied" errors
// when trying to create a file symlink. For now, disable test on Windows.
#[cfg(not(windows))]
sherlock!(regression_137, "Sherlock", ".", |wd: WorkDir, mut cmd: Command| {
wd.link_file("sherlock", "sym1");
wd.link_file("sherlock", "sym2");
cmd.arg("sym1");
cmd.arg("sym2");
cmd.arg("-j1");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
sym1:For the Doctor Watsons of this world, as opposed to the Sherlock
sym1:be, to a very large extent, the result of luck. Sherlock Holmes
sym2:For the Doctor Watsons of this world, as opposed to the Sherlock
sym2:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, path(expected));
});
// See: https://github.com/BurntSushi/ripgrep/issues/156
clean!(
regression_156,
r#"#(?:parse|include)\s*\(\s*(?:"|')[./A-Za-z_-]+(?:"|')"#,
"testcase.txt",
|wd: WorkDir, mut cmd: Command| {
const TESTCASE: &'static str = r#"#parse('widgets/foo_bar_macros.vm')
#parse ( 'widgets/mobile/foo_bar_macros.vm' )
#parse ("widgets/foobarhiddenformfields.vm")
#parse ( "widgets/foo_bar_legal.vm" )
#include( 'widgets/foo_bar_tips.vm' )
#include('widgets/mobile/foo_bar_macros.vm')
#include ("widgets/mobile/foo_bar_resetpw.vm")
#parse('widgets/foo-bar-macros.vm')
#parse ( 'widgets/mobile/foo-bar-macros.vm' )
#parse ("widgets/foo-bar-hiddenformfields.vm")
#parse ( "widgets/foo-bar-legal.vm" )
#include( 'widgets/foo-bar-tips.vm' )
#include('widgets/mobile/foo-bar-macros.vm')
#include ("widgets/mobile/foo-bar-resetpw.vm")
"#;
wd.create("testcase.txt", TESTCASE);
cmd.arg("-N");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, TESTCASE);
});
// See: https://github.com/BurntSushi/ripgrep/issues/20
sherlock!(feature_20_no_filename, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--no-filename");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
For the Doctor Watsons of this world, as opposed to the Sherlock
be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/68
clean!(feature_68_no_ignore_vcs, "test", ".", |wd: WorkDir, mut cmd: Command| {
wd.create(".gitignore", "foo");
wd.create(".ignore", "bar");
wd.create("foo", "test");
wd.create("bar", "test");
cmd.arg("--no-ignore-vcs");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo:test\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/70
sherlock!(feature_70_smart_case, "sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--smart-case");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_files_with_matches, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--null").arg("--files-with-matches");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "sherlock\x00");
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_count, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--null").arg("--count");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "sherlock\x002\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_files, "NADA", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--null").arg("--files");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "sherlock\x00");
});
// See: https://github.com/BurntSushi/ripgrep/issues/89
sherlock!(feature_89_match, "Sherlock", ".",
|wd: WorkDir, mut cmd: Command| {
cmd.arg("--null").arg("-C1");
let lines: String = wd.stdout(&mut cmd);
let expected = "\
sherlock\x00For the Doctor Watsons of this world, as opposed to the Sherlock
sherlock\x00Holmeses, success in the province of detective work must always
sherlock\x00be, to a very large extent, the result of luck. Sherlock Holmes
sherlock\x00can extract a clew from a wisp of straw or a flake of cigar ash;
";
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/109
clean!(feature_109_max_depth, "far", ".", |wd: WorkDir, mut cmd: Command| {
wd.create_dir("one");
wd.create("one/pass", "far");
wd.create_dir("one/too");
wd.create("one/too/many", "far");
cmd.arg("--maxdepth").arg("2");
let lines: String = wd.stdout(&mut cmd);
let expected = path("one/pass:far\n");
assert_eq!(lines, expected);
});
// See: https://github.com/BurntSushi/ripgrep/issues/124
clean!(feature_109_case_sensitive_part1, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo", "tEsT");
cmd.arg("--smart-case").arg("--case-sensitive");
wd.assert_err(&mut cmd);
});
// See: https://github.com/BurntSushi/ripgrep/issues/124
clean!(feature_109_case_sensitive_part2, "test", ".",
|wd: WorkDir, mut cmd: Command| {
wd.create("foo", "tEsT");
cmd.arg("--ignore-case").arg("--case-sensitive");
wd.assert_err(&mut cmd);
});
#[test]
fn binary_nosearch() {
let wd = WorkDir::new("binary_nosearch");
@@ -619,7 +984,7 @@ fn binary_search_no_mmap() {
let mut cmd = wd.command();
cmd.arg("-a").arg("--no-mmap").arg("foo").arg("file");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, "foo\nfoo\n");
assert_eq!(lines, "foo\x00bar\nfoo\x00baz\n");
}
#[test]
@@ -632,13 +997,23 @@ fn files() {
let mut cmd = wd.command();
cmd.arg("--files");
let lines: String = wd.stdout(&mut cmd);
if cfg!(windows) {
assert!(lines == "./dir\\file\n./file\n"
|| lines == "./file\n./dir\\file\n");
} else {
assert!(lines == "./file\n./dir/file\n"
|| lines == "./dir/file\n./file\n");
}
assert!(lines == path("file\ndir/file\n")
|| lines == path("dir/file\nfile\n"));
}
// See: https://github.com/BurntSushi/ripgrep/issues/64
#[test]
fn regression_64() {
let wd = WorkDir::new("regression_64");
wd.create_dir("dir");
wd.create_dir("foo");
wd.create("dir/abc", "");
wd.create("foo/abc", "");
let mut cmd = wd.command();
cmd.arg("--files").arg("foo");
let lines: String = wd.stdout(&mut cmd);
assert_eq!(lines, path("foo/abc\n"));
}
#[test]

View File

@@ -83,7 +83,7 @@ impl WorkDir {
/// Creates a directory symlink to the src with the given target name
/// in this directory.
#[cfg(not(windows))]
pub fn link<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
pub fn link_dir<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
use std::os::unix::fs::symlink;
let src = self.dir.join(src);
let target = self.dir.join(target);
@@ -91,8 +91,10 @@ impl WorkDir {
nice_err(&target, symlink(&src, &target));
}
/// Creates a directory symlink to the src with the given target name
/// in this directory.
#[cfg(windows)]
pub fn link<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
pub fn link_dir<S: AsRef<Path>, T: AsRef<Path>>(&self, src: S, target: T) {
use std::os::windows::fs::symlink_dir;
let src = self.dir.join(src);
let target = self.dir.join(target);
@@ -100,6 +102,32 @@ impl WorkDir {
nice_err(&target, symlink_dir(&src, &target));
}
/// Creates a file symlink to the src with the given target name
/// in this directory.
#[cfg(not(windows))]
pub fn link_file<S: AsRef<Path>, T: AsRef<Path>>(
&self,
src: S,
target: T,
) {
self.link_dir(src, target);
}
/// Creates a file symlink to the src with the given target name
/// in this directory.
#[cfg(windows)]
pub fn link_file<S: AsRef<Path>, T: AsRef<Path>>(
&self,
src: S,
target: T,
) {
use std::os::windows::fs::symlink_file;
let src = self.dir.join(src);
let target = self.dir.join(target);
let _ = fs::remove_file(&target);
nice_err(&target, symlink_file(&src, &target));
}
/// Runs and captures the stdout of the given command.
///
/// If the return type could not be created from a string, then this