51 Commits

Author SHA1 Message Date
Andrew Gallant
92dc402f7f Switch from Docopt to Clap.
There were two important reasons for the switch:

1. Performance. Docopt does poorly when the argv becomes large, which is
   a reasonable common use case for search tools. (e.g., use with xargs)
2. Better failure modes. Clap knows a lot more about how a particular
   argv might be invalid, and can therefore provide much clearer error
   messages.

While both were important, (1) made it urgent.

Note that since Clap requires at least Rust 1.11, this will in turn
increase the minimum Rust version supported by ripgrep from Rust 1.9 to
Rust 1.11. It is therefore a breaking change, so the soonest release of
ripgrep with Clap will have to be 0.3.

There is also at least one subtle breaking change in real usage.
Previous to this commit, this used to work:

    rg -e -foo

Where this would cause ripgrep to search for the string `-foo`. Clap
currently has problems supporting this use case
(see: https://github.com/kbknapp/clap-rs/issues/742),
but it can be worked around by using this instead:

    rg -e [-]foo

or even

    rg [-]foo

and this still works:

    rg -- -foo

This commit also adds Bash, Fish and PowerShell completion files to the
release, fixes a bug that prevented ripgrep from working on file
paths containing invalid UTF-8 and shows short descriptions in the
output of `-h` but longer descriptions in the output of `--help`.

Fixes #136, Fixes #189, Fixes #210, Fixes #230
2016-11-17 19:53:41 -05:00
Andrew Gallant
f24873c70b Don't ever search directories. 2016-11-06 19:02:14 -05:00
Andrew Gallant
9fc9f368f5 Always search paths given by user.
This permits doing `rg -a test /dev/sda1` for example, where as before
/dev/sda1 was skipped because it wasn't a regular file.
2016-11-06 18:23:50 -05:00
Andrew Gallant
77ad7588ae Add --no-messages flag.
This flag is similar to what's found in grep: it will suppress all error
messages, such as those shown when a particular file couldn't be read.

Closes #149
2016-11-06 14:36:08 -05:00
Andrew Gallant
58aca2efb2 Add -m/--max-count flag.
This flag limits the number of matches printed *per file*.

Closes #159
2016-11-06 13:09:53 -05:00
Andrew Gallant
b272be25fa Add parallel recursive directory iterator.
This adds a new walk type in the `ignore` crate, `WalkParallel`, which
provides a way for recursively iterating over a set of paths in parallel
while respecting various ignore rules.

The API is a bit strange, as a closure producing a closure isn't
something one often sees, but it does seem to work well.

This also allowed us to simplify much of the worker logic in ripgrep
proper, where MultiWorker is now gone.
2016-11-05 21:45:55 -04:00
Brian Campbell
79a8d0ab3f Reset the terminal when Ctrl-C is pressed
If a user hits Ctrl-C to exit out of a search in the middle of printing
a line, we don't want to leave the terminal colors screwed up for them.
Catch Ctrl-C using the ctrlc crate, obtain a stdout lock to ensure that
other threads don't continue writing after we do so, reset the terminal,
and exit the program.

Closes #119
2016-10-29 21:23:05 -04:00
Andrew Gallant
d79add341b Move all gitignore matching to separate crate.
This PR introduces a new sub-crate, `ignore`, which primarily provides a
fast recursive directory iterator that respects ignore files like
gitignore and other configurable filtering rules based on globs or even
file types.

This results in a substantial source of complexity moved out of ripgrep's
core and into a reusable component that others can now (hopefully)
benefit from.

While much of the ignore code carried over from ripgrep's core, a
substantial portion of it was rewritten with the following goals in
mind:

1. Reuse matchers built from gitignore files across directory iteration.
2. Design the matcher data structure to be amenable for parallelizing
   directory iteration. (Indeed, writing the parallel iterator is the
   next step.)

Fixes #9, #44, #45
2016-10-29 20:48:59 -04:00
Andrew Gallant
247a9398f4 Switch to thread_local crate in lieu of thread_local!.
This is to work around a bug where using a thread_local! was causing
a segfault on macos.

Fixes #164.
2016-10-11 18:23:49 -04:00
Andrew Gallant
fdf24317ac Move glob implementation to new crate.
It is isolated and complex enough that it deserves attention all on its
own. It's also eminently reusable.
2016-09-30 19:42:41 -04:00
Andrew Gallant
46dff8f4be Be better with short circuiting with --quiet.
It didn't make sense for --quiet to be part of the printer, because --quiet
doesn't just mean "don't print," it also means, "stop after the first
match is found." This needs to be wired all the way up through directory
traversal, and it also needs to cause all of the search workers to quit
as well. We do it with an atomic that is only checked with --quiet is
given.

Fixes #116.
2016-09-28 20:50:50 -04:00
Andrew Gallant
3e78fce3a3 Don't print empty lines in single threaded mode.
Fixes #99.
2016-09-26 19:57:23 -04:00
Andrew Gallant
104d740f76 Don't quit if opening a file fails.
This was already working correctly in multithreaded mode, but in single
threaded mode, a file failing to open caused search to stop. That's bad.

Fixes #98.
2016-09-26 18:44:19 -04:00
Andrew Gallant
f85822266f Don't use an intermediate buffer when --threads=1.
Fixes #8
2016-09-25 21:27:17 -04:00
Andrew Gallant
e7839f2200 Merge pull request #71 from catchmrbharath/issue46
[Fixes #46] Use 1 less worker thread than number of threads
2016-09-25 15:02:38 -04:00
Bharath M R
9f1aae64f8 [Fixes #46] Use 1 less worker thread than number of threads
The main thread does directory traversal. Hence
number of threads = main Thread + number of worker threads.
We should have atleast one worker thread.
2016-09-24 19:48:26 -07:00
Andrew Gallant
6b2efd4d88 If a file is empty, still try to search it.
Files like /proc/cpuinfo will advertise themselves as a normal file with
size 0. Normally, this isn't a problem, but if ripgrep decides to use a
memory map, it skipped searching if the file was empty since it's an error
to memory map an empty file. Instead of returning 0, we should just fall
back to standard read calls.

Fixes #55.
2016-09-24 20:45:06 -04:00
Andrew Gallant
69095cf5c3 Add an error message for catching a common failure mode.
If you're in a directory that has a parent .gitignore (like, your $HOME),
then it can cause ripgrep to simply not do anything depending on your
ignore rules.

There are probably other scenarios where ripgrep applies some filter that
an end user doesn't expect, so try to catch the worst case (when ripgrep
doesn't search anything).
2016-09-20 20:25:24 -04:00
Andrew Gallant
0e46171e3b Rework glob sets.
We try to reduce the pressure on regexes and offload some of it to
Aho-Corasick or exact lookups.
2016-09-15 22:06:04 -04:00
Andrew Gallant
c24f8fd50f Replace crossbeam with deque.
deque appears faster.
2016-09-14 07:40:46 -04:00
Andrew Gallant
983c7fd6f9 We don't use thread_local any more, so remove it. 2016-09-13 21:21:36 -04:00
Andrew Gallant
fdca74148d Stream results when feasible.
For example, when only a single file (or stdin) is being searched, then we
should be able to print directly to the terminal instead of intermediate
buffers. (The buffers are only necessary for parallelism.)

Closes #4.
2016-09-13 21:11:46 -04:00
Andrew Gallant
37544c092f We don't need regex-syntax directly in ripgrep. 2016-09-11 13:25:37 -04:00
Andrew Gallant
e3da726836 Rename search module to search_stream.
The name better reflects the difference between it and the search_buffer
module.
2016-09-10 00:08:42 -04:00
Andrew Gallant
5b36c86c15 Rejigger the atty detection stuff. 2016-09-10 00:05:20 -04:00
Andrew Gallant
0766617e07 Refactor how coloring is done.
All in the name of appeasing Windows.
2016-09-08 21:46:14 -04:00
Andrew Gallant
0042dce949 Hack in Windows console coloring.
The code has suffered and needs refactoring/commenting. BUT... IT WORKS!
2016-09-07 21:54:28 -04:00
Andrew Gallant
ca058d7584 Add support for memory maps.
I though plain `read` had usurped them, but when searching a very small
number of files, mmaps can be around 20% faster on Linux. It'd be really
unfortunate to leave that on the table.

Mmap searching doesn't support contexts yet, but we probably don't really
care. And duplicating that logic doesn't sound fun. Without contexts, mmap
searching is delightfully simple.
2016-09-06 21:47:33 -04:00
Andrew Gallant
9948e0ca07 Only create the Grep searcher once. 2016-09-06 19:33:19 -04:00
Andrew Gallant
02ac331529 Whoops. Remove other bits of parking lot. 2016-09-05 19:55:31 -04:00
Andrew Gallant
2bda77c414 Fix deps so that others can build it. 2016-09-05 18:22:12 -04:00
Andrew Gallant
7a149c20fe More progress. With coloring! 2016-09-05 17:36:41 -04:00
Andrew Gallant
d8d7560fd0 TODOs and some cleanup/refactoring. 2016-09-05 10:15:13 -04:00
Andrew Gallant
812cdb13c6 Lots of progress:
- Refactored interaction between CLI args and rest of xrep.
  - Filling in a lot more options, including file type filtering.
  - Fixing some bugs in globbing/ignoring.
  - More documentation.
2016-09-05 00:52:23 -04:00
Andrew Gallant
0bf278e72f making search work (finally) 2016-09-03 21:48:23 -04:00
Andrew Gallant
c2b5577cba progress on after contexts 2016-09-03 01:11:14 -04:00
Andrew Gallant
062aa5ef76 Switch to Chase-Lev work stealing queue.
It seems to be a touch faster.
2016-09-02 23:38:27 -04:00
Andrew Gallant
5450aed9a8 Make "before" context work.
No line numbers. And match inverting is broken.

This is awful.
2016-09-01 21:56:23 -04:00
Andrew Gallant
d011cea053 The search code is a mess, but...
... we now support inverted matches and line numbers!
2016-08-29 22:44:15 -04:00
Andrew Gallant
c809679cf2 Lots of improvements. Most notably, removal of memory maps for searching.
Memory maps appear to degrade quite a bit in the presence of multithreading.

Also, switch to lock free data structures for synchronization. Give each
worker an input and output buffer which require no synchronization.
2016-08-28 20:18:34 -04:00
Andrew Gallant
1c8379f55a Implementing core functionality.
Initially experimenting with crossbeam to manage synchronization.
2016-08-28 01:37:12 -04:00
Andrew Gallant
065c449980 File path filtering works and is pretty fast.
I'm pretty disappointed by the performance of regex sets. They are
apparently spending a lot of their time in construction of the DFA,
which probably means that the DFA is just too big.

It turns out that it's actually faster to build an *additional* normal
regex with the alternation of every glob and use it as a first-pass
filter over every file path. If there's a match, only then do we try the
more expensive RegexSet.
2016-08-27 01:01:06 -04:00
Andrew Gallant
b55ecf34c7 globbing by regex 2016-08-25 21:44:37 -04:00
Andrew Gallant
076eeff3ea update 2016-08-05 00:10:58 -04:00
Andrew Gallant
0163b39faa refactor progress 2016-06-20 16:55:13 -04:00
Andrew Gallant
8d9d602945 update 2016-04-03 21:22:09 -04:00
Andrew Gallant
07bff7409b tweaks 2016-03-30 22:24:59 -04:00
Andrew Gallant
79a51029c1 progress 2016-03-29 21:21:34 -04:00
Andrew Gallant
4ae67a8587 progress 2016-03-28 20:07:25 -04:00
Andrew Gallant
403bb72a4d beating 'grep -E' on some things 2016-03-10 20:48:44 -05:00