Compare commits

...

969 Commits

Author SHA1 Message Date
Andrew Gallant
2c3897585d ignore-0.4.22 2024-01-06 14:27:44 -05:00
Andrew Gallant
6e9141a9ca deps: update everything 2024-01-06 14:26:52 -05:00
Andrew Gallant
c8e4a84519 cli: prefix all non-fatal error messages with 'rg: '
Fixes #2694
2024-01-06 14:15:52 -05:00
Andrew Gallant
f02a50a69d changelog: various updates 2024-01-06 13:59:52 -05:00
fe9lix
b9c774937f ignore: fix reference cycle for compiled matchers
It looks like there is a reference cycle caused by the compiled
matchers (compiled HashMap holds ref to Ignore and Ignore holds ref
to HashMap). Using weak refs fixes issue #2690 in my test project.
Also confirmed via before and after when profiling the code, see the
attached screenshots in #2692.

Fixes #2690
2024-01-06 12:50:42 -05:00
Andrew Gallant
67dd809a80 ignore: add some 'allow(dead_code)' annotations
I don't usually like doing this and would prefer to just delete unused
code, but I don't have the context required to understand why this code
is unused. A refresh of this crate is on the (distant) horizon, so I'll
just leave these here for now to squash the warnings.
2024-01-06 12:25:06 -05:00
Jan Verbeek
e0a85678e1 complete/fish: improve shell completions for fish
- Stop using `-n __fish_use_subcommand`. This had the effect of
ignoring options if a positional argument has already been given, but
that's not how ripgrep works.

- Only suggest negation options if the option they're negating is
passed (e.g., only complete `--no-pcre2` if `--pcre2` is present). The
zsh completions already do this.

- Take into account whether an option takes an argument. If an option
is not a switch then it won't suggest further options until the
argument is given, e.g. `-C<tab>` won't suggest options but `-i<tab>`
will.

- Suggest correct arguments for options. We already completed a fixed
set of choices where available, but now we go further:

  - Filenames are only suggested for options that take filenames.

  - `--pre` and `--hostname-bin` suggest binaries from `$PATH`.

  - `-t`/`--type`/&c use `--type-list` for suggestions, like in zsh,
  with a preview of the glob patterns.

  - `--encoding` uses a hardcoded list extracted from the zsh
  completions. This has been refactored into a separate file, and the
  range globs (`{1..5}`) replaced by comma globs (`{1,2,3,4,5}`) since
  those work in both shells. I verified that this produces the same
  list as before in zsh, and the same list in fish (albeit in a
  different order).

PR #2684
2024-01-06 10:39:35 -05:00
David Gilman
23af5fb043 doc: update MSRV in README
PR #2673
2024-01-06 10:22:26 -05:00
Andrew Gallant
5dec4b8e37 ci: drop custom Cross images
It looks like these aren't needed any more? I'm not sure why to be
honest. I suspect it's because we no longer need asciidoc(tor)? to
generate man pages. And I believe tests that require things like `zstd`
are automatically if `zstd` isn't installed.
2024-01-06 10:21:34 -05:00
Younes El-karama
827082a33a ci: add more ARM build configurations to CI and release workflows
... it turns out that rustembedded/cross:armv7-unknown-linux-musleabi
doesn't exist. And looking more closely, it looks like the Cross project
has decided to shake things up and publish images to ghcr instead. So we
migrate everything over to that.
2024-01-06 10:21:34 -05:00
Andrew Gallant
6c2a550e1e deps: update everything
This drops a dependency on memoffset due to a crossbeam-epoch update.
w00t.
2024-01-04 19:46:29 -05:00
Andrew Gallant
8e8fc9c503 deps: bump pcre2-sys to 0.2.8
This release contains some extra logic to disable the JIT on musleabi
targets.
2024-01-04 19:44:28 -05:00
Andrew Gallant
2057023dc5 readme: update benchmarks
We add a few more too.
2024-01-03 16:21:04 -05:00
Andrew Gallant
3f2fe0afee deps: update everything
This also drops a dependency on scopeguard, courtesy of crossbeam-epoch
dropping it. Not sure why they did, but fine by me.
2023-12-17 09:37:33 -05:00
amesgen
56c7ad175a ignore/types: add Lean
Ref: https://lean-lang.org/

PR #2678
2023-12-07 11:46:00 -05:00
Timo Wilken
5b7a30846f doc: fix Guix install instructions
`guix install` should not be run using `sudo`, as per
<https://packages.guix.gnu.org/packages/ripgrep/>.

PR #2669
2023-11-30 10:54:54 -05:00
Patrick Williams
2a4dba3fbf ignore/types: add meson.options
Starting with meson 1.1, there is a preference for using meson.options
instead of meson_options.txt.  Add the new filename to the meson set.

PR #2666
2023-11-29 19:03:12 -05:00
liberodark
84d65865e6 doc: add Void Linux installation instructions
PR #2665
2023-11-29 07:49:20 -05:00
Andrew Gallant
d9aaa11873 pkg/brew: update tap 2023-11-28 16:23:16 -05:00
Andrew Gallant
67ad9917ad 14.0.3 2023-11-28 16:18:14 -05:00
Andrew Gallant
daa157b5f9 core: actually implement --sortr=path
This is an embarrassing oversight. A `todo!()` actually made its way
into a release! Oof.

This was working in ripgrep 13, but I had redone some aspects of sorting
and this just got left undone.

Fixes #2664
2023-11-28 16:17:14 -05:00
Andrew Gallant
ca5e294ad6 pkg/brew: update tap 2023-11-27 21:44:06 -05:00
Andrew Gallant
6c7947b819 14.0.2 2023-11-27 21:38:21 -05:00
Andrew Gallant
9acb4a5405 deps: bump grep to 0.3.1 2023-11-27 21:37:41 -05:00
Andrew Gallant
0096c74c11 grep-0.3.1 2023-11-27 21:36:54 -05:00
Andrew Gallant
8c48355b03 deps: bump grep-printer to 0.2.1 2023-11-27 21:36:44 -05:00
Andrew Gallant
f9b86de963 grep-printer-0.2.1 2023-11-27 21:36:02 -05:00
Andrew Gallant
d23b74975a deps: bump grep-searcher to 0.1.13 2023-11-27 21:35:53 -05:00
Andrew Gallant
a5cbdb3dfe grep-searcher-0.1.13 2023-11-27 21:34:58 -05:00
Andrew Gallant
b6bac8484e cargo: add release-lto profile
The idea is to build ripgrep with as much optimization as possible.

This makes compilation times absolutely obscene. They jump from <10
seconds to 30+ seconds on my i9-12900K. I don't even want to know how
long CI would take with these.

I tried some ad hoc benchmarks and could not notice any meaningful
improvement with the LTO binary versus the normal release profile.
Because of that, I still don't think it's worth bloating the release
cycle times.

Ref #1225
2023-11-27 21:31:03 -05:00
Andrew Gallant
805fa32d18 searcher: work around NUL line terminator bug
As the FIXME comment says, ripgrep is not yet using the new line
terminator option in regex-automata exposed for exactly this purpose.
Because of that, line anchors like `(?m:^)` and `(?m:$)` will only match
`\n` as a line terminator. This means that when --null-data is used in
combination with --line-regexp, the anchors inserted by --line-regexp
will not match correctly. This is only a big deal in the "fast" path,
which requires the regex engine to deal with line terminators itself
correctly. The slow path strips line terminators regardless of what they
are, and so the line anchors can match (begin/end of haystack).

Fixes #2658
2023-11-27 21:17:12 -05:00
Andrew Gallant
2d518dd1f9 release: tweak how sha256sum is invoked
The output would ideally just have the basename of the file and not a
meaningless relative path.

Fixes #2654
2023-11-27 21:17:12 -05:00
Jan Verbeek
8575d26179 complete/fish: Fix syntax for negated options
And also, negated options don't take arguments.

Specifically, the fish completion generator currently forgets to add
`-l` to negation options, leading to a list of these errors:

    complete: too many arguments

    ~/.config/fish/completions/rg.fish (line 146):
    complete -c rg -n '__fish_use_subcommand'  no-sort-files -d '(DEPRECATED) Sort results by file path.'
    ^
    from sourcing file ~/.config/fish/completions/rg.fish

    (Type 'help complete' for related documentation)

To reproduce, run `fish -c 'rg --generate=complete-fish | source'`.

It also potentially suggests a list of choices for negation options,
even though those never take arguments. That case doesn't occur with
any of the current options but it's an easy fix.

Fixes #2659, Closes #2655
2023-11-27 21:17:12 -05:00
Jon Jensen
2e81a7adfe doc: fix typo that was preventing interpolation
Closes #2662
2023-11-27 21:17:12 -05:00
Andrew Gallant
cd5440fb62 changelog: fix wording
Ref: https://news.ycombinator.com/item?id=38425790
2023-11-26 17:58:30 -05:00
Andrew Gallant
2ee690e87a pkg/brew: update tap 2023-11-26 17:37:52 -05:00
Andrew Gallant
59f86a45d3 14.0.1 2023-11-26 16:33:35 -05:00
Andrew Gallant
2d31af38a2 cargo: include pkg/windows in crate package
Fixes #2653
2023-11-26 16:32:59 -05:00
Andrew Gallant
0da1176e7d pkg/brew: update tap 2023-11-26 15:27:09 -05:00
Andrew Gallant
eeffcd50b7 doc: add step to run 'cargo package' 2023-11-26 15:25:23 -05:00
Andrew Gallant
625743d7c8 grep-0.3.0 2023-11-26 15:24:09 -05:00
Andrew Gallant
3d0171040a grep-printer-0.2.0 2023-11-26 15:21:40 -05:00
Andrew Gallant
93429d0f85 14.0.0 2023-11-26 14:19:31 -05:00
Andrew Gallant
9c4b0baf10 deps: bump grep to 0.2.13 2023-11-26 14:18:53 -05:00
Andrew Gallant
179487aaed grep-0.2.13 2023-11-26 14:18:17 -05:00
Andrew Gallant
b407d62b63 deps: bump grep-searcher to 0.1.12 2023-11-26 14:18:03 -05:00
Andrew Gallant
9bd1e737bc grep-searcher-0.1.12 2023-11-26 14:17:26 -05:00
Andrew Gallant
c12231c621 deps: bump grep-pcre2 to 0.1.7 2023-11-26 14:17:11 -05:00
Andrew Gallant
b0df573834 grep-pcre2-0.1.7 2023-11-26 14:16:46 -05:00
Andrew Gallant
85b2ceecd1 deps: bump grep-regex to 0.1.12 2023-11-26 14:16:31 -05:00
Andrew Gallant
fee7ac79f1 grep-regex-0.1.12 2023-11-26 14:15:44 -05:00
Andrew Gallant
54d5540c10 deps: bump grep-matcher to 0.1.7 2023-11-26 14:15:34 -05:00
Andrew Gallant
d0251c77fe grep-matcher-0.1.7 2023-11-26 14:13:54 -05:00
Andrew Gallant
6aa5993d4b deps: bump grep-cli to 0.1.10 2023-11-26 14:13:40 -05:00
Andrew Gallant
6f78d211bf grep-cli-0.1.10 2023-11-26 14:13:03 -05:00
Andrew Gallant
51aa339830 deps: bump ignore to 0.4.21 2023-11-26 14:12:55 -05:00
Andrew Gallant
381c521d02 ignore-0.4.21 2023-11-26 14:12:16 -05:00
Andrew Gallant
57495db10e deps: bump globset to 0.4.14 2023-11-26 14:11:43 -05:00
Andrew Gallant
47e37175ca globset-0.4.14 2023-11-26 14:11:05 -05:00
Andrew Gallant
8697946718 release/doc: set date in man page 2023-11-26 14:10:07 -05:00
Andrew Gallant
8058859701 changelog: add link for reporting perf improvements/regressions 2023-11-26 14:05:23 -05:00
Andrew Gallant
e9ff90c8ff changelog: updates for the 14.0.0 release 2023-11-26 14:03:59 -05:00
Andrew Gallant
bf9f74ea5b doc: progress 2023-11-26 13:32:39 -05:00
Andrew Gallant
9b5091b895 deps: bump to memmap2 0.9.0 2023-11-26 13:32:39 -05:00
Andrew Gallant
a4f165e3ab deps: bump everything 2023-11-26 13:32:39 -05:00
Andrew Gallant
d1def67000 deps: bump pcre2 to 0.2.6 2023-11-26 13:32:20 -05:00
Andrew Gallant
56af4d4a74 cli: add simple flag suggestions
We look for similar flag names via Jaccard index on ngrams. In my
experience this tends to work better than Levenshtein or other edit
distance based metrics. Principally because it allows for out-of-order
suggestions. For example, --case-smart will result in a suggestion for
--smart-case, even though the edit distance between them is pretty big.

This is something Clap did for us. I initially thought it wasn't
necessary to add this back in, but I realized it wouldn't be much work
and might actually be helpful to folks.
2023-11-26 09:55:44 -05:00
Andrew Gallant
b0f6645408 ci: remove local deb build-and-publish script
I moved this to GitHub Actions. w00t.
2023-11-25 18:27:52 -05:00
Andrew Gallant
3dbe371fe4 ci: add Debian release build
Previously, we were running 'cargo deb' locally. But the release process
is a little simpler now thanks to GitHub Actions and the 'gh' tool, so I
felt comfortable putting the 'deb' generation in CI.

Now the only real manual part of release asset creation is the M2
release, but that should hopefully be automated once GitHub makes Apple
silicon runners available for free.
2023-11-25 18:20:05 -05:00
Andrew Gallant
30d06b3b4c changelog: note that --no-ignore --ignore-vcs works as expected
This fix fell out of the move off of Clap.

Closes #1376
2023-11-25 15:03:53 -05:00
Andrew Gallant
6a055d922c doc: clarify errors for -z/--search-zip
Fixes #1622
2023-11-25 15:03:53 -05:00
Andrew Gallant
e007523229 doc: note the precedence of -t/--type
Fixes #1635
2023-11-25 15:03:53 -05:00
Andrew Gallant
88353c80da doc: be more explicit about ripgrep's behavior when printing to a tty
Fixes #1709
2023-11-25 15:03:53 -05:00
Andrew Gallant
cd3bcce42d changelog: mention M2 binaries for releases
Fixes #1737
2023-11-25 15:03:53 -05:00
Andrew Gallant
1ea3552f2d changelog: mention perf improvement for inner literals
Fixes #1746
2023-11-25 15:03:53 -05:00
Andrew Gallant
9ed7565fcb cli: error when searching for NUL
Basically, unless the -a/--text flag is given, it is generally always an
error to search for an explicit NUL byte because the binary detection
will prevent it from matching.

Fixes #1838
2023-11-25 15:03:53 -05:00
Andrew Gallant
7bb9f35d2d doc: clarify that --pre can accept any kind of path
Fixes #2046
2023-11-25 15:03:53 -05:00
Andrew Gallant
b138d5740a log: add message about number of threads used
Closes #2122
2023-11-25 15:03:53 -05:00
Andrew Gallant
3f0c8c2900 doc: improve -r/--replace docs
It looks like this was done a while ago, but it didn't get added to the
CHANGELOG or connected with the corresponding issue.

Fixes #2201
2023-11-25 15:03:53 -05:00
Andrew Gallant
0e6e9417f1 log: add message when a binary file is skipped
The way we do this is a little hokey but I believe it is correct.

Fixes #2246
2023-11-25 15:03:53 -05:00
Andrew Gallant
fded2a5fe1 doc: add cargo-binstall instructions
Closes #2298
2023-11-25 15:03:53 -05:00
Andrew Gallant
e14eeb288f doc: mention that --stats is always implied by --json
Fixes #2337
2023-11-25 15:03:53 -05:00
Andrew Gallant
1cbcefddc9 doc: add more warnings about --vimgrep
The --vimgrep flag has some severe footguns when using a pattern that
matches very frequently. We had already written some docs to warn about
that, but now we also include a suggestion to avoid exorbitant heap
usage.

Closes #2505
2023-11-25 15:03:53 -05:00
Andrew Gallant
4fec9ffca8 doc: make the opening line a bit more descriptive
This mimics what was written in the man page.

Closes #2401
2023-11-25 15:03:53 -05:00
Andrew Gallant
00225a035b doc: improve --sort=path
This clarifies that the paths are not sorted in a fully lexicographic
order, but that / is treated specially.

Fixes #2418
2023-11-25 15:03:53 -05:00
Andrew Gallant
286de9564e cli: rejigger --version to include PCRE2 info
This adds info about whether PCRE2 is available or not to the output of
--version. Essentially, --version now subsumes --pcre2-version, although
we do retain the former because it (usefully) emits an exit code based
on whether PCRE2 is available or not.

Closes #2645
2023-11-25 15:03:53 -05:00
Andrew Gallant
038524a580 printer: trim before applying max column windowing
Previously, we were applying the -M/--max-columns flag *before* triming
prefix ASCII whitespace. But this doesn't make a whole lot of sense. We
should be trimming first, but the result of trimming is ultimately what
we'll be printing and that's what -M/--max-columns should be applied to.

Fixes #2458
2023-11-25 15:03:53 -05:00
Andrew Gallant
8f9557d183 changelog: mention shell completion generation feature
Closes #2425
2023-11-25 15:03:53 -05:00
Andrew Gallant
58e7d2ea63 doc: add docs about .ignore/.rgignore in parent directories
Closes #2479
2023-11-25 15:03:53 -05:00
Andrew Gallant
b7df9f8caa changelog: mention --field-match-separator bug fix
This was probably fixed in the migration off of Clap.

Closes #2519
2023-11-25 15:03:53 -05:00
Andrew Gallant
ebb986e767 logging: show heuristic information and decision
When one does not provide any paths to ripgrep to search, it has to
guess between searching stdin and the current working directory. It is
possible for this guess to be wrong, and having the heuristics and the
choice in the debug logs is useful for diagnosing this.

The failure mode here is still pretty bad because you need to know to
reach for the `--debug` flag in the first place. Namely, the typical
failure mode is that ripgrep tries to search stdin while the intent is
for it to search the current working directory, and thus likely blocking
forever waiting for data on stdin.

(Arguably this is a problem with the process architecture that invokes
ripgrep. It shouldn't give ripgrep an open stdin handle that isn't
closed.)

Closes #2524
2023-11-25 15:03:53 -05:00
Andrew Gallant
a2907db2de faq: update donation section to mention sponsorships 2023-11-21 19:05:58 -05:00
Andrew Gallant
470ad1d072 faq: rewrite the section on shell completions 2023-11-21 19:02:07 -05:00
Tavian Barnes
6d7550d58e ignore: Avoid contention on num_pending
Previously, every worker would increment the shared num_pending count on
every new work item, and decrement it after finishing them, leading to
lots of contention.  Now, we only track the number of workers actively
running, so there is no contention except when workers go to sleep or
wake up.

Closes #2642
2023-11-21 18:39:32 -05:00
Andrew Gallant
af55fc2b38 cli: make -d a short flag for --max-depth
Interestingly, ripgrep now only has two available ASCII letter short
flags remaining: -k and -y.

Closes #2643, Closes #2644
2023-11-21 18:39:32 -05:00
Andrew Gallant
3d2f49f6fe changelog: --pretty now behaves more sensibly
This actually just kind of fell out of the migration off of Clap as a
result of treating `-p/--pretty` more rigorously as an alias for
`--line-number --heading --color always`.

Fixes #2381, Closes #2637
2023-11-21 18:39:32 -05:00
Andrew Gallant
50b2472438 ci: strip release binaries on macOS
We were purportedly doing this already, but actually weren't because of
confusion in the `if` condition.

Closes #2636
2023-11-21 18:39:32 -05:00
Andrew Gallant
ae2a09915f printer: drop dependency on base64 crate
Instead, we just roll our own. A slow version of this is pretty simple
to do, and that's what we write here. The `base64` crate supports a lot
more functionality and is quite fast, but we care about neither of those
things for this particular aspect of ripgrep. (base64 is only used for
non-UTF-8 data or file paths, which are both quite rare.)
2023-11-21 18:39:32 -05:00
Andrew Gallant
9c84575229 printer: drop dependency on serde_derive
As suggested by @epage[1].

Ad hoc timings on my i7-12900K:

    before cargo build: 4.91s
    before cargo build release: 8.05s
    after cargo build: 4.69s
    after cargo build release: 7.83s

... pretty underwhelming if you ask me. Ah well. And on my M2 mac mini:

    before cargo build: 6.18s
    before cargo build release: 14.50s
    after cargo build: 5.52s
    after cargo build release: 13.44s

Still kind of underwhelming, but definitely better. It shaves a full
second off of compile times in release mode. I went back to my
i7-12900K, but passed `-j1` to `cargo build` to force single threaded
mode:

    before cargo build: 19.44s
    before cargo build release: 50.64s
    after cargo build: 16.76s
    after cargo build release: 48.00s

Which seems pretty consistent with the modest improvements above.

Looking at `cargo build --timings`, the beefiest chunk of time is spent
in compiling `regex-automata`, by far. This is fine because it's core
functionality. I wish a fast general purpose regex engine with its
internals exposed as a separately versioned library didn't require so
much code... Blech.

[1]: https://old.reddit.com/r/rust/comments/17rd8ww/faster_compilation_with_the_parallel_frontend_in/k8igjlg/
2023-11-21 18:39:32 -05:00
Andrew Gallant
cddb5f57f8 printer: rejigger how we use serde_derive
The idea is that by bringing derives in via serde's optional feature, it
was inhibiting compilation speed[1]. We try to fix that by depending on
`serde_derive` as a distinct dependency.

It does seem to improve overall compilation time, but only by about 0.5
seconds. With that said, my machine has a lot of cores, so it's possible
this will help more on less powerful CPUs.

[1]: https://old.reddit.com/r/rust/comments/17rd8ww/faster_compilation_with_the_parallel_frontend_in/k8igjlg/
2023-11-21 18:39:32 -05:00
Andrew Gallant
5dc424d302 doc: scrub mentions of asciidoc/asciidoctor
This optional dependency is now finally dropped. So ends a long journey
of trying to generate man pages in a lightweight and dependable way. The
only thing I could figure out how to make work reliably was to just
learn how to write roff myself. Yay.
2023-11-21 18:39:32 -05:00
Andrew Gallant
040d8f2171 ci: improve docs for manual build-and-publish scripts 2023-11-21 18:39:32 -05:00
Andrew Gallant
c81caa673b core: fix file separator bug
I introduced a regression in the migration off of the clap by having
both the buffer writer and the printer be responsible for printing file
separators in multi-threaded search. The buffer writer owns that
responsibility in multi-threaded search.
2023-11-21 18:39:32 -05:00
Andrew Gallant
082245dadb cli: replace clap with lexopt and supporting code
ripgrep began it's life with docopt for argument parsing. Then it moved
to Clap and stayed there for a number of years. Clap has served ripgrep
well, and it probably could continue to serve ripgrep well, but I ended
up deciding to move off of it.

Why?

The first time I had the thought of moving off of Clap was during the
2->3->4 transition. I thought the 3.x and 4.x releases were great, but
for me, it ended up moving a little too quickly. Since the release of
4.x was telegraphed around when 3.x came out, I decided to just hold off
and wait to migrate to 4.x instead of doing a 3.x migration followed
shortly by another 4.x migration. Of course, I just never ended up doing
the migration at all. I never got around to it and there just wasn't a
compelling reason for me to upgrade. While I never investigated it, I
saw an upgrade as a non-trivial amount of work in part because I didn't
encapsulate the usage of Clap enough.

The above is just what got me started thinking about it. It wasn't
enough to get me to move off of it on its own. What ended up pushing me
over the edge was a combination of factors:

* As mentioned above, I didn't want to run on the migration treadmill.
This has proven to not be much of an issue, but at the time of the
2->3->4 releases, I didn't know how long Clap 4.x would be out before a
5.x would come out.
* The release of lexopt[1] caught my eye. IMO, that crate demonstrates
exactly how something new can arrive on the scene and just thoroughly
solve a problem minimalistically. It has the docs, the reasoning, the
simple API, the tests and good judgment. It gets all the weird corner
cases right that Clap also gets right (and is part of why I was
originally attracted to Clap).
* I have an overall desire to reduce the size of my dependency tree. In
part because a smaller dependency tree tends to correlate with better
compile times, but also in part because it reduces my reliance and trust
on others. It lets me be the "master" of ripgrep's destiny by reducing
the amount of behavior that is the result of someone else's decision
(whether good or bad).
* I perceived that Clap solves a more general problem than what I
actually need solved. Despite the vast number of flags that ripgrep has,
its requirements are actually pretty simple. We just need simple
switches and flags that support one value. No multi-value flags. No
sub-commands. And probably a lot of other functionality that Clap has
that makes it so flexible for so many different use cases. (I'm being
hand wavy on the last point.)

With all that said, perhaps most importantly, the future of ripgrep
possibly demands a more flexible CLI argument parser. In today's world,
I would really like, for example, flags like `--type` and `--type-not`
to be able to accumulate their repeated values into a single sequence
while respecting the order they appear on the CLI. For example, prior
to this migration, `rg regex-automata -Tlock -ttoml` would not return
results in `Cargo.lock` in this repository because the `-Tlock` always
took priority even though `-ttoml` appeared after it. But with this
migration, `-ttoml` now correctly overrides `-Tlock`. We would like to
do similar things for `-g/--glob` and `--iglob` and potentially even
now introduce a `-G/--glob-not` flag instead of requiring users to use
`!` to negate a glob. (Which I had done originally to work-around this
problem.) And some day, I'd like to add some kind of boolean matching to
ripgrep perhaps similar to how `git grep` does it. (Although I haven't
thought too carefully on a design yet.) In order to do that, I perceive
it would be difficult to implement correctly in Clap.

I believe that this last point is possible to implement correctly in
Clap 2.x, although it is awkward to do so. I have not looked closely
enough at the Clap 4.x API to know whether it's still possible there. In
any case, these were enough reasons to move off of Clap and own more of
the argument parsing process myself.

This did require a few things:

* I had to write my own logic for how arguments are combined into one
single state object. Of course, I wanted this. This was part of the
upside. But it's still code I didn't have to write for Clap.
* I had to write my own shell completion generator.
* I had to write my own `-h/--help` output generator.
* I also had to write my own man page generator. Well, I had to do this
with Clap 2.x too, although my understanding is that Clap 4.x supports
this. With that said, without having tried it, my guess is that I
probably wouldn't have liked the output it generated because I
ultimately had to write most of the roff by hand myself to get the man
page I wanted. (This also had the benefit of dropping the build
dependency on asciidoc/asciidoctor.)

While this is definitely a fair bit of extra work, it overall only cost
me a couple days. IMO, that's a good trade off given that this code is
unlikely to change again in any substantial way. And it should also
allow for more flexible semantics going forward.

Fixes #884, Fixes #1648, Fixes #1701, Fixes #1814, Fixes #1966

[1]: https://docs.rs/lexopt/0.3.0/lexopt/index.html
2023-11-20 23:51:53 -05:00
Andrew Gallant
c33f623719 cargo: explicitly configure musl to be statically linked
It looks like the musl target will, at some point, default to be
dynamically linked. This config knob should make it so that it's always
statically linked.

Ref https://github.com/rust-lang/compiler-team/issues/422
Ref https://github.com/rust-lang/compiler-team/issues/422#issuecomment-812135847
2023-11-20 23:51:53 -05:00
Jonas Platte
824778c009 globset: add GlobSet::builder
This avoids needing to import and call GlobSetBuilder::new explicitly.

Closes #2635
2023-11-20 23:51:53 -05:00
Kento Okamoto
922bad2b92 ignore: improve 'excludesFile' parsing
This permits the value to be surrounded in double quotes. It's still not
perfect, but probably better than it was. Getting this to be more
correct will likely require writing (or using) a real parser, which I'm
not particularly incliend to do at present.

Fixes #2392, Closes #2629
2023-11-20 23:51:53 -05:00
Andrew Gallant
538ba956dc deps: bump regex and regex-automata 2023-11-20 23:51:53 -05:00
Andrew Gallant
443c057042 deps: bump regex, regex-automata and regex-syntax 2023-11-20 23:51:53 -05:00
Andrew Gallant
5b88515faf build: a bit of clean-up
This does just a smidge of polishing in the build script source code.
2023-11-20 23:51:53 -05:00
Andrew Gallant
92c81b1225 core: switch to anyhow
This commit adds `anyhow` as a dependency and switches over to it from
Box<dyn Error>.

It actually looks like I've kept all of my errors rather shallow, such
that we don't get a huge benefit from anyhow at present. But now that
anyhow is in use, I expect to use its "context" feature more going
forward.
2023-11-20 23:51:53 -05:00
Tavian Barnes
53679e4c43 ignore: simplify the work-stealing strategy
There's no particular reason for this change. I happened to be looking
at the code again and realized that stealing from your left neighbour
or your right neighbour shouldn't make a difference (and indeed perf is
the same in my benchmarks).

Closes #2624
2023-11-20 23:51:53 -05:00
Andrew Gallant
8b766a2522 ripgrep: disable hyperlinks by default
As a result of discussion in #2611, it seems prudent to disable
hyperlinks by default. Ideally they would be enabled, but it looks like
some environments may barf on them. Since this is the first release with
hyperlink support, it makes sense to me at least to make users opt into
them. This does not preclude enabling them by default in future
releases.
2023-11-20 23:51:53 -05:00
Andrew Gallant
c21302b409 regex: tweak inner literal heuristic
Previously, we had logic to skip our own inner literal optimization if
the regex itself was already (likely) accelerated. It turns out that the
presence of a Unicode word boundary can defeat acceleration to a point.
It's likely enough that even if the underlying regex is accelerated, it
would be prudent to do our own inner literal optimization if the pattern
has a Unicode word boundary.

Normally a Unicode word boundary doesn't defeat literal optimizations,
since even the slower engines can make use of *prefix* literal
optimizations. But a regex can be accelerated via its own inner or
suffix literal optimizations, and those require the use of a DFA (or
lazy DFA). Since DFAs crap out on haystacks that contain a non-ASCII
Unicode scalar value when the regex contains a Unicode word boundary, it
follows that an "accelerated" can still wind up being quite slow.

(An "accelerated" regex can also slow down because of restrictions on
avoiding quadratic behavior, but I believe this happens less frequently
and is not as severe as the slow down as a result of Unicode word
boundaries. Namely, avoiding quadratic behavior just means giving up on
the inner literal optimization for a single search. In which case, the
regex engine can still fall back to a normal forward DFA. That will
definitely be slower than an inner literal optimization done by ripgrep,
but not quite as dramatic as it would be when DFAs can't be used at
all.)
2023-11-20 23:51:53 -05:00
Andrew Gallant
8a5b81716a deps: update dependencies
Specifically, regex-syntax 0.8.1 has this fix:
f082244720
2023-11-20 23:51:53 -05:00
Andrew Gallant
7099e174ac cargo: remove dependency patches
I'm too lazy to fixup old commits.
2023-10-09 20:29:52 -04:00
Andrew Gallant
dd810779d4 changelog: add another note about -w/--word-regexp bugs
This was fixed a few commits ago when we updated to regex-automata 0.4
(regex 1.10).

Fixes #2623
2023-10-09 20:29:52 -04:00
Andrew Gallant
5011f6e9f1 changelog: add perf bug fix for \b
Like the previous CHANGELOG entry, this marks a bug that was fixed
likely with the introduction of regex 1.9:

    $ hyperfine "rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt" "rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt"
    Benchmark 1: rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt
      Time (mean ± σ):      1.034 s ±  0.011 s    [User: 1.030 s, System: 0.004 s]
      Range (min … max):    1.021 s …  1.053 s    10 runs

    Benchmark 2: rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt
      Time (mean ± σ):       6.3 ms ±   0.3 ms    [User: 4.6 ms, System: 1.6 ms]
      Range (min … max):     5.6 ms …   7.3 ms    343 runs

    Summary
      'rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt' ran
      164.95 ± 7.70 times faster than 'rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt'

This was not fixed by making \b itself faster, but rather, by improving
inner literal extraction. In particular, if the regex doesn't have any
literals extracted, then search time can still be quite slow:

    $ time rg-13.0.0 -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt
    57538

    real    0.427
    user    0.423
    sys     0.003
    maxmem  46 MB
    faults  0
    $ time rg -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt
    57538

    real    0.337
    user    0.333
    sys     0.003
    maxmem  46 MB
    faults  0

But then again, so is grep, because grep doesn't benefit from any
literal optimizations either:

    $ time grep -E -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt
    62396

    real    1.316
    user    1.292
    sys     0.007
    maxmem  13 MB
    faults  7

The count mismatch should probably be investigated.

Fixes #1760
2023-10-09 20:29:52 -04:00
Andrew Gallant
a2799ccb41 changelog: add bug fix for \b
This was probably fixed in a past commit where I bumped the regex engine
to 1.9 (or perhaps more precisely, regex-automata 0.3). But I didn't
track it as fixed at the time.

Fixes #1275
2023-10-09 20:29:52 -04:00
Andrew Gallant
a13b5e0196 deps: update various crates 2023-10-09 20:29:52 -04:00
Andrew Gallant
9626f16757 progress 2023-10-09 20:29:52 -04:00
Andrew Gallant
f7ff34fdf9 searcher: simplify 'replace_bytes' routine
I did this in the course of trying to optimize it. I don't believe I
made it any faster, but the refactoring led to code that I think is
more readable.
2023-10-09 20:29:52 -04:00
Andrew Gallant
b9de003f81 matcher: add a bunch of inline annotations
Many of these functions should be inlineable, but I'm not 100% sure
that they can be inlined without these annotations. We don't want to
force things, but we do try and nudge the compiler in the right
direction.
2023-10-09 20:29:52 -04:00
Andrew Gallant
1659fb9b43 printer: hand-roll decimal formatting
It seems like a trifle, but if the match frequency is high enough, the
allocation+formatting of line numbers (and columns and byte offsets)
starts to matter. We squash that part of the profile in this commit by
doing our own decimal formatting. I speculate that we get a speed-up
from this by avoiding the formatting machinery and also a possible
allocation.

An alternative would be to use the `itoa` crate, and it is indeed
marginally faster in ad hoc benchmarks, but I'm satisfied enough with
this solution.
2023-10-09 20:29:52 -04:00
Andrew Gallant
dd1bc5b898 printer: sprinkle in a few #[inline] annotations
These seem to help when ripgrep emits a lot of output, especially when
the --column flag is used.
2023-10-09 20:29:52 -04:00
Andrew Gallant
c9bfbe1e3d deps: bump regex and regex-automata
This brings in a fix for a bug I found during ad hoc benchmarking:
aa4e4c7120
2023-10-09 20:29:52 -04:00
Andrew Gallant
88524a2b52 core: dedup patterns
ripgrep does not, and likely never will, report which pattern matched.
Because of that, we can dedup the patterns via just their concrete
syntax without any fuss.

This is somewhat of a pathological case because you don't expect the end
user to pass duplicate patterns in general. But if the end user
generated a list of, say, names and did not dedup them, then ripgrep
could end up spending a lot of extra time on those duplicates if there
are many of them. By deduping them explicitly in the application, we
essentially remove their extra cost completely.
2023-10-09 20:29:52 -04:00
Andrew Gallant
9c6732bd26 printer: remove 'subl' alias
It was apparently using a format specific to a particular plugin. I did
know that, but apparently the plugin is not ubiquitous or de facto
standard[1]. Thus, including it I think just leads to more confusion. We
definitely do not want to be in the business of bundling aliases for
every conceivable plugin to different editors, so just drop it. We
expose the ability to write your own format for exactly this sort of
reason.

[1]: https://github.com/BurntSushi/ripgrep/discussions/2611#discussioncomment-7138302
2023-10-09 20:29:52 -04:00
Andrew Gallant
392bb0944a core: polish the core of ripgrep
This I believe finishes are quest to do mechanical updates to ripgrep's
style, bringing it in line with my current practice (loosely speaking).
2023-10-09 20:29:52 -04:00
Andrew Gallant
90b849912f deps: bump what we can 2023-10-09 20:29:52 -04:00
Andrew Gallant
6d17b3ed68 deps: drop thread_local, lazy_static and once_cell
This is largely made possible by the addition of std::sync::OnceLock to
the standard library, and the memory pool available in regex-automata.
2023-10-09 20:29:52 -04:00
Andrew Gallant
f16ea0812d ignore: polish
Like previous commits, we do a bit of polishing and bring the style up
to my current practice.
2023-10-09 20:29:52 -04:00
Andrew Gallant
be9e308999 globset: use a Pool from regex-automata
In the time before, we just used a RegexSet from the regex crate. That
allocated unconditionally, so there was nothing we could do and it
didn't expose any APIs to reuse that memory. But now that we're using
the lower level regex-automata, we can reuse a PatternSet.

Ideally we would just provide a way for the caller to build a PatternSet
(perhaps via an opaque type) so that we don't have to shuffle data into
a PatternSet and then back into the caller's `Vec<usize>`. But this at
least avoids allocating for every search.
2023-10-09 20:29:52 -04:00
Andrew Gallant
d53b7310ee searcher: polish
This updates some dependencies and brings code style in line with my
current practice.
2023-10-09 20:29:52 -04:00
Andrew Gallant
e30bbb8cff grep: update to the 2021 edition 2023-10-09 20:29:52 -04:00
Andrew Gallant
7f45640401 globset: polishing
This brings the code in line with my current style. It also inlines the
dozen or so lines of code for FNV hashing instead of bringing in a
micro-crate for it. Finally, it drops the dependency on regex in favor
of using regex-syntax and regex-automata directly.
2023-10-09 20:29:52 -04:00
Andrew Gallant
0951820f63 core: doc and logging touchups 2023-10-09 20:29:52 -04:00
Lucas Trzesniewski
c3e85f2b44 printer: fix a few issues in the hyperlink docs
Closes #2612
2023-10-09 20:29:52 -04:00
Andrew Gallant
3ad7a0d95e crates: remove hard-coded links
And use rustdoc's native intra-crate links. So much nicer.
2023-10-09 20:29:52 -04:00
Andrew Gallant
82d3183a04 regex: some minor polish
I think I already did a clean-up of this crate when I moved it to regex
1.9, so the polish here is very minor.
2023-10-09 20:29:52 -04:00
Andrew Gallant
798f8981eb pcre2: small polishing 2023-10-09 20:29:52 -04:00
Andrew Gallant
96f01b92a0 matcher: polish the grep-matcher crate
Not much here. Just updating to reflect my current style and bringing
the crate to the 2021 edition.
2023-10-09 20:29:52 -04:00
Linda_pp
abfa65c2c1 ignore/types: add *.sarif for SARIF format files
[SARIF] is a format for reporting static analysis results. It is [used
by GitHub CodeQL][GH] for example.

Here are some samples from Microsoft's VSCode extension:

https://github.com/microsoft/sarif-vscode-extension/tree/main/samples

The SARIF format is built on top of JSON.

[SARIF]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html
[GH]: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning

PR #2620
2023-10-05 13:23:29 -04:00
Andrew Gallant
f608d4d9b3 hyperlink: rejigger how hyperlinks work
This essentially takes the work done in #2483 and does a bit of a
facelift. A brief summary:

* We reduce the hyperlink API we expose to just the format, a
  configuration and an environment.
* We move buffer management into a hyperlink-specific interpolator.
* We expand the documentation on --hyperlink-format.
* We rewrite the hyperlink format parser to be a simple state machine
  with support for escaping '{{' and '}}'.
* We remove the 'gethostname' dependency and instead insist on the
  caller to provide the hostname. (So grep-printer doesn't get it
  itself, but the application will.) Similarly for the WSL prefix.
* Probably some other things.

Overall, the general structure of #2483 was kept. The biggest change is
probably requiring the caller to pass in things like a hostname instead
of having the crate do it. I did this for a couple reasons:

1. I feel uncomfortable with code deep inside the printing logic
   reaching out into the environment to assume responsibility for
   retrieving the hostname. This feels more like an application-level
   responsibility. Arguably, path canonicalization falls into this same
   bucket, but it is more difficult to rip that out. (And we can do it
   in the future in a backwards compatible fashion I think.)
2. I wanted to permit end users to tell ripgrep about their system's
   hostname in their own way, e.g., by running a custom executable. I
   want this because I know at least for my own use cases, I sometimes
   log into systems using an SSH hostname that is distinct from the
   system's actual hostname (usually because the system is shared in
   some way or changing its hostname is not allowed/practical).

I think that's about it.

Closes #665, Closes #2483
2023-09-25 14:39:54 -04:00
Andrew Gallant
23e21133ba printer: move PathPrinter into grep-printer
I originally did not put PathPrinter into grep-printer because I
considered it somewhat extraneous to what a "grep" program does, and
also that its implementation was rather simple. But now with hyperlink
support, its implementation has grown a smidge more complicated. And
more importantly, its existence required exposing a lot more of the
hyperlink guts. Without it, we can keep things like HyperlinkPath and
HyperlinkSpan completely private.

We can now also keep `PrinterPath` completely private as well. And this
is a breaking change.
2023-09-25 14:39:54 -04:00
Andrew Gallant
09905560ff printer: clean-up
Like a previous commit did for the grep-cli crate, this does some
polishing to the grep-printer crate. We aren't able to achieve as much
as we did with grep-cli, but we at least eliminate all rust-analyzer
lints and group imports in the way I've been doing recently.

Next we'll start doing some more invasive changes.
2023-09-25 14:39:54 -04:00
Andrew Gallant
25a7145c79 cli: add new 'hostname' function
This will enable us to query for the current system's hostname in both
Unix and Windows environments.

We could have pulled in the 'gethostname' crate for this, but:

1. I'm not a huge fan of micro-crates.
2. The 'gethostname' crate panics if an error occurs. (Which, to be
fair, an error should never occur, but it seems plausible on borked
systems? ripgrep runs in a lot of places, so I'd rather not take the
chance of a panic bringing down ripgrep for an optional convenience
feature.)
3. The 'gethostname' crate uses the 'windows-targets' crate from
Microsoft. This is arguably the "right" thing to do, but ripgrep
doesn't use them yet and they appear high-churn.

So I just added a safe wrapper to do this to winapi-util[1] and then
inlined the Unix version here. This brings in no extra dependencies and
the routine is fallible so that callers can recover from potentially
strange failures.

[1]: https://github.com/BurntSushi/winapi-util/pull/14
2023-09-25 14:39:54 -04:00
Andrew Gallant
19a08bee8a cli: clean-up crate
This does a variety of polishing.

1. Deprecate the tty methods in favor of std's IsTerminal trait.
2. Trim down un-needed dependencies.
3. Use bstr to implement escaping.
4. Various aesthetic polishing.

I'm doing this as prep work before adding more to this crate. And as
part of a general effort toward reducing ripgrep's dependencies.
2023-09-25 14:39:54 -04:00
Lucas Trzesniewski
1a50324013 printer: add hyperlinks
This commit represents the initial work to get hyperlinks working and
was submitted as part of PR #2483. Subsequent commits largely retain the
functionality and structure of the hyperlink support added here, but
rejigger some things around.
2023-09-25 14:39:54 -04:00
Andrew Gallant
86ef683308 deps: update everything
Notably, this includes termcolor 1.3, which comes with hyperlink
support.
2023-09-20 11:52:42 -04:00
Tavian Barnes
d938e955af ignore: use work-stealing stack instead of Arc<Mutex<Vec<_>>>
This represents yet another iteration on how `ignore` enqueues and
distributes work in parallel. The original implementation used a
multi-producer/multi-consumer thread safe queue from crossbeam. At some
point, I migrated to a simple `Arc<Mutex<Vec<_>>>` and treated it as a
stack so that we did depth first traversal. This helped with memory
usage in very wide directories.

But it turns out that a naive stack-behind-a-mutex can be quite a bit
slower than something that's a little smarter, such as a work-stealing
stack used in this commit. My hypothesis for why this helps is that
without the stealing component, work distribution can get stuck in
sub-optimal configurations that depend on which directory entries get
assigned to a particular worker. It's likely that this can result in
some workers getting "more" work than others, just by chance, and thus
remain idle. But the work-stealing approach heads that off.

This does re-introduce a dependency on parts of crossbeam which is kind
of a bummer, but it's carrying its weight for now.

Closes #1823, Closes #2591
Ref https://github.com/sharkdp/fd/issues/28
2023-09-20 11:52:42 -04:00
Thilo Uttendorfer
cad1f5fae2 ignore: fix filtering when searching subdirectories
When searching subdirectories the path was not correctly built and
included duplicate parts. This fix will remove the duplicate part if
possible.

Fixes #1757, Closes #2295
2023-09-20 11:52:42 -04:00
dana
2198bd92fa github: convert bug-report issue template to issue form
Trying this to see how well it works.

PR #2560
2023-09-18 11:07:46 -04:00
Andrew Gallant
a4387ed491 deps: bump to aho-corasick 1.1.0
This brings in aarch64 SIMD support for Teddy[1]. In effect, it means
searches that are multiple (but a small number of) literals extracted
will likely get much faster on aarch64 (i.e., Apple silicon). For
example, from the PR, on my M2 mac mini:

    $ time rg-before-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055

    real    8.196
    user    7.726
    sys     0.469
    maxmem  5728 MB
    faults  17

    $ time rg-after-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055

    real    1.127
    user    0.701
    sys     0.425
    maxmem  4880 MB
    faults  13

w00t.

[1]: https://github.com/BurntSushi/aho-corasick/pull/129
2023-09-18 09:35:06 -04:00
Andrew Gallant
d2a409f89f deps: bump to memchr 2.6.3
This brings in a fix for line counting when SIMD isn't available[1].

[1]: https://github.com/BurntSushi/memchr/pull/137
2023-09-02 14:40:45 -04:00
Andrew Gallant
6cdb99ea61 deps: drop bytecount in favor of memchr_iter(..).count()
As of the memchr 2.6 release, its Iterator::count method is specialized
to only count the number of occurrences instead of finding the offset of
each occurrence. This replaces ripgrep's use of the bytecount crate.
While micro-benchmarks suggest that memchr's method has better
throughput than bytecount, it turned out to be an illusion. Namely, on a
~13GB haystack prior to this change:

    $ time rg-bytecount 'You killed my friend, my best friend, my lifelong friend!' OpenSubtitles2018.raw.en --line-number
    441450441:- You killed my friend, my best friend, my lifelong friend!

    real    1.473
    user    1.186
    sys     0.286
    maxmem  12512 MB
    faults  0

And then after:

    $ time rg 'You killed my friend, my best friend, my lifelong friend!' OpenSubtitles2018.raw.en --line-number
    441450441:- You killed my friend, my best friend, my lifelong friend!

    real    1.532
    user    1.280
    sys     0.250
    maxmem  12512 MB
    faults  0

But perf is just about in the same ballpark. That's good enough for me
at the moment in order to drop the extra dependency.

I did this because the marginal cost of adding the Iterator::count()
specialization to memchr was extremely small.
2023-09-02 12:25:34 -04:00
Andrew Gallant
551ad3bada deps: update bstr 2023-09-02 12:15:15 -04:00
Andrew Gallant
8856f72df5 deps: update the regex family of crates 2023-09-02 12:14:50 -04:00
Yochem van Rosmalen
d596f6ebd0 ignore/types: add *.vsh to V type
PR #2604
2023-08-31 08:51:07 -04:00
Christian Vallentin
6cd9479634 ignore: implement FusedIterator for Walk
PR #2567
2023-08-28 22:55:19 -04:00
Andrew Gallant
3bfa125b2e ci: replace mips with powerpc64, aarch64 and s390x
We drop our MIPS target because it no longer works.[1] We were
previously using it as a means of testing ripgrep in a big endian
environment. So to achieve that without MIPS, we test on powerpc64 and
s390x. (No particular reason to do both, but why not.)

We also add aarch64 as a proxy for at least ensuring everything works
for the same architecture as Apple silicon. It's not a guarantee that
everything works, but it seems better than nothing until we can actually
test Apple silicon in CI.

[1]: c788378d6f
2023-08-28 22:45:46 -04:00
Andrew Gallant
51765f2f4c ignore: apply rustfmt
I believe this happened because rustfmt now knows how to format `let ...
else` constructs.
2023-08-28 20:09:26 -04:00
Andrew Gallant
67abd49678 deps: bump everything else 2023-08-28 20:00:41 -04:00
Andrew Gallant
a7fe296772 deps: bump regex, regex-automata and regex-syntax 2023-08-28 19:59:09 -04:00
Andrew Gallant
f75991538b deps: bump memchr to 2.6.0
This in particular brings in a PR[1] that provides huge speedups on
aarch64 (e.g., Apple silicon).

[1]: https://github.com/BurntSushi/memchr/pull/129
2023-08-28 19:56:59 -04:00
mataha
962d47e6a1 ignore/types: add Prolog file types
This improves the Prolog file type rules.

* `.pl` is the most common extension in the wild, though `.pro` is
   preferred in places where file extension may clash with Perl[1].
* `.P` is used for compatibility with XSB Prolog dialect[2].

PR #2590

[1]: https://www.swi-prolog.org/pldoc/man?section=fileext
[2]: https://www.swi-prolog.org/pldoc/man?section=xsb-source
2023-08-21 10:53:56 -04:00
mataha
19b6a45abb ignore/types: tweak Gradle file types
This PR extends Gradle file types with the following:

 - Kotlin DSL buildscripts (`*.gradle.kts`)
 - Gradle Java properties (`gradle.properties`)
 - wrapper files (`gradle-wrapper.*`)
 - wrapper scripts (`gradlew`, `gradlew.bat`)

PR #2587
2023-08-20 18:49:02 -04:00
Andrew Gallant
c51790b56d deps: update everything 2023-08-15 11:09:46 -04:00
Andrew Gallant
2af3734e0c deps: update aho-corasick
This brings in [1,2], which improves memory usage substantially when
Aho-Corasick is used.

[1]: https://github.com/BurntSushi/aho-corasick/pull/120
[2]: https://github.com/BurntSushi/aho-corasick/pull/121
2023-08-15 11:08:41 -04:00
Andrew Gallant
61733f6378 globset-0.4.13 2023-08-05 09:34:36 -04:00
Andrew Gallant
7227e94ce5 globset: use non-capture groups in regex transform
We currently implement globs by converting them to regexes, and in doing
so, sometimes use grouping. In all but one case, we used non-capturing
groups. But for alternations, we used capturing groups, which was likely
just an oversight. We don't make use of capture groups at all, and while
they usually don't have any overhead, they lead to weird cases like this
one: https://github.com/rust-lang/regex/issues/1059

That particular issue is also a bug in the regex crate itself, which is
fixed in https://github.com/rust-lang/regex/pull/1062. Note though that
the bug fix in the regex crate is required. Even with this patch to
globset, memory usage is reduced (by about half in rust-lang/regex#1059)
but is not returned to where it was prior to the regex 1.9 release.
2023-08-05 09:33:57 -04:00
Andrew Gallant
341a19e0d0 regex: fix fast path for -w/--word-regexp flag (#2576)
It turns out our fast path for -w/--word-regexp wasn't quite correct in
some cases. Namely, we use `(?m:^|\W)(<original-regex>)(?m:\W|$)` as the
implementation of -w/--word-regexp since `\b(<original-regex>)\b` has
some unintuitive results in certain cases, specifically when
<original-regex> matches non-word characters at match boundaries.

The problem is that using this formulation means that you need to
extract the capture group around <original-regex> to find the "real"
match, since the surrounding (^|\W) and (\W|$) aren't part of the match.
This is fine, but the capture group engine is usually slow, so we have a
fast path where we try to deduce the correct match boundary after an
initial match (before running capture groups). The problem is that doing
this is rather tricky because it's hard to know, in general, whether the
`^` or the `\W` matched.

This still doesn't seem quite right overall, but we at least fix one
more case.

Fixes #2574
2023-07-31 08:51:09 -04:00
Vidar
fed4fea217 ignore/types: add csproj
Supports the .NET C# Project file extension.

PR #2575
2023-07-31 07:08:44 -04:00
Andrew Gallant
053a1669bb globset-0.4.12 2023-07-26 19:51:38 -04:00
David Tolnay
31d3f16254 api: impl Deserialize for GlobSet
PR #2569
2023-07-26 19:51:22 -04:00
Andrew Gallant
304a60e8e9 grep-cli-0.1.9 2023-07-18 13:25:23 -04:00
Andrew Gallant
1d35859861 globset-0.4.11 2023-07-12 12:58:43 -04:00
mataha
601e122e9f ignore/types: add Windows Command Prompt files
This PR adds `*.bat` and `*.cmd` file types.

In doing so, it makes a distinction between batch files (old standard
from the MS-DOS era) and command scripts (new flavor - can operate on
batch files, although `*.cmd` is preferred for various reasons, the
main one being batch files will set `ERRORLEVEL` following inconsistent
MS-DOS style rules[1]).

PR #2556

[1]: https://groups.google.com/g/microsoft.public.win2000.cmdprompt.admin/c/XHeUq8oe2wk/m/LIEViGNmkK0J#i106
2023-07-10 15:58:17 -04:00
Andrew Gallant
efb2e8ce1e ci/release: use latest OS versions 2023-07-09 10:14:03 -04:00
xEgoist
8d464e5c78 ci/release: add sha256 sums to release artifacts
Fixes #1924, Closes #2168
2023-07-09 10:14:03 -04:00
Andrew Gallant
d67809d6c4 github: remove dependabot configuration
This does not seem to have worked at all. For example, there were
Actions being used that were clearly deprecated/archived[1]. But
Dependabot didn't make a peep. So just get rid of it to avoid the false
sense that someone is checking our dependencies for us.

[1]: https://github.com/BurntSushi/ripgrep/pull/2360
2023-07-09 10:14:03 -04:00
nguyenvukhang
6abb962f0d cli: fix non-path sorting behavior
Previously, sorting worked by sorting the parents and then sorting the
children within each parent. This was done during traversal, but it only
works when sorting parents preserves the overall order. This generally
only works for '--sort path' in ascending order.

This commit fixes the rest of the sorting behavior by collecting all of
the paths to search and then sorting them before searching. We only
collect all of the paths when sorting was requested.

Fixes #2243, Closes #2361
2023-07-09 10:14:03 -04:00
Edoardo Pirovano
6d95c130d5 cli: add --stop-on-nonmatch flag
This causes ripgrep to stop searching an individual file after it has
found a non-matching line. But this only occurs after it has found a
matching line.

Fixes #1790, Closes #1930
2023-07-08 18:52:42 -04:00
Garrett Thornburg
4782ebd5e0 core: lock stdout before printing an error message to stderr
Adds a new eprintln_locked macro which locks STDOUT before logging
to STDERR. This patch also replaces instances of eprintln with
eprintln_locked to avoid interleaving lines.

Fixes #1941, Closes #1968
2023-07-08 18:52:42 -04:00
piegames
4993d29a16 globset: add 'escape' routine
Fixes #2060, Closes #2061
2023-07-08 18:52:42 -04:00
Seth Stadick
23adbd6795 cli: force binary existance check
Previously, we were only doing a binary existence check on Windows. And
in fact, the main point there wasn't binary existence, but ensuring we
didn't accidentally resolve a binary name relative to the CWD, which
could result in executing a program one didn't mean to run.

However, it is useful to be able to check whether a binary exists on any
platform when associating a glob with a binary. If the binary doesn't
exist, then the association can fail eagerly and let some other glob
apply.

Closes #1946
2023-07-08 18:52:42 -04:00
Kevin Svetlitski
9df8ab42b1 cargo: reduce the size of the .crate file published to crates.io
None of this stuff is needed for the main ripgrep crate.

Closes #1940
2023-07-08 18:52:42 -04:00
Michal Terepeta
cb7501ff11 doc: clarify the comment on Worker.work_done
We call `work_done` only once the work has been actually performed
(otherwise `num_pending` could go to 0 before the actual work is done).

Closes #2039
2023-07-08 18:52:42 -04:00
Kyle Todeschini
3b66f37a31 doc: improve -r/--replace flag syntax docs
Fixes #2108, Closes #2123
2023-07-08 18:52:42 -04:00
Andrew Gallant
3eccb7c363 readme: add 'yum-utils' to RHEL/Centos instructions
Closes #2103
2023-07-08 18:52:42 -04:00
kotborealis
f30a30867e ignore/types: name aliases for file types
We also make py/python, md/markdown and ts/typescript aliases of one
another.

Note that this only introduces aliases at the point where default types
are defined. This just makes them a bit easier to read/write, and also
makes it easier to expose more names that describe the same thing.

Fixes #1857, Closes #1895
2023-07-08 18:52:42 -04:00
Klas Mellbourn
7313dca472 ignore/types: add 'typescript' alias for 'ts'
Closes #2009
2023-07-08 18:52:42 -04:00
Tama McGlinn
99bf2b01dc ignore/types: add Ada filetypes, including gprbuild and alire
*.adb and *.ads are the usual extensions for Ada source code,
and *.gpr indicates a GPRbuild project file used for Ada, and
these days often being combined with alire for package dependency
resolution. Alire stores a bunch of files named alire.toml in
different directories in your (gitignored) cache/dependencies/...

Closes #2013
2023-07-08 18:52:42 -04:00
Juan Francisco Cantero Hurtado
ee1360cc07 ignore/types: add raku extensions to ignore types
Closes #2117
2023-07-08 18:52:42 -04:00
Andrew Gallant
db6bb21a62 windows: attempt to enable long path support for MSVC targets
See the README and comments in the build.rs. Basically, this embeds an
XML file that I guess is a way of setting configuration knobs on
Windows. One of those knobs is enabling long path support. You still
need to enable it in your registry (lol), but this will handle the other
half of it.

Fixes #364, Closes #2049
2023-07-08 18:52:42 -04:00
Andrew Gallant
da7c81fb96 ignore/types: add MDX format to Markdown types
Ref https://mdxjs.com/

Closes #2142
2023-07-08 18:52:42 -04:00
chrispy
a4e3d56de1 ignore/types: add DITA (Darwin Information Typing Architecture)
Closes #2148
2023-07-08 18:52:42 -04:00
Ludi Rehak
7c83b90f95 doc: fix typo
Closes #2153
2023-07-08 18:52:42 -04:00
cuishuang
97b5b7769c doc: fix some typos
Closes #2195
2023-07-08 18:52:42 -04:00
dana
2708f9e81d complete: add extra-verbose support to _rg_types
When the extra-verbose style is set for the types tag, completed types
are displayed along with the patterns they correspond to. This can be
enabled by e.g. adding the following to .zshrc:

  zstyle ':completion:*:rg:*:types' extra-verbose true

This change also makes _rg_types use the actual rg specified on the
command line to look up types, and it fixes a mangled complete-all
style check

Fixes #2195
2023-07-08 18:52:42 -04:00
Richard Sternagel
f3241fd657 cli: '--no-ignore-dot' should also '.rgignore'
Fixes #2198, Closes #2202
2023-07-08 18:52:42 -04:00
Andrew Gallant
cfe357188d ignore/types: fix formatting 2023-07-08 18:52:42 -04:00
edam
792451e331 ignore/types: added V type
V (http://vlang.io) uses '.v' files.

Closes #2302
2023-07-08 18:52:42 -04:00
Andrew Gallant
7dafd58a32 readme: use 'sudo' more consistently
I definitely wonder whether I should just drop 'sudo' from the install
instructions and just rely on the user to "know" to do it. But some
commands legitimately do not require 'sudo', so there are actual
differences. Overall, this feels clearer to me but reasonable people can
disagree.
2023-07-08 18:52:42 -04:00
Andrew Savchenko
b92550b67b readme: add install command for ALT Linux
Closes #2330
2023-07-08 18:52:42 -04:00
Kevin Ushey
383d3b336b doc: add '--hidden' to example configuration
This increases visibility of the fact that hidden files are skipped by
default.

Closes #2356
2023-07-08 18:52:42 -04:00
James McKinney
fc7e634395 ci/release: Use GITHUB_REF_NAME instead of GITHUB_REF
This is a nice quality of life improvement.

Closes #2358
2023-07-08 18:52:42 -04:00
James McKinney
c9584b035b ci/release: use GitHub CLI
The old actions I was using are apparently archived because they make
use of deprecated features (like `set-output`). Sigh.

Closes #2360
2023-07-08 18:52:42 -04:00
Alex Rawson
f34fd5c4b6 globset: introduce option to keep empty alternates
Add a method GlobBuilder::empty_alternates and supporting mechanisms.

Ref #1368
Closes #2369
2023-07-08 18:52:42 -04:00
Jérome Eertmans
d51c6c005a globset: permit deserializing Glob from String
Closes #2386, Closes #2388
2023-07-08 18:52:42 -04:00
Jakub Wilk
ea05881319 readme: fix awkward grammar
Closes #2402
2023-07-08 18:52:42 -04:00
sitiom
1d4e3df19c readme: add winget installation section
Closes #2409
2023-07-08 18:52:42 -04:00
Mark Sisson
0f6181d309 ignore/types: add USD to the default file types
Closes #2432
2023-07-08 18:52:42 -04:00
Sam James
e902e2fef4 ignore/types: add Gentoo eclass type
Eclasses are "ebuild libraries" and generally if you're filtering
for/filtering out an ebuild/eclass, you don't want the other either.

Followup to 4dfea016b9

Closes #2437
2023-07-08 18:52:42 -04:00
angrycandy
07cbfee225 ignore/types: improve Elixir globs
Closes #2450
2023-07-08 18:52:42 -04:00
Andrew Gallant
d675844510 core: don't let context flags override eachother
This matches the behavior of GNU grep which does not ignore
before-context and after-context completely if the context flag is also
provided.

Note that this change wasn't done just to match GNU grep. In this case,
GNU grep has the more sensible behavior.

Fixes #2288, Closes #2451
2023-07-08 18:52:42 -04:00
Andrew Gallant
54e609d657 doc: add another example for the config file
Closes #2453
2023-07-08 18:52:42 -04:00
Misaki
43bbcca06f doc: note '-n' and '-N' override each other
Closes #2460
2023-07-08 18:52:42 -04:00
Eric Arellano
ad9bfdd981 ignore/gitignore: expose gitconfig_excludes_path
I have reservations about this, but it looks useful and doesn't seem
terribly onerous to support. The `ignore` crate will really always need
to have some kind of logic supporting this in some form I think.

Closes #2482
2023-07-08 18:52:42 -04:00
Gal Ofri
36194c2742 test: test that regex inline flags work as intended
This was originally fixed by using non-capturing groups when joining
patterns in crates/core/args.rs, but before that landed, it ended up
getting fixed via a refactor in the course of migrating to regex 1.9.
Namely, it's now fixed by pushing pattern joining down into the regex
layer, so that patterns can be joined in the most effective way
possible.

Still, #2488 contains a useful test, so we bring that in here. The
test actually failed for `rg -e ')('`, since it expected the command to
fail with a syntax error. But my refactor actually causes this command
to succeed. And indeed, #2488 worked around this by special casing a
single pattern. That work-around fixes it for the single pattern case,
but doesn't fix it for the -w or -X or multi-pattern case. So for now,
we're content to leave well enough alone. The only real way to fix this
for real is to parse each regexp individual and verify that each is
valid on its own. It's not clear that doing so is worth it.

Fixes #2480, Closes #2488
2023-07-08 18:52:42 -04:00
Jakub Jirutka
0c1cbd99f3 ignore: tweak regex crate features
This removes most of the Unicode features as they aren't currently
used. We can always add them back later if necessary.

We can avoid the unicode-perl feature by changing `\s` to `[[:space:]]`,
which uses the ASCII-only definition of `\s`. Since we don't expect
non-ASCII whitespace in git config files, this seems okay.

Closes #2502
2023-07-08 18:52:42 -04:00
Jon Parise
96cfc0ed13 ignore/types: add 'graphql' type
GraphQL file extensions: .graphql and .graphqls (schema)

We could also add `.gql`, but perhaps it's less correct to do so. We'll
start conservatively here, and we can always add `.gql` later.

Closes #2439, Closes #2508
2023-07-08 18:52:42 -04:00
mataha
da8ecddce9 cli: make resolve_binary take COM executables into account
When `resolve_binary()` attempts to resolve a path to a program on
Windows while searching for a program in `PATH` without an extension,
`ripgrep` will assume the extension of the file to be `.exe` as it's
the *de facto* standard, which will work most (99.99%) of the time...

...unless the binary is a COM executable (we're on Windows, duh).

Closes #2523
2023-07-08 18:52:42 -04:00
Yifei Teng
545a7dc759 ignore/types: add cml to the default types list
It's used in Fuchsia to mean "component manifest language."[1]

[1]: https://fuchsia.dev/reference/cml?hl=en

Closes #2529
2023-07-08 18:52:42 -04:00
Jonathan Schwender
16f783832e doc: update rust-version in Cargo.toml
The MSRV got bumped a little bit ago, so this is just catchup.

Closes #2539
2023-07-08 18:52:42 -04:00
Andrew Gallant
f4d07b9cbd grep-cli-0.1.8 2023-07-05 17:09:09 -04:00
Andrew Gallant
0b6eccf4d3 ci: try to fix CI 2023-07-05 14:04:29 -04:00
Andrew Gallant
3ac4541e9f regex: remove old inner literal extractor
(It had already been removed from the crate.)
2023-07-05 14:04:29 -04:00
Andrew Gallant
7b72e982f2 deps: update everything 2023-07-05 14:04:29 -04:00
Andrew Gallant
a68db3ac02 deps: drop temporary patch and move to bstr 1.6
Now that regex 1.9 is out, we can depend on it from crates.io.
2023-07-05 14:04:29 -04:00
Andrew Gallant
b12905daca deps: update everything 2023-07-05 14:04:29 -04:00
Andrew Gallant
ca740d9ace regex: add new inner literal extractor
This is mostly a copy of the prefix literal extractor in regex-syntax,
but with a tweaked notion of Seq that keeps track of whether it's a
prefix of an expression or not. If it isn't, then we can't cross it as a
suffix to another Seq.

This new extractor should be a lot more robust than the old one. We
actually will keep going through the regex to try and find the "best"
literals to search for (according to some heuristic).
2023-07-05 14:04:29 -04:00
Andrew Gallant
e80c102dee regex: tweak formatting of regex-automata version spec
This makes it easier to enable the `logging` feature for regex-automata.

I wish I could just enable it unconditionally, but it winds up producing
a lot of output because ripgrep uses regexes for things other than the
primary search (like every glob). Sigh.
2023-07-05 14:04:29 -04:00
Andrew Gallant
8ac66a9e04 regex: refactor matcher construction
This does a little bit of refactoring so that we can pass both a
ConfiguredHIR and a Regex to the inner literal extraction routine.

One downside of this approach is that a regex object hangs on to a
ConfiguredHIR. But the extra memory usage is probably negligible. A
benefit though is that converting the HIR to its concrete syntax is now
lazy and only happens when logging is enabled.
2023-07-05 14:04:29 -04:00
Andrew Gallant
04dde9a4eb regex: tweak DFA settings
This increases the limits a bit for when the regex engine will build and
use a fully compiled DFA. They can faster in some circumstances. For
example, '(?-u)^\w{30,}$' gets a nice speed boost from state
acceleration.

We are also able to remove `regex` proper as a dependency. Wow.
2023-07-05 14:04:29 -04:00
Andrew Gallant
81341702af regex: push more pattern handling to matcher construction
Previously, ripgrep core was responsible for escaping regex patterns and
implementing the --line-regexp flag. This commit moves that
responsibility down into the matchers such that ripgrep just needs to
hand the patterns it gets off to the matcher builder. The builder will
then take care of escaping and all that.

This was done to make pattern construction completely owned by the
matcher builders. With the arrival regex-automata, this means we can
move to the HIR very quickly and then never move back to the concrete
syntax. We can then build our regex directly from the HIR. This overall
can save quite a bit of time, especially when searching for large
dictionaries.

We still aren't quite as fast as GNU grep when searching something on
the scale of /usr/share/dict/words, but we are basically within spitting
distance. Prior to this, we were about an order of magnitude slower.

This architecture in particular lets us write a pretty simple fast path
that avoids AST parsing and HIR translation entirely: the case where one
is just searching for a literal. In that case, we can hand construct the
HIR directly.
2023-07-05 14:04:29 -04:00
Andrew Gallant
d34c5c88a7 globset: fix build error in tests
I guess we haven't been testing with the Serde feature enabled? Weird.
2023-07-05 14:04:29 -04:00
Andrew Gallant
4b8aa91ae5 deps: update to pcre2 0.2.4
0.2.4 updates to PCRE2 10.42 and has a few other nice changes. For
example, when `utf` is enabled, the crate will always set the
PCRE2_MATCH_INVALID_UTF option. That means we no longer need to do
transcoding or UTF-8 validity checks.

Because of this, we actually get to remove one of the two uses of
`unsafe` in ripgrep's `main` program.

(This also updates a couple other dependencies for convenience.)
2023-07-05 14:04:29 -04:00
Andrew Gallant
a775b493fd regex: small cleanups
Just some small polishing. We also get rid of thread_local in favor of
using regex-automata, mostly just in the name of reducing dependencies.
(We should eventually be able to drop thread_local completely.)
2023-07-05 14:04:29 -04:00
Andrew Gallant
a6dbff502f regex: s/locations/captures
Now that we use regex-automata, we no longer use any type with
"locations" in it. Instead, that's mostly legacy from the top-level
regex crate.
2023-07-05 14:04:29 -04:00
Andrew Gallant
51480d57a6 regex: simplify AST analysis a bit
The verbatim literal stuff hasn't been used for a while and I don't
foresee it being used. If it's really needed, it would probably better
to just implement it by looking at the pattern string itself, which
avoids parsing it into an AST altogether.
2023-07-05 14:04:29 -04:00
Andrew Gallant
d9bd261be8 regex: some small cleanup in 'strip.rs'
We also utilize bstr's methods to get rid of some helpers we had written
by hand.
2023-07-05 14:04:29 -04:00
Andrew Gallant
9d62eb997a BREAKING: regex: finally remove CRLF hack
Now that Rust's regex crate finally supports a CRLF mode, we can remove
this giant hack in ripgrep to enable it. (And assuredly did not work in
all cases.)

The way this works in the regex engine is actually subtly different than
what ripgrep previously did. Namely, --crlf would previously treat
either \r\n or \n as a line terminator. But now it treats \r\n, \n and
\r as line terminators. In effect, it is implemented by treating \r and
\n as line terminators, but ^ and $ will never match at a position
between a \r and a \n.

So basically this means that $ will end up matching in more cases than
it might be intended too, but I don't expect this to be a big problem in
practice.

Note that passing --crlf to ripgrep and enabling CRLF mode in the regex
via the `R` inline flag (e.g., `(?R:$)`) are subtly different. The `R`
flag just controls the regex engine, but --crlf instructs all of ripgrep
to use \r\n as a line terminator. There are likely some inconsistencies
or corner cases that are wrong as a result of this cognitive dissonance,
but we choose to leave well enough alone for now.

Fixing this for real will probably require re-thinking how line
terminators are handled in ripgrep. For example, one "problem" with how
they're handled now is that ripgrep will re-insert its own line
terminators when printing output instead of copying the input. This is
maybe not so great and perhaps unexpected. (ripgrep probably can't get
away with not inserting any line terminators. Users probably expect
files that don't end with a line terminator whose last line matches to
have a line terminator inserted.)
2023-07-05 14:04:29 -04:00
Andrew Gallant
e028ea3792 regex: migrate grep-regex to regex-automata
We just do a "basic" dumb migration. We don't try to improve anything
here.
2023-07-05 14:04:29 -04:00
Andrew Gallant
1035f6b1ff deps: initial migration steps to regex 1.9
This leaves the grep-regex crate in tatters. Pretty much the entire
thing needs to be re-worked. The upshot is that it should result in some
big simplifications. I hope.

The idea here is to drop down and actually use regex-automata 0.3
instead of the regex crate itself.
2023-07-05 14:04:29 -04:00
Andrew Gallant
a7f1276021 readme: update Debian instructions
We probably don't need to mention Buster specifically nor Debian
unstable since ripgrep has been in Debian for a while now.

But we can't just get rid of the `deb` file either, because Debian might
package a very old version.

Fixes #2531
2023-06-12 07:50:13 -04:00
Martin Nordholts
4fcb1b2202 cli: replace atty with std::io::IsTerminal
The `atty` crate is unmaintained[1] and `std::io::IsTerminal` was
stabilized in Rust 1.70.

[1]: https://rustsec.org/advisories/RUSTSEC-2021-0145.html

PR #2526
2023-06-05 14:00:46 -04:00
Francois Marier
949092fd22 ignore/types: add 'mdwn' to Markdown
PR #2520
2023-05-26 14:44:41 -04:00
Andrew Gallant
4a7e7094ad deps: update everything else 2023-05-25 13:06:13 -04:00
Andrew Gallant
fc0d9b90a9 deps: bump regex to 1.8.3
This brings in an update from the regex crate that fixes a matching bug
for particular kinds of alternations of literals.

Fixes #2518
2023-05-25 13:06:13 -04:00
Ville Skyttä
335aa4937a ignore/types: add *.pyi for Python
https://peps.python.org/pep-0484/#stub-files

PR #2517
2023-05-23 07:10:02 -04:00
Adam Reichold
803c447845 searcher: re-enable mmap on 32-bit architectures
memmap2 v0.3.0 introduced a regression when trying to map files larger than 4GB
on 32-bit architectures[1] which was subsequently fixed in v0.3.1[2].

This commit bumps locked version of the memmap2 dependency to the current v0.5.0
and reverts fdfc418be5 to re-enable mmap on 32-bit
architectures as a different approach to fixing [3].

This was tested to report matches from the end of a 5GB file using MinGW and Wine.

Ref #1911, PR #2000 

[1] 5e271224c8
[2] 9aa838aed9
[3] https://github.com/BurntSushi/ripgrep/issues/1911
2023-05-19 08:23:53 -04:00
Andrew Gallant
c5415adbe8 deps: update everything
This does unfortunately bring in both regex-syntax 0.6 and 0.7, but
we'll fix that once regex 1.9 is out.
2023-05-16 13:14:23 -04:00
Andrew Gallant
251376597f deps: update minimum version of grep crate
Ref #2516
2023-05-16 13:13:34 -04:00
Andrew Gallant
e593f5b7ee grep-0.2.12 2023-05-16 13:12:45 -04:00
Andrew Gallant
6b19be2477 crates/grep: remove 'deny(missing_docs)'
This crate is only a shim over a bunch of other crates. I'm not sure
that there's anything to add to each of the `pub extern` items. So
instead of just writing fluff, I removed the lint.

Fixes #2516
2023-05-16 13:10:42 -04:00
Ryan Whitehouse
041544853c doc: fix --quiet docs
The wording was previously inverted, which had the opposite
meaning as was intended.

Fixes #1962
2023-03-28 07:22:59 -04:00
Manu
a7ae9e4043 ignore/types: add support for docker-compose files
Default file is docker-compose.yml and the documentation
mentions overrides in the form of docker-compose.*.yml.

PR #2469
2023-03-21 12:56:38 -04:00
Andrew Gallant
595e7845b8 readme: add a link to delta's support for ripgrep
Ref: https://github.com/BurntSushi/ripgrep/issues/86#issuecomment-1469717706
2023-03-15 08:02:04 -04:00
David Ringo
44fb9fce2c ignore/types: add *.sln for msbuild
.sln is the extension for Visual Studio Project Soltion files, one of
the file types accepted as inputs by MSBuild.

PR #2415
2023-02-09 21:20:49 -05:00
Vincent Bockaert
339c46a6ed ignore/types: enhance terraform default filter
The default filter for terraform only checks for *.tf files, but there
are quite few other terraform filetypes.

The explanation for all of them can be found below (including link to
documentation from Hashicorp at time of writing)

- *.tf.json & *.tfvars.json is to capture the files written in
  JSON-based variant of the Terraform language
    - https://developer.hashicorp.com/terraform/language/files
- *.tfvars is used to supply variables
    - https://developer.hashicorp.com/terraform/cloud-docs/workspaces/variables#6-auto-tfvars-variable-files
- .terraform.lock.hcl is used as a Dependency lock file
    - https://developer.hashicorp.com/terraform/language/files/dependency-lock
- terraform.rc & .terraformrc, *.tfrc
    - https://developer.hashicorp.com/terraform/cli/config/config-file

PR #2412
2023-02-09 12:57:01 -05:00
Andrew Gallant
fe97c0a152 ignore-0.4.20 2023-01-15 08:21:02 -05:00
Christian Vallentin
826f3fad5b ignore/api: add Clone and Debug impls for OverrideBuilder
PR #2397
2023-01-15 08:16:27 -05:00
Andrew Gallant
bc55049327 readme: update MSRV in README
... this was apparently long outdated, wow.
2023-01-05 12:09:46 -05:00
Andrew Gallant
d58e9353fc deps: update to grep 0.2.11 2023-01-05 09:13:47 -05:00
Andrew Gallant
ca60fef4db grep-0.2.11 2023-01-05 09:12:49 -05:00
Andrew Gallant
a25307d6c8 deps: update to grep-printer 0.1.7 2023-01-05 09:12:37 -05:00
Andrew Gallant
b80947a8b3 grep-printer-0.1.7 2023-01-05 09:11:16 -05:00
Andrew Gallant
ad793a0d8f deps: update to grep-searcher 0.1.11 2023-01-05 09:07:49 -05:00
Andrew Gallant
120e55e7c7 grep-searcher-0.1.11 2023-01-05 09:07:09 -05:00
Andrew Gallant
3941a7701d deps: update to grep-pcre2 0.1.6 2023-01-05 09:06:52 -05:00
Andrew Gallant
96e130fbf9 grep-pcre2-0.1.6 2023-01-05 09:05:59 -05:00
Andrew Gallant
180c4eaf8b deps: update to grep-regex 0.1.11 2023-01-05 09:05:39 -05:00
Andrew Gallant
81529288cf grep-regex-0.1.11 2023-01-05 09:02:55 -05:00
Andrew Gallant
bcc7473a87 deps: update to grep-matcher 0.1.6 2023-01-05 09:02:40 -05:00
Andrew Gallant
bc78c644db grep-matcher-0.1.6 2023-01-05 09:00:33 -05:00
Andrew Gallant
dc7267a0fb deps: update to grep-cli 0.1.7 2023-01-05 08:58:47 -05:00
Andrew Gallant
3224324e25 grep-cli-0.1.7 2023-01-05 08:57:31 -05:00
Andrew Gallant
0f61f08eb1 deps: update to ignore 0.4.19 2023-01-05 08:57:05 -05:00
Andrew Gallant
a0e8dbe9df ignore-0.4.19 2023-01-05 08:55:46 -05:00
Andrew Gallant
e95254a86f deps: remove ignore's dependency on crossbeam-utils
Scoped threads are now part of std.
2023-01-05 08:51:08 -05:00
Andrew Gallant
2f484d8ce5 deps: update to globset 0.4.10 2023-01-05 08:49:58 -05:00
Andrew Gallant
364772ddd2 globset-0.4.10 2023-01-05 08:45:47 -05:00
Andrew Gallant
2e207833bc deps: upgrade to jemallocator 0.5 2023-01-05 08:33:43 -05:00
Andrew Gallant
92b35a65f8 deps: upgrade to base64 0.20 2023-01-05 08:21:49 -05:00
Andrew Gallant
ac8fecbbf2 deps: upgrade bstr to 1.1 2023-01-05 08:21:15 -05:00
Andrew Gallant
8596817374 deps: do semver compatible upgrades 2023-01-05 08:16:32 -05:00
Andrew Gallant
28bff84a0a deps: remove 'num_cpus'
Now that std:🧵:available_parallelism is a thing, we no longer
need num_cpus.
2023-01-05 08:15:09 -05:00
Alex Touchet
61101289fa cargo: set rust-version
This should hopefully make compilation errors from using
an older-than-supported compiler more helpful.

PR #2373
2022-12-21 07:37:09 -05:00
Andrew Gallant
13faa39b66 deps: update all dependencies within semver
Note that this adds a new dependency, 'unicode-ident', and removes
'unicode-xid'. I looked briefly at 'unicode-ident' and all looks okay.
It is also permissively licensed.
2022-12-20 09:23:29 -05:00
Andrew Gallant
6b61271bbb benchsuite/runs: add another run of the benchmarks
Looks like ripgrep is still the king. ;-)
2022-12-16 11:24:10 -05:00
Andrew Gallant
1be86392e0 benchsuite: pass '-a' to ugrep in some cases
It looks like it incorrectly treats a file that is purely valid UTF-8 as
a binary file, which in turn effectively renders all of the Russian
subtitle benchmarks moot for ugrep. So we pass '-a' to force ugrep to
treat the file as text.

This technically gives ugrep an edge because it now no longer needs to
look to see if the haystack is binary or not. In practice this is
usually implemented using highly optimized SIMD routines (e.g.,
'memchr'), so it tends not to matter much. We might also consider
passing '-a' to all grep commands. But... I think using '-a' is the less
common case and we should try to benchmark the common case.
2022-12-16 11:21:58 -05:00
Andrew Gallant
63058453fa benchsuite: update URLs
This removes the old commented out URLs for the 2016 subtitles that
don't work any more. I should probably upload the files to a more stable
URL.

This also switches to a 'https://' GitHub URL as I believe the 'git://'
URLs are no longer supported.
2022-12-16 11:20:45 -05:00
Armin Brauns
7f23cd63a5 ignore/types: add automated test for sortedness
People occasionally get this wrong and I've been manually
checking it. Instead, let's have CI do it automatically.

PR #2351
2022-11-14 08:31:07 -05:00
Andrew Gallant
8905d54a9f msrv: bump to Rust 1.65.0
This matches the latest stable release of Rust and let's us use nice
things like 'let else'.
2022-11-14 07:56:17 -05:00
Armin Brauns
25a4eaf5ae ignore/types: add devicetree filetype
See: https://www.devicetree.org/

PR #2349
2022-11-14 07:42:57 -05:00
jgart
0000157917 readme: add guix installation instructions
PR #2344
2022-11-02 08:10:54 -04:00
jgart
65b1b0e38a ignore/types: add carp
See: https://github.com/carp-lang/Carp

PR #2343
2022-11-01 07:17:00 -04:00
Glenn Slotte
c032cda4b7 ignore/types: add ReScript and ReasonML
PR #2340
2022-10-29 13:49:19 -04:00
Marcin Nowak-Liebiediew
eab044d829 ignore/types: add motoko and candid
See: https://github.com/dfinity/candid
See: https://github.com/dfinity/motoko

PR #2335
2022-10-20 09:22:41 -04:00
Andrew Gallant
55e62a4411 readme: add more links to overview
Many of the features are documented in the GUIDE, so let's just link to
them.
2022-10-19 11:06:44 -04:00
Andrew Gallant
5b2f614aad readme: add note about 'rg -uuu'
I'm not sure about putting this in such a prominent spot, and it does
bloat the introductory paragraph a bit, but it seems like an important
special case.
2022-10-19 09:52:37 -04:00
dependabot[bot]
4386b8e805 ci: bump actions/checkout from 2 to 3 (#2318)
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-29 08:18:47 -04:00
dependabot[bot]
6b012d8129 ci: bump actions/upload-release-asset from 1.0.1 to 1.0.2 (#2317)
Bumps [actions/upload-release-asset](https://github.com/actions/upload-release-asset) from 1.0.1 to 1.0.2.
- [Release notes](https://github.com/actions/upload-release-asset/releases)
- [Commits](https://github.com/actions/upload-release-asset/compare/v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: actions/upload-release-asset
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-29 08:15:36 -04:00
LingMan
a928ca4221 ci: enable Dependabot for the Actions workflows
Dependabot automatically files PRs for updatable dependencies. As
configured it watches all workflow files in `.github/workflows` for
possible updates to any of the Actions depended upon.

We specifically do not enable Dependabot for other things, in order to
avoid running in a hamster wheel.

Closes #2315
2022-09-29 07:44:30 -04:00
LingMan
d1570defbf ci: remove fetch-depth parameter from the checkout action
It is already set to 1 by default.

Closes #2316
2022-09-29 07:44:19 -04:00
LingMan
b732c23e36 ci: use cargo check's --check option directly 2022-09-29 07:44:13 -04:00
LingMan
49965703fa ci: switch to using '@master' dtolnay action
The `v1` tag exists but isn't really supported.

This mirrors [1]. See also [2].

[1]: 50086e74da
[2]: https://github.com/BurntSushi/bstr/pull/122#issuecomment-1201930916
2022-09-29 07:43:29 -04:00
LingMan
609838aebd ci: use latest runner images in CI
The `ubuntu-18.04` image is deprecated and will be removed by
2023-04-01[1][2] with scheduled brownouts starting on 2022-10-03.
Update all images to the latest available versions.

[1]: https://github.blog/changelog/2022-08-09-github-actions-the-ubuntu-18-04-actions-runner-image-is-being-deprecated-and-will-be-removed-by-12-1-22/
[2]: https://github.com/actions/runner-images/issues/6002
2022-09-29 07:43:10 -04:00
Dave Rolsky
515f120b5c doc: fix typo
PR #2313
2022-09-24 13:23:59 -04:00
Linda_pp
a66315d232 ignore/types: add *.cjs, *.mjs, *.cts, *.mts
These are used by both Node.js and TypeScript to indicate that a file
is CommonJS or ES.

Node.js: https://nodejs.org/api/esm.html

TypeScript: https://www.typescriptlang.org/docs/handbook/esm-node.html#new-file-extensions

PR #2297
2022-08-31 08:11:13 -04:00
Nacho Barrientos
bdf10ab7c0 ignore/types: add embedded puppet templates
.epp files are getting more and more common in Puppet code bases so it
makes sense I think to include them as part of the "puppet" type.

https://puppet.com/docs/puppet/7/lang_template_epp.html

PR #2141
2022-08-21 12:32:03 -04:00
John Saigle
a02678800b ignore/types: add Solidity
See: https://soliditylang.org/about/

PR #2284
2022-08-17 09:37:32 -04:00
Andrew Gallant
387df97d85 ripgrep: add /.github/ to whitelist
It's pretty common to want to search this, since it defines the CI
configuration of the project.
2022-08-17 08:31:22 -04:00
David Marzal
a9d97a1dda doc: add '-.' as short flag for '--hidden'
PR #2279
2022-08-10 08:03:04 -04:00
drebelsky
3bb71b0cb8 doc: fix a few typos
PR #2274
2022-08-06 14:29:27 -04:00
Malte
87b33c96c0 ignore/types: improve 'markdown' and 'php' types
This adds some lesser known extensions.

Notably, it adds php7 and php8, but not php6. Apparently,
php6 was never a thing: https://wiki.php.net/rfc/php6

PR #2263
2022-07-18 10:35:09 -04:00
Andrew Gallant
5e975c43f8 doc: appease rustdoc 2022-07-15 10:13:55 -04:00
Andrew Gallant
7efa2e46d3 grep-0.2.10 2022-07-15 10:06:53 -04:00
Andrew Gallant
db0b92b62d grep: bump grep-searcher to 0.1.10
This was a result of leaving a stray 'dbg!'.
2022-07-15 10:06:31 -04:00
Andrew Gallant
33b81cac48 grep-searcher-0.1.10 2022-07-15 10:05:46 -04:00
Andrew Gallant
6a13a4f64d searcher: remove stray 'dbg!' 2022-07-15 10:05:20 -04:00
Andrew Gallant
b13d835d95 grep-0.2.9 2022-07-15 10:03:06 -04:00
Andrew Gallant
d53506b7f7 grep: bump 'grep-regex' and 'grep-searcher'
To 0.1.10 and 0.1.9, respectively.
2022-07-15 10:02:41 -04:00
Andrew Gallant
78a35d4d43 grep-searcher-0.1.9 2022-07-15 10:02:24 -04:00
Andrew Gallant
a933d0bc90 searcher: bump grep-regex dep to 0.1.10 2022-07-15 10:02:06 -04:00
Andrew Gallant
2cae30e399 grep-regex-0.1.10 2022-07-15 10:01:42 -04:00
Andrew Gallant
8e57989cd2 regex: fix matching bug when text anchors are used
It turns out that if there are text anchors (that is, \A or \z, or ^/$
when multi-line is disabled), then the "fast" line searching path isn't
quite correct. Since searching without multi-line mode is exceptionally
rare, we just look for the presence of text anchors and specifically
disable the line terminator option in 'grep-regex'. This in turn
inhibits the "fast" line searching path.

Fixes #2260
2022-07-15 09:53:39 -04:00
Andrew Gallant
b9f5835534 ci: switch to dtolnay/rust-toolchain
The actions-rs/toolchain project appears dead. dtolnay's also seems more
sustainable given its simplicity, but it does enough to suit our needs.
2022-07-14 13:48:14 -04:00
tleb
e70778e89d ignore/types: add dts to default types
See: https://devicetree-specification.readthedocs.io/en/v0.3/source-language.html

PR #2255
2022-07-07 12:24:12 -04:00
zhimoe
87c4a2b4b1 doc: fix typo
PR #2248
2022-06-26 18:49:54 -04:00
Kian-Meng Ang
0aa31676e3 doc: fix typos
PR #2245
2022-06-24 09:58:20 -04:00
Andrew Gallant
9f0e88bcb1 ignore: fix gitignore parsing bug for trailing \/
When a glob pattern ended with a \/, and since we permit backslash
escapes, the glob parser gave a "dangling escape" error. Which is weird,
because the \ is clearly not dangling.

The issue is that the layer above the glob parser, the gitignore parser,
was stripping the trailing / so that it wouldn't be part of the matching
logic. Of course, stripping the trailing / while it is escaped without
removing the backslash escape is wrong. So we do that here.

Fixes #2236
2022-06-14 10:40:37 -04:00
Alex Touchet
eb4b389846 globset/readme: update version number and some links
PR #2232
2022-06-11 14:17:32 -04:00
Andrew Gallant
dc337bab0a deps: update to globset 0.4.9 2022-06-10 14:11:20 -04:00
Andrew Gallant
2cfb338530 globset-0.4.9 2022-06-10 14:10:34 -04:00
Sergio Benitez
48646e3451 globset: make 'log' an optional feature
PR #1910
2022-06-10 14:10:09 -04:00
Andrew Gallant
985394a19e deps: update to packed_simd_2 0.3.8
It broke on latest nightly. I'm *very* close to just removing the
'simd-accel' feature altogether.

Fixes #2230
2022-06-10 09:39:17 -04:00
jgart
ec36f8c3ff ignore/types: add pants
See: https://www.pantsbuild.org/

PR #2228
2022-06-08 13:29:17 -04:00
jpe90
a726d03641 ignore/types: add hare to default types
PR #2219
2022-05-22 20:08:45 -04:00
Andrew Gallant
91afd4214a printer: fix duplicative replacement in multiline mode
This furthers our kludge of dealing with PCRE2's look-around in the
printer. Because of our bad abstraction boundaries, we added a kludge to
deal with PCRE2 look-around by extending the bytes we search by a fixed
amount to hopefully permit any look-around to operate. But because of
that kludge, we wind up over extending ourselves in some cases and
dragging along those extra bytes.

We had fixed this for simple searching by simply rejecting any matches
past the end point. But we didn't do the same for replacements. So this
commit extends our kludge to replacements.

Thanks to @sonohgong for diagnosing the problem and proposing a fix. I
mostly went with their solution, but adding the new replacement routine
as an internal helper rather than a new APIn in the 'grep-matcher'
crate.

Fixes #2095, Fixes #2208
2022-05-11 14:44:58 -04:00
Keith Smiley
4dc6c73c5a ignore/types: improve Bazel globs
MODULE.bazel is a new file, and WORKSPACE.bazel was always supported
similar to BUILD.bazel vs BUILD.

PR #2203
2022-05-09 11:50:34 -04:00
Alex Touchet
36d03b4101 cargo: use SPDX license format for all crates
This was done for the main crate in d11a3b3377.

See also #987.

PR #2204
2022-05-09 07:52:11 -04:00
Conrad Meyer
d161acb0a3 ignore/types: add '*.hh' to C++ headers
Like .hpp, .hh is an occasionally used extension for C++ headers
(to distinguish them from C headers). At least one popular project,
FreeBSD, uses this extension.

See also: https://docs.fileformat.com/programming/hh/

PR #2192
2022-04-25 07:38:03 -04:00
Matrix Dai
30ee6f08ee ignore/types: add '*.asp' for asp type
The `*.asp` was not included in the type "asp" when it was added.
https://github.com/BurntSushi/ripgrep/pull/1134

PR #2188
2022-04-19 10:36:14 -04:00
Andrew Gallant
ced5b92aa9 deps: bump memmap2 to 0.5
Looking at the memmap2 CHANGELOG, there don't appear to be any breaking
changes that impact us.
2022-03-21 08:59:05 -04:00
Andrew Gallant
191315a2ea deps: update everything
Surprisingly looks like no new dependencies were added! Yay! And we
removed an extra copy of 'cfg-if' due to what appears to be an updated
in 'packed_simd_2'.

Otherwise, all updates appear to be minor things.
2022-03-21 08:59:05 -04:00
Andrew Gallant
5370064f00 warnings: remove/tweak some dead code
It looks like the dead code detector got better, so do a little code
cleanup.
2022-03-21 08:59:05 -04:00
arcsi42
b6189c659e ci: fix failing nightly-arm build on ci workflow
This commit updates the Ubuntu install script to include brotli and
zstd, which are needed for tests.

We also fix the Ubuntu install script to work in environments that
don't have 'sudo'. Instead of creating a totally separate script, we
preserve a single point of truth for these things and just make the
script a bit more flexible.

NOT seen in this commit is that we have built and updated the arm Docker
image. I'm hoping this fixes the GLIBC version issues we're seeing in
CI.

Fixes #2130, Closes #2132
2022-03-21 08:59:05 -04:00
Mateusz Konieczny
0b36942f68 doc: transcoding is done in addition to search
Even if transcoding would be faster than search it would still incur
performance penalty. We make this clearer by tweaking the wording.

PR #2079
2021-11-22 09:48:42 -05:00
mi-wada
7e05cde008 cli: improve configuration failure mode
This improves the error message printed when ripgrep can't read the
file path pointed to by RIPGREP_CONFIG_PATH. Specifically, before this
change:

    $ RIPGREP_CONFIG_PATH=no_exist_path rg 'search regex'
    no_exist_path: No such file or directory (os error 2)

And now after this change:

    $ RIPGREP_CONFIG_PATH=no_exist_path rg 'search regex'
    failed to read the file specified in RIPGREP_CONFIG_PATH: no_exist_path: No such file or directory (os error 2)

In the above examples, the first failure mode looks obvious, but that's
only because RIPGREP_CONFIG_PATH is being set at the same time that we
run the command. Often, the environment variable is set elsewhere and
the error message could be confusing outside of that context.

Closes #1990
2021-11-15 10:29:34 -05:00
jgart
418d048b27 ignore/types: add fennel
https://fennel-lang.org/

PR #2069
2021-11-15 09:58:09 -05:00
Josh Triplett
009dda1488 ignore: if require_git is false, don't stat .git
I've confirmed via strace that this eliminates a pile of stat calls.

PR #2052
2021-11-12 08:37:05 -05:00
Linda_pp
ba535fb5a3 ignore/types: improve 'vim' and 'vimscript' types
This adds various Vim config files to the glob patterns.

PR #2044
2021-10-27 10:59:44 -04:00
jgart
427aaeeb2e ignore/types: add lilypond
This adds file detection for lilypond: https://lilypond.org/

PR #2038
2021-10-24 11:22:07 -04:00
jgart
f5cff746bc ignore/types: add hy
This adds file detection for hy: http://hylang.org/

PR #2033
2021-10-22 08:16:48 -04:00
Philip Munksgaard
457f53b7ee ignore/types: fix futhark type extension
Previously, the 'fut' type only matches files called '.fut', while in
reality we want to match all files with the '.fut' extension. This
commit fixes that issue.

PR #2027
2021-10-19 09:15:19 -04:00
jgart
eb35f7978e ignore/types: add janet
This adds file detection for janet:
https://janet-lang.org/

PR #2018
2021-10-14 07:56:55 -04:00
Markus Dosch
fc69bd366c readme: update install commands for Debian/Ubuntu
This got overlooked during the last release.

PR #2016
2021-10-12 11:08:14 -04:00
Dash
9b01a8f9ae doc: add -F/--fixed-strings to "common options"
#607 is the top result for the search "ripgrep disable regex". I think
it makes sense to add it to the user guide, since it's a very useful
flag.

PR #1945
2021-07-21 20:52:25 -04:00
Andrew Gallant
0ff5dd2360 doc: --field-match-separator's default value is ':'
The docs were out of sync with the implementation. Likely a
copy-and-paste error.

Fixes #1939
2021-07-19 08:07:40 -04:00
Joe Lencioni
3c7819301b doc: fix typo "used" -> "use"
PR #1936
2021-07-14 10:12:30 -04:00
jgart
699e651db2 ignore/types: add texinfo
https://www.gnu.org/software/texinfo/

PR #1934
2021-07-13 07:59:23 -04:00
Eyal
9eddb71b8e ignore/types: add CUDA
Fixes #1918
2021-06-30 09:50:53 -04:00
Andrew Gallant
abf115228e changelog: add #1911 bug fix 2021-06-26 12:57:11 -04:00
Andrew Gallant
fdfc418be5 searcher: disable mmap searching on non-64 bit
It looks like it's possible for mmap to succeed on 32-bit systems even
when the full file can't be addressed in memory. This used to work prior
to ripgrep 13, but (maybe) something about statically linking vcruntime
has caused this to now fail.

It's no big deal to disable mmap searching on 32-bit, so we just do that
instead of returning incorrect results.

Fixes #1911
2021-06-26 12:53:59 -04:00
Sergio Benitez
5bf74362b9 doc: fix typo in --glob flag docs
PR #1899
2021-06-24 08:09:00 -04:00
Kostya M
431ea38620 ignore/types: add file extensions for Crystal
It sounds like Projectfile is no longer being used,
but we should keep it around in case folks are
still using it. It's unlikely that its presence will
do much if any harm.

PR #1904
2021-06-20 08:24:41 -04:00
Andrew Gallant
caba5c4348 globset-0.4.8 2021-06-18 13:30:32 -04:00
Gleb Pomykalov
07f97d42cf globset: fix compilation when serde is enabled
PR #1903
2021-06-18 13:30:47 -04:00
kotborealis
e33d6e73f5 doc: fix formatting of nested list
Markdown wants 4 spaces, not 2.

PR #1894
2021-06-15 10:35:16 -04:00
Andrew Gallant
478da4f271 pkg: fix version number for 13.0.0 release
Fixes #1896
2021-06-15 10:30:01 -04:00
Andrew Gallant
7ce66f73cf regex: update regression test
Sadly, PCRE2 has different behavior (but doesn't panic). We should look
into that, but for now, this is good enough.

Also, update the CHANGELOG.

Ref #1891
2021-06-12 16:22:30 -04:00
Andrew Gallant
bc76a30c23 regex: fix -w when regex can match empty string
This is a weird bug where our optimization for handling -w more quickly
than we would otherwise failed. In particular, if the original regex can
match the empty string, then our word boundary detection would produce
invalid indices to the start the next search at. We "fix" it by simply
bailing when the indices are known to be incorrect.

This wasn't a problem in a previous release since ripgrep 13 tweaked how
word boundaries are detected in commit efd9cfb2.

Fixes #1891
2021-06-12 14:18:53 -04:00
Andrew Gallant
5e81c60b35 ci: use musl to build debian artifact
Previously, I was trying to be a good citizen and let ripgrep use the
system libc. But it turns out that building ripgrep on Arch with a newer
version of glibc than what is in Ubuntu results in the whole thing
breaking. Arguably, I should build the Debian artifact on an Ubuntu or
Debian machine of an appropriate version, but that's too much work. If
people really want that, then they can install some ancient version of
ripgrep from their Ubuntu/Debian repo.

Since we were already statically linking PCRE2, we go the whole nine
yards and statically link the entire thing.

Fixes #1890
2021-06-12 13:36:57 -04:00
Andrew Gallant
b3e5ae9d28 changelog: add template for next entry 2021-06-12 08:43:49 -04:00
Andrew Gallant
a024f14fdd pkg: update brew tap version to 13.0.0 2021-06-12 08:43:30 -04:00
Andrew Gallant
8c30c8294a release: work around GitHub Actions weirdness 2021-06-12 08:40:48 -04:00
Andrew Gallant
c44d263419 release: add note about pushing changes 2021-06-12 08:13:29 -04:00
Andrew Gallant
af6b6c543b 13.0.0 2021-06-12 08:12:24 -04:00
Andrew Gallant
1a4fec8b4a changelog: final prep before ripgrep 13 release 2021-06-12 08:11:51 -04:00
Andrew Gallant
c8d8ab8ded deps/grep: update minimal versions 2021-06-12 08:08:58 -04:00
Andrew Gallant
1d53ed2744 grep-0.2.8 2021-06-12 08:08:32 -04:00
Andrew Gallant
29696d1455 deps/printer: update minimal versions 2021-06-12 08:08:18 -04:00
Andrew Gallant
57ce623a57 grep-printer-0.1.6 2021-06-12 08:07:46 -04:00
Andrew Gallant
f1c656de40 deps/searcher: update minimal versions 2021-06-12 08:07:28 -04:00
Andrew Gallant
dd47582619 grep-searcher-0.1.8 2021-06-12 08:06:58 -04:00
Andrew Gallant
9b88cf8b72 deps/pcre2: update minimal versions 2021-06-12 08:06:50 -04:00
Andrew Gallant
6668d7ba8a grep-pcre2-0.1.5 2021-06-12 08:06:29 -04:00
Andrew Gallant
008da5dca4 pcre2: update minimal version to 0.2.3 2021-06-12 08:05:56 -04:00
Andrew Gallant
a34df1f690 deps/regex: update minimal versions 2021-06-12 08:05:36 -04:00
Andrew Gallant
7f3fd6f7ce grep-regex-0.1.9 2021-06-12 08:03:56 -04:00
Andrew Gallant
6331a7ac18 deps/matcher: update minimal versions 2021-06-12 08:03:47 -04:00
Andrew Gallant
cd4386bd9b grep-matcher-0.1.5 2021-06-12 08:02:30 -04:00
Andrew Gallant
cdc20c5685 deps/cli: update minimal versions 2021-06-12 08:02:18 -04:00
Andrew Gallant
0cf2b98df2 grep-cli-0.1.6 2021-06-12 08:01:22 -04:00
Andrew Gallant
9efdbf74a1 deps/ignore: update minimal versions 2021-06-12 08:01:13 -04:00
Andrew Gallant
53cb9a779e release: add step about making sure 'master' is in sync
Otherwise, if we start doing crate releases from the local checkout
(with git tags) and it turns out that origin/master has newer commits,
rebasing local master will then invalidate those tags.
2021-06-12 07:59:47 -04:00
Andrew Gallant
14860b0f16 ignore-0.4.18 2021-06-12 07:59:07 -04:00
Andrew Gallant
0eb1a1e7c9 deps/globset: update minimal versions 2021-06-12 07:58:46 -04:00
Andrew Gallant
5631e5c7a0 globset-0.4.7 2021-06-12 07:56:56 -04:00
Andrew Gallant
21644408f2 release: tweak 'cargo outdated' advice
I do run --aggressive, although I've been ignoring the clap 3 update for
what seems like forever since it's still in beta.
2021-06-12 07:54:51 -04:00
Andrew Gallant
0ee85a89f5 deps: update to memmap2
Looking at the changelog for memmap2, the only breaking change was to
MmapOptions, which we don't use. So no migration is needed.
2021-06-12 07:53:42 -04:00
Andrew Gallant
ed9d37959f deps: updates libc and syn 2021-06-12 07:52:04 -04:00
Andrew Gallant
9f924ee187 msrv: bump to Rust 1.52.1
This matches the latest stable release of Rust.
2021-06-01 21:07:37 -04:00
Andrew Gallant
35c5db6d1a deps: update everything
Removes two dependencies! autocfg and byteorder.
2021-06-01 21:07:37 -04:00
Andrew Gallant
e824531e38 edition: manual changes
This is mostly just about removing 'extern crate' everywhere and fixing
the fallout.
2021-06-01 21:07:37 -04:00
Andrew Gallant
af54069c51 edition: run 'cargo fix --edition --edition-idioms --all' 2021-06-01 21:07:37 -04:00
Andrew Gallant
77a9e99964 edition: set edition=2018 2021-06-01 21:07:37 -04:00
Andrew Gallant
459a9c5637 edition: initial 'cargo fix --edition' run 2021-06-01 21:07:37 -04:00
Andrew Gallant
e4c4540f6a changelog: fix typo and add Ruby to type improvement list 2021-06-01 11:57:16 -04:00
Ulysse Buonomo
5d0f2b0fc0 ignore/types: config.ru and *.rbw Ruby
PR #1886
2021-06-01 10:57:09 -04:00
Andrew Gallant
079a23b515 changelog: a bit of polish
I think I'm just waiting on the CVE to be published at this point.
2021-06-01 06:59:06 -04:00
Andrew Gallant
6e27649af1 github: add note about file types 2021-06-01 06:26:13 -04:00
Andrew Gallant
df83b8b444 ci: re-work github actions release
This combines the tips from #1820 and the patch submitted in #1675.
The latter wasn't taken as-is because I didn't agree with some of the
changes, and in particular, it removed the ability to easily test the
release on a branch with a dummy tag name. I've tried to add that back
here with the 'rg_version' output. Overall though, using outputs is
indeed much simpler.

Closes #1675, Closes #1820
2021-05-31 21:51:18 -04:00
Andrew Gallant
e48a17e189 changelog: prep for ripgrep 13 release 2021-05-31 21:51:18 -04:00
Andrew Gallant
fbb2cfed28 printer: trim line terminator before doing replacements
This is basically the same bug as #1401, but applied to replacements
instead of --only-matching.

Fixes #1739
2021-05-31 21:51:18 -04:00
Andrew Gallant
af8b27ffae changelog: fish completions are staying
In a previous release, I announced that Fish completions were being
removed. But the Fish project decided to remove theirs and have
ripgrep's stay.

Closes #1577
2021-05-31 21:51:18 -04:00
Martin Pool
8a4071eea9 globset: expand docs and impl Default for GlobSet
Closes #1882, Closes #1883
2021-05-31 21:51:18 -04:00
Andrew Gallant
ee23ab5173 printer: trim line terminator before finding submatches
This fixes a bug where PCRE2 look-around could change the result of a
match if it observed a line terminator in the printer. And in
particular, this is precisely how the searcher operates: the line is
considered unto itself *without* the line terminator.

Fixes #1401
2021-05-31 21:51:18 -04:00
Andrew Gallant
efd9cfb2fc grep: fix bugs in handling multi-line look-around
This commit hacks in a bug fix for handling look-around across multiple
lines. The main problem is that by the time the matching lines are sent
to the printer, the surrounding context---which some look-behind or
look-ahead might have matched---could have been dropped if it wasn't
part of the set of matching lines. Therefore, when the printer re-runs
the regex engine in some cases (to do replacements, color matches, etc
etc), it won't be guaranteed to see the same matches that the searcher
found.

Overall, this is a giant clusterfuck and suggests that the way I divided
the abstraction boundary between the printer and the searcher is just
wrong. It's likely that the searcher needs to handle more of the work of
matching and pass that info on to the printer. The tricky part is that
this additional work isn't always needed. Ultimately, this means a
serious re-design of the interface between searching and printing. Sigh.

The way this fix works is to smuggle the underlying buffer used by the
searcher through into the printer. Since these bugs only impact
multi-line search (otherwise, searches are only limited to matches
across a single line), and since multi-line search always requires
having the entire file contents in a single contiguous slice (memory
mapped or on the heap), it follows that the buffer we pass through when
we need it is, in fact, the entire haystack. So this commit refactors
the printer's regex searching to use that buffer instead of the intended
bundle of bytes containing just the relevant matching portions of that
same buffer.

There is one last little hiccup: PCRE2 doesn't seem to have a way to
specify an ending position for a search. So when we re-run the search to
find matches, we can't say, "but don't search past here." Since the
buffer is likely to contain the entire file, we really cannot do
anything here other than specify a fixed upper bound on the number of
bytes to search. So if look-ahead goes more than N bytes beyond the
match, this code will break by simply being unable to find the match. In
practice, this is probably pretty rare. I believe that if we did a
better fix for this bug by fixing the interfaces, then we'd probably try
to have PCRE2 find the pertinent matches up front so that it never needs
to re-discover them.

Fixes #1412
2021-05-31 21:51:18 -04:00
Andrew Gallant
656aa12649 printer: fix multi-line replacement bug
This commit fixes a subtle bug in multi-line replacement of line
terminators.

The problem is that even though ripgrep supports multi-line searches, it
is *still* line oriented. It still needs to print line numbers, for
example. For this reason, there are various parts in the printer that
iterate over lines in order to format them into the desired output.

This turns out to be problematic in some cases. #1311 documents one of
those cases (with line numbers enabled to highlight a point later):

    $ printf "hello\nworld\n" | rg -n -U "\n" -r "?"
    1:hello?
    2:world?

But the desired output is this:

    $ printf "hello\nworld\n" | rg -n -U "\n" -r "?"
    1:hello?world?

At first I had thought that the main problem was that the printer was
taking ownership of writing line terminators, even if the input already
had them. But it's more subtle than that. If we fix that issue, we get
output like this instead:

    $ printf "hello\nworld\n" | rg -n -U "\n" -r "?"
    1:hello?2:world?

Notice how '2:' is printed before 'world?'. The reason it works this way
is because matches are reported to the printer in a line oriented way.
That is, the printer gets a block of lines. The searcher guarantees that
all matches that start or end in any of those lines also end or start in
another line in that same block. As a result, the printer uses this
assumption: once it has processed a block of lines, the next match will
begin on a new and distinct line. Thus, things like '2:' are printed.

This is generally all fine and good, but an impedance mismatch arises
when replacements are used. Because now, the replacement can be used to
change the "block of lines" approach. Now, in terms of the output, the
subsequent match might actually continue the current line since the
replacement might get rid of the concept of lines altogether.

We can sometimes work around this. For example:

    $ printf "hello\nworld\n" | rg -U "\n(.)?" -r '?$1'
    hello?world?

Why does this work? It's because the '(.)' after the '\n' causes the
match to overlap between lines. Thus, the searcher guarantees that the
block sent to the printer contains every line.

And there in lay the solution: all we need to do is tweak the multi-line
searcher so that it combines lines with matches that directly adjacent,
instead of requiring at least one byte of overlap. Fixing that solves
the issue above. It does cause some tests to fail:

* The binary3 test in the searcher crate fails because adjacent line
  matches are now one part of block, and that block is scanned for
  binary data. To preserve the essence of the test, we insert a couple
  dummy lines to split up the blocks.
* The JSON CRLF test. It was testing that we didn't output any messages
  with an empty 'submatches' array. That is indeed still the case. The
  difference is that the messages got combined because of the adjacent
  line merging behavior. This is a slight change to the output, but is
  still correct.

Fixes #1311
2021-05-31 21:51:18 -04:00
Andrew Gallant
fc31aedcf3 printer: vimgrep now only prints one line
It turns out that the vimgrep format really only wants one line per
match, even when that match spans multiple lines.

We continue to support the previous behavior (print all lines in a
match) in the `grep-printer` crate. We add a new option to enable the
"only print the first line" behavior, and unconditionally enable it in
ripgrep. We can do that because the option has no effect in single-line
mode, since, well, in that case matches are guaranteed to span one line
anyway.

Fixes #1866
2021-05-31 21:51:18 -04:00
Anthony Huang
578e1992fa cli: add --field-{context,match}-separator flags
These flags permit configuring the bytes used to delimit fields in match
or context lines, where "fields" are things like the file path, line
number, column number and the match/context itself.

Fixes #1842, Closes #1871
2021-05-31 21:51:18 -04:00
Austin Wise
46d0130597 cargo: statically link binary on Windows/MSVC
Before this change, rg.exe depended on vcruntime140.dll, which does not
exist on a fresh install of Windows.

Closes #1613
2021-05-31 21:51:18 -04:00
Andres Suarez
7534d5144f globset: fix recursive suffix over matching
Previous, 'foo/**' would match 'foo', but it shouldn't have. In this
case, not matching 'foo' is what is documented and also seems consistent
with other recursive globbing implementations (like that in zsh).

This also updates the prefix extractor to pull 'foo/' out of 'foo/**'.

Closes #1756
2021-05-31 21:51:18 -04:00
Richard Khoury
a28e664abd ignore: check ignore rules before issuing stat calls
This seems like an obvious optimization but becomes critical when
filesystem operations even as simple as stat can result in significant
overheads; an example of this was a bespoke filesystem layer in Windows
that hosted files remotely and would download them on-demand when
particular filesystem operations occurred. Users of this system who
ensured correct file-type fileters were being used could still get
unnecessary file access resulting in large downloads.

Fixes #1657, Closes #1660
2021-05-31 21:51:18 -04:00
Pen Tree
0ca96e004c printer: fix context bug when --max-count is used
In the case where after-context is requested with a match count limit,
we need to be careful not to reset the state tracking the remaining
context lines.

Fixes #1380, Closes #1642
2021-05-31 21:51:18 -04:00
Alessandro Menezes
2295061e80 searcher: do UTF-8 BOM sniffing like UTF-16
Previously, we were only looking for the UTF-16 BOM for determining
whether to do transcoding or not. But we should also look for the UTF-8
BOM as well.

Fixes #1638, Closes #1697
2021-05-31 21:51:18 -04:00
Raimon Grau
53c4855517 ignore/types: add red
See: https://www.red-lang.org/

Closes #1663
2021-05-31 21:51:18 -04:00
Simon Morgan
121e0135c1 ignore/types: replace duplicate glob with *.aspx.vb
*.aspx.cs was listed twice and the VB variant is missing.

Closes #1683
2021-05-31 21:51:18 -04:00
tillyboy
c53c4c0ade doc: explain ignore rules a bit more
Closes #1600
2021-05-31 21:51:18 -04:00
João Marcos
4566882521 cli: add -. as short option for --hidden
This is somewhat non-standard, but it seems nice on the surface: short
flag names are in short supply, --hidden is probably somewhat common and
-. has an obvious connection with how hidden files are named on Unix.

Closes #1680
2021-05-31 21:51:18 -04:00
Andrew Gallant
12dd455ee9 printer: fix \r\n line terminator handling
This fixes a bug where it was assumed that 'is_suffix' when CRLF
handling was enabled mean that '\r\n' was present. But that's not the
case, and it is intentional that 'is_suffix' only looks for '\n'. (Which
is why #1803 wasn't taken, which tries to fix this by changing
'is_suffix'.)

Fixes #1765, Closes #1803
2021-05-31 21:51:18 -04:00
goto-engineering
e6cac8b119 cli: print warning if nothing was searched
This was once part of ripgrep, but at some point, was unintentionally
removed. The value of this warning is that since ripgrep tries to be
"smart" by default, it can be surprising if it doesn't search certain
things. This warning covers the case when ripgrep searches *nothing*,
which happens somewhat more frequently than you might expect. e.g., If
you're searching within an ignore directory.

Note that for now, we only print this message when the user has not
supplied any explicit paths. It's not clear that we want to print this
otherwise, and in particular, it seems that the message shows up too
eagerly. e.g., 'rg foo does-not-exist' will both print an error about
'does-not-exist' not existing, *and* the message about no files being
searched, which seems annoying in this case. We can always refine this
logic later.

Fixes #1404, Closes #1762
2021-05-31 21:51:18 -04:00
Marco Ieni
0f502a9439 cargo: remove "readme" field
It is apparently no longer required since a README.md file is
automatically detected:
https://doc.rust-lang.org/cargo/reference/manifest.html#the-readme-field

Closes #1770
2021-05-31 21:51:18 -04:00
Ilya Grigoriev
51d2db7f19 doc: document '{a,b}' glob syntax
This syntax does not exist in `git`, so it is not documented in `man
gitignore`. There is a question of whether it *should* exist, but as
long as it does, it should be documented somewhere.

See also:
https://github.com/BurntSushi/ripgrep/issues/1221
https://github.com/BurntSushi/ripgrep/issues/1368

Closes #1816
2021-05-31 21:51:18 -04:00
Marco Ieni
b3a6a69f9d ci: check docs for all crates
This also replaces '--all' in Cargo commands with '--workspace'. The
former has apparently been deprecated.

We also fix a couple warnings that this new step detected.

Closes #1848
2021-05-31 21:51:18 -04:00
Jade
26a29c750e doc: clarify --files-with-matches and --files-without-match
Ref https://github.com/BurntSushi/ripgrep/issues/103#issuecomment-763083510

Closes #1869
2021-05-31 21:51:18 -04:00
Varik Valefor
beda5f70dc doc: improve wording
This tightens up the wording in ripgrep's opening description. It's used
in several places, so we update all of them.

Closes #1881
2021-05-31 21:51:18 -04:00
Vasili Revelas
5af7707a35 cli: fix process leak
If ripgrep was called in a way where the entire contents of a file
aren't read (like --files-with-matches, among other methods), and if the
file was read through an external process, then ripgrep would never reap
that process.

We fix this by introducing an explicit 'close' method, which we now call
when using decompression or preprocessor searches.

The implementation of 'close' is a little hokey. In particular, when we
close stdout, this usually results in a broken pipe, and, consequently,
a non-zero code returned once the child process is reaped. This is
"situation normal," so we invent a (hopefully portable) heuristic for
detecting it.

Fixes #1766, Closes #1767
2021-05-31 21:51:18 -04:00
Vasili Revelas
3f33a83a5f searcher: remove variable shadowing
The previous variable name was the same as one of the method arguments.
2021-05-31 21:51:18 -04:00
Andrew Gallant
35b52d33b9 regex: add unit tests for non-matching anchor bytes
This is in addition to the integration level test added in
581a35e568.
2021-05-31 21:51:18 -04:00
Andrew Gallant
a77b914e7a args: make --passthru and -A/-B/-C override each other
Fixes #1868
2021-05-31 21:51:18 -04:00
Andrew Gallant
2e2af50a4d doc: add vulnerability report docs
Fixes #1773
2021-05-29 09:53:18 -04:00
Andrew Gallant
229d1a8d41 cli: fix arbitrary execution of program bug
This fixes a bug only present on Windows that would permit someone to
execute an arbitrary program if they crafted an appropriate directory
tree. Namely, if someone put an executable named 'xz.exe' in the root of
a directory tree and one ran 'rg -z foo' from the root of that tree,
then the 'xz.exe' executable in that tree would execute if there are any
'xz' files anywhere in the tree.

The root cause of this problem is that 'CreateProcess' on Windows will
implicitly look in the current working directory for an executable when
it is given a relative path to a program. Rust's standard library allows
this behavior to occur, so we work around it here. We work around it by
explicitly resolving programs like 'xz' via 'PATH'. That way, we only
ever pass an absolute path to 'CreateProcess', which avoids the implicit
behavior of checking the current working directory.

This fix doesn't apply to non-Windows systems as it is believed to only
impact Windows. In theory, the bug could apply on Unix if '.' is in
one's PATH, but at that point, you reap what you sow.

While the extent to which this is a security problem isn't clear, I
think users generally expect to be able to download or clone
repositories from the Internet and run ripgrep on them without fear of
anything too awful happening. Being able to execute an arbitrary program
probably violates that expectation. Therefore, CVE-2021-3013[1] was
created for this issue.

We apply the same logic to the --pre command, since the --pre command is
likely in a user's config file and it would be surprising for something
that the user is searching to modify which preprocessor command is used.

The --pre and -z/--search-zip flags are the only two ways that ripgrep
will invoke external programs, so this should cover any possible
exploitable cases of this bug.

[1] - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3013
2021-05-29 09:36:48 -04:00
Andrew Gallant
8ec6ef373f changelog: sync with commits since last release
I'm hoping to get a release out soon, and this is the first step.
2021-05-29 08:26:46 -04:00
Andrew Gallant
581a35e568 impl: fix --multiline anchored match bug
This fixes a bug where using \A or (?-m)^ in combination with
-U/--multiline would permit matches that aren't anchored to the
beginning of the file. The underlying cause was an optimization that
occurred when mmaps couldn't be used. Namely, ripgrep tries to still
read the input incrementally if it knows the pattern can't match through
a new line. But the detection logic was flawed, since it didn't account
for line anchors. This commit fixes that.

Fixes #1878, Fixes #1879
2021-05-29 07:37:28 -04:00
jack1142
ba965962fe ignore/types: add po files to supported types
See: https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html

Closes #1875
2021-05-28 12:06:10 -04:00
Andrew Gallant
94e4b8e301 printer: fix --vimgrep for multi-line mode
It turned out that --vimgrep wasn't quite getting the column of each
match correctly. Instead of printing column numbers relative to the
current line, it was printing column numbers as byte offsets relative to
where the match began. To fix this, we simply subtract the offset of the
line number from the beginning of the match. If the beginning of the
match came before the start of the current line, then there's really
nothing sensible we can do other than to use a column number of 1, which
we now document.

Interestingly, existing tests were checking that the previous behavior
was intended. My only defense is that I somehow tricked myself into
thinking it was a byte offset instead of a column number.

Kudos to @bfrg for calling this out in #1866:
https://github.com/BurntSushi/ripgrep/issues/1866#issuecomment-841635553
2021-05-15 08:27:59 -04:00
Alessandro Caputo
2af77242c5 doc: fix typo in --engine flag docs
Fixes #1862
2021-05-08 15:35:44 -04:00
Andrew Gallant
3f4c4188c1 deps: update to regex 1.5.2
This brings in a performance bug fix, merged in
https://github.com/rust-lang/regex/pull/768.

Fixes #1860.
2021-05-01 07:44:47 -04:00
Andrew Gallant
ce4b587055 deps: update everything
It looks like no new dependencies have been introduced. Yay!

This update was primarily motivated to bring regex 1.5 in with its new
memmem implementation from the memchr crate.
2021-04-30 20:26:32 -04:00
Eliaz Bobadilla
be63122508 doc: add links to Spanish translation
PR #1856
2021-04-21 11:14:11 -04:00
Dan Bjorge
92286ad4d2 doc: clarify --hidden definition
On Windows, we didn't previously document that ripgrep
respected both the prefix-dot convention _and_ the "hidden"
attribute on files.

Fixes #1847
2021-04-15 19:21:26 -04:00
jgart
4ebe8375ec ignore/types: add mint
PR #1844
2021-04-04 08:00:12 -04:00
Andrew Gallant
7923d25228 core: add a 'trace' message
This message will emit the binary detection mechanism being used for
each file.

This does not noticeably increases the number of log messages, as the
'trace' level is already used for emitting messages for every file
searched.

This trace message was added in the course of investigating #1838.
2021-03-31 13:54:00 -04:00
aricha1940
1c3eebefec searcher: update outdated comment for buffer size
Looks like this was accidentally left set to 8 in commit 46fb77c.

PR #1839
2021-03-31 08:18:38 -04:00
Andrew Gallant
64ac2ebe0f tests: fix tests for buffer size change
Sadly, there were several tests that are coupled to the size of the
buffer used by ripgrep. Making the tests agnostic to the size is
difficult. And it's annoying to fix the tests. But we rarely change the
buffer size, so ¯\_(ツ)_/¯.
2021-03-23 18:14:18 -04:00
Andrew Gallant
46fb77c20c searcher: bump buffer size
This increases the initial buffer size from 8KB to 64KB. This actually
leads to a reasonably noticeable improvement in at least one work-load,
and is unlikely to regress in any other case. Also, since Rust programs
(at least on Linux) seem to always use a minimum of 6-8MB of memory,
adding an extra 56KB is negligible.

Before:

    $ hyperfine -i "rg 'zqzqzqzq' OpenSubtitles2018.raw.en --no-mmap"
    Benchmark #1: rg 'zqzqzqzq' OpenSubtitles2018.raw.en --no-mmap
      Time (mean ± σ):      2.109 s ±  0.012 s    [User: 565.5 ms, System: 1541.6 ms]
      Range (min … max):    2.094 s …  2.128 s    10 runs

After:

    $ hyperfine -i "rg 'zqzqzqzq' OpenSubtitles2018.raw.en --no-mmap"
    Benchmark #1: rg 'zqzqzqzq' OpenSubtitles2018.raw.en --no-mmap
      Time (mean ± σ):      1.802 s ±  0.006 s    [User: 462.3 ms, System: 1337.9 ms]
      Range (min … max):    1.795 s …  1.814 s    10 runs
2021-03-23 17:45:02 -04:00
Allen Wild
6a1c3253e0 ci: fix deb build script in clean checkout
If ripgrep hasn't been built yet (i.e. target/debug/ doesn't exist),
then cargo-out-dir can't find OUT_DIR and the copy commands fail. Fix by
running cargo build before finding OUT_DIR.

Also add a check to fail early with a sensible error message when
asciidoctor isn't installed, rather than failing because of a missing
rg.1 file after the build.

PR #1831
2021-03-20 13:37:50 -04:00
Andrew Gallant
c7730d1f3a deps: bump regex and regex-syntax 2021-03-11 21:20:25 -05:00
Hanif Ariffin
c5ea5a13df gitignore: add HTML files generated by cargo -Z timings
PR #1801
2021-02-12 11:09:56 -05:00
Sergei Vorobev
9c8d873a75 ignore/types: improve bazel globs
Adds *.BUILD and *.bazelrc.

PR #1789
2021-01-30 18:22:48 -05:00
Andrew Gallant
7899a4b931 regex: s/CachedThreadLocal/ThreadLocal
CachedThreadLocal has been deprecated. We bump thread_local's minimal
version corresponding to that deprecation as well.
2021-01-25 10:38:05 -05:00
Andrew Gallant
ae55a4e872 deps: update everything
Most of these updates come from releases I've made, and the rest appear
minor. No new dependencies have been added, and `const_fn` was removed.
Yay.
2021-01-17 18:55:17 -05:00
Andrew Gallant
3a1780d841 deps: replace memmap with memmap2
memmap is unmaintained at this point and it is being flagged as a
RUSTSEC advisory in ripgrep. This doesn't seem like that big of a deal
to me honestly, but memmap2 looks like a fine choice at this point.

Fixes #1785, Closes #1786
2021-01-17 18:49:51 -05:00
Andrew Gallant
a6d05475fb ignore-0.4.17 2020-11-23 10:25:33 -05:00
Roey Darwish Dror
020c5453a5 cli: fix stdin detection for Powershell on Unix
It seems that PowerShell uses sockets instead of FIFOs to redirect the
output between commands. So add `is_socket` to our `is_readable_stdin`
check.

This seems unlikely to cause problems and it probably more generally
correct than what we had before. In theory, it could cause problems if
it produces false positives, in which case, ripgrep will try to read
stdin when it should search the current working directory. (And this
usually winds up manifesting as ripgrep blocking forever.) But, if the
stdin handle reports itself as a socket, then it seems like we should
read it.

Fixes #1741, Closes #1742
2020-11-23 10:23:34 -05:00
Ed Page
873abecbf1 ignore: provide underlying IO Error
`ignore::Error` wraps `std::io::Error` with additional information
(as well as expose non-IO errors). For people wanting to inspect what
the error is, they have to recursively match the Enum. This provides
`io_error` and `into_io_error` helpers to do this for the user.

PR #1740
2020-11-23 10:19:31 -05:00
tleb
8c73833efc readme: fix link to .deb
This is a common thing to forget to do after a release.
2020-11-22 09:56:02 -05:00
James Harr
44e69ba627 ignore/types: add yang file type
YANG is described in RFC 6020
https://tools.ietf.org/html/rfc6020

PR #1736
2020-11-20 09:41:29 -05:00
Andrew Gallant
13d77ab646 ci: update to GITHUB_ENV
Apparently ::set-env has been completely disabled. Sigh.
2020-11-16 19:17:36 -05:00
Andrew Gallant
d97fb72d84 doc: update CI links in crate READMEs
I switched to GitHub Actions long ago, which replaces both Travis and
AppVeyor.

Fixes #1732
2020-11-16 19:07:16 -05:00
Andrew Gallant
d6365117e2 doc: sync --help output with man page
The man page had the correct usage hints, but the -h/--help output was
using an older more incorrect version of the hints.

Closes #1730 (again)
2020-11-15 15:27:23 -05:00
Andrew Gallant
f32e906012 doc: clarify that CLI invocation must always be valid
This comes up as a corner case where folks provide -e/--regexp in a
configuration file and then expect to be able to run 'rg' with no args.
However, ripgrep fails because it still expects at least one pattern
even though one was specified in the config file.

This occurs because ripgrep has to parse its CLI parameters before
reading the config file. (For log output settings and to handle the
--no-config flag.) This initial parse will fail if there are no patterns
specified.

The only way to solve this that I can see is to somehow relax the
requirements of the initial parse. But this is problematic because we
would still need to enforce those requirements in cases where we don't
do a second parse (when no config file is present).

All in all, this doesn't seem like a problem that is worth solving.

Closes #1730
2020-11-15 15:00:08 -05:00
Taiki Endo
59644d4592 ci: install cross from crates.io
A new release of cross has been put out, so we
no longer need to install it from git.

PR #1728
2020-11-09 07:25:41 -05:00
Alex Touchet
3ca324fda7 doc: update several links to use https
PR #1724
2020-11-03 10:33:36 -05:00
Stefan VanBuren
8782f8200c doc: add missing backtick in FAQ
PR #1723
2020-11-03 10:32:38 -05:00
Andrew Gallant
2819212f89 printer: tweak binary detection message format
This roughly matches similar changes made in GNU grep recently.
2020-11-02 10:52:51 -05:00
Andrew Gallant
810be0b348 deps: update base64 to 0.13.0 2020-11-02 10:52:51 -05:00
Andrew Gallant
a28bb1e953 deps: bring in all semver updates
This brings in all other semver updates.

This did require updating some tests, since bstr changed its debug
output for NUL bytes to be a bit more idiomatic.
2020-11-02 10:52:51 -05:00
Andrew Gallant
3ef63dacbe deps: targeted update of some dependencies
This updates encoding_rs, crossbeam-utils and crossbeam-channel. This
serves two purposes. The encoding_rs update fixes a compilation failure
on the latest nightly. The crossbeam updates are good sense and to
reduce duplicate dependencies such as cfg-if. (Although, we note that
the log crate still pulls in cfg-if 0.1, so ripgrep has a duplicate
dependency there for now. But it's very small.)

Fixes #1721, Closes #1705
2020-11-02 10:52:51 -05:00
Vanessa McHale
e1ac18ef06 ignore/types: add Futhark
See: https://futhark-lang.org/

PR #1720
2020-10-31 12:10:15 -04:00
Brandon Adams
ba3f9673ad ignore/types: generalize bazel type a bit
Bazel supports `BUILD.bazel` as well as `WORKSPACE.bazel`. In
addition, it is common to ship BUILD/WORKSPACE templates for
external repositories suffixed with .bazel for easier tool
recognition.

Co-authored-by: Brandon Adams <brandon.adams@imc.com>

PR #1716
2020-10-23 12:24:30 -04:00
Andrew Gallant
c777e2cd57 globset-0.4.6 2020-10-21 21:10:43 -04:00
Ajeet D'Souza
e5639cf22d globset: remove regex unicode dependency
Since the translation from a glob to a regex always
disables Unicode in the regex, it follows that we shouldn't
need regex's Unicode features enabled.

Now, ripgrep enables Unicode features in its regex
dependency and of course uses them, which will cause
globset to have it enabled in the ripgrep build as well. So
this doesn't actually change anything for ripgrep. But this
does slim thing downs for folks using globset independently
of ripgrep.

PR #1712
2020-10-19 14:29:05 -04:00
Dương Đỗ Minh Châu
86c843a44b ignore/types: add a type for minified files
Fixes #1710, PR #1711
2020-10-19 09:10:54 -04:00
Andrew Gallant
2b1637d1db doc: clarify how -S/--smart-case works
Whether or not smart case kicks in can be a little subtle in some cases.
So we document the specific conditions in which it applies. These
conditions were taken directly from the public API docs of the
`grep-regex` crate:
https://docs.rs/grep-regex/0.1.8/grep_regex/struct.RegexMatcherBuilder.html#method.case_smart

Fixes #1708
2020-10-17 18:55:44 -04:00
Andrew Pyatkov
6301e20ee4 ignore/types: add flatbuffers type
See: https://google.github.io/flatbuffers/

PR #1707
2020-10-16 20:19:16 -04:00
dana
145cef2eff doc: elaborate on the function of -u/--unrestricted
Fixes #1703
2020-10-16 09:52:42 -04:00
Andrew Gallant
20534fad04 benchsuite/runs: add updated benchmark, with ugrep 2020-10-14 17:01:45 -04:00
Andrew Gallant
de0c24f31c benchsuite: add ugrep commands to benchmarks 2020-10-14 17:00:35 -04:00
Andrew Gallant
c55e7af675 benchsuite: remove -a flag from grep
It's not quite clear why I added this originally. ripgrep doesn't have
its `-a` flag enabled. It's possible I tricked myself into adding it
because ripgrep's binary detection has evolved to be more like GNU
grep's nowadays.

In any case, using `-a` on data that is non-binary can only improve
performance because it removes the overhead for checking whether the
data is binary or not. So this was giving an artificial boost to GNU
grep.
2020-10-14 15:16:25 -04:00
Andrew Gallant
5ebb3ad039 benchsuite: remove sift, pt and ucg
None of these tools got particularly popular (except for pt briefly),
but they do not appear to be active projects nowadays. While ucg was
fast, sift and pt were ecscruiating slow in a number of cases that
required special care in the benchmarks.

This also fixes the ordering of benchmark output to reflect the ordering
in the source of the benchsuite script.
2020-10-14 15:16:07 -04:00
Andrew Gallant
b0066274cb benchsuite: update subtitle URLs
Since the English subtitle file actually changed its content, we tweak
the benchmark to use a slightly bigger sample that more closely matches
the file size of the Russian subtitle file.

Also, the BurntSushi/linux repo has been updated and I've confirmed that
it builds on my Linux machine.

Fixes #1257
2020-10-14 14:17:23 -04:00
Josh Soref
def993bad1 spelling: fix various misspellings
These were found by the check spelling action[1] and reported
here[2].

PR #1685 

[1] - https://github.com/marketplace/actions/check-spelling
[2] - 6f02d05671 (commitcomment-42625778)
2020-09-22 10:29:16 -04:00
Andrew Gallant
f511849c81 doc: fix FAQ ordering
The actual answers were in a different order than the table of contents.
This commit corrects that. No content has been changed.
2020-09-13 09:33:14 -04:00
Andrew Gallant
e6e50054b0 doc: document cygwin path translation behavior
Kudos to @Pyker for posting more details about this.

Closes #1277
2020-09-13 09:29:28 -04:00
Andrew Gallant
11c7b2ae17 deps: upgrade pcre2-sys to 0.2.5
This brings in a PR that disables the JIT on certain Apple targets since
it doesn't appear to build.

See: https://github.com/BurntSushi/rust-pcre2/pull/16
2020-08-27 09:37:53 -04:00
Andrew Gallant
ac7d4c99b9 deps: bump pcre2-sys again
The pcre2-sys 0.2.3 release was bunk, since it didn't include the PCRE2
source for some reason.
2020-08-19 19:03:50 -04:00
Andrew Gallant
b5681e3694 deps: bump pcre2-sys
This should bring a compilation time improvement when building static
buils of PCRE2 by enabling parallelism for C compilation.

Kudos to @JoshTriplett for the tip!
2020-08-19 18:55:16 -04:00
Andy Freeland
fc2a99bb1f ignore/types: add vcl (#1659)
VCL is the Varnish Configuration Language used by Varnish and Fastly.

https://varnish-cache.org/docs/trunk/users-guide/vcl.html

PR #1659
2020-08-19 16:28:14 -04:00
Raimon Grau (rgrau)
ffd4c9ccba ignore/types: add racket
PR #1628
2020-06-25 08:51:32 -04:00
jtrakk
a16bfcb3d6 ignore/types: add dvc
This provides support for DVC files (https://dvc.org/).

PR #1608
2020-06-09 07:44:09 -04:00
Martin Michlmayr
1b2c1dc675 doc: fix typos
PR #1605
2020-06-04 09:06:09 -04:00
Andrew Gallant
b1e3de246c changelog: add empty TBD section to CHANGELOG
And update the release checklist to mention this process.
2020-05-29 09:49:45 -04:00
Andrew Gallant
bb36fc1bf8 pkg: update brew tap version to 12.1.1 2020-05-29 09:48:19 -04:00
Andrew Gallant
7cb211378a 12.1.1 2020-05-29 09:26:47 -04:00
Andrew Gallant
a73c0a21d9 changelog: 12.1.1 2020-05-29 09:26:33 -04:00
Andrew Gallant
0b965f900c doc: small release checklist updates
In particular, explicitly note when to update the CHANGELOG.

Also, tweak the ripgrep introductory message.
2020-05-29 09:21:19 -04:00
Andrew Gallant
a2f90747c9 core: update minimal dependency versions 2020-05-29 09:18:59 -04:00
Andrew Gallant
f97cc623f7 grep-0.2.7 2020-05-29 09:17:24 -04:00
Andrew Gallant
f35de5c523 grep: update minimal dependency versions 2020-05-29 09:17:08 -04:00
Andrew Gallant
c9bb78ceba grep-cli-0.1.5 2020-05-29 09:14:18 -04:00
Andrew Gallant
72bdde6771 ignore-0.4.16 2020-05-29 09:13:02 -04:00
Andrew Gallant
d66712a452 deps: update all dependencies 2020-05-29 09:11:50 -04:00
Andy Salerno
e8822ce97a ignore/doc: update misleading documentation
This likely originated from a bad copy/paste.

PR #1596
2020-05-24 23:12:53 -04:00
Andrew Gallant
a700b75843 doc: clarify capture group indices
And in particular, note the special $0 index, which corresponds to the
entire match.

Fixes #1591
2020-05-21 22:22:51 -04:00
Gerion Entrup
b72ad8f8aa ignore/types: add meson filetype
Closes #1586, PR #1587
2020-05-18 14:01:35 -04:00
Andrew Gallant
1980630f17 doc: fix egregious markup output
We use '+++' syntax to output a literal '**' for a '--glob' example.
This '+++' syntax is pretty ugly when rendered literally via --help. We
fix this by hackily inserting the '+++' syntax for its one specific case
that we need it during man page generation.

Not ideal but it works. And --help still has some '*foo*' markup, but we
live with that for now.

Fixes #1581
2020-05-13 08:13:05 -04:00
Andrew Gallant
1e9a481a66 doc: more release checklist updates 2020-05-09 11:43:37 -04:00
Andrew Gallant
bacfca174e pkg: update brew tap to version 12.1.0
This also removes Fish shell completions. See #1577 for more details.
2020-05-09 11:42:39 -04:00
Andrew Gallant
6162b000a3 changelog: 12.1.0 2020-05-09 11:36:44 -04:00
Andrew Gallant
2658bd4e46 12.1.0 2020-05-09 11:13:33 -04:00
Andrew Gallant
4b8e1f030e doc: add more detail to release checklist
Getting the crate order right is important, so document it.
2020-05-09 11:12:51 -04:00
Andrew Gallant
72807462e8 deps: update minimal versions for dependencies 2020-05-09 10:39:43 -04:00
Andrew Gallant
08dee094dd grep-0.2.6 2020-05-09 10:37:29 -04:00
Andrew Gallant
caa53b7b09 grep: update minimal dependency versions 2020-05-09 10:37:08 -04:00
Andrew Gallant
c5d6141562 grep-printer-0.1.5 2020-05-09 10:33:02 -04:00
Andrew Gallant
c0f0492b98 grep-regex-0.1.8 2020-05-09 10:31:29 -04:00
Andrew Gallant
568018386b ignore-0.4.15 2020-05-09 10:27:19 -04:00
Andrew Gallant
6219d29c24 doc: add 'cargo outdated' step to release checklist
It's just good sense to make sure everything is updated if possible.
2020-05-09 10:26:00 -04:00
Andrew Gallant
b458cf39f2 deps: update to base64 0.12
No code changes were necessary.
2020-05-09 10:25:37 -04:00
Andrew Gallant
3fd2694fbc deps: update all dependencies
Everything looks pretty minor.
2020-05-09 09:05:51 -04:00
Andrew Gallant
b56315ea84 changelog: add #1550 to CHANGELOG 2020-05-08 23:37:17 -04:00
Andrew Gallant
fac47906e6 doc: add a release checklist
The steps are numerous, subtle and complex enough that it's worth
writing them down. In particular, getting the order correct is
important. (i.e., If we released to crates.io first and the GitHub
release infrastructure failed, then we'd be in a pickle.)
2020-05-08 23:24:40 -04:00
Andrew Gallant
e02bb6b99a changelog: add downstream notices 2020-05-08 23:24:40 -04:00
Chayoung You
16a1221fc7 doc: use asciidoctor instead of a2x
AsciiDoc development is continued under asciidoctor. See
https://github.com/asciidoc/asciidoc.

We do however fallback to a2x if asciidoctor is not present. This is to
ease migration, but at some point, it's likely that support for a2x will
be dropped.

Originally reported downstream:
https://github.com/Homebrew/linuxbrew-core/issues/19885

Closes #1544
2020-05-08 23:24:40 -04:00
Casey Rodarmor
793c1179cc ignore: allow filtering with predicate
Adds `WalkBuilder::filter_entry` that takes a predicate to be applied to
all entries. If the predicate returns `false` on a given entry, that
entry and all children will be skipped.

Fixes #1555, Closes #1557
2020-05-08 23:24:40 -04:00
Wieland Hoffmann
df7a3bfc7f grep-cli: support files compressed by compress(1)
While Linux distributions (at least Arch Linux, RHEL, Debian) do not support
compressing files with compress(1), macOS & AIX do (the utility is part of
POSIX). Additionally, gzip is able to uncompress such compressed files and
provides an `uncompress` binary.

Closes #1547
2020-05-08 23:24:40 -04:00
Andrew Gallant
28f2a93cae doc: shorten -h/--help prelude
It has grown quite long. It would be nice if we could shorten this only
when -h is used and keep it long for --help, but it seems clap doesn't
let this happen. (It does have `about` and `long_about` options, but
they don't work, even when I disable the use of the template.)

The longer prelude is now only available in the man page.

This addresses #189.
2020-05-08 23:24:40 -04:00
Andrew Gallant
0eb2501b6e doc: add a section about --pre to the GUIDE
Fixes #1252
2020-05-08 23:24:40 -04:00
Andrew Gallant
184c15882e doc: add -U/--multiline to common options 2020-05-08 23:24:40 -04:00
Andrew Gallant
64a4dee495 cli: improve invalid UTF-8 pattern error message
When a pattern with invalid UTF-8 is given, the error message suggests
unqualified use of hex escape sequences to match arbitrary bytes. But
you *also* need to disable Unicode mode. So include that in the error
message.

Fixes #1339
2020-05-08 23:24:40 -04:00
Andrew Gallant
50840ea43b doc: note how to escape a '$' in --replace
Fixes #1524
2020-05-08 23:24:40 -04:00
Andrew Gallant
17dcc2bf51 doc: clarify that *files* override gitignores
This attempts to fix some mild confusion that came up as part of #1574.
Specifically:
https://github.com/BurntSushi/ripgrep/issues/1574#issuecomment-625780436
2020-05-08 23:24:40 -04:00
Andrew Gallant
9a858e4909 doc: add config file note for --type-{add,clear}
This clarifies that persistence is possible via a configuration file.

Fixes #1571
2020-05-08 23:24:40 -04:00
Andrew Gallant
cbfbe9312f snap: remove snapcraft configuration
This hasn't been updated in ages and it's not clear what purpose it's
serving.
2020-05-08 23:24:40 -04:00
Andrew Gallant
7ed9a31819 printer: fix --count-matches output
In order to implement --count-matches, we simply re-execute the regex on
the spans reported by the searcher. The spans always correspond to the
lines that participated in the match. This is the correct thing to do,
except when the regex contains look-ahead (or look-behind).

In particular, the look-around permits the regex's match success to
depends on an arbitrary point before or after the lines actually
reported as participating in the match. Since only the matched lines are
reported to the printer, it is possible for subsequent searching on
those lines to fail.

A true fix for this would somehow make the total span available to the
printer. But that seems tricky since it isn't always available. For
PCRE2's case in multiline mode, it is available because we force it to
be so for correctness.

For now, we simply detect this corner case heuristically. If the match
count is zero, then it necessarily means there is some kind of
look-around that isn't matching. So we set the match count to 1. This is
probably incorrect in some cases, although my brain can't quite come up
with a concrete example. Nevertheless, this is strictly better than the
status quo.

Fixes #1573
2020-05-08 23:24:40 -04:00
Andrew Gallant
a2e6aec7a4 tests: add new regression test for fixed inner literal bug
This adds a new test case for a bug (#1537) that has already been fixed.
Or more precisely, a new bug with the same root cause.

Closes #1559
2020-04-23 08:37:04 -04:00
Andrew Gallant
73103df6d9 deps: small dependency updates 2020-04-18 11:33:27 -04:00
Andrew Gallant
139f186e57 crates/ignore: switch to depth first traversal
This replaces the use of channels in the parallel directory traversal
with a simple stack. The primary motivation for this change is to reduce
peak memory usage. In particular, when using a channel (which is a
queue), we wind up visiting files in a breadth first fashion. Using a
stack switches us to a depth first traversal. While there are no real
intrinsic differences, depth first traversal generally tends to use less
memory because directory trees are more commonly wide than they are
deep.

In particular, the queue/stack size itself is not the only concern. In
one recent case documented in #1550, a user wanted to search all Rust
crates. The directory structure was shallow but extremely wide, with a
single directory containing all crates. This in turn results is in
descending into each of those directories and building a gitignore
matcher for each (since most crates have `.gitignore` files) before ever
searching a single file. This means that ripgrep has all such matchers
in memory simultaneously, which winds up using quite a bit of memory.

In a depth first traversal, peak memory usage is much lower because
gitignore matches are built and discarded more quickly. In the case of
searching all crates, the peak memory usage decrease is dramatic. On my
system, it shrinks by an order magnitude, from almost 1GB to 50MB. The
decline in peak memory usage is consistent across other use cases as
well, but is typically more modest. For example, searching the Linux
repo has a 50% decrease in peak memory usage and searching the Chromium
repo has a 25% decrease in peak memory usage.

Search times generally remain unchanged, although some ad hoc benchmarks
that I typically run have gotten a bit slower. As far as I can tell,
this appears to be result of scheduling changes. Namely, the depth first
traversal seems to result in searching some very large files towards the
end of the search, which reduces the effectiveness of parallelism and
makes the overall search take longer. This seems to suggest that a stack
isn't optimal. It would instead perhaps be better to prioritize
searching larger files first, but it's not quite clear how to do this
without introducing more overhead (getting the file size for each file
requires a stat call).

Fixes #1550
2020-04-18 11:33:03 -04:00
Andrew Gallant
afb325f733 readme: fix ordering of benchmarks
Results remain the same. I just didn't order them correctly.
2020-04-16 12:03:46 -04:00
Andrew Gallant
40af352d74 github: add necessary metadata 2020-04-14 16:28:09 -04:00
Andrew Gallant
3f1d4b397d github: switch to new issue template format
And also point folks toward Discussions.
2020-04-14 16:23:47 -04:00
Andrew Gallant
a75b4d122a doc: fix newline escape
Fixes #1551
2020-04-13 08:49:27 -04:00
Simon Robin
f51b762c6d pkg: fix brew tap version
It wasn't updated after the 12.0.1 release, even though the
SHA values were.

PR #1545
2020-04-07 19:45:53 -04:00
Andrew Gallant
49de7b119c ci: disable man page check
It appears to be intermittently failing. Specifically, a2x seems to be
failing occasionally with no apparent reason why. The error message it
gives is inscrutable. Sigh.
2020-04-01 21:18:04 -04:00
Andrew Gallant
1c4b5adb7b regex: fix another inner literal bug
It looks like `is_simple` wasn't quite correct.

I can't wait until this code is rewritten. It is still not quite clearly
correct to me.

Fixes #1537
2020-04-01 20:37:48 -04:00
Marius Schulz
3d6a58faff doc: fix typo in help description
PR #1536
2020-03-30 17:31:16 -04:00
Andrew Gallant
5b6ca04e39 ci: upgrade to actions/checkout@v2
In particular, this appears to fix an extremely annoying bug that was
causing PR builds to fail if they were re-run.

For more details:
https://github.com/actions/checkout/issues/23#issuecomment-572688577
2020-03-30 17:09:41 -04:00
Andrew Gallant
47f20c2661 pkg: update brew tap to 12.0.1 2020-03-29 19:18:57 -04:00
Andrew Gallant
1d5b1011e5 12.0.1 2020-03-29 18:59:40 -04:00
Andrew Gallant
1bb30b72fc changelog: prepare for 12.0.1 release, redux 2020-03-29 18:50:31 -04:00
Andrew Gallant
09a4b75baf ignore-0.4.14 2020-03-29 18:49:01 -04:00
Andrew Gallant
58c428827d changelog: prepare for 12.0.1 release 2020-03-29 18:47:46 -04:00
Andrew Gallant
b9bb04b793 deps: minor dependency updates 2020-03-29 18:47:15 -04:00
Zoltan Puskas
4dfea016b9 ignore/types: add ebuild type
Add support for Gentoo's portage package manager spec files:
https://wiki.gentoo.org/wiki/Portage
2020-03-29 18:44:04 -04:00
Andrew Gallant
3193d57ac1 ci: attempt to fix CI
It looks like a2x isn't working, so take a shot at fixing it.
2020-03-28 21:36:29 -04:00
Andrew Gallant
67c0f576b6 ignore-0.4.13 2020-03-22 21:08:37 -04:00
Andrew Gallant
543f99dbf1 grep-regex-0.1.7 2020-03-22 21:08:19 -04:00
Andrew Gallant
0ea65efd6d regex: special case literal extraction
In a prior commit, we fixed a performance problem with the -w flag by
doing a little extra work to extract literals. It turns out that using
literals in this case when the -w flag is NOT used results in a
performance regression. The reasoning is that we end up using a "fast"
regex as a prefilter when the regex engine itself uses its own
equivalent prefilter, so ripgrep ends up redoing a fair amount of work.

Instead, we only do this extra work when we know the -w flag is enabled.
2020-03-22 21:02:51 -04:00
Paul A. Patience
20deae6497 tests: fix typo in test name
PR #1528
2020-03-22 07:43:16 -04:00
Andrew Gallant
655e33219a crates.io: remove badges
... and don't replace them with anything because crates.io does not
support GitHub Actions yet. But it's almost there:
https://github.com/rust-lang/crates.io/pull/1838

Thanks @atouchet for noticing this.
2020-03-17 17:50:37 -04:00
Andrew Gallant
8ba6ccd159 ignore: fix failing test
This fixes fallout from fixing #1520.
2020-03-16 19:16:24 -04:00
Andrew Gallant
34edb8123a ignore: squash noisy error message
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.

This regression was introduced in #1446.

Fixes #1520
2020-03-16 18:50:02 -04:00
Andrew Gallant
5b30c2aed6 ci: fix deb build script 2020-03-15 22:11:32 -04:00
Andrew Gallant
bf1027a83e pkg: update brew tap to 12.0.0 2020-03-15 22:10:08 -04:00
Andrew Gallant
031264e5fb ci: tweak release name
This is consistent with prior releases.
2020-03-15 22:07:22 -04:00
Andrew Gallant
b9cd95faf1 release: 12.0.0, take 2 2020-03-15 21:54:11 -04:00
Andrew Gallant
92daa34eb3 ripgrep: release 12.0.0 2020-03-15 21:42:54 -04:00
Andrew Gallant
a8c1fb7c88 changelog: prepare for 12.0.0 release 2020-03-15 21:06:45 -04:00
Andrew Gallant
52ec68799c ci: make script names consistent 2020-03-15 21:06:45 -04:00
Andrew Gallant
c0d78240df ci: remove Travis and appveyor specific stuff 2020-03-15 21:06:45 -04:00
Andrew Gallant
cda9acb876 ci: rebuild release infrastructure on GitHub Actions 2020-03-15 21:06:45 -04:00
Andrew Gallant
1ece50694e readme: update file size 2020-03-15 13:27:31 -04:00
Andrew Gallant
f3a966bcbc readme: add 'Unicode' label to ugrep 2020-03-15 13:26:02 -04:00
Andrew Gallant
a38913b63a readme: update benchmarks
This also updates the corpora used, so previous times (and counts) are
not comparable.

We also remove some tools, likt pt, sift and ucg, since they appear to
be no longer maintained. ag isn't really maintained either, but it still
has significant mind share, so we retain a benchmark for it.

We also upgrade ack to version 3, and remove the clarification on how
`-w` is implemented.

We also add `git grep -P` (uses PCRE2) which appears to be much faster
than `git grep -E`.

Finally, we add ugrep which is a new up and comer in this space.

Fixes #1474
2020-03-15 13:21:18 -04:00
Andrew Gallant
e772a95b58 regex: avoid using literal optimizations when whitespace is detected
If a literal is entirely whitespace, then it's quite likely that it is
very common. So when that case occurs, just don't do (inner) literal
optimizations at all.

The regex engine may still make sub-optimal decisions here, but that's a
problem for another day.

Fixes #1087
2020-03-15 13:19:14 -04:00
Andrew Gallant
9dd4bf8d7f style: fix rust-analyzer lint warnings 2020-03-15 13:19:14 -04:00
Andrew Gallant
c4c43c733e cli: add --no-ignore-files flag
The purpose of this flag is to force ripgrep to ignore all --ignore-file
flags (whether they come before or after --no-ignore-files).

This flag can be overridden with --ignore-files.

Fixes #1466
2020-03-15 13:19:14 -04:00
Andrew Gallant
447506ebe0 doc: clarify globing behavior
Fixes #1442, Fixes #1478
2020-03-15 13:19:14 -04:00
Andrew Gallant
12e4180985 doc: remove CPU features from man pages
It doesn't really belong in the man page since it's an artifact of a
build/runtime configuration. Moreover, it inhibits reproducible builds.

Fixes #1441
2020-03-15 13:19:14 -04:00
Andrew Gallant
daa8319398 doc: note ripgrep's stdin behavior
Fixes #1439
2020-03-15 13:19:14 -04:00
pierrenn
3a6a24a52a cli: add engine flag
This permits switching between the different regex engine modes that
ripgrep supports. The purpose of this flag is to make it easier to
extend ripgrep with additional regex engines.

Closes #1488, Closes #1502
2020-03-15 09:30:58 -04:00
pierrenn
aab3d80374 args: refactor to permit adding other engines
This is in preparation for adding a new --engine flag which is intended
to eventually supplant --auto-hybrid-regex.

While there are no immediate plans to add more regex engines to ripgrep,
this is intended to make it easier to maintain a patch to ripgrep with
an additional regex engine. See #1488 for more details.
2020-03-15 09:24:28 -04:00
Andrew Gallant
1856cda77b style: fix rust-analyzer lints in core 2020-03-15 09:04:54 -04:00
Andrew Gallant
7340d8dbbe deps: update everything
This adds one new dependency, maybe-uninit, which is brought in by
crossbeam-channel[1]. This is to apparently fix some unsound code
without bumping the MSRV. Since ripgrep uses the latest stable release
of Rust, the maybe-uninit crate should compile down to nothing and just
re-export std's `MaybeUninit` type.

[1] - https://github.com/crossbeam-rs/crossbeam/pull/458
2020-03-15 08:32:33 -04:00
chip
50d2047ae2 crates: update URLs in Cargo.toml
This corrects an oversight when the repo was re-organized to
have its crates moved into a 'crates' sub-directory.

PR #1505
2020-02-28 20:31:43 -05:00
Wolf Honore
227436624f ignore/types: add coq type
PR #1504
2020-02-28 19:11:29 -05:00
pierreN
5bfdd3a652 ci: fix ci by removing fetch-depth 1
It's not clear why removing this makes things work. I've submitted
PRs that passed CI with fetch-depth=1. Maybe it only fails when
PRs are submitted from external contributors?

Either way, for now, we remove this and absorb the extra cost in
order to get PRs passing CI again.

PR #1501
2020-02-27 08:53:06 -05:00
Andrew Gallant
ecec6147d1 doc: be more vague in the FAQ
The existing vagueness was not enough to prevent people from lawyering
me over it.
2020-02-22 09:13:31 -05:00
Lucien Greathouse
db7a8cdcb5 globset: Implement serde::{Serialize, Deserialize} for Glob
PR #1492
2020-02-21 07:40:47 -05:00
Andrew Gallant
eef7a7e7ff readme: update CI badge 2020-02-20 18:15:15 -05:00
Andrew Gallant
4176050cdd ignore: another simplification
Again, thanks to @zsugabubus!
2020-02-20 17:26:34 -05:00
Andrew Gallant
109460fce2 ignore: simplify parallel worker initialization
We can just ask the channel whether any work has been loaded. Normally
querying a channel for its length is a strong predictor of bugs, but in
this case, we do it before we ever attempt a `recv`, so it should work.

Kudos to @zsugabubus for suggesting this!
2020-02-20 16:50:41 -05:00
Andrew Gallant
da3431b478 ci: switch build to GitHub Actions 2020-02-20 16:07:51 -05:00
Andrew Gallant
f314b0d55f ignore: fix parallel traversal
It turns out that the previous version wasn't quite correct. Namely, it
was possible for the following sequence to occur:

1. Consider that all workers, except for one, are `waiting`.
2. The last remaining worker finds one more job to do and sends it on
   the channel.
3. One of the previously `waiting` workers wakes up from the job that
   the last running worker sent, but `self.resume()` has not been
   called yet.
4. The last worker, from (2), calls `get_work` and sees that the
   channel has nothing on it, so it executes `self.waiting() ==
   1`. Since the worker in (3) hasn't called `self.resume()` yet,
   `self.waiting() == 1` evaluates to true.
5. This sets off a chain reaction that stops all workers, despite that
   fact that (3) got more work (which could itself spawn more work).

The end result is that the traversal may terminate while their are still
outstanding work items to process. This problem was observed through
spurious failures in CI. I was not actually able to reproduce the bug
locally.

We fix this by changing our strategy to detect termination using a
counter. Namely, we increment the counter just before sending new work
and decrement the counter just after finishing work. In this way, we
guarantee that the counter only ever reaches 0 once there is no more
work to process.

See #1337 for more discussion. Many thanks to @zsugabubus for helping me
work through this.
2020-02-20 16:07:51 -05:00
Andrew Gallant
fab5c812f3 tests: add debugging output
The transient failures appear to be persisting and they are quite
difficult to debug. So include a full directory listing in the output of
every test failure.
2020-02-20 16:07:51 -05:00
Andrew Gallant
c824d095a7 tests: use std::env::consts::EXE_SUFFIX
This avoids a conditional compilation knob and is likely more portable.
2020-02-20 16:07:51 -05:00
Andrew Gallant
ee21897ebd tests: make 'cross test' work
The reason why it wasn't working was the integration tests. Namely, the
integration tests attempted to execute the 'rg' binary directly from
inside cross's docker container. But this obviously doesn't work when
'rg' was compiled for a totally different architecture.

Cross normally does this by hooking into the Rust test infrastructure
and causing tests to run with 'qemu'. But our integration tests didn't
do that. This commit fixes our test setup to check for cross's
environment variable that points to the 'qemu' binary. Once we have
that, we just use 'qemu-foo rg' instead of 'rg'. Piece of cake.
2020-02-20 16:07:51 -05:00
Andrew Gallant
0373f6ddb0 ci: soft-disable Travis and AppVeyor 2020-02-20 16:07:51 -05:00
asymmetric
b44554c803 ignore/types: add K type
Adds support for files used by the K executable semantic framework:
http://www.kframework.org/index.php/Main_Page

PR #1493
2020-02-19 07:07:09 -05:00
Andrew Gallant
0874aa115c repo: make ripgrep build with the new organization 2020-02-17 19:24:53 -05:00
Andrew Gallant
fdd8510fdd repo: move all source code in crates directory
The top-level listing was just getting a bit too long for my taste. So
put all of the code in one directory and shrink the large top-level mess
to a small top-level mess.

NOTE: This commit only contains renames. The subsequent commit will
actually make ripgrep build again. We do it this way with the naive hope
that this will make it easier for git history to track the renames.
Sigh.
2020-02-17 19:24:53 -05:00
Andrew Gallant
0bc4f0447b style: rustfmt everything
This is why I was so intent on clearing the PR queue. This will
effectively invalidate all existing patches, so I wanted to start from a
clean slate.

We do make one little tweak: we put the default type definitions in
their own file and tell rustfmt to keep its grubby mits off of it. We
also sort it lexicographically and hopefully will enforce that from here
on.
2020-02-17 19:24:53 -05:00
Andrew Gallant
c95f29e3ba ci: check rustfmt in Travis 2020-02-17 19:24:53 -05:00
Andrew Gallant
3644208b03 ci: set MSRV to Rust 1.41.0
The next release will be ripgrep 12, so we bump to the latest stable
release of Rust.
2020-02-17 19:24:53 -05:00
Andrew Gallant
66f045e055 changelog: add commit links
... now that we have stable identifiers.
2020-02-17 17:34:19 -05:00
zsugabubus
3d59bd98aa ignore: rework inter-thread messaging
Change the meaning of `Quit` message. Now it means terminate. The final
"dance" is unnecessary, because by the time quitting begins, no thread
will ever spawn a new `Work`. The trick was to replace the heuristic
spin-loop with blocking receive.

Closes #1337
2020-02-17 17:16:28 -05:00
Andrew Gallant
52d7f47420 ignore: treat symbolic links to directories as directories
Due to how walkdir works if symlinks are not followed, symlinks to
directories are seen as simple files by ripgrep. This caused a panic
in some cases due to receiving a WalkEvent::Exit event without a
corresponding WalkEvent::Dir event.

This is fixed by looking at the metadata of the file in the case of a
symlink to determine if it's a directory. We are careful to only do
this stat check when the depth of the entry is 0, as this bug only
impacts us when 1) we aren't following symlinks generally and 2) the
user provides a symlinked directory that we do follow as a top-level
path to search.

Fixes #1389, Closes #1397
2020-02-17 17:16:28 -05:00
Andrew Gallant
75cbe88fa2 cli: add --no-unicode, deprecate --no-pcre2-unicode
This adds a universal --no-unicode flag that is intended to work for all
supported regex engines. There is no point in retaining
--no-pcre2-unicode, so we make them aliases to the new flags and
deprecate them.
2020-02-17 17:16:28 -05:00
Andrew Gallant
711426a632 cli: add --no-require-git flag
This flag prevents ripgrep from requiring one to search a git repository
in order to respect git-related ignore rules (global, .gitignore and
local excludes). This actually corresponds to behavior ripgrep had long
ago, but #934 changed that. It turns out that users were relying on this
buggy behavior. In most cases, fixing it as simple as converting one's
rules to .ignore or .rgignore files. Unfortunately, there are other use
cases---like Perforce automatically respecting .gitignore files---that
make a strong case for ripgrep to at least support this.

The UX of a flag like this is absolutely atrocious. It's so obscure that
it's really not worth explicitly calling it out anywhere. Moreover, the
error cases that occur when this flag isn't used (but its behavior is
desirable) will not be intuitive, do not seem easily detectable and will
not guide users to this flag. Nevertheless, the motivation for this is
just barely strong enough for me to begrudgingly accept this.

Fixes #1414, Closes #1416
2020-02-17 17:16:28 -05:00
Andrew Gallant
01eeec56bb deb: fix fish completion install location
It looks like `completions` is owned by Fish itself. Third party
completions should go in `vendor_completions.d`.

Fixes #1485
2020-02-17 17:16:28 -05:00
Jakub Wieczorek
322fc75a3d ignore: make walker visit untraversable directories
This commit fixes an inconsistency between the serial and the parallel
directory walkers around visiting a directory for which the user holds
insufficient permissions to descend into.

The serial walker does produce a successful entry for a directory that
it cannot descend into due to insufficient permissions. However, before
this change that has not been the case for the parallel walker, which
would produce an `Err` item not only when descending into a directory
that it cannot read from but also for the directory entry itself.

This change brings the behaviour of the parallel variant in line with
that of the serial one.

Fixes #1346, Closes #1365
2020-02-17 17:16:28 -05:00
Jakub Wieczorek
b435eaafc8 grep-regex: fix inner literal extraction bug
This appears to be another transcription bug from copying this code from
the prefix literal detection from inside the regex crate. Namely, when
it comes to inner literals, we only want to treat counted repetition as
two separate cases: the case when the minimum match is 0 and the case
when the minimum match is more than 0. In the former case, we treat
`e{0,n}` as `e*` and in the latter we treat `e{m,n}` where `m >= 1` as
just `e`.

We could definitely do better here. e.g., This means regexes like
`(foo){10}` will only have `foo` extracted as a literal, where searching
for the full literal would likely be faster.

The actual bug here was that we were not implementing this logic
correctly. Namely, we weren't always "cutting" the literals in the
second case to prevent them from being expanded.

Fixes #1319, Closes #1367
2020-02-17 17:16:28 -05:00
Ed Page
f8e70294d5 ignore: allow post-processing at end-of-thread
On top of the parallel-walk's closures, this provides a Visitor API.
This clarifies the role of the two different closures in the `run`
API and allows implementing of `Drop` for post-processing once traversal
is finished.

The closure API is maintained not just for compatibility but also
convinience for simple cases.

Fixes #469, Closes #1430
2020-02-17 17:16:28 -05:00
Ed Page
578e2d47a8 core: simplify parallel walking using borrows
This changes ripgrep to use ignore's new support for borrowing data when
walking in parallel.
2020-02-17 17:16:28 -05:00
Ed Page
9f7c2ebc09 ignore: allow parallel walker to borrow data
This makes it so the caller can more easily refactor from
single-threaded to multi-threaded walking. If they want to support both,
this makes it easier to do so with a single initialization code-path. In
particular, it side-steps the need to put everything into an `Arc`.

This is not a breaking change because it strictly increases the number
of allowed inputs to `WalkParallel::run`.

Closes #1410, Closes #1432
2020-02-17 17:16:28 -05:00
Andrew Gallant
5c1eac41a3 changelog: highlight a bad performance regression 2020-02-17 17:16:28 -05:00
Johannes Altmanninger
6f2b79f584 ignore: use git commondir for sourcing .git/info/exclude
Git looks for this file in GIT_COMMON_DIR, which is usually the same
as GIT_DIR (.git). However, when searching inside a linked worktree,
.git is usually a file that contains the path of the actual git dir,
which in turn contains a file "commondir" which references the directory
where info/exclude may reside, alongside other configuration shared across
all worktrees. This directory is usually the git dir of the main worktree.

Unlike git this does *not* read environment variables GIT_DIR and
GIT_COMMON_DIR, because it is not clear how to interpret them when
searching multiple repositories.

Fixes #1445, Closes #1446
2020-02-17 17:16:28 -05:00
Andrew Gallant
0c3b673e4c cli: make ripgrep work in non-existent directories
It turns out that querying the CWD while in a directory that no longer
exists results in an error. Since the CWD is queried every time ripgrep
starts---whether it needs it or not---for dealing with glob matching,
ripgrep winds up being completely useless inside a non-existent
directory.

We fix this in a few different ways:

* Firstly, if std::env::current_dir() fails, then we fall back to trying
  to read the `PWD` environment variable.
* If that fails, that we return a more sensible error message so that a
  user can at least react to the problem. Previously, the error message
  was inscrutable.
* Finally, we try to avoid the problem altogether by building empty glob
  matchers if not globs were provided, thus side-stepping querying the
  CWD completely.

Fixes #1291, Closes #1400
2020-02-17 17:16:28 -05:00
Naveen Nathan
297b428c8c cli: add --no-ignore-exclude flag
This commit adds a new --no-ignore-exclude flag that permits disabling
the use of .git/info/exclude filtering. Local exclusions are manual
configurations to a repository and are not shared, so it is sometimes
useful to disable to get a consistent view of a repository.

This also adds a new section to the man page that describes automatic
filtering.

Closes #1420
2020-02-17 17:16:28 -05:00
Manfred Endres
804b43ecd8 globset: implement FromStr for Glob
The `globset::Glob` type [`new`] function creates a new value with an
`&str` parameter which returns an `Result<Glob, Error>` object. This is
exactly what [`std::str::FromStr::from_str`][`std::str::FromStr`] defines.
Libraries like [`clap`] use [`std::str::FromStr`] to create objects from
provided commandline arguments. This change makes this library usable
without a newtype wrapper.

[`std::str::FromStr`]: 	https://doc.rust-lang.org/std/str/trait.FromStr.html
[`clap`]:		https://docs.rs/clap/2.33.0/clap/macro.value_t.html
[`new`]:		https://docs.rs/globset/0.4.4/globset/struct.Glob.html#method.new

Closes #1447
2020-02-17 17:16:28 -05:00
Lucien Greathouse
2263b8ac92 globset: add GlobMatcher::glob
This exposes the underlying `Glob` used to compile the matcher. This can
be useful for wrapping up the glob matcher in other types.

Closes #1454
2020-02-17 17:16:28 -05:00
Andrew Gallant
cd8ec38a68 grep-regex: add fast path for -w/--word-regexp
Previously, ripgrep would always defer to the regex engine's capturing
matches in order to implement word matching. Namely, ripgrep would
determine the correct match offsets via a capturing group, since the
word regex is itself generated from the user supplied regex.

Unfortunately, the regex engine's capturing mode is still fairly slow,
so this commit adds a fast path to avoid capturing mode in the vast
majority of cases. See comments in the code for details.
2020-02-17 17:16:28 -05:00
Andrew Gallant
6a0e0147e0 grep-regex: improve literal detection with -w
When the -w/--word-regexp was used, ripgrep would in many cases fail to
apply literal optimizations. This occurs specifically when the regex
given by the user is an alternation of literals with no common prefixes
or suffixes, e.g.,

    rg -w 'foo|bar|baz|quux'

In this case, the inner literal detector fails. Normally, this would
result in literal prefixes being detected by the regex engine. But
because of the -w/--word-regexp flag, the actual regex that we run ends
up looking like this:

    (^|\W)(foo|bar|baz|quux)($|\W)

which of course defeats any prefix or suffix literal optimizations in
the regex crate's somewhat naive extractor. (A better extractor could
still do literal optimizations in the above case.)

So this commit fixes this by falling back to prefix or suffix literals
when they're available instead of prematurely giving up and assuming the
regex engine will do the rest.
2020-02-17 17:16:28 -05:00
Andrew Gallant
ad97e9c93f grep-regex: improve inner literal detection
This fixes an interesting performance bug where the inner literal
extractor would sometimes choose a sub-optimal literal. For example,
consider the regex:

    \x20+Sherlock Holmes\x20+

(The `\x20` is the ASCII code for a space character, which we use here
to just make it clearer. It otherwise does not matter.)

Previously, this would see the initial \x20 and then stop collecting
literals after the `+` repetition operator. This was because the inner
literal detector was adapter from the prefix literal detector, which had
to stop here. Namely, while \x20S would be a valid prefix (for example),
\x20\x20S would also be a valid prefix. As would \x20\x20\x20S and so
on. So the prefix detector would have to stop at the repetition
operator. Otherwise, only searching for \x20S could potentially scan
farther then the starting position of the next match.

However, for inner literals, this calculus no longer makes sense. We can
freely search for, e.g., \x20S without missing matches that start with
\x20\x20S precisely because we know this is an inner literal which may
not correspond to the start of a match.

With this fix, the literal that is now detected is

    \x20Sherlock Holmes\x20

Which is much better. We achieve this by no longer "cutting" literals
after seeing a `+` repetition operator. Instead, we permit literals to
continue to be extended.

The reason why this is important is because using \x20 as the literal to
search for is generally bad juju since it is so common. In fact, we
should probably add more logic here to either avoid such things or give
up entirely on the inner literal optimization if it detected a literal
that we think is very common. But we punt on such things here.
2020-02-17 17:16:28 -05:00
Robert Irelan
24f8a3e5ec doc: document all file type
This adds it to the guide and the docs for the --type flag.

Fixes #1344, Closes #1472
2020-02-17 17:16:28 -05:00
Mikko Vedru
1bdb767851 doc: improve docs for --sort and --sortr flags
I improved the help documentation in the following manner and for the
following reasons:

1. It's only logical to put the default sub-option on the first possible
line, as well as to separately mention that it is indeed the default
sub-option.

2. Additional options for the flags should describe the main points of
their purpose without requiring user to read the whole help entry. In my
opinion, the information sub-options' influence on multi-threading and
speed are important enough to warrant their inclusion in each
sub-option's description line text.

Closes #1434
2020-02-17 17:16:28 -05:00
Andreas Stieger
a4897eca23 readme: simplify openSUSE instructions
Closes #1436
2020-02-17 17:16:28 -05:00
Collin Styles
a070722ff2 cli: add --include-zero flag
This flag, when used in conjunction with --count or --count-matches,
will print a result for each file searched even if there were zero
matches in that file. This is off by default but can be enabled to make
ripgrep behave more like grep.

This also clarifies some of the defaults for the
grep-printer::SummaryBuilder type.

Closes #1370, Closes #1405
2020-02-17 17:16:28 -05:00
Matěj Cepl
4628d77808 ignore/types: add spec file type
This is for RPM package SPEC files.

Fixes #946, Closes #1449
2020-02-17 17:16:28 -05:00
Ximin Luo
f8418c6a52 explicitly declare lazy_static dependency
`benches/bench.rs` uses lazy_static but Cargo.toml does not declare a
dependency on it. This causes rustc to use its own internal private
copy instead. Sometimes this causes unintuitive errors like this Debian
bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942243

The underlying issue is https://github.com/rust-lang/rust#27812 but it
can be avoided by explicitly declaring the dependency, which you are
supposed to do anyways.

Closes #1435
2020-02-17 17:16:28 -05:00
luh2
040ca45ba0 ignore/types: add xhtml to xml file type
Closes #1426
2020-02-17 17:16:28 -05:00
Andrew Gallant
91470572cd changelog: add notes about new file types 2020-02-17 17:16:28 -05:00
Sven-Hendrik Haase
027adbf485 ignore/types: add 'diff' file type
This includes .patch and .diff files.

Fixes #1418, Closes #1419
2020-02-17 17:16:28 -05:00
Mohammad AlSaleh
e71eedf0eb cli: add --no-context-separator flag
--context-separator='' still adds a new line separator, which could
still potentially be useful. So we add a new `--no-context-separator`
flag that completely disables context separators even when the -A/-B/-C
context flags are used.

Closes #1390
2020-02-17 17:16:28 -05:00
Andrew Gallant
88f46d12f1 tests: remove existing test directory
I'm surprised this wasn't caught until now, but if a test directory
already exists, then it was reused. This can result in hard to debug
problems with tests when, e.g., file names are changed and a recursive
search is executed.
2020-02-17 17:16:28 -05:00
sharkdp
a18cf6ec39 ignore: add existence check for ignore files
This commit adds a simple `.exists()` check for `.gitignore`,
`.ignore`, and other similar files before actually calling
`File::open(…)` in `GitIgnoreBuilder::add`.

The reason is that a simple existence check via `stat` can be faster
than actually trying to `open` the file, see
https://stackoverflow.com/a/12774387/704831. As we typically expect(?)
the number of directories *without* ignore files to be much larger
than the number of directories *with* ignore files, this leads to an
overall speedup.

The performance gain is not huge for `rg`, but can be quite significant
if more `.gitignore`-like files are added via
`add_custom_ignore_filename`. The speedup is *larger* for folders with
*low* files-per-directory ratios.

Note though that we do not do this check on Windows until a specific
analysis there suggests this is beneficial. Namely, Windows generally
has slower file system operations, so it's not clear whether this
speculative check is actually a benefit or not.

Benchmark results
-----------------

`rg --files` in my home folder (200k results, 6.5 files per directory):

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 396.4 ± 3.2 | 390.9 | 400.0 | 1.05 |
| `./rg-feature --files` | 376.0 ± 3.6 | 369.3 | 383.5 | 1.00 |

`rg --files --hidden` in my home folder (800k results, 5.4
files per directory)

| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files --hidden` | 1.575 ± 0.012 | 1.560 | 1.597 | 1.06 |
| `./rg-feature --files --hidden` | 1.479 ± 0.011 | 1.464 | 1.496 | 1.00 |

`rg --files` in the chromium-79.0.3915.2 source tree (300k results, 12.7 files per
directory)

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `~/rg-master --files` | 445.2 ± 5.3 | 435.6 | 453.0 | 1.04 |
| `~/rg-feature --files` | 428.9 ± 7.0 | 418.2 | 440.0 | 1.00 |

`rg --files` in the linux-5.3 source tree (65k results, 15.1
files per directory)

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 94.5 ± 1.9 | 89.8 | 98.5 | 1.02 |
| `./rg-feature --files` | 92.6 ± 2.7 | 88.4 | 98.7 | 1.00 |

Closes #1381
2020-02-17 17:16:28 -05:00
Gibson Fahnestock
c78c3236a8 readme: remove outdated SIMD info
Looks like the upstream brew Formula [0][] now has SIMD support, so
remove the extraneous info now that the custom tap is no longer needed
[1][].

[0]: https://github.com/Homebrew/homebrew-core/blob/master/Formula/ripgrep.rb
[1]: f3083e4574

PR #1431
2020-02-15 17:19:22 -05:00
Sorin Sbarnea
7cf21600cd readme: document CentOS 8 support
ripgrep install instructions are valid even for the 7 version. The tool
works without problems on these too.

PR #1428
2020-02-15 17:16:57 -05:00
Jonathan Mast
647b0d3977 ignore/types: add HAML and ERB
These are commonly used templating languages for Ruby, add their
extensions to the filetypes list for convenient filtering.

PR #1407
2020-02-15 09:18:32 -05:00
Jeff S
e572fc1683 ignore/types: add slim, slime, and skim templates
PR #1391
2020-02-15 09:17:46 -05:00
Andrew Gallant
9cb93abd11 ignore: allow use of Error::description
We can remove it in the next semver incompatible release.
2020-02-10 06:44:21 -05:00
Luca Kredel
41695c66fa ignore/types: add typoscript file type
Add the file types for TypoScript - the configuration language of the
TYPO3 CMS.

PR #1477
2020-02-07 08:41:00 -05:00
Andrew Gallant
cb0dfda936 faq: add section about donations
This is asked often enough that it's worth having a canonical answer.
2020-02-05 13:09:11 -05:00
Andrew Gallant
74d1fe59e9 deps: update everything 2020-01-30 18:33:40 -05:00
Andrew Gallant
9fd1e202e0 deps: update regex, regex-syntax and aho-corasick
Notably, this brings in a bug fix reported by @okdana:
https://github.com/rust-lang/regex/issues/640
2020-01-30 18:32:56 -05:00
Robert Irelan
e76807b1b5 ignore/types: add *.org_archive to org file type
.org_archive is the default extension for Org archive files, created when
entries from an Org-mode file are archived (see
<https://orgmode.org/org.html#Moving-subtrees>). These files are still in Org
mode format, so it's worth searching them at the same time as non-archive Org
mode files.

PR #1475
2020-01-29 13:59:34 -05:00
Andrew Gallant
f8fb65f7e3 globset: fix benchmarks
There were apparently a lot of unused things, including lazy_static.
2020-01-27 16:45:12 -05:00
Tristan Waddington
98de8d248a ignore/types: make 'gradle' it's own type
This change maintains the existing behavior of the 'groovy' type, which
includes both .groovy and .gradle files.

PR #1470
2020-01-23 06:51:11 -05:00
Crestwave
c358700dfb readme: add instructions for Haiku x86_64 and x86_gcc2
PR #1465
2020-01-21 07:34:24 -05:00
Alex Touchet
8670a4a969 readme: update outdated links
PR #1463
2020-01-21 07:32:54 -05:00
Oliver Newman
e3b1f86908 doc: add missing "will" to the user guide
PR #1462
2020-01-20 17:26:08 -05:00
Jan Verbeek
46b07bb2ee ignore/types: fix postscript globs
The postscript globs were missing asterisks, so they were treated as
literal filenames.

PR #1461
2020-01-20 07:48:57 -05:00
Andrew Gallant
8bdf84e3a8 deps: update everything 2020-01-16 19:47:23 -05:00
Andrew Gallant
5a6e17fcc1 deps: various updates
Most of these updates (sans thread_local) are from crates I maintain
that have seen updates recently.

Notably, this includes a bump to `termcolor 1.1.0` which includes
support for respecting `NO_COLOR`. This commit therefore means that
ripgrep now supports `NO_COLOR`.

As an added bonus, we drop a dependency on Windows. (Although the total
amount of code compiled remains the same.)

Closes #1186
2020-01-11 10:09:10 -05:00
Andrew Gallant
00bfcd14a6 ignore-0.4.11 2020-01-10 15:08:27 -05:00
Andrew Gallant
bf0ddc4675 ci: fix musl docker build
Looks like the old japaric images are bunk. We update our docker image
to be based on the new rustembedded images and configure cross to use
it.

Turns out that this wasn't due to a stale docker image, but rather, a
bug in cross: https://github.com/rust-embedded/cross/issues/357
We work around that bug by installing the master branch of cross. Sigh.
2020-01-10 15:07:47 -05:00
Andrew Gallant
0fb3f6a159 ci: disable github actions for now
The CI build failures are annoying and distracting. Hopefully soon I'll
be able to invest more time in the switch.
2020-01-10 15:07:47 -05:00
Andrew Gallant
837fb5e21f deps: update to crossbeam-channel 0.4
Closes #1427
2020-01-10 15:07:47 -05:00
Andrew Gallant
2e1815606e deps: update to bytecount 0.6
Looks like there aren't any major changes other than dependency updates.
2020-01-10 15:07:47 -05:00
Andrew Gallant
cb2f6ddc61 deps: update to thread_local 1.0
We also update the pcre2 and regex dependencies, which removes any other
lingering uses of thread_local 0.3.
2020-01-10 15:07:47 -05:00
Andrew Gallant
bd7a42602f deps: bump to base64 0.11 2020-01-10 15:07:47 -05:00
Andrew Gallant
528ce56e1b deps: run cargo update
The only new dependency is an unused target specific dependency hermit
via the atty crate.
2020-01-10 15:07:47 -05:00
Yevgen Antymyrov
8892bf648c doc: fix typo in FAQ 2019-09-25 08:13:27 -04:00
Jonathan Clem
8cb7271b64 ci: get GitHub Actions running again
Basically, matrix.os needs to be defined for every build. We
were commenting out some of the builds in order to debug
CI in the `include` section, but we also need to comment them
out in the `build section.
2019-09-11 09:08:24 -04:00
Andrew Gallant
4858267f3b ci: initial github actions config 2019-08-31 09:24:44 -04:00
Andrew Gallant
5011dba2fd ignore: remove unused parameter 2019-08-28 20:21:34 -04:00
Andrew Gallant
e14f9195e5 deps: update everything 2019-08-28 20:18:47 -04:00
Andrew Gallant
ef0e7af56a deps: update bstr to 0.2.7
The new bstr release contains a small performance bug fix where some
trivial methods weren't being inlined.
2019-08-11 10:41:05 -04:00
Todd Walton
b266818aa5 doc: use XDG_CONFIG_HOME in comments
XDG_CONFIG_DIR does not actually exist.

PR #1347
2019-08-09 13:37:37 -04:00
LawAbidingCactus
81415ae52d doc: update to reflect glob matching behavior change
Specifically, paths contains a `/` are not allowed to match any
other slash in the path, even as a prefix. So `!.git` is the correct
incantation for ignoring a `.git` directory that occurs anywhere 
in the path.
2019-08-07 13:47:18 -04:00
Andrew Gallant
5c4584aa7c grep-regex-0.1.5 2019-08-06 09:51:13 -04:00
Andrew Gallant
0972c6e7c7 grep-searcher-0.1.6 2019-08-06 09:50:52 -04:00
Andrew Gallant
0a372bf2e4 deps: update ignore 2019-08-06 09:50:35 -04:00
Andrew Gallant
345124a7fa ignore-0.4.10 2019-08-06 09:47:45 -04:00
Andrew Gallant
31807f805a deps: drop tempfile
We were only using it to create temporary directories for `ignore`
tests, but it pulls in a bunch of dependencies and we don't really need
randomness. So just use our own simple wrapper instead.
2019-08-06 09:46:05 -04:00
Andrew Gallant
4de227fd9a deps: update everything
Mostly this just updates regex and its assorted dependencies. This does
drop utf8-ranges and ucd-util, in accordance with changes to
regex-syntax and regex.
2019-08-05 13:50:55 -04:00
jimbo1qaz
d7ce274722 readme: Debian Buster is stable now
PR #1338
2019-08-04 08:06:10 -04:00
Andrew Gallant
5b10328f41 changelog: update with bug fix 2019-08-02 07:37:27 -04:00
Andrew Gallant
813c676eca searcher: fix roll buffer bug
This commit fixes a subtle bug in how the line buffer was rolling its
contents. Specifically, when ripgrep searches without memory maps,
it uses a "roll" buffer for incremental line oriented search without
needing to read the entire file into memory at once. The roll buffer
works by reading a chunk of bytes from the file into memory, and then
searching everything in that buffer up to the last `\n` byte. The bytes
*after* the last `\n` byte are preserved, since they likely correspond
to *part* of the next line. Once ripgrep is done searching the buffer,
it "rolls" the buffer such that the start of the next line is at the
beginning of the buffer, and then ripgrep reads more data into the
buffer starting at the (possibly) partial end of that line.

The implication of this strategy, necessarily so, is that a buffer must
be big enough to fit a single line in memory. This is because the regex
engine needs a contiguous block of memory to search, so there is no way
to search anything smaller than a single line. So if a file contains a
single line with 7.5 million bytes, then the buffer will grow to be at
least that size. (Many files have super long lines like this, but they
tend to be *binary* files, which ripgrep will detect and stop searching
unless the user forces it with the `-a/--text` flag. So in practice,
they aren't usually a problem. However, in this case, #1335 found a case
where a plain text file had a line with 7.5 million bytes.)

Now, for performance reasons, ripgrep reuses these buffers across its
search. Typically, it will create `N` of these line buffers when it
starts (where `N` is the number of threads it is using), and then reuse
them without creating any new ones as it searches through files.

This means that if you search a file with a very long line, that buffer
will expand to be big enough to store that line. ripgrep never contracts
these buffers, so once it searches the next file, ripgrep will continue
to use this large buffer. While it might be prudent to contract these
buffers in some circumstances, this isn't otherwise inherently a
problem. The memory has already been allocated, and there isn't much
cost to using it, other than the fact that ripgrep hangs on to it and
never gives it back to the OS.

However, the `roll` implementation described above had a really
important bug in it that was impacted by the size of the buffer.
Specifically, it used the following to "roll" the partial line at the
end of the buffer to the beginning:

    self.buf.copy_within_str(self.pos.., 0);

Which means that if the buffer is very large, ripgrep will copy
*everything* from `self.pos` (which might be very small, e.g., for small
files) to the end of the buffer, and move it to the beginning of the
buffer. This will happen repeatedly each time the buffer is used to
search small files, which winds up being quite a large slow down if the
line was exceptionally large (say, megabytes).

It turns out that copying everything is completely unnecessary. We only
need to copy the remainder of the last read to the beginning of the
buffer. Everything *after* the last read in the buffer is just free
space that can be filled for the next read. So, all we need to do is
copy just those bytes:

    self.buf.copy_within_str(self.pos..self.end, 0);

... which is typically much much smaller than the rest of the buffer.

This was likely also causing small performance losses in other cases as
well. For example, when searching a lot of small files, ripgrep would
likely do a lot more copying than necessary. Although, given that the
default buffer size is 8KB, this extra copying was likely pretty small,
and was thus harder to observe.

Fixes #1335
2019-08-02 07:23:27 -04:00
Andrew Gallant
f625d72b6f pkg: update brew tap to 11.0.2 2019-08-01 19:39:53 -04:00
Andrew Gallant
3de31f7527 ci: fix musl deployment
The docker image that the Linux binary is now built in does not have
ASCII doc installed, so setup Cross to point to my own image with those
tools installed.
2019-08-01 18:41:44 -04:00
Andrew Gallant
e402d6c260 ripgrep: release 11.0.2 2019-08-01 18:02:15 -04:00
Andrew Gallant
48b5bdc441 src: remove old directories
termcolor has had its own repository for a while now. No need for these
redirects any more.
2019-08-01 17:49:28 -04:00
Andrew Gallant
709ca91f50 ignore: release 0.4.9 2019-08-01 17:48:37 -04:00
Andrew Gallant
9c220f9a9b grep-regex: release 0.1.4 2019-08-01 17:47:45 -04:00
Andrew Gallant
9085bed139 grep-matcher: release 0.1.3 2019-08-01 17:46:59 -04:00
Andrew Gallant
931ab35f76 changelog: start work on 11.0.2 release 2019-08-01 17:42:38 -04:00
Andrew Gallant
b5e5979ff1 deps: update everything
This drops `spin` and `autocfg`, yay.
2019-08-01 17:42:38 -04:00
Andrew Gallant
052c857da0 doc: mention .ignore and .rgignore more prominently
Fixes #1284
2019-08-01 17:37:46 -04:00
Andrew Gallant
5e84e784c8 doc: add translations section
We note that they may not be up to date and are unofficial.

Fixes #1246
2019-08-01 17:37:46 -04:00
Andrew Gallant
01e8e11621 doc: improve PCRE2 failure mode documentation
If a user tries to search for an explicit `\n` character in a PCRE2
regex, ripgrep won't report an error and instead will (likely) silently
fail to match.

Fixes #1261
2019-08-01 17:32:44 -04:00
Ninan John
9268ff8e8d ripgrep: fix bug when CWD has directory named -
Specifically, when searching stdin, if the current directory has a
directory named `-`, then the `--with-filename` flag would automatically
be turned on. This is because `--with-filename` is automatically enabled
when ripgrep is given a single path that is a directory. When ripgrep is
given empty arguments, and if it is searching stdin, then its default
path list is just simple `["-"]`. The `is_dir` check passes, and
`--with-filename` gets enabled.

This commit fixes the problem by checking whether the path is `-` first.
If so, then we assume it isn't a directory. This is fine, since if it is
a directory and one asks to search it explicitly, then ripgrep will
interpret `-` as stdin anyway (which is arguably a bug on its own, but
probably not one worth fixing).

Fixes #1223, Closes #1292
2019-08-01 17:27:23 -04:00
dana
c2cb0a4de4 ripgrep: add --glob-case-insensitive
This flag forces -g/--glob patterns to be treated case-insensitively, as with
--iglob patterns.

Fixes #1293
2019-08-01 17:08:58 -04:00
Andrew Gallant
adb9332f52 regex: fix -F aho-corasick optimization
It turns out that when the -F flag was used, if any of the patterns
contained a regex meta character (such as `.`), then we winded up
escaping the pattern first before handing it off to Aho-Corasick, which
treats all patterns literally.

We continue to apply band-aides here and just avoid Aho-Corasick if
there is an escape in any of the literal patterns. This is unfortunate,
but making this work better requires more refactoring, and the right
solution is to get this optimization pushed down into the regex engine.

Fixes #1334
2019-08-01 16:58:12 -04:00
Matthew Davidson
bc37c32717 ignore/types: add edn type from Clojure ecosystem
PR #1330
2019-07-29 16:43:28 -04:00
Andrew Gallant
08ae4da2b7 deps: update them
There are some nice removals. It looks like rand has slimmed down, and
smallvec is gone now as well.
2019-07-25 07:52:33 -04:00
Andrew Gallant
7ac95c1f50 deps: bump ignore 2019-07-24 12:56:47 -04:00
Andrew Gallant
7a6903bd4e ignore-0.4.8 2019-07-24 12:56:01 -04:00
Tiziano Santoro
9801fae29f ignore: support compilation on wasm
Currently the crate assumes that exactly one of `cfg(windows)` or
`cfg(unix)` is true, but this is not actually the case, for instance
when compiling for `wasm32`.

Implement the missing functions so that the crate can compile on other
platforms, even though those functions will always return an error.

PR #1327
2019-07-24 12:55:37 -04:00
Miloš Stojanović
abdf7140d7 readme: fix broken link to Scoop bucket
PR #1324
2019-07-20 12:03:46 -04:00
Conrad Olega
b83e7968ef ignore/types: add Robot Framework
PR #1322
2019-07-14 08:12:34 -04:00
Hugo Locurcio
8ebc113847 doc: improve docs for --replace flag
Specifically, we document shell-specific caveats related to the `--replace`
flag.

PR #1318
2019-07-04 11:42:35 -04:00
Andrew Gallant
785c1f1766 release: globset, grep-cli, grep-printer, grep-searcher 2019-06-26 16:53:30 -04:00
Andrew Gallant
8b734cb490 deps: update everything 2019-06-26 16:51:06 -04:00
Andrew Gallant
b93762ea7a bstr: update everything to bstr 0.2 2019-06-26 16:47:33 -04:00
Andrew Gallant
34677d2622 search: a few small touchups 2019-06-18 20:23:47 -04:00
Andrew Gallant
d1389db2e3 search: better errors for preprocessor commands
If a preprocessor command could not be started, we now show some
additional context with the error message. Previously, it showed
something like this:

  some/file: No such file or directory (os error 2)

Which is itself pretty misleading. Now it shows:

  some/file: preprocessor command could not start: '"nonexist" "some/file"': No such file or directory (os error 2)

Fixes #1302
2019-06-16 19:02:02 -04:00
Andrew Gallant
50bcb7409e deps: update everything 2019-06-16 18:38:45 -04:00
Andrew Gallant
7b9972c308 style: fix deprecations
Use `dyn` for trait objects and use `..=` for inclusive ranges.
2019-06-16 18:37:51 -04:00
Hitesh Jasani
9f000c2910 ignore/types: add more nim types
PR #1297
2019-06-12 14:02:28 -04:00
skierpage
392682d352 doc: point regex doc link to the latest version
The latest doc is different, e.g. adds "symmetric differences" under
https://docs.rs/regex/*/regex/#character-classes

PR #1287
2019-06-01 08:44:55 -04:00
Andrew Gallant
7d3f794588 ignore: remove .git check in some cases
When we know we aren't going to process gitignores, we shouldn't waste
the syscall in every directory to check for a git repo.
2019-05-29 18:06:11 -04:00
bruce-one
290fd2a7b6 readme: mention Zstandard and Brotli
Also alphabetise the list.

PR #1288
2019-05-29 13:37:31 -04:00
Fabian Würfl
d1e4d28f30 readme: remove outdated statement
Issue #10 already states that "ripgrep is now in most or all of the major
package repositories."

PR #1280
2019-05-14 18:44:50 -04:00
Andrew Gallant
5ce2d7351d ci: use cross for musl x86_64 builds
This is necessary because jemalloc + musl + Ubuntu 16.04 is apparently
broken.

Moreover, jemalloc doesn't support i686, so we accept the performance
regression there.

See also: https://github.com/gnzlbg/jemallocator/issues/124
2019-04-25 11:12:14 -04:00
Andrew Gallant
9dcfd9a205 deps: bump pcre2-sys to 0.2.1
This brings in a bug fix that no longer tries to run `git` to update the
submodule if the `git` command doesn't exist.

This is useful is more restricted build contexts where `git` isn't
installed. Such as in the docker image used for running `cross`.
2019-04-25 11:12:14 -04:00
Andrew Gallant
36b276c6d0 printer: remove unnecessary mut 2019-04-24 17:22:27 -04:00
Andrew Gallant
03bf37ff4a alloc: use jemalloc when building with musl
It turns out that musl's allocator is slow enough to cause a fairly
noticeable performance regression when ripgrep is built as a static
binary with musl. We fix this by using jemalloc when building with musl.

We continue to use the default system allocator in all other scenarios.
Namely, glibc's allocator doesn't noticeably regress performance compared
to jemalloc. But we could add more targets to this logic if other
system allocators (macOS, Windows) prove to be slow.

This wasn't necessary before because rustc recently stopped using jemalloc
by default.

Fixes #1268
2019-04-24 17:21:38 -04:00
Andrew Gallant
e7829c05d3 cli: fix bug where last byte was stripped
In an effort to strip line terminators, we assumed their existence. But
a pattern file may not end with a line terminator, so we shouldn't
unconditionally strip them.

We fix this by moving to bstr's line handling, which does this for us
automatically.
2019-04-19 07:11:44 -04:00
Rory O’Kane
a6222939f9 readme: mention --pcre2 as long form of -P
This is for consistency with the short and long flags given in other
bullet points. I originally assumed there was no long flag for `-P`
because none was given here.

PR #1254
2019-04-16 21:22:48 -04:00
Rory O’Kane
6ffd434232 readme: mention --auto-hybrid-regex in advantages
This feature solves a major reason I was skeptical of using ripgrep, so
I think it’s good to mention it in the section about why one should use
it.

I use backreferences a lot, so I had previously thought that ripgrep
would provide no speed advantage over ag, since I would always have
`-P` enabled. But when I saw `--auto-hybrid-regex` in the 11.0.0
changelog, I learned that ripgrep can use it to speed up simple queries
while still allowing me to write backreferences.

PR #1253
2019-04-16 17:21:40 -04:00
Andrew Gallant
1f1cd9b467 pkg: update brew tap to 11.0.1 2019-04-16 13:39:56 -04:00
Andrew Gallant
973de50c9e ripgrep: release 11.0.1, take 2 2019-04-16 13:11:28 -04:00
Andrew Gallant
5f8805a496 ripgrep: release 11.0.1 2019-04-16 13:10:29 -04:00
Andrew Gallant
fdde2bcd38 deps: update regex to 1.1.6
This brings in a fix for a regression introduced in ripgrep 11.

Fixes #1247
2019-04-16 08:34:30 -04:00
Gerard de Melo
7b3fe6b325 doc: fix typo in FAQ
PR #1248
2019-04-16 08:32:30 -04:00
Max Horn
b3dd3ae203 ignore/types: add GAP
Add support for file types used by the GAP language, a research system
computational discrete algebra, see <https://www.gap-system.org>

PR #1249
2019-04-16 08:31:58 -04:00
Andrew Gallant
f3083e4574 readme: remove brew tap instructions
The brew tap isn't really needed any more, since SIMD is now
automatically enabled in all binaries.
2019-04-15 18:32:33 -04:00
Andrew Gallant
d03e30707e pkg: update brew tap to 11.0.0 2019-04-15 18:32:10 -04:00
Andrew Gallant
d7f57d9aab ripgrep: release 11.0.0 2019-04-15 18:09:40 -04:00
Andrew Gallant
1a2a24ea74 grep: release 0.2.4 2019-04-15 18:03:46 -04:00
Andrew Gallant
d66610b295 grep-cli: release 0.1.2 2019-04-15 18:02:44 -04:00
Andrew Gallant
019ae1989b grep-printer: release 0.1.2 2019-04-15 18:00:49 -04:00
Andrew Gallant
36d3f235dc grep-searcher: release 0.1.4 2019-04-15 17:59:22 -04:00
Andrew Gallant
79018eb693 grep-pcre2: release 0.1.3 2019-04-15 17:57:03 -04:00
Andrew Gallant
44cd344438 grep-regex: release 0.1.3 2019-04-15 17:56:04 -04:00
Andrew Gallant
e493e54b9b grep-matcher: release 0.1.2 2019-04-15 17:53:29 -04:00
Andrew Gallant
8e8215aa65 ignore: release 0.4.7 2019-04-15 17:50:37 -04:00
Andrew Gallant
3fe701498e doc: add note about --pre-glob
There was a performance warning in the --pre docs, but didn't mention
--pre-glob as a possible mitigation to it.
2019-04-15 17:47:48 -04:00
Andrew Gallant
e79085e9e4 release: globset 0.4.3 2019-04-15 14:07:03 -04:00
Andrew Gallant
764c197022 complete: fix typo 2019-04-15 07:04:57 -04:00
Andrew Gallant
ef1611b5f5 ripgrep: max-column-preview --> max-columns-preview
Credit to @okdana for catching this. This naming is a bit more
consistent with the existing --max-columns flag.
2019-04-15 06:51:51 -04:00
Andrew Gallant
45d12abbc5 changelog: small fixups 2019-04-14 20:21:55 -04:00
Andrew Gallant
5fde8391f9 changelog: backfill it
I went through every commit since the 0.10.0 release and added anything
that I thought was missing.
2019-04-14 20:04:01 -04:00
Marco Herrn
3edb11c513 ignore/types: add additional java files
- .jspx for XHTML JSP files
- .properties for Java Properties files (resource bundles, etc.)

Closes #1242
2019-04-14 19:38:24 -04:00
Andrew Gallant
ed144be775 ci: bump MSRV to 1.34.0 2019-04-14 19:29:27 -04:00
Andrew Gallant
967e7ad0de ripgrep: add --auto-hybrid-regex flag
This flag, when set, will automatically dispatch to PCRE2 if the given
regex cannot be compiled by Rust's regex engine. If both engines fail to
compile the regex, then both errors are surfaced.

Closes #1155
2019-04-14 19:29:27 -04:00
Andrew Gallant
9952ba2068 deps: update glob dev-dependency 2019-04-14 19:29:27 -04:00
Andrew Gallant
b751758d60 deps: update everything 2019-04-14 19:29:27 -04:00
Andrew Gallant
8f14cb18a5 ripgrep: increase pcre2's default JIT stack size
The default stack size is 32KB, and this increases it to 10MB. 32KB is
pretty paltry in the environments in which ripgrep runs, and 10MB is
easily afforded as a maximum size. (The size limit we set for Rust's
regex engine is considerably larger.)

This was motivated due to the fack that JIT stack limits have been
observed to be hit in the wild:
https://github.com/Microsoft/vscode/issues/64606
2019-04-14 19:29:27 -04:00
Andrew Gallant
da9d720431 ripgrep: add --pcre2-version flag
This flag will output details about the version of PCRE2 that ripgrep
is using (if any).
2019-04-14 19:29:27 -04:00
Andrew Gallant
a9d71a0368 pcre2: add a few re-exports
This adds the top-level is_jit_available and version free functions from
the underlying pcre2 crate, and also forwards the max_jit_stack_size
option.
2019-04-14 19:29:27 -04:00
Andrew Gallant
f3646242cc deps: use pcre2 0.2.0
This comes with PCRE 10.32 and a few new options we'll use in subsequent
commits.
2019-04-14 19:29:27 -04:00
Andrew Gallant
601f212a0b ripgrep: add -I as a short option for --no-filename
This flag is commonly used in pipelines and it can be annoying to write
it out every time you need it.

Ideally, we would use -h for this to match GNU grep, but -h is used to
print help output.

Closes #1185
2019-04-14 19:29:27 -04:00
Andrew Gallant
5a565354f8 versioning: next version will be ripgrep 11
This sets up the release announcement and briefly describes the
versioning change. The actual version change itself won't happen until
the release.

Closes #1172
2019-04-14 19:29:27 -04:00
Andrew Gallant
2a6532ae71 doc: note cases of exorbitant memory usage
Fixes #1189
2019-04-14 19:29:27 -04:00
Andrew Gallant
ece1f50cfe printer: support previews for long lines
This commit adds support for showing a preview of long lines. While the
default still remains as completely suppressing the entire line, this
new functionality will show the first N graphemes of a matching line,
including the number of matches that are suppressed.

This was unfortunately a fairly invasive change to the printer that
required a bit of refactoring. On the bright side, the single line
and multi-line coloring are now more unified than they were before.

Closes #1078
2019-04-14 19:29:27 -04:00
Andrew Gallant
a7d26c8f14 binary: rejigger ripgrep's handling of binary files
This commit attempts to surface binary filtering in a slightly more
user friendly way. Namely, before, ripgrep would silently stop
searching a file if it detected a NUL byte, even if it had previously
printed a match. This can lead to the user quite reasonably assuming
that there are no more matches, since a partial search is fairly
unintuitive. (ripgrep has this behavior by default because it really
wants to NOT search binary files at all, just like it doesn't search
gitignored or hidden files.)

With this commit, if a match has already been printed and ripgrep detects
a NUL byte, then it will print a warning message indicating that the search
stopped prematurely.

Moreover, this commit adds a new flag, --binary, which causes ripgrep to
stop filtering binary files, but in a way that still avoids dumping
binary data into terminals. That is, the --binary flag makes ripgrep
behave more like grep's default behavior.

For files explicitly specified in a search, e.g., `rg foo some-file`,
then no binary filtering is applied (just like no gitignore and no
hidden file filtering is applied). Instead, ripgrep behaves as if you
gave the --binary flag for all explicitly given files.

This was a fairly invasive change, and potentially increases the UX
complexity of ripgrep around binary files. (Before, there were two
binary modes, where as now there are three.) However, ripgrep is now a
bit louder with warning messages when binary file detection might
otherwise be hiding potential matches, so hopefully this is a net
improvement.

Finally, the `-uuu` convenience now maps to `--no-ignore --hidden
--binary`, since this is closer to the actualy intent of the
`--unrestricted` flag, i.e., to reduce ripgrep's smart filtering. As a
consequence, `rg -uuu foo` should now search roughly the same number of
bytes as `grep -r foo`, and `rg -uuua foo` should search roughly the
same number of bytes as `grep -ra foo`. (The "roughly" weasel word is
used because grep's and ripgrep's binary file detection might differ
somewhat---perhaps based on buffer sizes---which can impact exactly what
is and isn't searched.)

See the numerous tests in tests/binary.rs for intended behavior.

Fixes #306, Fixes #855
2019-04-14 19:29:27 -04:00
Andrew Gallant
bd222ae93f regex: fix HIR analysis bug
An alternate can be empty at this point, so we must handle it. We didn't
before because the regex engine actually disallows empty alternates,
however, this code runs before the regex compiler rejects the regex.
2019-04-14 19:29:27 -04:00
hupfdule
4359d8aac0 ignore/types: add more extensions for xml
This includes:

    *.dtd for Document Type Definitions
    *.xsl and *.xslt for XSL Transformation descriptions
    *.xsd for XML Schema definitions
    *.xjb for JAXB bindings
    *.rng for Relax NG files
    *.sch for Schematron files

PR #1243
2019-04-09 15:17:57 -04:00
tonypai
308819fb1f ignore/types: add lock files
Treat anything with a `.lock` extension as a lock file, with
an extra rule or two for special cases, e.g., package-lock.json.
2019-04-09 10:24:48 -04:00
Andrew Gallant
09108b7fda regex: make multi-literal searcher faster
This makes the case of searching for a dictionary of a very large number
of literals much much faster. (~10x or so.) In particular, we achieve this
by short-circuiting the construction of a full regex when we know we have
a simple alternation of literals. Building the regex for a large dictionary
(>100,000 literals) turns out to be quite slow, even if it internally will
dispatch to Aho-Corasick.

Even that isn't quite enough. It turns out that even *parsing* such a regex
is quite slow. So when the -F/--fixed-strings flag is set, we short
circuit regex parsing completely and jump straight to Aho-Corasick.

We aren't quite as fast as GNU grep here, but it's much closer (less than
2x slower).

In general, this is somewhat of a hack. In particular, it seems plausible
that this optimization could be implemented entirely in the regex engine.
Unfortunately, the regex engine's internals are just not amenable to this
at all, so it would require a larger refactoring effort. For now, it's
good enough to add this fairly simple hack at a higher level.

Unfortunately, if you don't pass -F/--fixed-strings, then ripgrep will
be slower, because of the aforementioned missing optimization. Moreover,
passing flags like `-i` or `-S` will cause ripgrep to abandon this
optimization and fall back to something potentially much slower. Again,
this fix really needs to happen inside the regex engine, although we
might be able to special case -i when the input literals are pure ASCII
via Aho-Corasick's `ascii_case_insensitive`.

Fixes #497, Fixes #838
2019-04-07 19:11:03 -04:00
Andrew Gallant
743d64f2e4 deps: update to clap 2.33 2019-04-06 10:35:08 -04:00
lesnyrumcajs
5962abc465 searcher: add option to disable BOM sniffing
This commit adds a new encoding feature where the -E/--encoding flag
will now accept a value of 'none'. When given this value, all encoding
related machinery is disabled and ripgrep will search the raw bytes of
the file, including the BOM if it's present.

Closes #1207, Closes #1208
2019-04-06 10:35:08 -04:00
dana
1604a18db3 ignore/types: add *.am and *.in for C/C++/make
PR #1205
2019-04-06 08:02:04 -04:00
luzpaz
9eeb0b01ce readme: add Repology badge
This adds a badge to the README.md file indicating to users that click
on it if their os/distro carries that latest version of ripgrep.

PR #1213
2019-04-06 08:00:40 -04:00
dana
df4400209a ripgrep: remove extra new-line after Clap output
PR #1222
2019-04-06 07:59:36 -04:00
Andrew Gallant
77439f99a4 deps: add bstr to Cargo.lock 2019-04-05 23:24:08 -04:00
Andrew Gallant
be7d6dd9ce regex: print out final regex in trace mode
This is useful for debugging to see what regex is actually being run.
We put this as a trace since the regex can be quite gnarly. (It is not
pretty printed.)
2019-04-05 23:24:08 -04:00
Andrew Gallant
9f15e3b671 regex: fix a perf bug when using -w flag
When looking for an inner literal to speed up searches, if only a prefix
is found, then we generally give up doing inner literal optimizations since
the regex engine will generally handle it for us. Unfortunately, this
decision was being made *before* we wrap the regex in (^|\W)...($|\W) when
using the -w/--word-regexp flag, which would then defeat the literal
optimizations inside the regex engine.

We fix this with a bit of a hack that says, "if we're doing a word regexp,
then give me back any literal you find, even if it's a prefix."
2019-04-05 23:24:08 -04:00
Andrew Gallant
254b8b67bb globset: small perf improvements
This tweaks the path handling functions slightly to make them a hair
faster. In particular, `file_name` is called on every path that ripgrep
visits, and it was possible to remove a few branches without changing
behavior.
2019-04-05 23:24:08 -04:00
Andrew Gallant
8a7f43b84d globset: use bstr
This simplifies the various path related functions and pushed more platform
dependent code down into bstr. This likely also makes things a bit more
efficient on Windows, since we now only do a single UTF-8 check for each
file path.
2019-04-05 23:24:08 -04:00
Andrew Gallant
d968a27ed5 cli: use bstr
This uses bstr in the unescaping logic. This lets us remove some platform
specific code, and also lets us remove a hacked UTF-8 decoder on raw
bytes.
2019-04-05 23:24:08 -04:00
Andrew Gallant
9b8f5cbaba config: switch to using bstrs
This lets us implement correct Unicode trimming and also simplifies the
parsing logic a bit. This also removes the last platform specific bits of
code in ripgrep core.
2019-04-05 23:24:08 -04:00
Andrew Gallant
c52da74ac3 printer: use bstr
This starts the usage of bstr in the printer. We don't use it too much
yet, but it comes in handy for implementing PrinterPath and lets us push
down some platform specific code into bstr.
2019-04-05 23:24:08 -04:00
Andrew Gallant
7dcbff9a9b searcher: partially migrate to bstr
This commit causes grep-searcher to use byte strings internally for its
line buffer support. We manage to remove a use of `unsafe` by doing this
(by pushing it down into `bstr`).

We stop short of using byte strings everywhere else because we rely
heavily on the `impl ops::Index<[u8]> for grep_matcher::Match` impl,
which isn't available for byte strings. (It is premature to make bstr a
public dep of a core crate like grep-matcher, but maybe some day.)
2019-04-05 23:24:08 -04:00
Andrew Gallant
bef1f0e770 ci: switch to xenial (#1234)
Rust is having problems with trusty, in particular, see this bug I
filed: https://github.com/rust-lang/rust/issues/59411

This was purpotedly fixed in
https://github.com/rust-lang/rust/pull/59468,
but it appears the issue is still occurring.

This commit tries to update to Ubuntu 16.04 in the hope that it will fix
this problem.
2019-04-03 19:52:34 -04:00
Andrew Gallant
cd9815cb37 deps: update to aho-corasick 0.7
We do the simplest possible change to migrate to the new version.

Fixes #1228
2019-04-03 13:51:26 -04:00
Andrew Gallant
3f22c3a658 deps: update everything
This updates all dependencies to their latest versions.

We tolerate a duplicative aho-corasick for now, which we will fix in the
next commit.
2019-04-03 13:07:26 -04:00
Andrew Gallant
0913972104 deps: bump encoding_rs_io
This brings in a new API for disabling BOM sniffing.

This is part of the work toward completing
https://github.com/BurntSushi/ripgrep/issues/1207
2019-03-03 16:36:34 -05:00
Andrew Gallant
f19b84fb23 regex: bump regex dep to fix match bug
See

* 661bf53d5b
* edf45e6f5f

for details on the bug fix, which was in the regex engine.

Fixes #1203
2019-02-27 17:42:14 -05:00
Andrew Gallant
59fc583aeb readme: include details about filtering
Despite the fact that we mention this in several places, people are
still surprised by ripgrep's "smart" filtering.
2019-02-27 08:01:23 -05:00
Andrew Gallant
1c7c4e6640 deps: update tempfile 2019-02-21 16:32:17 -05:00
Andrew Gallant
69c5e3938d deps: bump smallvec
This gets rid of the unmaintained crates `unreachable` and `void`. Yay!
2019-02-21 16:31:48 -05:00
Andrew Gallant
d9cf05ad50 deps: update to aho-corasick 0.6.10
This brings in a fix for this bug:
https://github.com/BurntSushi/aho-corasick/issues/37

Fixes #1079
2019-02-16 11:39:33 -05:00
Andrew Gallant
af8b6caebb deps: update various dependencies 2019-02-16 09:39:42 -05:00
Andrew Gallant
c84cfb6756 grep-regex-0.1.2 2019-02-16 09:30:06 -05:00
Andrew Gallant
895e26a000 ci: don't do releases on all tags
This attempts to make Appveyor more conservative in what tags it thinks
are releases. I don't know for sure, but it looks like the previous
regex could match anywhere, so we anchor it.

Fixes #1195
2019-02-10 12:51:56 -05:00
Andrew Gallant
8c95290ff6 deps: miscellaneous updates 2019-02-10 07:45:08 -05:00
Andrew Gallant
d6feeb7ff2 grep-searcher-0.1.3 2019-02-10 07:42:37 -05:00
Andrew Gallant
626ed00c19 searcher: revert big-endian patch
This undoes the patch to stop using bytecount on big-endian
architectures. In particular, we bump our bytecount dependency to the
latest release, which has a fix.

This reverts commit a4868b8835.

Fixes #1144 (again), Closes #1194
2019-02-10 07:40:32 -05:00
Andrew Gallant
332ad18401 tests: use const constructor for atomics
We did this in 05411b2b for core ripgrep, but didn't carry it over to
tests.
2019-02-09 16:27:25 -05:00
Andrew Gallant
fc3cf41247 grep-searcher-0.1.2 2019-02-09 16:13:07 -05:00
Andrew Gallant
a4868b8835 searcher: use naive line counting on big-endian
This patches out bytecount's "fast" vectorized algorithm on big-endian
machines, where it has been observed to fail. Going forward, bytecount
should probably fix this on their end, but for now, we take a small
performance hit on big-endian machines.

Fixes #1144
2019-02-09 16:13:07 -05:00
John Schmidt
f99b991117 ignore/types: add zig
PR #1191
2019-02-08 08:12:40 -05:00
Andrew Gallant
de0bc78982 deps: bump encoding_rs to 0.8.16
This brings in an updated `encoding_rs` crate that uses `packed_simd`,
which compiles on the latest nightly. Compilation times do appear to be
impacted significantly though.

Fixes #1175 (again)
2019-02-07 17:05:14 -05:00
Steffen Banhardt
147e96914c ignore/types: *.dtx and *.ins added for tex
PR #1182
2019-01-31 09:06:19 -05:00
Andrew Gallant
0abc40c23c readme: bump MSRV
We bumped it a while back in the CI configuration, but didn't update the
README.
2019-01-29 13:10:43 -05:00
Andrew Gallant
f768796e4f deps: update other deps 2019-01-29 13:08:56 -05:00
Andrew Gallant
da0c0c4705 deps: update to crossbeam-channel 0.3.8
This drops dependencies on parking_lot and rand from ripgrep.

(rand is still used for tests.)
2019-01-29 13:07:37 -05:00
Andrew Gallant
05411b2b32 deprecated: remove use of ATOMIC_BOOL_INIT
Our MSRV is high enough that we can use const functions now.
2019-01-29 13:05:16 -05:00
Andrew Gallant
cc93db3b18 cargo: include auto-generated message
This is going to be annoying for a while if one switches between the
latest nightly compiler and older compilers. Sigh.
2019-01-29 13:04:40 -05:00
Alex Macleod
049354b766 readme: remove EOL Fedora install instructions
Fedora 27 and below are past their EOL, so it can now be said that it's
supported regularly on Fedora.

PR #1177
2019-01-28 08:15:36 -05:00
Andrew Gallant
386dd2806d changelog: BUG #916
This was fixed by bumping the MSRV above Rust 1.28.

Fixes #916
2019-01-27 13:15:17 -05:00
Andrew Gallant
5fe9a954e6 changelog: BUG #1154 2019-01-27 13:05:50 -05:00
Andrew Gallant
f158a42a71 ignore: correctly detect hidden files on Windows
This commit fixes a bug where ripgrep only treated files beginning with
a `.` as hidden. On Windows, we continue this tradition, but
additionally check whether a file has the special Windows "hidden"
attribute set. If so, we treat it as a hidden file.

In order to make this work without an additional stat call, we had to
rearrange some of the plumbing from the directory traverser.

Fixes #1154
2019-01-27 12:11:52 -05:00
Andrew Gallant
5724391d39 doc: small updates to the FAQ and GUIDE
Notably, ripgrep can do multiline search now. We also update the
supported compression format list and replace deprecated flags like
`--sort-files` with `--sort path`.
2019-01-26 16:19:09 -05:00
Andrew Gallant
0df71240ff search: fix -F and -f interaction bug
This fixes what appears to be a pretty egregious regression where the
`-F/--fixed-strings` flag wasn't be applied to patterns supplied via
the `-f/--file` flag. The same bug existed for the `-x/--line-regexp`
flag as well, which we fix here.

Fixes #1176
2019-01-26 16:01:52 -05:00
Andrew Gallant
f3164f2615 exit: tweak exit status logic
This changes how ripgrep emit exit status codes. In particular, any error
that occurs while searching will now cause ripgrep to emit a `2` exit
code, where as it previously would emit either a `0` or a `1` code based
on whether it matched or not. That is, ripgrep would only emit a `2` exit
code for a catastrophic error.

This tweak includes additional logic that GNU grep adheres to, which seems
like good sense. Namely, if -q/--quiet is given, and an error occurs and
a match occurs, then ripgrep will emit a `0` exit code.

Closes #1159
2019-01-26 15:44:49 -05:00
Andrew Gallant
31d3e24130 args: prevent panicking in 'rg -h | rg'
Previously, we relied on clap to handle printing either an error
message, or --help/--version output, in addition to setting the exit
status code. Unfortunately, for --help/--version output, clap was
panicking if the write failed, which can happen in fairly common
scenarios via a broken pipe error. e.g., `rg -h | head`.

We fix this by using clap's "safe" API and doing the printing ourselves.
We also set the exit code to `2` when an invalid command has been given.

Fixes #1125 and partially addresses #1159
2019-01-26 14:39:40 -05:00
Andrew Gallant
bf842dbc7f doc: add note about inverted flags
Fixes #1091
2019-01-26 14:13:06 -05:00
Andrew Gallant
6d5dba85bd doc: clarify automatic encoding detection
Fixes #1103
2019-01-26 13:55:47 -05:00
Andrew Gallant
afb89bcdad fmt: shorten --ignore-file-case-insensitive description 2019-01-26 13:45:02 -05:00
Andrew Gallant
332dc56372 changelog: BUG #1095 2019-01-26 13:40:59 -05:00
Andrew Gallant
12a6ca45f9 config: add --no-ignore-dot flag
This flag causes ripgrep to ignore `.ignore` files.

Closes #1138
2019-01-26 13:40:12 -05:00
Andrew Gallant
9d703110cf regex: make CRLF hack more robust
This commit improves the CRLF hack to be more robust. In particular, in
addition to rewriting `$` as `(?:\r??$)`, we now strip `\r` from the end
of a match if and only if the regex has an ending line anchor required for
a match. This doesn't quite make the hack 100% correct, but should fix most
use cases in practice. An example of a regex that will still be incorrect
is `foo|bar$`, since the analysis isn't quite sophisticated enough to
determine that a `\r` can be safely stripped from any match. Even if we
fix that, regexes like `foo\r|bar$` still won't be handled correctly. Alas,
more work on this front should really be focused on enabling this in the
regex engine itself.

The specific cause of this bug was that grep-searcher was sneakily
stripping CRLF from matching lines when it really shouldn't have. We remove
that code now, and instead rely on better match semantics provided at a
lower level.

Fixes #1095
2019-01-26 12:34:28 -05:00
Andrew Gallant
e99b6bda0e deps: bump regex-syntax to 0.6.5
This is necessary for the use of the new is_line_anchored_{start,end}
APIs.
2019-01-26 12:20:02 -05:00
Andrew Gallant
276e2c9b9a searcher: always strip BOM
This fixes a bug where a BOM prefix was included. While this was somewhat
intentional in order to have a faithful "UTF8 passthru" option, in
practice, this causes problems such as breaking patterns like `^` in a
really non-obvious way.

The actual fix was to add a new API to encoding_rs_io, which this commit
brings in.

Fixes #1163
2019-01-25 17:18:57 -05:00
Andrew Gallant
9a9f54d44c readme: encoding_rs's SIMD support is broken
Add a note about it to the README.

Also, remove mention of the avx-accel feature since it no longer exists.
(bytecount now uses runtime detection to enable SIMD support.)

Fixes #1175
2019-01-24 07:00:53 -05:00
Andrew Gallant
47833b9ce7 deps: update removal of grep devdeps 2019-01-23 20:14:37 -05:00
Awad Mackie
44a9e37737 ignore/types: add method for retrieving file type definition
Fixes #1116, Closes #1120
2019-01-23 20:08:48 -05:00
Andrew Gallant
8fd05cacee changelog: BUG #1121 2019-01-23 20:06:01 -05:00
Rob Lourens
4691d11034 ripgrep: don't skip stdout in --files mode
Specifically, this avoids triggering Windows antimalware when in --files mode.

See also #600.

Fixes #1121
2019-01-23 20:04:44 -05:00
Andrew Gallant
519a6b68af grep: remove unused dependencies
We remove these for now, but we'll eventually add them back once the
examples get more fleshed out.

Closes #1043
2019-01-23 20:01:32 -05:00
Andrew Gallant
9c940b45f4 globset: permit ** to appear anywhere
Previously, `man gitignore` specified that `**` was invalid unless it
was used in one of a few specific circumstances, i.e., `**`, `a/**`,
`**/b` or `a/**/b`. That is, `**` always had to be surrounded by either
a path separator or the beginning/end of the pattern.

It turns out that git itself has treated `**` outside the above contexts
as valid for quite a while, so there was an inconsistency between the
spec `man gitignore` and the implementation, and it wasn't clear which
was actually correct.

@okdana filed a bug against git[1] and got this fixed. The spec was wrong,
which has now been fixed [2] and updated[2].

This commit brings ripgrep in line with git and treats `**` outside of
the above contexts as two consecutive `*` patterns. We deprecate the
`InvalidRecursive` error since it is no longer used.

Fixes #373, Fixes #1098

[1] - https://public-inbox.org/git/C16A9F17-0375-42F9-90A9-A92C9F3D8BBA@dana.is
[2] - 627186d020
[3] - https://git-scm.com/docs/gitignore
2019-01-23 19:59:39 -05:00
Andrew Gallant
0a167021c3 changelog: BUG #1174 2019-01-23 19:19:26 -05:00
Andrew Gallant
aeaa5fc1b1 globset: fix repeated use of **
This fixes a bug where repeated use of ** didn't behave as it should. In
particular, each use of `**` added a new requirement directory depth
requirement. For example, something like `**/**/b` would match
`foo/bar/b`, but it wouldn't match `foo/b` even though it should. In
particular, `**` semantics demand "infinite" depth, so repeated uses of
`**` should just coalesce as if only one was given.

We do this coalescing in the parser. It's a little tricky because we
treat `**/a`, `a/**` and `a/**/b` as distinct tokens with their own
regex conversions. We also test the crap out of it.

Fixes #1174
2019-01-23 19:15:02 -05:00
Andrew Gallant
7048a06c31 changelog: BUG #1173 2019-01-23 18:14:16 -05:00
Andrew Gallant
23be3cf850 ignore: fix handling of **
When deciding whether to add the `**/` prefix or not, we should choose
not to add it if the pattern is simply a bare `**`. Previously, we were
only not adding it if it was `**/`, which is correct, but we also need
to do it for `**` since `**` can already match anywhere.

There's likely a more principled solution to this, but this works for
now.

Fixes #1173
2019-01-23 18:12:35 -05:00
Andrew Gallant
b48bbf527d changelog: PR #1093 2019-01-23 17:56:18 -05:00
dana
8eabe47b57 ignore: always use literal_separator for gitignore patterns (#1093)
PR #1093
2019-01-23 17:54:28 -05:00
Michele Bologna
ff712bfd9d readme: add instructions for openSUSE 15.0
PR #1088
2019-01-22 21:46:11 -05:00
Mika Dede
a7f2d48234 printer: fix path handling in summarizer
This commit fixes a bug where both of the following commands always
reported an error:

    rg --files-with-matches foo file
    rg --files-without-match foo file

In particular, the printer was erroneously respecting the `path` option
even the the summary kind was `PathWithMatch` or `PathWithoutMatch`. The
documented behavior is that those summary kinds always require a path,
and thus, the `path` option has no effect. We fix this by correcting the
case analysis.

This also fixes a bug where the exit code for `--files-without-match`
was not set correctly. We update the printer's `has_match` method to
report the correct value.

Fixes #1106, Closes #1130
2019-01-22 21:37:23 -05:00
Andrew Gallant
57500ad013 changelog: brotli/zstd addition 2019-01-22 20:57:28 -05:00
dana
0b04553aff grep-cli: support Brotli/Zstd decompression
Fixes #1099
2019-01-22 20:56:16 -05:00
dana
1ae121122f ignore/types: add/update brotli, bzip2, gzip, xz, zstd 2019-01-22 20:56:16 -05:00
Andrew Gallant
688003e51c ripgrep: ban rustfmt 2019-01-22 20:07:26 -05:00
David Torosyan
718a00f6f2 ripgrep: add --ignore-file-case-insensitive
The --ignore-file-case-insensitive flag causes all
.gitignore/.rgignore/.ignore files to have their globs matched without
regard for case. Because this introduces a potentially significant
performance regression, this is always disabled by default. Users that
need case insensitive matching can enable it on a case by case basis.

Closes #1164, Closes #1170
2019-01-22 20:03:59 -05:00
Andrew Gallant
7cbc535d70 edition: fix build.rs 2019-01-19 10:46:57 -05:00
Andrew Gallant
7a6a40bae1 edition: move core ripgrep to Rust 2018 2019-01-19 10:44:30 -05:00
Andrew Gallant
1e9ee2cc85 deps: update memmap 2019-01-19 10:44:30 -05:00
Andrew Gallant
968491f8e9 deps: update to bytecount 0.5
bytecount now uses runtime dispatch for enabling SIMD, which means we can
no longer need the avx-accel features. We remove it from ripgrep since the
next release will be a minor version bump, but leave them as no-ops for
the crates that previously used it.
2019-01-19 10:44:30 -05:00
Andrew Gallant
63b0f31a22 deps: update various dependencies
We also increase the MSRV to 1.32, the current stable release, which sets
the stage for migrating to Rust 2018.
2019-01-19 10:44:30 -05:00
P M
7ecee299a5 ignore/types: add QML
PR #1165
2019-01-18 06:48:47 -05:00
David Håsäther
dd396ff34e doc: fix typo
PR #1161
2019-01-14 06:50:30 -05:00
Andrew Gallant
fb0a82f3c3 grep-printer: add macro docs, redux 2019-01-11 09:18:09 -05:00
Andrew Gallant
dbc8ca9cc1 grep-searcher: add docs for assert_eq_printed
Looks like the deny(missing_docs) lint got a bit stronger.
2019-01-11 09:03:00 -05:00
Marco Hinz
c3db8db93d doc: fix typo 2019-01-05 11:18:05 -05:00
Andrew Gallant
17ef4c40f3 ignore-0.4.6 2018-12-30 08:46:09 -05:00
Andrew Gallant
a9e0477ea8 ignore: permit use of deprecated trim_right 2018-12-30 08:44:59 -05:00
Andrew Gallant
b3c5773266 deps: bump ignore 2018-12-30 08:43:18 -05:00
Andrew Gallant
118b950085 ignore-0.4.5 2018-12-15 08:44:10 -05:00
Andrew Gallant
b45b2f58ea deps: update most other dependencies
This commit is the result of doing:

  $ cargo update
  $ cargo update -p encoding_rs --precise 0.8.10

where the latter line prevents encoding_rs from updating to 0.8.11 (or
newer). In particular, the 0.8.11 release increased the minimum Rust
version to 1.29, where as ripgrep 0.10.x is still on 1.28. We stay on an
older version for now until ripgrep is ready to move to 0.11.x.
2018-12-15 08:42:14 -05:00
Andrew Gallant
662a9bc73d deps: update to crossbeam-channel 0.3
This also requires corresponding updates to both rand and rand_core. Doing
an update of rand without doing an update of rand_core results in
compilation errors because two distinct versions of rand_core are included
in the build, and the traits they expose are distinct and incompatible.

We also switch over to using tempfile instead of tempdir, which drops the
last remaining thing keeping rand 0.4 in the build.

Fixes #1141, Fixes #1142
2018-12-15 08:40:04 -05:00
Andrew Gallant
401add0a99 deps: update regex and regex-syntax
This brings in some new Unicode properties, such as \p{Emoji}.

It is now also technically possible construct a regex that recognizes
grapheme clusters.
2018-12-09 16:33:37 -05:00
Simon Morgan
f81b72721b ignore/types: add ASP
PR #1134
2018-12-07 16:19:33 -05:00
Antony Lee
1d4fccaadc ignore/types: add postscript
Although postscript/encapsulated postscript is usually thought of as a
binary format, it's actually mostly ASCII, so ripgrep will not ignore
these files.

The situation is basically the same as for pdf, which is also already
present in the list of known filetypes.

PR #1118
2018-11-23 09:46:11 -05:00
Matteo Bertini
09e464e674 ignore/types: add more Cython file types
From the [Cython file types](https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html?highlight=pxi#cython-file-types) paragraph on the official docs:

> There are three file types in Cython:
>    The implementation files, carrying a .py or .pyx suffix.
>    The definition files, carrying a .pxd suffix.
>    The include files, carrying a .pxi suffix.

PR #1113
2018-11-19 07:37:00 -05:00
Jon Parise
31adff6f3c ignore/types: add Apache Thrift
PR #1102
2018-11-07 07:42:13 -05:00
Andrew Gallant
b41e596327 doc: escape braces in AsciiDoc
This commit fixes a bug where AsciiDoc would drop any line containing a
'{foo}' because it interpreted it as an undefined attribute reference:

> Simple attribute references take the form {<name>}. If the attribute name
> is defined its text value is substituted otherwise the line containing the
> reference is dropped from the output.

See: https://www.methods.co.nz/asciidoc/chunked/ch30.html

We fix this by simply replacing all occurrences of '{' and '}' with
their escaped forms: '&#123;' and '&#125;'.

Fixes #1101
2018-11-06 06:57:16 -05:00
Andrew Gallant
fb62266620 deps: update encoding_rs
This commit bumps the version of encoding_rs to use the latest release.
This appears to fix a panic in UTF-16 decoding.

Fixes #1089
2018-10-22 06:50:35 -04:00
Dave Lee
acf226c39d ignore/types: add BUILD.bazel to bazel file type
PR #1074
2018-10-02 18:00:04 -04:00
Mathieu Bridon
8299625e48 ignore/types: add buildstream
BuildStream is a Free Software tool for building/integrating software stacks.: https://buildstream.gitlab.io/buildstream/

It uses recipes written in YAML, in files with the `.bst` extension.

PR #1071
2018-09-28 08:32:24 -04:00
Andrew Gallant
db256c87eb ripgrep: suggest -U/--multiline
When a "\n literal is not allowed" error is reported, ripgrep will now
suggest the use of the -U/--multiline flag, which enables matching
newlines.

Fixes #1055
2018-09-25 16:56:04 -04:00
Andrew Gallant
ba533f390e grep-searcher: update to encoding_rs_io 0.1.3
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
https://github.com/hsivonen/encoding_rs/issues/34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
2018-09-25 16:56:04 -04:00
Andrew Gallant
ba503eb677 grep-regex: fix inner literal detection
It seems the inner literal detector fails spectacularly in cases of
concatenations that involve groups. The issue here is that if the prefix
of a group inside a concatenation can match the empty string, then any
literals generated to that point in the concatenation need to be cut
such that they are never extended. The detector isn't really built to
handle this case, so we just act conservative cut literals whenever we
see a sub-group. This may make some regexes slower, but the inner
literal detector already misses plenty of cases.

Literal detection (including in the regex engine) is a key component
that needs to be completely rethought at some point.

Fixes #1064
2018-09-25 16:56:04 -04:00
Andrew Gallant
f72c2dfd90 readme: touch up README
Make the wording consistent.
2018-09-14 11:33:56 -04:00
Sylvestre Ledru
c0aa58b4f7 Ripgrep is also available in Ubuntu (from Cosmic) 2018-09-14 08:41:05 +02:00
ykgmfq
184ee4c328 deb: add section info
Put it in the same section as
https://packages.debian.org/stretch/grep

PR #1051
2018-09-13 08:17:24 -04:00
Gabe Berke-Williams
e82fbf2c46 doc: fix typo
"cretion" -> "creation"

PR #1045
2018-09-10 06:49:48 -04:00
Andrew Gallant
eb18da0450 pcre2: use jit_if_available
This will allow PCRE2 to fall back to non-JIT matching when running on
platforms without JIT support.

ref https://github.com/BurntSushi/rust-pcre2/issues/3
2018-09-08 17:12:14 -04:00
Andrew Gallant
0f7494216f readme: update dpkg version 2018-09-08 10:46:40 -04:00
Andrew Chin
442a278635 readme: fancy regexes are not supported by default
PR #1042
2018-09-07 17:43:24 -04:00
Andrew Gallant
7ebed3ace6 pkg: update brew tap to 0.10.0 2018-09-07 14:43:59 -04:00
Andrew Gallant
8a7db1a918 ci: tweak deployment conditions 2018-09-07 14:07:52 -04:00
Andrew Gallant
ce80d794c0 changelog: add release date 2018-09-07 14:00:23 -04:00
Andrew Gallant
c5d467a2ab ci: always force PCRE2 static builds for releases 2018-09-07 14:00:23 -04:00
Andrew Gallant
a62cd553c2 ci: clean up appveyor
Remove some outdated comments and unused config. Also, make the regex for
matching tags a bit more specific.
2018-09-07 14:00:22 -04:00
Andrew Gallant
ce5188335b ci: remove 'branch' condition for deployment
Travis docs[1] say this is ignore when 'tags' is used.

[1] - https://docs.travis-ci.com/user/deployment/#conditional-releases-with-on
2018-09-07 14:00:22 -04:00
Andrew Gallant
b7a456ae83 deb: add completions
This commit adds Bash, zsh and fish completions to the Debian binary
package.

Fixes #1032
2018-09-07 14:00:22 -04:00
Andrew Gallant
d14f0b37d6 deps: update versions for all crates
I don't think every change here is needed, but this ensures we're using
the latest version of every direct dependency.
2018-09-07 14:00:22 -04:00
Andrew Gallant
3ddc3c040f deps: minor updates 2018-09-07 13:03:01 -04:00
Andrew Gallant
eeaa42ecaf scripts: add copy-examples
This is a preliminary script to copy example code from a Markdown file
into a crate's example directory.

This is intended to be used for the upcoming libripgrep guide, but we
don't commit any examples yet.
2018-09-07 12:27:48 -04:00
Andrew Gallant
3797a2a5cb simplegrep: touch up 2018-09-07 12:24:50 -04:00
Andrew Gallant
0e2f8f7b47 grep: add clap and regex dev dependencies to grep
These are (or will be) used in grep's examples.
2018-09-07 12:06:05 -04:00
Andrew Gallant
3dd4b77dfb grep-searcher: add Box<...> impl for Sink
We initially did not have this impl because the first revision of the Sink
trait was much more complicated. In particular, each method was
parameterized over a Matcher. But not every Sink impl actually needs a
Matcher, and it is just as easy to borrow a Matcher explicitly, so the
added parameterization wasn't holding its own.

This does permit Sink implementations to be used as trait objects. One
key use case here is to reduce compile times, since there is quite a bit
of code inside grep-searcher that is parameterized on Sink. Unfortunately,
that code is *also* parameterized on Matcher, and the various printers in
grep-printer are also parameterized on Matcher, which means Sink trait
objects are necessary but no sufficient for a major reduction in compile
times. Unfortunately, the path to making Matcher object safe isn't quite
clear. Extension traits maybe? There's also stuff in the Serde ecosystem
that might help, but the type shenanigans can get pretty gnarly.
2018-09-07 12:06:05 -04:00
Andrew Gallant
3b5cdea862 doc: minor touchups to API docs 2018-09-07 12:06:05 -04:00
Andrew Gallant
54b3e9eb10 grep-printer: delete unused code 2018-09-07 12:06:05 -04:00
Andrew Gallant
56e8864426 grep-matcher: add LineTerminator::is_suffix
This centralizes the logic for checking whether a line has a line
terminator or not.
2018-09-07 12:06:04 -04:00
Andrew Gallant
b8f619d16e readme: a few clarifications 2018-09-07 12:06:04 -04:00
Andrew Gallant
83dff33326 deps: update various deps 2018-09-04 23:29:22 -04:00
Andrew Gallant
003c3695f4 deps: update grep version 2018-09-04 23:29:05 -04:00
Andrew Gallant
10777c150d grep-0.2.1 2018-09-04 23:25:39 -04:00
Andrew Gallant
827179250b changelog: assign feature id 2018-09-04 23:24:22 -04:00
Andrew Gallant
fd22cd520b windows: fix unused warnings on Windows 2018-09-04 23:18:55 -04:00
Andrew Gallant
241bc8f8fc ripgrep: add --pre-glob flag
The --pre-glob flag is like the --glob flag, except it applies to filtering
files through the preprocessor instead of for search. This makes it
possible to apply the preprocessor to only a small subset of files, which
can greatly reduce the process overhead of using a preprocessor when
searching large directories.
2018-09-04 23:18:55 -04:00
Andrew Gallant
b6e30124e0 ripgrep: add --line-buffered and --block-buffered
These flags provide granular control over ripgrep's buffering strategy.
The --line-buffered flag can be genuinely useful in certain types of shell
pipelines. The --block-buffered flag has a murkier use case, but we add it
for completeness.
2018-09-04 23:18:55 -04:00
Andrew Gallant
4846d63539 grep-cli: introduce new grep-cli crate
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.

There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.

In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.

Fixes #990
2018-09-04 23:18:55 -04:00
helloer
13c47530a6 ignore/types: add pascal type
PR #1036
2018-09-03 07:25:07 -04:00
Jakub Wilk
328f4369e6 doc: fix typos 2018-08-31 11:59:28 -04:00
Andrew Gallant
04518e32e7 deps: update other crates 2018-08-30 23:03:07 -04:00
Andrew Gallant
f2eaf5b977 deps: update termcolor for perf tweaks 2018-08-30 22:57:01 -04:00
Andrew Gallant
3edeeca6e9 changelog: fix typo 2018-08-29 18:46:34 -04:00
Andrew Gallant
c41b353009 changelog: update
This brings the changelog up to date with HEAD and rewords a few things.
2018-08-29 18:25:08 -04:00
Aaron Power
d18839f3dc ignore: add into_path for DirEntry (#1031)
This commit adds ignore::DirEntry::into_path to match
the corresponding method on walkdir::DirEntry.
2018-08-28 18:27:34 -04:00
Andrew Gallant
8f978a3cf7 doc: clarify and fix typo
Clarify that --byte-offset may be wrong if the source isn't being read
directly.

Also tweak the README a bit. And remove a damned Oxford comma.
2018-08-27 21:21:37 -04:00
Andrew Gallant
87b745454d ripgrep: use 'ignore' for skipping stdout
This removes ripgrep-specific code for filtering files that correspond to
stdout and instead uses the 'ignore' crate's functionality for doing the
same.
2018-08-27 21:18:53 -04:00
Andrew Gallant
e5bb750995 ignore: add 'stdout' skipping to the walker
This commit adds a new 'skip_stdout' option to the directory walker. When
enabled, it will skip yielding any directory entries that are believed to
correspond to stdout for the current process. This is useful for filtering
out 'results' in a command like 'grep -r foo > results' in order to avoid
an unbounded feedback mechanism.
2018-08-27 21:18:53 -04:00
dana
d599f0b3c7 complete: don't complete bare pattern after -f 2018-08-27 07:56:40 -04:00
Andrew Gallant
40e310a9f9 ripgrep: add --sort and --sortr flags
These flags each accept one of five choices: none, path, modified,
accessed or created. The value indicates how the results are sorted.
For --sort, results are sorted in ascending order where as for --sortr,
results are sorted in descending order.

Closes #404
2018-08-26 18:42:25 -04:00
Andrew Gallant
510f15f4da ignore: add sort_by_file_path builder method
This permits callers to sort entries by their full file path, which makes
it easy to query for various file statistics.

It would have been better to provide a comparator on DirEntry itself,
similar to how walkdir does it, but this seems to require quite a bit of
work to make the types work out, assuming we want to continue to use
walkdir's sorting support (we do).
2018-08-26 18:42:25 -04:00
Andrew Gallant
f9ce7a84a8 ignore: add 'same_file_system' option
This commit adds a 'same_file_system' option to the walk builder. For
single threaded walking, it defers to the walkdir crate, which has the
same option. The bulk of this commit implements this flag for the parallel
walker. We add one very feeble test for this.

The parallel walker is now officially a complete mess.

Closes #321
2018-08-26 18:42:25 -04:00
Andrew Gallant
1b6089674e deps: more updates 2018-08-26 18:42:25 -04:00
Andrew Gallant
05a0389555 ripgrep: use winapi-util for stdin_is_readable 2018-08-25 00:30:15 -04:00
Andrew Gallant
16353bad6e deps: update various deps
This includes a new crate, winapi-util, that is now used in wincolor,
walkdir and same-file.
2018-08-25 00:19:40 -04:00
Tim Kilbourn
fe442de091 changelog: fix typo
Fuchsia is a pain to spell.

PR #1026
2018-08-23 13:17:27 -04:00
Andrew Gallant
1bb8b7170f doc: clarify use of SIMD features
You need a nightly compiler.

Ref #188
2018-08-23 09:56:37 -04:00
Andrew Gallant
55ed698a98 deps: update walkdir minimum version
We'll want to be using the new `same_file_system` option soon.
2018-08-23 09:54:45 -04:00
Andrew Gallant
f1e025873f deps: update dependencies
This includes an update to walkdir 2.2.2, which includes a
`same_file_system` option.
2018-08-22 20:50:24 -04:00
Andrew Gallant
033ad2b8e4 deps: update clap
Update clap to the latest version.

Also, drop the ansi_term dependency by disabling color output in clap's
error messages.
2018-08-21 23:10:34 -04:00
Andrew Gallant
098a8ee843 deps: various patch upgrades 2018-08-21 23:05:52 -04:00
Andrew Gallant
2f3dbf5fee ignore: fix false positive in path_is_symlink
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.

Part of this fix includes updating walkdir to 2.2.1, which also includes
a corresponding bug fix.

Fixes #984
2018-08-21 23:05:52 -04:00
Andrew Gallant
5c80e4adb6 release: better support for binary Debian package
This commit beefs up the package metadata used by the 'cargo deb' tool to
produce a binary dpkg. In particular, we now include ripgrep's man page.

This commit includes a new script, 'ci/build_deb.sh', which will handle
the build process for a dpkg, which has become a bit more nuanced than
just running 'cargo deb'. We don't (yet) run this script in CI.

Fixes #842
2018-08-21 23:05:52 -04:00
Andrew Gallant
fcd1853031 doc: update ripgrep's description
This now mentions PCRE2 support.
2018-08-21 23:05:52 -04:00
Andrew Gallant
74a89be641 grep-printer: fix bug in printing truncated lines
When emitting color, the printer wasn't checking whether the line
exceeded the maximum allowed length.
2018-08-21 23:05:52 -04:00
Andrew Gallant
5b1ce8bdc2 tests: touch up tests on Windows
This fixes warnings and adds an additional invalid UTF-8 test that will
run on Windows.
2018-08-21 23:05:52 -04:00
Andrew Gallant
1529ce3341 ripgrep: remove workaround for std bug
This commit undoes a work-around for a bug in Rust's standard library
that prevented correct file type detection on Windows in OneDrive
directories. We remove the work-around because we are moving to a
latest-stable Rust version policy, which has included this fix for a while
now.

ref #705, https://github.com/rust-lang/rust/issues/46484
2018-08-21 23:05:52 -04:00
Andrew Gallant
95a4f15916 ignore: clarify docs for DirEntry::error
Fixes #953
2018-08-21 23:05:52 -04:00
Andrew Gallant
0eef05142a ripgrep: move minimum version to Rust stable
This also updates some code to make use of our more liberal versioning
requirement, including the use of crossbeam-channel instead of the MsQueue
from the older an unmaintained crossbeam 0.3. This does regrettably add
a sizable number of dependencies, however, compile times seem mostly
unaffected.

Closes #1019
2018-08-21 23:05:52 -04:00
Andrew Gallant
edd6eb4e06 ripgrep: make --no-pcre2-unicode the canonical flag
Previously, we used --pcre2-unicode as the canonical flag despite the
fact that it is enabled by default, which is inconsistent with how we
handle other similar flags.

The reason why --pcre2-unicode was made the canonical flag was to make
it easier to discover since it would be sorted near the --pcre2 flag. To
solve that problem, we simply start a convention that lists related
flags in the docs.

Fixes #1022
2018-08-21 23:05:52 -04:00
Andrew Gallant
7ac9782970 doc: fix typo 2018-08-20 18:00:14 -04:00
Andrew Gallant
180054d7dc doc: caveats 2018-08-20 17:58:29 -04:00
Andrew Gallant
7eaaa04c69 ripgrep: small cleanups 2018-08-20 17:34:45 -04:00
Andrew Gallant
87a627631c doc: add section on PCRE2 performance 2018-08-20 17:34:45 -04:00
Andrew Gallant
9df60e164e deps: update other dependencies to latest 2018-08-20 17:34:45 -04:00
Andrew Gallant
afa06c518a deps: update libripgrep crate versions
This prepares them for an initial 0.1.0 release.
2018-08-20 17:34:45 -04:00
Andy Freeland
e46aeb34f8 ignore/types: add .mako and .mao for Mako templates
I've personally never seen `.mao`, but GitHub includes it in Linguist: 
4f11062304/lib/linguist/languages.yml (L2702-L2709)
2018-08-20 15:26:49 -04:00
dana
d8f187e990 complete: add completion reference guide 2018-08-20 11:53:19 -04:00
dana
7d93d2ab05 ripgrep: add --no-multiline-dotall 2018-08-20 07:50:00 -04:00
dana
9ca2d68e94 ripgrep: fix typos in option descriptions 2018-08-20 07:50:00 -04:00
dana
60b0e3ff80 complete: update wording, exclusion, &c. 2018-08-20 07:50:00 -04:00
dana
3a1c081c13 test_complete: match certain long options in description bodies 2018-08-20 07:50:00 -04:00
Andrew Gallant
d5c0b03030 changelog: massive update for libripgrep
This commit updates the CHANGELOG to reflect all the work done to make
libripgrep a reality.

* Closes #162 (libripgrep)
* Closes #176 (multiline search)
* Closes #188 (opt-in PCRE2 support)
* Closes #244 (JSON output)
* Closes #416 (Windows CRLF support)
* Closes #917 (trim prefix whitespace)
* Closes #993 (add --null-data flag)
* Closes #997 (--passthru works with --replace)

* Fixes #2 (memory maps and context handling work)
* Fixes #200 (ripgrep stops when pipe is closed)
* Fixes #389 (more intuitive `-w/--word-regexp`)
* Fixes #643 (detection of stdin on Windows is better)
* Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird)
* Fixes #764 (coalesce color escapes)
* Fixes #922 (memory maps failing is no big deal)
* Fixes #937 (color escapes no longer used for empty matches)
* Fixes #940 (--passthru does not impact exit status)
* Fixes #1013 (show runtime CPU features in --version output)
2018-08-20 07:10:19 -04:00
Andrew Gallant
eb184d7711 tests: re-tool integration tests
This basically rewrites every integration test. We reduce the amount of
magic involved here in terms of which arguments are being passed to
ripgrep processes. To make up for the boiler plate saved by the magic,
we make the Dir (formerly WorkDir) type a bit nicer to use, along with a
new TestCommand that wraps a std::process::Command. In exchange, we get
tests that are easier to read and write.

We also run every test with the `--pcre2` flag to make sure that works,
when PCRE2 is available.
2018-08-20 07:10:19 -04:00
Andrew Gallant
bb110c1ebe ripgrep: migrate to libripgrep
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.

The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
2018-08-20 07:10:19 -04:00
Andrew Gallant
d9ca529356 libripgrep: initial commit introducing libripgrep
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:

  1. Pattern matching (e.g., by a regex engine).
  2. Searching a file using a pattern matcher.
  3. Printing results.

Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.

Closes #162
2018-08-20 07:10:19 -04:00
Sylvestre Ledru
0958837ee1 readme: ripgrep is available in Debian Buster
PR #1016
2018-08-17 06:35:43 -04:00
Andrew Gallant
94be3bd4bb grep: remove senseless test
It was pulling in a sizable data file and doesn't appear to be testing
anything meaningful that isn't covered by a variety of other tests.
2018-08-15 19:52:50 -04:00
woky
deb1de6e1e ignore/types: add *.sbt to scala type
Sbt is currently most used Scala build tool which uses
*.sbt files, which are basically Scala.

PR #1010
2018-08-14 06:29:27 -07:00
Vanessa McHale
6afdf15d85 ignore/types: add Idris, Dhall and ATS
And also improve Haskell detection.

PR #1007
2018-08-07 13:10:19 -04:00
Jonatan Hamberg
6cda7b24e9 readme: update debian link to 0.9.0
PR #1006
2018-08-07 07:50:08 -04:00
llogiq
ad9befbc1d deps: update bytecount to 0.3.2
PR #1003
2018-08-06 06:44:16 -04:00
Andrew Gallant
e86d3d95c2 pkg: update brew tap to 0.9.0 2018-08-03 17:04:36 -04:00
Andrew Gallant
6799dcfc0e release: 0.9.0 2018-08-03 16:13:31 -04:00
217 changed files with 52999 additions and 27598 deletions

21
.cargo/config.toml Normal file
View File

@@ -0,0 +1,21 @@
# On Windows MSVC, statically link the C runtime so that the resulting EXE does
# not depend on the vcruntime DLL.
#
# See: https://github.com/BurntSushi/ripgrep/pull/1613
[target.x86_64-pc-windows-msvc]
rustflags = ["-C", "target-feature=+crt-static"]
[target.i686-pc-windows-msvc]
rustflags = ["-C", "target-feature=+crt-static"]
# Do the same for MUSL targets. At the time of writing (2023-10-23), this is
# the default. But the plan is for the default to change to dynamic linking.
# The whole point of MUSL with respect to ripgrep is to create a fully
# statically linked executable.
#
# See: https://github.com/rust-lang/compiler-team/issues/422
# See: https://github.com/rust-lang/compiler-team/issues/422#issuecomment-812135847
[target.x86_64-unknown-linux-musl]
rustflags = [
"-C", "target-feature=+crt-static",
"-C", "link-self-contained=yes",
]

101
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@@ -0,0 +1,101 @@
name: Bug Report
description: An issue with ripgrep or any of its crates (ignore, globset, etc.).
body:
- type: markdown
attributes:
value: |
Please review the following common issues before filing a bug. You may also be interested in reading the [FAQ](https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md)
and the [user guide](https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md).
* Unable to search for text with leading dash/hyphen: This is not a bug. Use `rg -- -mytext` or `rg -e -mytext`. See #102, #215, #624.
* Unable to build with old version of Rust. This is not a bug. ripgrep tracks the latest stable release of Rust. See #1019, #1433, #2534.
* ripgrep package is broken or out of date. ripgrep's author does not maintain packages for Red Hat, Ubuntu, Arch, Homebrew, WinGet, etc. If you have an issue with one of these, please contact your package maintainer. See #1637, #2264, #2459.
- type: checkboxes
id: issue-not-common
attributes:
label: Please tick this box to confirm you have reviewed the above.
options:
- label: I have a different issue.
required: true
- type: textarea
id: ripgrep-version
attributes:
label: What version of ripgrep are you using?
description: Enter the output of `rg --version`.
placeholder: ex. ripgrep 13.0.0
validations:
required: true
- type: textarea
id: install-method
attributes:
label: How did you install ripgrep?
description: |
If you installed ripgrep with snap and are getting strange file permission or file not found errors, then please do not file a bug. Instead, use one of the GitHub binary releases.
Please report any other issues with downstream ripgrep packages to their respective maintainers as mentioned above.
placeholder: ex. Cargo, APT, Homebrew
validations:
required: true
- type: textarea
id: operating-system
attributes:
label: What operating system are you using ripgrep on?
description: Enter the name and version of your operating system.
placeholder: ex. Debian 12.0, macOS 13.4.1
validations:
required: true
- type: textarea
id: description
attributes:
label: Describe your bug.
description: Give a high level description of the bug.
placeholder: ex. ripgrep fails to return the expected matches when...
validations:
required: true
- type: textarea
id: steps-to-reproduce
attributes:
label: What are the steps to reproduce the behavior?
description: |
If possible, please include both your search patterns and the corpus on which you are searching. Unless the bug is very obvious, then it is unlikely that it will be fixed if the ripgrep maintainers cannot reproduce it.
If the corpus is too big and you cannot decrease its size, file the bug anyway and the ripgrep maintainers will help figure out next steps.
placeholder: >
ex. Run `rg bar` in a directory containing a file with the lines 'bar' and 'barbaz'
validations:
required: true
- type: textarea
id: actual-behavior
attributes:
label: What is the actual behavior?
description: |
Show the command you ran and the actual output. **Include the `--debug` flag in your invocation of ripgrep.**
If the output is large, put it in a gist: <https://gist.github.com/>
If the output is small, put it in code fences (see placeholder text).
placeholder: |
ex.
```
$ rg --debug bar
DEBUG|grep_regex::literal|crates/regex/src/literal.rs:58: literal prefixes detected: Literals { lits: [Complete(bar)], limit_size: 250, limit_class: 10 }
...
```
validations:
required: true
- type: textarea
id: expected-behavior
attributes:
label: What is the expected behavior?
description: What do you think ripgrep should have done?
placeholder: ex. ripgrep should have returned 2 matches
validations:
required: true

6
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,6 @@
blank_issues_enabled: true
contact_links:
- name: Ask a question
about: |
You've come to seek help or want to discuss something related to ripgrep.
url: https://github.com/BurntSushi/ripgrep/discussions/new

View File

@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest a new feature for ripgrep
title: ''
labels: ''
assignees: ''
---
#### Describe your feature request
Please describe the behavior you want and the motivation. Please also provide
examples of how ripgrep would be used if your feature request were added.
If you're not sure what to write here, then try imagining what the ideal
documentation of your new feature would look like in ripgrep's man page. Then
try to write it.
If you're requesting the addition or change of default file types, please open
a PR. We can discuss it there if necessary.

217
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,217 @@
name: ci
on:
pull_request:
push:
branches:
- master
schedule:
- cron: '00 01 * * *'
# The section is needed to drop write-all permissions that are granted on
# `schedule` event. By specifying any permission explicitly all others are set
# to none. By using the principle of least privilege the damage a compromised
# workflow can do (because of an injection or compromised third party tool or
# action) is restricted. Currently the worklow doesn't need any additional
# permission except for pulling the code. Adding labels to issues, commenting
# on pull-requests, etc. may need additional permissions:
#
# Syntax for this section:
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions
#
# Reference for how to assign permissions on a job-by-job basis:
# https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
#
# Reference for available permissions that we can enable if needed:
# https://docs.github.com/en/actions/security-guides/automatic-token-authentication#permissions-for-the-github_token
permissions:
# to fetch code (actions/checkout)
contents: read
jobs:
test:
name: test
env:
# For some builds, we use cross to test on 32-bit and big-endian
# systems.
CARGO: cargo
# When CARGO is set to CROSS, this is set to `--target matrix.target`.
# Note that we only use cross on Linux, so setting a target on a
# different OS will just use normal cargo.
TARGET_FLAGS:
# When CARGO is set to CROSS, TARGET_DIR includes matrix.target.
TARGET_DIR: ./target
# Bump this as appropriate. We pin to a version to make sure CI
# continues to work as cross releases in the past have broken things
# in subtle ways.
CROSS_VERSION: v0.2.5
# Emit backtraces on panics.
RUST_BACKTRACE: 1
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
include:
- build: pinned
os: ubuntu-latest
rust: 1.74.0
- build: stable
os: ubuntu-latest
rust: stable
- build: beta
os: ubuntu-latest
rust: beta
- build: nightly
os: ubuntu-latest
rust: nightly
- build: stable-musl
os: ubuntu-latest
rust: stable
target: x86_64-unknown-linux-musl
- build: stable-x86
os: ubuntu-latest
rust: stable
target: i686-unknown-linux-gnu
- build: stable-aarch64
os: ubuntu-latest
rust: stable
target: aarch64-unknown-linux-gnu
- build: stable-arm-gnueabihf
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-gnueabihf
- build: stable-arm-musleabihf
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-musleabihf
- build: stable-arm-musleabi
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-musleabi
- build: stable-powerpc64
os: ubuntu-latest
rust: stable
target: powerpc64-unknown-linux-gnu
- build: stable-s390x
os: ubuntu-latest
rust: stable
target: s390x-unknown-linux-gnu
- build: macos
os: macos-latest
rust: nightly
- build: win-msvc
os: windows-2022
rust: nightly
- build: win-gnu
os: windows-2022
rust: nightly-x86_64-gnu
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install packages (Ubuntu)
if: matrix.os == 'ubuntu-latest'
run: |
ci/ubuntu-install-packages
- name: Install Rust
uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ matrix.rust }}
- name: Use Cross
if: matrix.os == 'ubuntu-latest' && matrix.target != ''
run: |
# In the past, new releases of 'cross' have broken CI. So for now, we
# pin it. We also use their pre-compiled binary releases because cross
# has over 100 dependencies and takes a bit to compile.
dir="$RUNNER_TEMP/cross-download"
mkdir "$dir"
echo "$dir" >> $GITHUB_PATH
cd "$dir"
curl -LO "https://github.com/cross-rs/cross/releases/download/$CROSS_VERSION/cross-x86_64-unknown-linux-musl.tar.gz"
tar xf cross-x86_64-unknown-linux-musl.tar.gz
echo "CARGO=cross" >> $GITHUB_ENV
echo "TARGET_FLAGS=--target ${{ matrix.target }}" >> $GITHUB_ENV
echo "TARGET_DIR=./target/${{ matrix.target }}" >> $GITHUB_ENV
- name: Show command used for Cargo
run: |
echo "cargo command is: ${{ env.CARGO }}"
echo "target flag is: ${{ env.TARGET_FLAGS }}"
echo "target dir is: ${{ env.TARGET_DIR }}"
- name: Build ripgrep and all crates
run: ${{ env.CARGO }} build --verbose --workspace ${{ env.TARGET_FLAGS }}
- name: Build ripgrep with PCRE2
run: ${{ env.CARGO }} build --verbose --workspace --features pcre2 ${{ env.TARGET_FLAGS }}
# This is useful for debugging problems when the expected build artifacts
# (like shell completions and man pages) aren't generated.
- name: Show build.rs stderr
shell: bash
run: |
set +x
stderr="$(find "${{ env.TARGET_DIR }}/debug" -name stderr -print0 | xargs -0 ls -t | head -n1)"
if [ -s "$stderr" ]; then
echo "===== $stderr ===== "
cat "$stderr"
echo "====="
fi
set -x
- name: Run tests with PCRE2 (sans cross)
if: matrix.target == ''
run: ${{ env.CARGO }} test --verbose --workspace --features pcre2 ${{ env.TARGET_FLAGS }}
- name: Run tests without PCRE2 (with cross)
# These tests should actually work, but they almost double the runtime.
# Every integration test spins up qemu to run 'rg', and when PCRE2 is
# enabled, every integration test is run twice: one with the default
# regex engine and once with PCRE2.
if: matrix.target != ''
run: ${{ env.CARGO }} test --verbose --workspace ${{ env.TARGET_FLAGS }}
- name: Test zsh shell completions (Unix, sans cross)
# We could test this when using Cross, but we'd have to execute the
# 'rg' binary (done in test-complete) with qemu, which is a pain and
# doesn't really gain us much. If shell completion works in one place,
# it probably works everywhere.
if: matrix.target == '' && matrix.os != 'windows-2022'
shell: bash
run: ci/test-complete
- name: Print hostname detected by grep-cli crate
shell: bash
run: ${{ env.CARGO }} test --manifest-path crates/cli/Cargo.toml ${{ env.TARGET_FLAGS }} --lib print_hostname -- --nocapture
- name: Print available short flags
shell: bash
run: ${{ env.CARGO }} test --bin rg ${{ env.TARGET_FLAGS }} flags::defs::tests::available_shorts -- --nocapture
rustfmt:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-toolchain@master
with:
toolchain: stable
components: rustfmt
- name: Check formatting
run: cargo fmt --all --check
docs:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-toolchain@master
with:
toolchain: stable
- name: Check documentation
env:
RUSTDOCFLAGS: -D warnings
run: cargo doc --no-deps --document-private-items --workspace

371
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,371 @@
name: release
# Only do the release on x.y.z tags.
on:
push:
tags:
- "[0-9]+.[0-9]+.[0-9]+"
# We need this to be able to create releases.
permissions:
contents: write
jobs:
# The create-release job runs purely to initialize the GitHub release itself,
# and names the release after the `x.y.z` tag that was pushed. It's separate
# from building the release so that we only create the release once.
create-release:
name: create-release
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Get the release version from the tag
if: env.VERSION == ''
run: echo "VERSION=${{ github.ref_name }}" >> $GITHUB_ENV
- name: Show the version
run: |
echo "version is: $VERSION"
- name: Check that tag version and Cargo.toml version are the same
shell: bash
run: |
if ! grep -q "version = \"$VERSION\"" Cargo.toml; then
echo "version does not match Cargo.toml" >&2
exit 1
fi
- name: Create GitHub release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: gh release create $VERSION --draft --verify-tag --title $VERSION
outputs:
version: ${{ env.VERSION }}
build-release:
name: build-release
needs: ['create-release']
runs-on: ${{ matrix.os }}
env:
# For some builds, we use cross to test on 32-bit and big-endian
# systems.
CARGO: cargo
# When CARGO is set to CROSS, this is set to `--target matrix.target`.
TARGET_FLAGS:
# When CARGO is set to CROSS, TARGET_DIR includes matrix.target.
TARGET_DIR: ./target
# Bump this as appropriate. We pin to a version to make sure CI
# continues to work as cross releases in the past have broken things
# in subtle ways.
CROSS_VERSION: v0.2.5
# Emit backtraces on panics.
RUST_BACKTRACE: 1
# Build static releases with PCRE2.
PCRE2_SYS_STATIC: 1
strategy:
fail-fast: false
matrix:
include:
- build: linux
os: ubuntu-latest
rust: nightly
target: x86_64-unknown-linux-musl
strip: x86_64-linux-musl-strip
- build: stable-x86
os: ubuntu-latest
rust: stable
target: i686-unknown-linux-gnu
strip: x86_64-linux-gnu-strip
qemu: i386
- build: stable-aarch64
os: ubuntu-latest
rust: stable
target: aarch64-unknown-linux-gnu
strip: aarch64-linux-gnu-strip
qemu: qemu-aarch64
- build: stable-arm-gnueabihf
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-gnueabihf
strip: arm-linux-gnueabihf-strip
qemu: qemu-arm
- build: stable-arm-musleabihf
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-musleabihf
strip: arm-linux-musleabihf-strip
qemu: qemu-arm
- build: stable-arm-musleabi
os: ubuntu-latest
rust: stable
target: armv7-unknown-linux-musleabi
strip: arm-linux-musleabi-strip
qemu: qemu-arm
- build: stable-powerpc64
os: ubuntu-latest
rust: stable
target: powerpc64-unknown-linux-gnu
strip: powerpc64-linux-gnu-strip
qemu: qemu-ppc64
- build: stable-s390x
os: ubuntu-latest
rust: stable
target: s390x-unknown-linux-gnu
strip: s390x-linux-gnu-strip
qemu: qemu-s390x
- build: macos
os: macos-latest
rust: nightly
target: x86_64-apple-darwin
- build: win-msvc
os: windows-latest
rust: nightly
target: x86_64-pc-windows-msvc
- build: win-gnu
os: windows-latest
rust: nightly-x86_64-gnu
target: x86_64-pc-windows-gnu
- build: win32-msvc
os: windows-latest
rust: nightly
target: i686-pc-windows-msvc
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install packages (Ubuntu)
if: matrix.os == 'ubuntu-latest'
shell: bash
run: |
ci/ubuntu-install-packages
- name: Install Rust
uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ matrix.rust }}
target: ${{ matrix.target }}
- name: Use Cross
if: matrix.os == 'ubuntu-latest' && matrix.target != ''
shell: bash
run: |
# In the past, new releases of 'cross' have broken CI. So for now, we
# pin it. We also use their pre-compiled binary releases because cross
# has over 100 dependencies and takes a bit to compile.
dir="$RUNNER_TEMP/cross-download"
mkdir "$dir"
echo "$dir" >> $GITHUB_PATH
cd "$dir"
curl -LO "https://github.com/cross-rs/cross/releases/download/$CROSS_VERSION/cross-x86_64-unknown-linux-musl.tar.gz"
tar xf cross-x86_64-unknown-linux-musl.tar.gz
echo "CARGO=cross" >> $GITHUB_ENV
- name: Set target variables
shell: bash
run: |
echo "TARGET_FLAGS=--target ${{ matrix.target }}" >> $GITHUB_ENV
echo "TARGET_DIR=./target/${{ matrix.target }}" >> $GITHUB_ENV
- name: Show command used for Cargo
shell: bash
run: |
echo "cargo command is: ${{ env.CARGO }}"
echo "target flag is: ${{ env.TARGET_FLAGS }}"
echo "target dir is: ${{ env.TARGET_DIR }}"
- name: Build release binary
shell: bash
run: |
${{ env.CARGO }} build --verbose --release --features pcre2 ${{ env.TARGET_FLAGS }}
if [ "${{ matrix.os }}" = "windows-latest" ]; then
bin="target/${{ matrix.target }}/release/rg.exe"
else
bin="target/${{ matrix.target }}/release/rg"
fi
echo "BIN=$bin" >> $GITHUB_ENV
- name: Strip release binary (macos)
if: matrix.os == 'macos-latest'
shell: bash
run: strip "$BIN"
- name: Strip release binary (cross)
if: env.CARGO == 'cross'
shell: bash
run: |
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.strip }}" \
"/$BIN"
- name: Determine archive name
shell: bash
run: |
version="${{ needs.create-release.outputs.version }}"
echo "ARCHIVE=ripgrep-$version-${{ matrix.target }}" >> $GITHUB_ENV
- name: Creating directory for archive
shell: bash
run: |
mkdir -p "$ARCHIVE"/{complete,doc}
cp "$BIN" "$ARCHIVE"/
cp {README.md,COPYING,UNLICENSE,LICENSE-MIT} "$ARCHIVE"/
cp {CHANGELOG.md,FAQ.md,GUIDE.md} "$ARCHIVE"/doc/
- name: Generate man page and completions (no emulation)
if: matrix.qemu == ''
shell: bash
run: |
"$BIN" --version
"$BIN" --generate complete-bash > "$ARCHIVE/complete/rg.bash"
"$BIN" --generate complete-fish > "$ARCHIVE/complete/rg.fish"
"$BIN" --generate complete-powershell > "$ARCHIVE/complete/_rg.ps1"
"$BIN" --generate complete-zsh > "$ARCHIVE/complete/_rg"
"$BIN" --generate man > "$ARCHIVE/doc/rg.1"
- name: Generate man page and completions (emulation)
if: matrix.qemu != ''
shell: bash
run: |
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" --version
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" \
--generate complete-bash > "$ARCHIVE/complete/rg.bash"
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" \
--generate complete-fish > "$ARCHIVE/complete/rg.fish"
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" \
--generate complete-powershell > "$ARCHIVE/complete/_rg.ps1"
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" \
--generate complete-zsh > "$ARCHIVE/complete/_rg"
docker run --rm -v \
"$PWD/target:/target:Z" \
"ghcr.io/cross-rs/${{ matrix.target }}:main" \
"${{ matrix.qemu }}" "/$BIN" \
--generate man > "$ARCHIVE/doc/rg.1"
- name: Build archive (Windows)
shell: bash
if: matrix.os == 'windows-latest'
run: |
7z a "$ARCHIVE.zip" "$ARCHIVE"
certutil -hashfile "$ARCHIVE.zip" SHA256 > "$ARCHIVE.zip.sha256"
echo "ASSET=$ARCHIVE.zip" >> $GITHUB_ENV
echo "ASSET_SUM=$ARCHIVE.zip.sha256" >> $GITHUB_ENV
- name: Build archive (Unix)
shell: bash
if: matrix.os != 'windows-latest'
run: |
tar czf "$ARCHIVE.tar.gz" "$ARCHIVE"
shasum -a 256 "$ARCHIVE.tar.gz" > "$ARCHIVE.tar.gz.sha256"
echo "ASSET=$ARCHIVE.tar.gz" >> $GITHUB_ENV
echo "ASSET_SUM=$ARCHIVE.tar.gz.sha256" >> $GITHUB_ENV
- name: Upload release archive
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
shell: bash
run: |
version="${{ needs.create-release.outputs.version }}"
gh release upload "$version" ${{ env.ASSET }} ${{ env.ASSET_SUM }}
build-release-deb:
name: build-release-deb
needs: ['create-release']
runs-on: ubuntu-latest
env:
TARGET: x86_64-unknown-linux-musl
# Emit backtraces on panics.
RUST_BACKTRACE: 1
# Since we're distributing the dpkg, we don't know whether the user will
# have PCRE2 installed, so just do a static build.
PCRE2_SYS_STATIC: 1
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install packages (Ubuntu)
shell: bash
run: |
ci/ubuntu-install-packages
- name: Install Rust
uses: dtolnay/rust-toolchain@master
with:
toolchain: nightly
target: ${{ env.TARGET }}
- name: Install cargo-deb
shell: bash
run: |
cargo install cargo-deb
# 'cargo deb' does not seem to provide a way to specify an asset that is
# created at build time, such as ripgrep's man page. To work around this,
# we force a debug build, copy out the man page (and shell completions)
# produced from that build, put it into a predictable location and then
# build the deb, which knows where to look.
- name: Build debug binary to create release assets
shell: bash
run: |
cargo build --target ${{ env.TARGET }}
bin="target/${{ env.TARGET }}/debug/rg"
echo "BIN=$bin" >> $GITHUB_ENV
- name: Create deployment directory
shell: bash
run: |
dir=deployment/deb
mkdir -p "$dir"
echo "DEPLOY_DIR=$dir" >> $GITHUB_ENV
- name: Generate man page
shell: bash
run: |
"$BIN" --generate man > "$DEPLOY_DIR/rg.1"
- name: Generate shell completions
shell: bash
run: |
"$BIN" --generate complete-bash > "$DEPLOY_DIR/rg.bash"
"$BIN" --generate complete-fish > "$DEPLOY_DIR/rg.fish"
"$BIN" --generate complete-zsh > "$DEPLOY_DIR/_rg"
- name: Build release binary
shell: bash
run: |
cargo deb --profile deb --target ${{ env.TARGET }}
version="${{ needs.create-release.outputs.version }}"
echo "DEB_DIR=target/${{ env.TARGET }}/debian" >> $GITHUB_ENV
echo "DEB_NAME=ripgrep_$version-1_amd64.deb" >> $GITHUB_ENV
- name: Create sha256 sum of deb file
shell: bash
run: |
cd "$DEB_DIR"
sum="$DEB_NAME.sha256"
shasum -a 256 "$DEB_NAME" > "$sum"
echo "SUM=$sum" >> $GITHUB_ENV
- name: Upload release archive
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
shell: bash
run: |
cd "$DEB_DIR"
version="${{ needs.create-release.outputs.version }}"
gh release upload "$version" "$DEB_NAME" "$SUM"

5
.gitignore vendored
View File

@@ -7,6 +7,7 @@ target
/termcolor/Cargo.lock
/wincolor/Cargo.lock
/deployment
/.idea
# Snapcraft files
stage
@@ -15,3 +16,7 @@ parts
*.snap
*.pyc
ripgrep*_source.tar.bz2
# Cargo timings
cargo-timing-*.html
cargo-timing.html

1
.ignore Normal file
View File

@@ -0,0 +1 @@
!/.github/

View File

@@ -1,107 +0,0 @@
language: rust
env:
global:
- PROJECT_NAME: ripgrep
- RUST_BACKTRACE: full
addons:
apt:
packages:
# For generating man page.
- libxslt1-dev
- asciidoc
- docbook-xsl
- xsltproc
- libxml2-utils
# Needed for completion-function test.
- zsh
# Needed for testing decompression search.
- xz-utils
- liblz4-tool
matrix:
fast_finish: true
include:
# Nightly channel.
# All *nix releases are done on the nightly channel to take advantage
# of the regex library's multiple pattern SIMD search.
- os: linux
rust: nightly
env: TARGET=i686-unknown-linux-musl
- os: linux
rust: nightly
env: TARGET=x86_64-unknown-linux-musl
- os: osx
rust: nightly
# XML_CATALOG_FILES is apparently necessary for asciidoc on macOS.
env: TARGET=x86_64-apple-darwin XML_CATALOG_FILES=/usr/local/etc/xml/catalog
- os: linux
rust: nightly
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
addons:
apt:
packages:
- gcc-4.8-arm-linux-gnueabihf
- binutils-arm-linux-gnueabihf
- libc6-armhf-cross
- libc6-dev-armhf-cross
# For generating man page.
- libxslt1-dev
- asciidoc
- docbook-xsl
- xsltproc
- libxml2-utils
# Beta channel. We enable these to make sure there are no regressions in
# Rust beta releases.
- os: linux
rust: beta
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: beta
env: TARGET=x86_64-unknown-linux-gnu
# Minimum Rust supported channel. We enable these to make sure ripgrep
# continues to work on the advertised minimum Rust version.
- os: linux
rust: 1.23.0
env: TARGET=x86_64-unknown-linux-gnu
- os: linux
rust: 1.23.0
env: TARGET=x86_64-unknown-linux-musl
- os: linux
rust: 1.23.0
env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
addons:
apt:
packages:
- gcc-4.8-arm-linux-gnueabihf
- binutils-arm-linux-gnueabihf
- libc6-armhf-cross
- libc6-dev-armhf-cross
# For generating man page.
- libxslt1-dev
- asciidoc
- docbook-xsl
- xsltproc
- libxml2-utils
install: ci/install.sh
script: ci/script.sh
before_deploy: ci/before_deploy.sh
deploy:
provider: releases
file_glob: true
file: deployment/${PROJECT_NAME}-${TRAVIS_TAG}-${TARGET}.tar.gz
skip_cleanup: true
on:
condition: $TRAVIS_RUST_VERSION = nightly
branch: master
tags: true
api_key:
secure: "IbSnsbGkxSydR/sozOf1/SRvHplzwRUHzcTjM7BKnr7GccL86gRPUrsrvD103KjQUGWIc1TnK1YTq5M0Onswg/ORDjqa1JEJPkPdPnVh9ipbF7M2De/7IlB4X4qXLKoApn8+bx2x/mfYXu4G+G1/2QdbaKK2yfXZKyjz0YFx+6CNrVCT2Nk8q7aHvOOzAL58vsG8iPDpupuhxlMDDn/UhyOWVInmPPQ0iJR1ZUJN8xJwXvKvBbfp3AhaBiAzkhXHNLgBR8QC5noWWMXnuVDMY3k4f3ic0V+p/qGUCN/nhptuceLxKFicMCYObSZeUzE5RAI0/OBW7l3z2iCoc+TbAnn+JrX/ObJCfzgAOXAU3tLaBFMiqQPGFKjKg1ltSYXomOFP/F7zALjpvFp4lYTBajRR+O3dqaxA9UQuRjw27vOeUpMcga4ZzL4VXFHzrxZKBHN//XIGjYAVhJ1NSSeGpeJV5/+jYzzWKfwSagRxQyVCzMooYFFXzn8Yxdm3PJlmp3GaAogNkdB9qKcrEvRINCelalzALPi0hD/HUDi8DD2PNTCLLMo6VSYtvc685Zbe+KgNzDV1YyTrRCUW6JotrS0r2ULLwnsh40hSB//nNv3XmwNmC/CmW5QAnIGj8cBMF4S2t6ohADIndojdAfNiptmaZOIT6owK7bWMgPMyopo="
branches:
only:
# Pushes and PR to the master branch
- master
# Ruby regex to match tags. Required, or travis won't trigger deploys when
# a new tag is pushed.
- /^\d+\.\d+\.\d+.*$/
notifications:
email:
on_success: never

View File

@@ -1,14 +1,774 @@
0.9.0 (TBD)
===========
This is a new minor version release of ripgrep that mostly contains bug fixes.
14.1.0 (TBD)
============
This is a minor release with a few small new features and bug fixes.
Releases provided on Github for `x86` and `x86_64` will now work on all target
CPUs, and will also automatically take advantage of features found on modern
CPUs (such as AVX2) for additional optimizations.
Bug fixes:
* [BUG #2664](https://github.com/BurntSushi/ripgrep/issues/2690):
Fix unbounded memory growth in the `ignore` crate.
Feature enhancements:
* Added or improved file type filtering for Lean and Meson.
* [FEATURE #2684](https://github.com/BurntSushi/ripgrep/issues/2684):
Improve completions for the `fish` shell.
* [FEATURE #2702](https://github.com/BurntSushi/ripgrep/pull/2702):
Add release binaries for `armv7-unknown-linux-gnueabihf`,
`armv7-unknown-linux-musleabihf` and `armv7-unknown-linux-musleabi`.
14.0.3 (2023-11-28)
===================
This is a patch release with a bug fix for the `--sortr` flag.
Bug fixes:
* [BUG #2664](https://github.com/BurntSushi/ripgrep/issues/2664):
Fix `--sortr=path`. I left a `todo!()` in the source. Oof.
14.0.2 (2023-11-27)
===================
This is a patch release with a few small bug fixes.
Bug fixes:
* [BUG #2654](https://github.com/BurntSushi/ripgrep/issues/2654):
Fix `deb` release sha256 sum file.
* [BUG #2658](https://github.com/BurntSushi/ripgrep/issues/2658):
Fix partial regression in the behavior of `--null-data --line-regexp`.
* [BUG #2659](https://github.com/BurntSushi/ripgrep/issues/2659):
Fix Fish shell completions.
* [BUG #2662](https://github.com/BurntSushi/ripgrep/issues/2662):
Fix typo in documentation for `-i/--ignore-case`.
14.0.1 (2023-11-26)
===================
This a patch release meant to fix `cargo install ripgrep` on Windows.
Bug fixes:
* [BUG #2653](https://github.com/BurntSushi/ripgrep/issues/2653):
Include `pkg/windows/Manifest.xml` in crate package.
14.0.0 (2023-11-26)
===================
ripgrep 14 is a new major version release of ripgrep that has some new
features, performance improvements and a lot of bug fixes.
The headlining feature in this release is hyperlink support. In this release,
they are an opt-in feature but may change to an opt-out feature in the future.
To enable them, try passing `--hyperlink-format default`. If you use [VS Code],
then try passing `--hyperlink-format vscode`. Please [report your experience
with hyperlinks][report-hyperlinks], positive or negative.
[VS Code]: https://code.visualstudio.com/
[report-hyperlinks]: https://github.com/BurntSushi/ripgrep/discussions/2611
Another headlining development in this release is that it contains a rewrite
of its regex engine. You generally shouldn't notice any changes, except for
some searches may get faster. You can read more about the [regex engine rewrite
on my blog][regex-internals]. Please [report your performance improvements or
regressions that you notice][report-perf].
[report-perf]: https://github.com/BurntSushi/ripgrep/discussions/2652
Finally, ripgrep switched the library it uses for argument parsing. Users
should not notice a difference in most cases (error messages have changed
somewhat), but flag overrides should generally be more consistent. For example,
things like `--no-ignore --ignore-vcs` work as one would expect (disables all
filtering related to ignore rules except for rules found in version control
systems such as `git`).
[regex-internals]: https://blog.burntsushi.net/regex-internals/
**BREAKING CHANGES**:
* `rg -C1 -A2` used to be equivalent to `rg -A2`, but now it is equivalent to
`rg -B1 -A2`. That is, `-A` and `-B` no longer completely override `-C`.
Instead, they only partially override `-C`.
Build process changes:
* ripgrep's shell completions and man page are now created by running ripgrep
with a new `--generate` flag. For example, `rg --generate man` will write a
man page in `roff` format on stdout. The release archives have not changed.
* The optional build dependency on `asciidoc` or `asciidoctor` has been
dropped. Previously, it was used to produce ripgrep's man page. ripgrep now
owns this process itself by writing `roff` directly.
Performance improvements:
* [PERF #1746](https://github.com/BurntSushi/ripgrep/issues/1746):
Make some cases with inner literals faster.
* [PERF #1760](https://github.com/BurntSushi/ripgrep/issues/1760):
Make most searches with `\b` look-arounds (among others) much faster.
* [PERF #2591](https://github.com/BurntSushi/ripgrep/pull/2591):
Parallel directory traversal now uses work stealing for faster searches.
* [PERF #2642](https://github.com/BurntSushi/ripgrep/pull/2642):
Parallel directory traversal has some contention reduced.
Feature enhancements:
* Added or improved file type filtering for Ada, DITA, Elixir, Fuchsia, Gentoo,
Gradle, GraphQL, Markdown, Prolog, Raku, TypeScript, USD, V
* [FEATURE #665](https://github.com/BurntSushi/ripgrep/issues/665):
Add a new `--hyperlink-format` flag that turns file paths into hyperlinks.
* [FEATURE #1709](https://github.com/BurntSushi/ripgrep/issues/1709):
Improve documentation of ripgrep's behavior when stdout is a tty.
* [FEATURE #1737](https://github.com/BurntSushi/ripgrep/issues/1737):
Provide binaries for Apple silicon.
* [FEATURE #1790](https://github.com/BurntSushi/ripgrep/issues/1790):
Add new `--stop-on-nonmatch` flag.
* [FEATURE #1814](https://github.com/BurntSushi/ripgrep/issues/1814):
Flags are now categorized in `-h/--help` output and ripgrep's man page.
* [FEATURE #1838](https://github.com/BurntSushi/ripgrep/issues/1838):
An error is shown when searching for NUL bytes with binary detection enabled.
* [FEATURE #2195](https://github.com/BurntSushi/ripgrep/issues/2195):
When `extra-verbose` mode is enabled in zsh, show extra file type info.
* [FEATURE #2298](https://github.com/BurntSushi/ripgrep/issues/2298):
Add instructions for installing ripgrep using `cargo binstall`.
* [FEATURE #2409](https://github.com/BurntSushi/ripgrep/pull/2409):
Added installation instructions for `winget`.
* [FEATURE #2425](https://github.com/BurntSushi/ripgrep/pull/2425):
Shell completions (and man page) can be created via `rg --generate`.
* [FEATURE #2524](https://github.com/BurntSushi/ripgrep/issues/2524):
The `--debug` flag now indicates whether stdin or `./` is being searched.
* [FEATURE #2643](https://github.com/BurntSushi/ripgrep/issues/2643):
Make `-d` a short flag for `--max-depth`.
* [FEATURE #2645](https://github.com/BurntSushi/ripgrep/issues/2645):
The `--version` output will now also contain PCRE2 availability information.
Bug fixes:
* [BUG #884](https://github.com/BurntSushi/ripgrep/issues/884):
Don't error when `-v/--invert-match` is used multiple times.
* [BUG #1275](https://github.com/BurntSushi/ripgrep/issues/1275):
Fix bug with `\b` assertion in the regex engine.
* [BUG #1376](https://github.com/BurntSushi/ripgrep/issues/1376):
Using `--no-ignore --ignore-vcs` now works as one would expect.
* [BUG #1622](https://github.com/BurntSushi/ripgrep/issues/1622):
Add note about error messages to `-z/--search-zip` documentation.
* [BUG #1648](https://github.com/BurntSushi/ripgrep/issues/1648):
Fix bug where sometimes short flags with values, e.g., `-M 900`, would fail.
* [BUG #1701](https://github.com/BurntSushi/ripgrep/issues/1701):
Fix bug where some flags could not be repeated.
* [BUG #1757](https://github.com/BurntSushi/ripgrep/issues/1757):
Fix bug when searching a sub-directory didn't have ignores applied correctly.
* [BUG #1891](https://github.com/BurntSushi/ripgrep/issues/1891):
Fix bug when using `-w` with a regex that can match the empty string.
* [BUG #1911](https://github.com/BurntSushi/ripgrep/issues/1911):
Disable mmap searching in all non-64-bit environments.
* [BUG #1966](https://github.com/BurntSushi/ripgrep/issues/1966):
Fix bug where ripgrep can panic when printing to stderr.
* [BUG #2046](https://github.com/BurntSushi/ripgrep/issues/2046):
Clarify that `--pre` can accept any kind of path in the documentation.
* [BUG #2108](https://github.com/BurntSushi/ripgrep/issues/2108):
Improve docs for `-r/--replace` syntax.
* [BUG #2198](https://github.com/BurntSushi/ripgrep/issues/2198):
Fix bug where `--no-ignore-dot` would not ignore `.rgignore`.
* [BUG #2201](https://github.com/BurntSushi/ripgrep/issues/2201):
Improve docs for `-r/--replace` flag.
* [BUG #2288](https://github.com/BurntSushi/ripgrep/issues/2288):
`-A` and `-B` now only each partially override `-C`.
* [BUG #2236](https://github.com/BurntSushi/ripgrep/issues/2236):
Fix gitignore parsing bug where a trailing `\/` resulted in an error.
* [BUG #2243](https://github.com/BurntSushi/ripgrep/issues/2243):
Fix `--sort` flag for values other than `path`.
* [BUG #2246](https://github.com/BurntSushi/ripgrep/issues/2246):
Add note in `--debug` logs when binary files are ignored.
* [BUG #2337](https://github.com/BurntSushi/ripgrep/issues/2337):
Improve docs to mention that `--stats` is always implied by `--json`.
* [BUG #2381](https://github.com/BurntSushi/ripgrep/issues/2381):
Make `-p/--pretty` override flags like `--no-line-number`.
* [BUG #2392](https://github.com/BurntSushi/ripgrep/issues/2392):
Improve global git config parsing of the `excludesFile` field.
* [BUG #2418](https://github.com/BurntSushi/ripgrep/pull/2418):
Clarify sorting semantics of `--sort=path`.
* [BUG #2458](https://github.com/BurntSushi/ripgrep/pull/2458):
Make `--trim` run before `-M/--max-columns` takes effect.
* [BUG #2479](https://github.com/BurntSushi/ripgrep/issues/2479):
Add documentation about `.ignore`/`.rgignore` files in parent directories.
* [BUG #2480](https://github.com/BurntSushi/ripgrep/issues/2480):
Fix bug when using inline regex flags with `-e/--regexp`.
* [BUG #2505](https://github.com/BurntSushi/ripgrep/issues/2505):
Improve docs for `--vimgrep` by mentioning footguns and some work-arounds.
* [BUG #2519](https://github.com/BurntSushi/ripgrep/issues/2519):
Fix incorrect default value in documentation for `--field-match-separator`.
* [BUG #2523](https://github.com/BurntSushi/ripgrep/issues/2523):
Make executable searching take `.com` into account on Windows.
* [BUG #2574](https://github.com/BurntSushi/ripgrep/issues/2574):
Fix bug in `-w/--word-regexp` that would result in incorrect match offsets.
* [BUG #2623](https://github.com/BurntSushi/ripgrep/issues/2623):
Fix a number of bugs with the `-w/--word-regexp` flag.
* [BUG #2636](https://github.com/BurntSushi/ripgrep/pull/2636):
Strip release binaries for macOS.
13.0.0 (2021-06-12)
===================
ripgrep 13 is a new major version release of ripgrep that primarily contains
bug fixes, some performance improvements and a few minor breaking changes.
There is also a fix for a security vulnerability on Windows
([CVE-2021-3013](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3013)).
Some highlights:
A new short flag, `-.`, has been added. It is an alias for the `--hidden` flag,
which instructs ripgrep to search hidden files and directories.
ripgrep is now using a new
[vectorized implementation of `memmem`](https://github.com/BurntSushi/memchr/pull/82),
which accelerates many common searches. If you notice any performance
regressions (or major improvements), I'd love to hear about them through an
issue report!
Also, for Windows users targeting MSVC, Cargo will now build fully static
executables of ripgrep. The release binaries for ripgrep 13 have been compiled
using this configuration.
**BREAKING CHANGES**:
**Binary detection output has changed slightly.**
In this release, a small tweak has been made to the output format when a binary
file is detected. Previously, it looked like this:
```
Binary file FOO matches (found "\0" byte around offset XXX)
```
Now it looks like this:
```
FOO: binary file matches (found "\0" byte around offset XXX)
```
**vimgrep output in multi-line now only prints the first line for each match.**
See [issue 1866](https://github.com/BurntSushi/ripgrep/issues/1866) for more
discussion on this. Previously, every line in a match was duplicated, even
when it spanned multiple lines. There are no changes to vimgrep output when
multi-line mode is disabled.
**In multi-line mode, --count is now equivalent to --count-matches.**
This appears to match how `pcre2grep` implements `--count`. Previously, ripgrep
would produce outright incorrect counts. Another alternative would be to simply
count the number of lines---even if it's more than the number of matches---but
that seems highly unintuitive.
**FULL LIST OF FIXES AND IMPROVEMENTS:**
Security fixes:
* [CVE-2021-3013](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3013):
Fixes a security hole on Windows where running ripgrep with either the
`-z/--search-zip` or `--pre` flags can result in running arbitrary
executables from the current directory.
* [VULN #1773](https://github.com/BurntSushi/ripgrep/issues/1773):
This is the public facing issue tracking CVE-2021-3013. ripgrep's README
now contains a section describing how to report a vulnerability.
Performance improvements:
* [PERF #1657](https://github.com/BurntSushi/ripgrep/discussions/1657):
Check if a file should be ignored first before issuing stat calls.
* [PERF memchr#82](https://github.com/BurntSushi/memchr/pull/82):
ripgrep now uses a new vectorized implementation of `memmem`.
Feature enhancements:
* Added or improved file type filtering for ASP, Bazel, dvc, FlatBuffers,
Futhark, minified files, Mint, pofiles (from GNU gettext) Racket, Red, Ruby,
VCL, Yang.
* [FEATURE #1404](https://github.com/BurntSushi/ripgrep/pull/1404):
ripgrep now prints a warning if nothing is searched.
* [FEATURE #1613](https://github.com/BurntSushi/ripgrep/pull/1613):
Cargo will now produce static executables on Windows when using MSVC.
* [FEATURE #1680](https://github.com/BurntSushi/ripgrep/pull/1680):
Add `-.` as a short flag alias for `--hidden`.
* [FEATURE #1842](https://github.com/BurntSushi/ripgrep/issues/1842):
Add `--field-{context,match}-separator` for customizing field delimiters.
* [FEATURE #1856](https://github.com/BurntSushi/ripgrep/pull/1856):
The README now links to a
[Spanish translation](https://github.com/UltiRequiem/traducciones/tree/master/ripgrep).
Bug fixes:
* [BUG #1277](https://github.com/BurntSushi/ripgrep/issues/1277):
Document cygwin path translation behavior in the FAQ.
* [BUG #1739](https://github.com/BurntSushi/ripgrep/issues/1739):
Fix bug where replacements were buggy if the regex matched a line terminator.
* [BUG #1311](https://github.com/BurntSushi/ripgrep/issues/1311):
Fix multi-line bug where a search & replace for `\n` didn't work as expected.
* [BUG #1401](https://github.com/BurntSushi/ripgrep/issues/1401):
Fix buggy interaction between PCRE2 look-around and `-o/--only-matching`.
* [BUG #1412](https://github.com/BurntSushi/ripgrep/issues/1412):
Fix multi-line bug with searches using look-around past matching lines.
* [BUG #1577](https://github.com/BurntSushi/ripgrep/issues/1577):
Fish shell completions will continue to be auto-generated.
* [BUG #1642](https://github.com/BurntSushi/ripgrep/issues/1642):
Fixes a bug where using `-m` and `-A` printed more matches than the limit.
* [BUG #1703](https://github.com/BurntSushi/ripgrep/issues/1703):
Clarify the function of `-u/--unrestricted`.
* [BUG #1708](https://github.com/BurntSushi/ripgrep/issues/1708):
Clarify how `-S/--smart-case` works.
* [BUG #1730](https://github.com/BurntSushi/ripgrep/issues/1730):
Clarify that CLI invocation must always be valid, regardless of config file.
* [BUG #1741](https://github.com/BurntSushi/ripgrep/issues/1741):
Fix stdin detection when using PowerShell in UNIX environments.
* [BUG #1756](https://github.com/BurntSushi/ripgrep/pull/1756):
Fix bug where `foo/**` would match `foo`, but it shouldn't.
* [BUG #1765](https://github.com/BurntSushi/ripgrep/issues/1765):
Fix panic when `--crlf` is used in some cases.
* [BUG #1638](https://github.com/BurntSushi/ripgrep/issues/1638):
Correctly sniff UTF-8 and do transcoding, like we do for UTF-16.
* [BUG #1816](https://github.com/BurntSushi/ripgrep/issues/1816):
Add documentation for glob alternate syntax, e.g., `{a,b,..}`.
* [BUG #1847](https://github.com/BurntSushi/ripgrep/issues/1847):
Clarify how the `--hidden` flag works.
* [BUG #1866](https://github.com/BurntSushi/ripgrep/issues/1866#issuecomment-841635553):
Fix bug when computing column numbers in `--vimgrep` mode.
* [BUG #1868](https://github.com/BurntSushi/ripgrep/issues/1868):
Fix bug where `--passthru` and `-A/-B/-C` did not override each other.
* [BUG #1869](https://github.com/BurntSushi/ripgrep/pull/1869):
Clarify docs for `--files-with-matches` and `--files-without-match`.
* [BUG #1878](https://github.com/BurntSushi/ripgrep/issues/1878):
Fix bug where `\A` could produce unanchored matches in multiline search.
* [BUG 94e4b8e3](https://github.com/BurntSushi/ripgrep/commit/94e4b8e3):
Fix column numbers with `--vimgrep` is used with `-U/--multiline`.
12.1.1 (2020-05-29)
===================
ripgrep 12.1.1 is a patch release that fixes a couple small bugs. In
particular, the ripgrep 12.1.0 release did not tag new releases for all of its
in-tree dependencies. As a result, ripgrep built dependencies from crates.io
would produce a different build than compiling ripgrep from source on the
`12.1.0` tag. Namely, some crates like `grep-cli` had unreleased changes.
Bug fixes:
* [BUG #1581](https://github.com/BurntSushi/ripgrep/issues/1581):
Corrects some egregious markup output in `--help`.
* [BUG #1591](https://github.com/BurntSushi/ripgrep/issues/1591):
Mention the special `$0` capture group in docs for the `-r/--replace` flag.
* [BUG #1602](https://github.com/BurntSushi/ripgrep/issues/1602):
Fix failing test resulting from out-of-sync dependencies.
12.1.0 (2020-05-09)
===================
ripgrep 12.1.0 is a small minor version release that mostly includes bug fixes
and documentation improvements. This release also contains some important
notices for downstream packagers.
**Notices for downstream ripgrep package maintainers:**
* Fish shell completions will be removed in the ripgrep 13 release.
See [#1577](https://github.com/BurntSushi/ripgrep/issues/1577)
for more details.
* ripgrep has switched from `a2x` to `asciidoctor` to generate the man page.
If `asciidoctor` is not present, then ripgrep will currently fall back to
`a2x`. Support for `a2x` will be dropped in the ripgrep 13 release.
See [#1544](https://github.com/BurntSushi/ripgrep/issues/1544)
for more details.
Feature enhancements:
* [FEATURE #1547](https://github.com/BurntSushi/ripgrep/pull/1547):
Support decompressing `.Z` files via `uncompress`.
Bug fixes:
* [BUG #1252](https://github.com/BurntSushi/ripgrep/issues/1252):
Add a section on the `--pre` flag to the GUIDE.
* [BUG #1339](https://github.com/BurntSushi/ripgrep/issues/1339):
Improve error message when a pattern with invalid UTF-8 is provided.
* [BUG #1524](https://github.com/BurntSushi/ripgrep/issues/1524):
Note how to escape a `$` when using `--replace`.
* [BUG #1537](https://github.com/BurntSushi/ripgrep/issues/1537):
Fix match bug caused by inner literal optimization.
* [BUG #1544](https://github.com/BurntSushi/ripgrep/issues/1544):
ripgrep now uses `asciidoctor` instead of `a2x` to generate its man page.
* [BUG #1550](https://github.com/BurntSushi/ripgrep/issues/1550):
Substantially reduce peak memory usage when searching wide directories.
* [BUG #1571](https://github.com/BurntSushi/ripgrep/issues/1571):
Add note about configuration files in `--type-{add,clear}` docs.
* [BUG #1573](https://github.com/BurntSushi/ripgrep/issues/1573):
Fix incorrect `--count-matches` output when using look-around.
12.0.1 (2020-03-29)
===================
ripgrep 12.0.1 is a small patch release that includes a minor bug fix relating
to superfluous error messages when searching git repositories with sub-modules.
This was a regression introduced in the 12.0.0 release.
Bug fixes:
* [BUG #1520](https://github.com/BurntSushi/ripgrep/issues/1520):
Don't emit spurious error messages in git repositories with submodules.
12.0.0 (2020-03-15)
===================
ripgrep 12 is a new major version release of ripgrep that contains many bug
fixes, several important performance improvements and a few minor new features.
In a near future release, I am hoping to add an
[indexing feature](https://github.com/BurntSushi/ripgrep/issues/1497)
to ripgrep, which will dramatically speed up searching by building an index.
Feedback would very much be appreciated, especially on the user experience
which will be difficult to get right.
This release has no known breaking changes.
Deprecations:
* The `--no-pcre2-unicode` flag is deprecated. Instead, use the `--no-unicode`
flag, which applies to both the default regex engine and PCRE2. For now,
`--no-pcre2-unicode` and `--pcre2-unicode` are aliases to `--no-unicode`
and `--unicode`, respectively. The `--[no-]pcre2-unicode` flags may be
removed in a future release.
* The `--auto-hybrid-regex` flag is deprecated. Instead, use the new `--engine`
flag with the `auto` value.
Performance improvements:
* [PERF #1087](https://github.com/BurntSushi/ripgrep/pull/1087):
ripgrep is smarter when detected literals are whitespace.
* [PERF #1381](https://github.com/BurntSushi/ripgrep/pull/1381):
Directory traversal is sped up with speculative ignore-file existence checks.
* [PERF cd8ec38a](https://github.com/BurntSushi/ripgrep/commit/cd8ec38a):
Improve inner literal detection to cover more cases more effectively.
e.g., ` +Sherlock Holmes +` now has ` Sherlock Holmes ` extracted instead
of ` `.
* [PERF 6a0e0147](https://github.com/BurntSushi/ripgrep/commit/6a0e0147):
Improve literal detection when the `-w/--word-regexp` flag is used.
* [PERF ad97e9c9](https://github.com/BurntSushi/ripgrep/commit/ad97e9c9):
Improve overall performance of the `-w/--word-regexp` flag.
Feature enhancements:
* Added or improved file type filtering for erb, diff, Gradle, HAML, Org,
Postscript, Skim, Slim, Slime, RPM Spec files, Typoscript, xml.
* [FEATURE #1370](https://github.com/BurntSushi/ripgrep/pull/1370):
Add `--include-zero` flag that shows files searched without matches.
* [FEATURE #1390](https://github.com/BurntSushi/ripgrep/pull/1390):
Add `--no-context-separator` flag that always hides context separators.
* [FEATURE #1414](https://github.com/BurntSushi/ripgrep/pull/1414):
Add `--no-require-git` flag to allow ripgrep to respect gitignores anywhere.
* [FEATURE #1420](https://github.com/BurntSushi/ripgrep/pull/1420):
Add `--no-ignore-exclude` to disregard rules in `.git/info/exclude` files.
* [FEATURE #1466](https://github.com/BurntSushi/ripgrep/pull/1466):
Add `--no-ignore-files` flag to disable all `--ignore-file` flags.
* [FEATURE #1488](https://github.com/BurntSushi/ripgrep/pull/1488):
Add '--engine' flag for easier switching between regex engines.
* [FEATURE 75cbe88f](https://github.com/BurntSushi/ripgrep/commit/75cbe88f):
Add `--no-unicode` flag. This works on all supported regex engines.
Bug fixes:
* [BUG #1291](https://github.com/BurntSushi/ripgrep/issues/1291):
ripgrep now works in non-existent directories.
* [BUG #1319](https://github.com/BurntSushi/ripgrep/issues/1319):
Fix match bug due to errant literal detection.
* [**BUG #1335**](https://github.com/BurntSushi/ripgrep/issues/1335):
Fixes a performance bug when searching plain text files with very long lines.
This was a serious performance regression in some cases.
* [BUG #1344](https://github.com/BurntSushi/ripgrep/issues/1344):
Document usage of `--type all`.
* [BUG #1389](https://github.com/BurntSushi/ripgrep/issues/1389):
Fixes a bug where ripgrep would panic when searching a symlinked directory.
* [BUG #1439](https://github.com/BurntSushi/ripgrep/issues/1439):
Improve documentation for ripgrep's automatic stdin detection.
* [BUG #1441](https://github.com/BurntSushi/ripgrep/issues/1441):
Remove CPU features from man page.
* [BUG #1442](https://github.com/BurntSushi/ripgrep/issues/1442),
[BUG #1478](https://github.com/BurntSushi/ripgrep/issues/1478):
Improve documentation of the `-g/--glob` flag.
* [BUG #1445](https://github.com/BurntSushi/ripgrep/issues/1445):
ripgrep now respects ignore rules from .git/info/exclude in worktrees.
* [BUG #1485](https://github.com/BurntSushi/ripgrep/issues/1485):
Fish shell completions from the release Debian package are now installed to
`/usr/share/fish/vendor_completions.d/rg.fish`.
11.0.2 (2019-08-01)
===================
ripgrep 11.0.2 is a new patch release that fixes a few bugs, including a
performance regression and a matching bug when using the `-F/--fixed-strings`
flag.
Feature enhancements:
* [FEATURE #1293](https://github.com/BurntSushi/ripgrep/issues/1293):
Added `--glob-case-insensitive` flag that makes `--glob` behave as `--iglob`.
Bug fixes:
* [BUG #1246](https://github.com/BurntSushi/ripgrep/issues/1246):
Add translations to README, starting with an unofficial Chinese translation.
* [BUG #1259](https://github.com/BurntSushi/ripgrep/issues/1259):
Fix bug where the last byte of a `-f file` was stripped if it wasn't a `\n`.
* [BUG #1261](https://github.com/BurntSushi/ripgrep/issues/1261):
Document that no error is reported when searching for `\n` with `-P/--pcre2`.
* [BUG #1284](https://github.com/BurntSushi/ripgrep/issues/1284):
Mention `.ignore` and `.rgignore` more prominently in the README.
* [BUG #1292](https://github.com/BurntSushi/ripgrep/issues/1292):
Fix bug where `--with-filename` was sometimes enabled incorrectly.
* [BUG #1268](https://github.com/BurntSushi/ripgrep/issues/1268):
Fix major performance regression in GitHub `x86_64-linux` binary release.
* [BUG #1302](https://github.com/BurntSushi/ripgrep/issues/1302):
Show better error messages when a non-existent preprocessor command is given.
* [BUG #1334](https://github.com/BurntSushi/ripgrep/issues/1334):
Fix match regression with `-F` flag when patterns contain meta characters.
11.0.1 (2019-04-16)
===================
ripgrep 11.0.1 is a new patch release that fixes a search regression introduced
in the previous 11.0.0 release. In particular, ripgrep can enter an infinite
loop for some search patterns when searching invalid UTF-8.
Bug fixes:
* [BUG #1247](https://github.com/BurntSushi/ripgrep/issues/1247):
Fix search bug that can cause ripgrep to enter an infinite loop.
11.0.0 (2019-04-15)
===================
ripgrep 11 is a new major version release of ripgrep that contains many bug
fixes, some performance improvements and a few feature enhancements. Notably,
ripgrep's user experience for binary file filtering has been improved. See the
[guide's new section on binary data](GUIDE.md#binary-data) for more details.
This release also marks a change in ripgrep's versioning. Where as the previous
version was `0.10.0`, this version is `11.0.0`. Moving forward, ripgrep's
major version will be increased a few times per year. ripgrep will continue to
be conservative with respect to backwards compatibility, but may occasionally
introduce breaking changes, which will always be documented in this CHANGELOG.
See [issue 1172](https://github.com/BurntSushi/ripgrep/issues/1172) for a bit
more detail on why this versioning change was made.
This release increases the **minimum supported Rust version** from 1.28.0 to
1.34.0.
**BREAKING CHANGES**:
* ripgrep has tweaked its exit status codes to be more like GNU grep's. Namely,
if a non-fatal error occurs during a search, then ripgrep will now always
emit a `2` exit status code, regardless of whether a match is found or not.
Previously, ripgrep would only emit a `2` exit status code for a catastrophic
error (e.g., regex syntax error). One exception to this is if ripgrep is run
with `-q/--quiet`. In that case, if an error occurs and a match is found,
then ripgrep will exit with a `0` exit status code.
* Supplying the `-u/--unrestricted` flag three times is now equivalent to
supplying `--no-ignore --hidden --binary`. Previously, `-uuu` was equivalent
to `--no-ignore --hidden --text`. The difference is that `--binary` disables
binary file filtering without potentially dumping binary data into your
terminal. That is, `rg -uuu foo` should now be equivalent to `grep -r foo`.
* The `avx-accel` feature of ripgrep has been removed since it is no longer
necessary. All uses of AVX in ripgrep are now enabled automatically via
runtime CPU feature detection. The `simd-accel` feature does remain available
(only for enabling SIMD for transcoding), however, it does increase
compilation times substantially at the moment.
Performance improvements:
* [PERF #497](https://github.com/BurntSushi/ripgrep/issues/497),
[PERF #838](https://github.com/BurntSushi/ripgrep/issues/838):
Make `rg -F -f dictionary-of-literals` much faster.
Feature enhancements:
* Added or improved file type filtering for Apache Thrift, ASP, Bazel, Brotli,
BuildStream, bzip2, C, C++, Cython, gzip, Java, Make, Postscript, QML, Tex,
XML, xz, zig and zstd.
* [FEATURE #855](https://github.com/BurntSushi/ripgrep/issues/855):
Add `--binary` flag for disabling binary file filtering.
* [FEATURE #1078](https://github.com/BurntSushi/ripgrep/pull/1078):
Add `--max-columns-preview` flag for showing a preview of long lines.
* [FEATURE #1099](https://github.com/BurntSushi/ripgrep/pull/1099):
Add support for Brotli and Zstd to the `-z/--search-zip` flag.
* [FEATURE #1138](https://github.com/BurntSushi/ripgrep/pull/1138):
Add `--no-ignore-dot` flag for ignoring `.ignore` files.
* [FEATURE #1155](https://github.com/BurntSushi/ripgrep/pull/1155):
Add `--auto-hybrid-regex` flag for automatically falling back to PCRE2.
* [FEATURE #1159](https://github.com/BurntSushi/ripgrep/pull/1159):
ripgrep's exit status logic should now match GNU grep. See updated man page.
* [FEATURE #1164](https://github.com/BurntSushi/ripgrep/pull/1164):
Add `--ignore-file-case-insensitive` for case insensitive ignore globs.
* [FEATURE #1185](https://github.com/BurntSushi/ripgrep/pull/1185):
Add `-I` flag as a short option for the `--no-filename` flag.
* [FEATURE #1207](https://github.com/BurntSushi/ripgrep/pull/1207):
Add `none` value to `-E/--encoding` to forcefully disable all transcoding.
* [FEATURE da9d7204](https://github.com/BurntSushi/ripgrep/commit/da9d7204):
Add `--pcre2-version` for querying showing PCRE2 version information.
Bug fixes:
* [BUG #306](https://github.com/BurntSushi/ripgrep/issues/306),
[BUG #855](https://github.com/BurntSushi/ripgrep/issues/855):
Improve the user experience for ripgrep's binary file filtering.
* [BUG #373](https://github.com/BurntSushi/ripgrep/issues/373),
[BUG #1098](https://github.com/BurntSushi/ripgrep/issues/1098):
`**` is now accepted as valid syntax anywhere in a glob.
* [BUG #916](https://github.com/BurntSushi/ripgrep/issues/916):
ripgrep no longer hangs when searching `/proc` with a zombie process present.
* [BUG #1052](https://github.com/BurntSushi/ripgrep/issues/1052):
Fix bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1055](https://github.com/BurntSushi/ripgrep/issues/1055):
Suggest `-U/--multiline` when a pattern contains a `\n`.
* [BUG #1063](https://github.com/BurntSushi/ripgrep/issues/1063):
Always strip a BOM if it's present, even for UTF-8.
* [BUG #1064](https://github.com/BurntSushi/ripgrep/issues/1064):
Fix inner literal detection that could lead to incorrect matches.
* [BUG #1079](https://github.com/BurntSushi/ripgrep/issues/1079):
Fixes a bug where the order of globs could result in missing a match.
* [BUG #1089](https://github.com/BurntSushi/ripgrep/issues/1089):
Fix another bug where ripgrep could panic when transcoding UTF-16 files.
* [BUG #1091](https://github.com/BurntSushi/ripgrep/issues/1091):
Add note about inverted flags to the man page.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
Fix handling of literal slashes in gitignore patterns.
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
Fix corner cases involving the `--crlf` flag.
* [BUG #1101](https://github.com/BurntSushi/ripgrep/issues/1101):
Fix AsciiDoc escaping for man page output.
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
Clarify what `--encoding auto` does.
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
`--files-with-matches` and `--files-without-match` work with one file.
* [BUG #1121](https://github.com/BurntSushi/ripgrep/issues/1121):
Fix bug that was triggering Windows antimalware when using the `--files`
flag.
* [BUG #1125](https://github.com/BurntSushi/ripgrep/issues/1125),
[BUG #1159](https://github.com/BurntSushi/ripgrep/issues/1159):
ripgrep shouldn't panic for `rg -h | rg` and should emit correct exit status.
* [BUG #1144](https://github.com/BurntSushi/ripgrep/issues/1144):
Fixes a bug where line numbers could be wrong on big-endian machines.
* [BUG #1154](https://github.com/BurntSushi/ripgrep/issues/1154):
Windows files with "hidden" attribute are now treated as hidden.
* [BUG #1173](https://github.com/BurntSushi/ripgrep/issues/1173):
Fix handling of `**` patterns in gitignore files.
* [BUG #1174](https://github.com/BurntSushi/ripgrep/issues/1174):
Fix handling of repeated `**` patterns in gitignore files.
* [BUG #1176](https://github.com/BurntSushi/ripgrep/issues/1176):
Fix bug where `-F`/`-x` weren't applied to patterns given via `-f`.
* [BUG #1189](https://github.com/BurntSushi/ripgrep/issues/1189):
Document cases where ripgrep may use a lot of memory.
* [BUG #1203](https://github.com/BurntSushi/ripgrep/issues/1203):
Fix a matching bug related to the suffix literal optimization.
* [BUG 8f14cb18](https://github.com/BurntSushi/ripgrep/commit/8f14cb18):
Increase the default stack size for PCRE2's JIT.
0.10.0 (2018-09-07)
===================
This is a new minor version release of ripgrep that contains some major new
features, a huge number of bug fixes, and is the first release based on
libripgrep. The entirety of ripgrep's core search and printing code has been
rewritten and generalized so that anyone can make use of it.
Major new features include PCRE2 support, multi-line search and a JSON output
format.
**BREAKING CHANGES**:
* The minimum version required to compile Rust has now changed to track the
latest stable version of Rust. Patch releases will continue to compile with
the same version of Rust as the previous patch release, but new minor
versions will use the current stable version of the Rust compile as its
minimum supported version.
* The match semantics of `-w/--word-regexp` have changed slightly. They used
to be `\b(?:<your pattern>)\b`, but now it's
`(?:^|\W)(?:<your pattern>)(?:$|\W)`. This matches the behavior of GNU grep
and is believed to be closer to the intended semantics of the flag. See
[#389](https://github.com/BurntSushi/ripgrep/issues/389) for more details.
Feature enhancements:
* [FEATURE #162](https://github.com/BurntSushi/ripgrep/issues/162):
libripgrep is now a thing. The primary crate is
[`grep`](https://docs.rs/grep).
* [FEATURE #176](https://github.com/BurntSushi/ripgrep/issues/176):
Add `-U/--multiline` flag that permits matching over multiple lines.
* [FEATURE #188](https://github.com/BurntSushi/ripgrep/issues/188):
Add `-P/--pcre2` flag that gives support for look-around and backreferences.
* [FEATURE #244](https://github.com/BurntSushi/ripgrep/issues/244):
Add `--json` flag that prints results in a JSON Lines format.
* [FEATURE #321](https://github.com/BurntSushi/ripgrep/issues/321):
Add `--one-file-system` flag to skip directories on different file systems.
* [FEATURE #404](https://github.com/BurntSushi/ripgrep/issues/404):
Add `--sort` and `--sortr` flag for more sorting. Deprecate `--sort-files`.
* [FEATURE #416](https://github.com/BurntSushi/ripgrep/issues/416):
Add `--crlf` flag to permit `$` to work with carriage returns on Windows.
* [FEATURE #917](https://github.com/BurntSushi/ripgrep/issues/917):
The `--trim` flag strips prefix whitespace from all lines printed.
* [FEATURE #993](https://github.com/BurntSushi/ripgrep/issues/993):
Add `--null-data` flag, which makes ripgrep use NUL as a line terminator.
* [FEATURE #997](https://github.com/BurntSushi/ripgrep/issues/997):
The `--passthru` flag now works with the `--replace` flag.
* [FEATURE #1038-1](https://github.com/BurntSushi/ripgrep/issues/1038):
Add `--line-buffered` and `--block-buffered` for forcing a buffer strategy.
* [FEATURE #1038-2](https://github.com/BurntSushi/ripgrep/issues/1038):
Add `--pre-glob` for filtering files through the `--pre` flag.
Bug fixes:
* [BUG #2](https://github.com/BurntSushi/ripgrep/issues/2):
Searching with non-zero context can now use memory maps if appropriate.
* [BUG #200](https://github.com/BurntSushi/ripgrep/issues/200):
ripgrep will now stop correctly when its output pipe is closed.
* [BUG #389](https://github.com/BurntSushi/ripgrep/issues/389):
The `-w/--word-regexp` flag now works more intuitively.
* [BUG #643](https://github.com/BurntSushi/ripgrep/issues/643):
Detection of readable stdin has improved on Windows.
* [BUG #441](https://github.com/BurntSushi/ripgrep/issues/441),
[BUG #690](https://github.com/BurntSushi/ripgrep/issues/690),
[BUG #980](https://github.com/BurntSushi/ripgrep/issues/980):
Matching empty lines now works correctly in several corner cases.
* [BUG #764](https://github.com/BurntSushi/ripgrep/issues/764):
Color escape sequences now coalesce, which reduces output size.
* [BUG #842](https://github.com/BurntSushi/ripgrep/issues/842):
Add man page to binary Debian package.
* [BUG #922](https://github.com/BurntSushi/ripgrep/issues/922):
ripgrep is now more robust with respect to memory maps failing.
* [BUG #937](https://github.com/BurntSushi/ripgrep/issues/937):
Color escape sequences are no longer emitted for empty matches.
* [BUG #940](https://github.com/BurntSushi/ripgrep/issues/940):
Context from the `--passthru` flag should not impact process exit status.
* [BUG #984](https://github.com/BurntSushi/ripgrep/issues/984):
Fixes bug in `ignore` crate where first path was always treated as a symlink.
* [BUG #990](https://github.com/BurntSushi/ripgrep/issues/990):
Read stderr asynchronously when running a process.
* [BUG #1013](https://github.com/BurntSushi/ripgrep/issues/1013):
Add compile time and runtime CPU features to `--version` output.
* [BUG #1028](https://github.com/BurntSushi/ripgrep/pull/1028):
Don't complete bare pattern after `-f` in zsh.
0.9.0 (2018-08-03)
==================
This is a new minor version release of ripgrep that contains some minor new
features and a panoply of bug fixes.
Releases provided on Github for `x86_64` will now work on all target CPUs, and
will also automatically take advantage of features found on modern CPUs (such
as AVX2) for additional optimizations.
This release increases the **minimum supported Rust version** from 1.20.0 to
1.23.0.
It is anticipated that the next release of ripgrep (0.10.0) will provide
multi-line search support and a JSON output format.
**BREAKING CHANGES**:
* When `--count` and `--only-matching` are provided simultaneously, the
@@ -27,7 +787,7 @@ This release increases the **minimum supported Rust version** from 1.20.0 to
Feature enhancements:
* Added or improved file type filtering for Android, Bazel, Fuschia, Haskell,
* Added or improved file type filtering for Android, Bazel, Fuchsia, Haskell,
Java and Puppet.
* [FEATURE #411](https://github.com/BurntSushi/ripgrep/issues/411):
Add a `--stats` flag, which emits aggregate statistics after search results.
@@ -37,7 +797,7 @@ Feature enhancements:
* [FEATURE #702](https://github.com/BurntSushi/ripgrep/issues/702):
Support `\u{..}` Unicode escape sequences.
* [FEATURE #812](https://github.com/BurntSushi/ripgrep/issues/812):
Add `-b/--byte-offset` flag that reports byte offset of each matching line.
Add `-b/--byte-offset` flag that shows the byte offset of each matching line.
* [FEATURE #814](https://github.com/BurntSushi/ripgrep/issues/814):
Add `--count-matches` flag, which is like `--count`, but for each match.
* [FEATURE #880](https://github.com/BurntSushi/ripgrep/issues/880):
@@ -81,7 +841,7 @@ Bug fixes:
* [BUG #852](https://github.com/BurntSushi/ripgrep/issues/852):
Be robust with respect to `ENOMEM` errors returned by `mmap`.
* [BUG #853](https://github.com/BurntSushi/ripgrep/issues/853):
Upgrade `grep` crate to `regex-syntax 0.5.0`.
Upgrade `grep` crate to `regex-syntax 0.6.0`.
* [BUG #893](https://github.com/BurntSushi/ripgrep/issues/893):
Improve support for git submodules.
* [BUG #900](https://github.com/BurntSushi/ripgrep/issues/900):
@@ -151,7 +911,7 @@ Bug fixes:
0.8.0 (2018-02-11)
==================
This is a new minor version releae of ripgrep that satisfies several popular
This is a new minor version release of ripgrep that satisfies several popular
feature requests (config files, search compressed files, true colors), fixes
many bugs and improves the quality of life for ripgrep maintainers. This
release also includes greatly improved documentation in the form of a
@@ -849,7 +1609,7 @@ Bug fixes:
=====
Feature enhancements:
* Added or improved file type filtering for VB, R, F#, Swift, Nim, Javascript,
* Added or improved file type filtering for VB, R, F#, Swift, Nim, JavaScript,
TypeScript
* [FEATURE #20](https://github.com/BurntSushi/ripgrep/issues/20):
Adds a --no-filename flag.

650
Cargo.lock generated
View File

@@ -1,431 +1,537 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "aho-corasick"
version = "0.6.6"
version = "1.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b2969dcb958b36655471fc61f7e416fa76033bdd4bfed0678d8fee1e2d07a1f0"
dependencies = [
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr",
]
[[package]]
name = "ansi_term"
version = "0.11.0"
name = "anyhow"
version = "1.0.79"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "080e9890a082662b09c1ad45f567faeeb47f22b5fb23895fbe1e651e718e25ca"
[[package]]
name = "autocfg"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "bstr"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c48f0051a4b4c5e0b6d365cd04af53aeaa209e3cc15ec2cdb69e73cc87fbd0dc"
dependencies = [
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr",
"regex-automata",
"serde",
]
[[package]]
name = "atty"
version = "0.2.11"
name = "cc"
version = "1.0.83"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f1174fb0b6ec23863f8b971027804a42614e347eafb0a95bf0b12cdae21fc4d0"
dependencies = [
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "bitflags"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "bytecount"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"simd 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"jobserver",
"libc",
]
[[package]]
name = "cfg-if"
version = "0.1.4"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "clap"
version = "2.32.0"
name = "crossbeam-channel"
version = "0.5.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "82a9b73a36529d9c47029b9fb3a6f0ea3cc916a261195352ba19e770fc1748b2"
dependencies = [
"ansi_term 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)",
"atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"bitflags 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)",
"textwrap 0.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if",
"crossbeam-utils",
]
[[package]]
name = "crossbeam"
version = "0.3.2"
name = "crossbeam-deque"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fca89a0e215bab21874660c67903c5f143333cab1da83d041c7ded6053774751"
dependencies = [
"cfg-if",
"crossbeam-epoch",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-epoch"
version = "0.9.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e3681d554572a651dda4186cd47240627c3d0114d45a95f6ad27f2f22e7548d"
dependencies = [
"autocfg",
"cfg-if",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3a430a770ebd84726f584a90ee7f020d28db52c6d02138900f22341f866d39c"
dependencies = [
"cfg-if",
]
[[package]]
name = "encoding_rs"
version = "0.8.4"
version = "0.8.33"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7268b386296a025e474d5140678f75d6de9493ae55a5d709eeb9dd08149945e1"
dependencies = [
"cfg-if 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
"simd 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if",
"packed_simd",
]
[[package]]
name = "encoding_rs_io"
version = "0.1.1"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1cc3c5651fb62ab8aa3103998dade57efdd028544bd300516baa31840c252a83"
dependencies = [
"encoding_rs 0.8.4 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs",
]
[[package]]
name = "fnv"
version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "fuchsia-zircon"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"bitflags 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"fuchsia-zircon-sys 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "fuchsia-zircon-sys"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "glob"
version = "0.2.11"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d2fabcfbdc87f4758337ca535fb41a6d701b65693ce38287d856d1674551ec9b"
[[package]]
name = "globset"
version = "0.4.1"
version = "0.4.14"
dependencies = [
"aho-corasick 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
"fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)",
"glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick",
"bstr",
"glob",
"log",
"regex-automata",
"regex-syntax",
"serde",
"serde_json",
]
[[package]]
name = "grep"
version = "0.1.9"
version = "0.3.1"
dependencies = [
"log 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)",
"grep-cli",
"grep-matcher",
"grep-pcre2",
"grep-printer",
"grep-regex",
"grep-searcher",
"termcolor",
"walkdir",
]
[[package]]
name = "grep-cli"
version = "0.1.10"
dependencies = [
"bstr",
"globset",
"libc",
"log",
"termcolor",
"winapi-util",
]
[[package]]
name = "grep-matcher"
version = "0.1.7"
dependencies = [
"memchr",
"regex",
]
[[package]]
name = "grep-pcre2"
version = "0.1.7"
dependencies = [
"grep-matcher",
"log",
"pcre2",
]
[[package]]
name = "grep-printer"
version = "0.2.1"
dependencies = [
"bstr",
"grep-matcher",
"grep-regex",
"grep-searcher",
"log",
"serde",
"serde_json",
"termcolor",
]
[[package]]
name = "grep-regex"
version = "0.1.12"
dependencies = [
"bstr",
"grep-matcher",
"log",
"regex-automata",
"regex-syntax",
]
[[package]]
name = "grep-searcher"
version = "0.1.13"
dependencies = [
"bstr",
"encoding_rs",
"encoding_rs_io",
"grep-matcher",
"grep-regex",
"log",
"memchr",
"memmap2",
"regex",
]
[[package]]
name = "ignore"
version = "0.4.3"
version = "0.4.22"
dependencies = [
"crossbeam 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.1",
"lazy_static 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"tempdir 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"walkdir 2.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"bstr",
"crossbeam-channel",
"crossbeam-deque",
"globset",
"log",
"memchr",
"regex-automata",
"same-file",
"walkdir",
"winapi-util",
]
[[package]]
name = "lazy_static"
version = "1.0.2"
name = "itoa"
version = "1.0.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b1a46d1a171d865aa5f83f92695765caa047a9b4cbae2cbf37dbd613a793fd4c"
[[package]]
name = "jemalloc-sys"
version = "0.5.4+5.3.0-patched"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac6c1946e1cea1788cbfde01c993b52a10e2da07f4bac608228d1bed20bfebf2"
dependencies = [
"cc",
"libc",
]
[[package]]
name = "jemallocator"
version = "0.5.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a0de374a9f8e63150e6f5e8a60cc14c668226d7a347d8aee1a45766e3c4dd3bc"
dependencies = [
"jemalloc-sys",
"libc",
]
[[package]]
name = "jobserver"
version = "0.1.27"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8c37f63953c4c63420ed5fd3d6d398c719489b9f872b9fa683262f8edd363c7d"
dependencies = [
"libc",
]
[[package]]
name = "lexopt"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baff4b617f7df3d896f97fe922b64817f6cd9a756bb81d40f8883f2f66dcb401"
[[package]]
name = "libc"
version = "0.2.42"
version = "0.2.151"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "302d7ab3130588088d277783b1e2d2e10c9e9e4a16dd9050e6ec93fb3e7048f4"
[[package]]
name = "libm"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4ec2a862134d2a7d32d7983ddcdd1c4923530833c9f2ea1a44fc5fa473989058"
[[package]]
name = "log"
version = "0.4.3"
version = "0.4.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"cfg-if 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)",
]
checksum = "b5e6163cb8c49088c2c36f57875e58ccd8c87c7427f7fbd50ea6710b2f3f2e8f"
[[package]]
name = "memchr"
version = "2.0.1"
version = "2.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "523dc4f511e55ab87b694dc30d0f820d60906ef06413f93d4d7a1385599cc149"
[[package]]
name = "memmap2"
version = "0.9.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "45fd3a57831bf88bc63f8cebc0cf956116276e97fef3966103e96416209f7c92"
dependencies = [
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"libc",
]
[[package]]
name = "memmap"
version = "0.6.2"
name = "num-traits"
version = "0.2.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "39e3200413f237f41ab11ad6d161bc7239c84dcb631773ccd7de3dfe4b5c267c"
dependencies = [
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"autocfg",
"libm",
]
[[package]]
name = "num_cpus"
version = "1.8.0"
name = "packed_simd"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1f9f08af0c877571712e2e3e686ad79efad9657dbf0f7c3c8ba943ff6c38932d"
dependencies = [
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"cfg-if",
"num-traits",
]
[[package]]
name = "rand"
version = "0.4.2"
name = "pcre2"
version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c9d53a8ea5fc3d3568d3de4bebc12606fd0eb8234c602576f1f1ee4880488a7"
dependencies = [
"fuchsia-zircon 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"libc",
"log",
"pcre2-sys",
]
[[package]]
name = "redox_syscall"
version = "0.1.40"
name = "pcre2-sys"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "25b8a7b5253a4465b873a21ee7e8d6ec561a57eed5d319621bec36bea35c86ae"
dependencies = [
"cc",
"libc",
"pkg-config",
]
[[package]]
name = "redox_termios"
version = "0.1.1"
name = "pkg-config"
version = "0.3.28"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "69d3587f8a9e599cc7ec2c00e331f71c4e69a5f9a4b8a6efd5b07466b9736f9a"
[[package]]
name = "proc-macro2"
version = "1.0.76"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "95fc56cda0b5c3325f5fbbd7ff9fda9e02bb00bb3dac51252d2f1bfa1cb8cc8c"
dependencies = [
"redox_syscall 0.1.40 (registry+https://github.com/rust-lang/crates.io-index)",
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "291ec9ab5efd934aaf503a6466c5d5251535d108ee747472c3977cc5acc868ef"
dependencies = [
"proc-macro2",
]
[[package]]
name = "regex"
version = "1.0.2"
version = "1.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "380b951a9c5e80ddfd6136919eef32310721aa4aacd4889a8d39124b026ab343"
dependencies = [
"aho-corasick 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"regex-syntax 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)",
"thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"utf8-ranges 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
"aho-corasick",
"memchr",
"regex-automata",
"regex-syntax",
]
[[package]]
name = "regex-automata"
version = "0.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5f804c7828047e88b2d32e2d7fe5a105da8ee3264f01902f796c8e067dc2483f"
dependencies = [
"aho-corasick",
"memchr",
"regex-syntax",
]
[[package]]
name = "regex-syntax"
version = "0.6.2"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"ucd-util 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "remove_dir_all"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
checksum = "c08c74e62047bb2de4ff487b251e4a92e24f48745648451635cec7d591162d9f"
[[package]]
name = "ripgrep"
version = "0.8.1"
version = "14.0.3"
dependencies = [
"atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)",
"bytecount 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.32.0 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs 0.8.4 (registry+https://github.com/rust-lang/crates.io-index)",
"encoding_rs_io 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"globset 0.4.1",
"grep 0.1.9",
"ignore 0.4.3",
"lazy_static 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)",
"memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"memmap 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)",
"num_cpus 1.8.0 (registry+https://github.com/rust-lang/crates.io-index)",
"regex 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"termcolor 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"anyhow",
"bstr",
"grep",
"ignore",
"jemallocator",
"lexopt",
"log",
"serde",
"serde_derive",
"serde_json",
"termcolor",
"textwrap",
"walkdir",
]
[[package]]
name = "ryu"
version = "1.0.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f98d2aa92eebf49b69786be48e4477826b256916e84a57ff2a4f21923b48eb4c"
[[package]]
name = "same-file"
version = "1.0.2"
version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502"
dependencies = [
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util",
]
[[package]]
name = "simd"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "strsim"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "tempdir"
version = "0.3.7"
name = "serde"
version = "1.0.195"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "63261df402c67811e9ac6def069e4786148c4563f4b50fd4bf30aa370d626b02"
dependencies = [
"rand 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)",
"remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.195"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "46fe8f8603d81ba86327b23a2e9cdf49e1255fb94a4c5f297f6ee0547178ea2c"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serde_json"
version = "1.0.111"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "176e46fa42316f18edd598015a5166857fc835ec732f5215eac6b7bdbf0a84f4"
dependencies = [
"itoa",
"ryu",
"serde",
]
[[package]]
name = "syn"
version = "2.0.48"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f3531638e407dfc0814761abb7c00a5b54992b849452a0646b7f65c9f770f3f"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "termcolor"
version = "1.0.1"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ff1bc3d3f05aff0403e8ac0d92ced918ec05b666a43f83297ccef5bea8a3d449"
dependencies = [
"wincolor 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "termion"
version = "1.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_syscall 0.1.40 (registry+https://github.com/rust-lang/crates.io-index)",
"redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-util",
]
[[package]]
name = "textwrap"
version = "0.10.0"
version = "0.16.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
checksum = "222a222a5bfe1bba4a77b45ec488a741b3cb8872e5e499451fd7d0129c9c7c3d"
[[package]]
name = "thread_local"
version = "0.3.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"lazy_static 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"unreachable 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "ucd-util"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unicode-width"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "unreachable"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
]
[[package]]
name = "utf8-ranges"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "void"
version = "1.0.2"
name = "unicode-ident"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
[[package]]
name = "walkdir"
version = "2.1.4"
version = "2.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d71d857dc86794ca4c280d616f7da00d2dbfd8cd788846559a6813e6aa4b54ee"
dependencies = [
"same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
"same-file",
"winapi-util",
]
[[package]]
name = "winapi"
version = "0.3.5"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-x86_64-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-util"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f29e6f9198ba0d26b4c9f07dbe6f9ed633e1f3d5b8b414090084349e46a52596"
dependencies = [
"winapi",
]
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
[[package]]
name = "wincolor"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
dependencies = [
"winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)",
]
[metadata]
"checksum aho-corasick 0.6.6 (registry+https://github.com/rust-lang/crates.io-index)" = "c1c6d463cbe7ed28720b5b489e7c083eeb8f90d08be2a0d6bb9e1ffea9ce1afa"
"checksum ansi_term 0.11.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ee49baf6cb617b853aa8d93bf420db2383fab46d314482ca2803b40d5fde979b"
"checksum atty 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "9a7d5b8723950951411ee34d271d99dddcc2035a16ab25310ea2c8cfd4369652"
"checksum bitflags 1.0.3 (registry+https://github.com/rust-lang/crates.io-index)" = "d0c54bb8f454c567f21197eefcdbf5679d0bd99f2ddbe52e84c77061952e6789"
"checksum bytecount 0.3.1 (registry+https://github.com/rust-lang/crates.io-index)" = "882585cd7ec84e902472df34a5e01891202db3bf62614e1f0afe459c1afcf744"
"checksum cfg-if 0.1.4 (registry+https://github.com/rust-lang/crates.io-index)" = "efe5c877e17a9c717a0bf3613b2709f723202c4e4675cc8f12926ded29bcb17e"
"checksum clap 2.32.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b957d88f4b6a63b9d70d5f454ac8011819c6efa7727858f458ab71c756ce2d3e"
"checksum crossbeam 0.3.2 (registry+https://github.com/rust-lang/crates.io-index)" = "24ce9782d4d5c53674646a6a4c1863a21a8fc0cb649b3c94dfc16e45071dea19"
"checksum encoding_rs 0.8.4 (registry+https://github.com/rust-lang/crates.io-index)" = "88a1b66a0d28af4b03a8c8278c6dcb90e6e600d89c14500a9e7a02e64b9ee3ac"
"checksum encoding_rs_io 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "ad0ffe753ba194ef1bc070e8d61edaadb1536c05e364fc9178ca6cbde10922c4"
"checksum fnv 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "2fad85553e09a6f881f739c29f0b00b0f01357c743266d478b68951ce23285f3"
"checksum fuchsia-zircon 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "2e9763c69ebaae630ba35f74888db465e49e259ba1bc0eda7d06f4a067615d82"
"checksum fuchsia-zircon-sys 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "3dcaa9ae7725d12cdb85b3ad99a434db70b468c09ded17e012d86b5c1010f7a7"
"checksum glob 0.2.11 (registry+https://github.com/rust-lang/crates.io-index)" = "8be18de09a56b60ed0edf84bc9df007e30040691af7acd1c41874faac5895bfb"
"checksum lazy_static 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "fb497c35d362b6a331cfd94956a07fc2c78a4604cdbee844a81170386b996dd3"
"checksum libc 0.2.42 (registry+https://github.com/rust-lang/crates.io-index)" = "b685088df2b950fccadf07a7187c8ef846a959c142338a48f9dc0b94517eb5f1"
"checksum log 0.4.3 (registry+https://github.com/rust-lang/crates.io-index)" = "61bd98ae7f7b754bc53dca7d44b604f733c6bba044ea6f41bc8d89272d8161d2"
"checksum memchr 2.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "796fba70e76612589ed2ce7f45282f5af869e0fdd7cc6199fa1aa1f1d591ba9d"
"checksum memmap 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)" = "e2ffa2c986de11a9df78620c01eeaaf27d94d3ff02bf81bfcca953102dd0c6ff"
"checksum num_cpus 1.8.0 (registry+https://github.com/rust-lang/crates.io-index)" = "c51a3322e4bca9d212ad9a158a02abc6934d005490c054a2778df73a70aa0a30"
"checksum rand 0.4.2 (registry+https://github.com/rust-lang/crates.io-index)" = "eba5f8cb59cc50ed56be8880a5c7b496bfd9bd26394e176bc67884094145c2c5"
"checksum redox_syscall 0.1.40 (registry+https://github.com/rust-lang/crates.io-index)" = "c214e91d3ecf43e9a4e41e578973adeb14b474f2bee858742d127af75a0112b1"
"checksum redox_termios 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "7e891cfe48e9100a70a3b6eb652fef28920c117d366339687bd5576160db0f76"
"checksum regex 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "5bbbea44c5490a1e84357ff28b7d518b4619a159fed5d25f6c1de2d19cc42814"
"checksum regex-syntax 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)" = "747ba3b235651f6e2f67dfa8bcdcd073ddb7c243cb21c442fc12395dfcac212d"
"checksum remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "3488ba1b9a2084d38645c4c08276a1752dcbf2c7130d74f1569681ad5d2799c5"
"checksum same-file 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "cfb6eded0b06a0b512c8ddbcf04089138c9b4362c2f696f3c3d76039d68f3637"
"checksum simd 0.2.2 (registry+https://github.com/rust-lang/crates.io-index)" = "ed3686dd9418ebcc3a26a0c0ae56deab0681e53fe899af91f5bbcee667ebffb1"
"checksum strsim 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "bb4f380125926a99e52bc279241539c018323fab05ad6368b56f93d9369ff550"
"checksum tempdir 0.3.7 (registry+https://github.com/rust-lang/crates.io-index)" = "15f2b5fb00ccdf689e0149d1b1b3c03fead81c2b37735d812fa8bddbbf41b6d8"
"checksum termcolor 1.0.1 (registry+https://github.com/rust-lang/crates.io-index)" = "722426c4a0539da2c4ffd9b419d90ad540b4cff4a053be9069c908d4d07e2836"
"checksum termion 1.5.1 (registry+https://github.com/rust-lang/crates.io-index)" = "689a3bdfaab439fd92bc87df5c4c78417d3cbe537487274e9b0b2dce76e92096"
"checksum textwrap 0.10.0 (registry+https://github.com/rust-lang/crates.io-index)" = "307686869c93e71f94da64286f9a9524c0f308a9e1c87a583de8e9c9039ad3f6"
"checksum thread_local 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "279ef31c19ededf577bfd12dfae728040a21f635b06a24cd670ff510edd38963"
"checksum ucd-util 0.1.1 (registry+https://github.com/rust-lang/crates.io-index)" = "fd2be2d6639d0f8fe6cdda291ad456e23629558d466e2789d2c3e9892bda285d"
"checksum unicode-width 0.1.5 (registry+https://github.com/rust-lang/crates.io-index)" = "882386231c45df4700b275c7ff55b6f3698780a650026380e72dabe76fa46526"
"checksum unreachable 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "382810877fe448991dfc7f0dd6e3ae5d58088fd0ea5e35189655f84e6814fa56"
"checksum utf8-ranges 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "662fab6525a98beff2921d7f61a39e7d59e0b425ebc7d0d9e66d316e55124122"
"checksum void 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)" = "6a02e4885ed3bc0f2de90ea6dd45ebcbb66dacffe03547fadbb0eeae2770887d"
"checksum walkdir 2.1.4 (registry+https://github.com/rust-lang/crates.io-index)" = "63636bd0eb3d00ccb8b9036381b526efac53caf112b7783b730ab3f8e44da369"
"checksum winapi 0.3.5 (registry+https://github.com/rust-lang/crates.io-index)" = "773ef9dcc5f24b7d850d0ff101e542ff24c3b090a9768e03ff889fdef41f00fd"
"checksum winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
"checksum winapi-x86_64-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
"checksum wincolor 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b9dc3aa9dcda98b5a16150c54619c1ead22e3d3a5d458778ae914be760aa981a"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

View File

@@ -1,30 +1,34 @@
[package]
name = "ripgrep"
version = "0.8.1" #:version
version = "14.0.3" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
ripgrep is a line-oriented search tool that recursively searches your current
directory for a regex pattern while respecting your gitignore rules. ripgrep
has first class support on Windows, macOS and Linux
ripgrep is a line-oriented search tool that recursively searches the current
directory for a regex pattern while respecting gitignore rules. ripgrep has
first class support on Windows, macOS and Linux.
"""
documentation = "https://github.com/BurntSushi/ripgrep"
homepage = "https://github.com/BurntSushi/ripgrep"
repository = "https://github.com/BurntSushi/ripgrep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
categories = ["command-line-utilities", "text-processing"]
license = "Unlicense OR MIT"
exclude = ["HomebrewFormula"]
exclude = [
"HomebrewFormula",
"/.github/",
"/ci/",
"/pkg/brew",
"/benchsuite/",
"/scripts/",
]
build = "build.rs"
autotests = false
[badges]
travis-ci = { repository = "BurntSushi/ripgrep" }
appveyor = { repository = "BurntSushi/ripgrep" }
edition = "2021"
rust-version = "1.72"
[[bin]]
bench = false
path = "src/main.rs"
path = "crates/core/main.rs"
name = "rg"
[[test]]
@@ -32,51 +36,86 @@ name = "integration"
path = "tests/tests.rs"
[workspace]
members = ["grep", "globset", "ignore"]
members = [
"crates/globset",
"crates/grep",
"crates/cli",
"crates/matcher",
"crates/pcre2",
"crates/printer",
"crates/regex",
"crates/searcher",
"crates/ignore",
]
[dependencies]
atty = "0.2.11"
bytecount = "0.3.1"
encoding_rs = "0.8"
encoding_rs_io = "0.1"
globset = { version = "0.4.0", path = "globset" }
grep = { version = "0.1.8", path = "grep" }
ignore = { version = "0.4.0", path = "ignore" }
lazy_static = "1"
libc = "0.2"
log = "0.4"
memchr = "2"
memmap = "0.6"
num_cpus = "1"
regex = "1"
same-file = "1"
termcolor = "1"
anyhow = "1.0.75"
bstr = "1.7.0"
grep = { version = "0.3.1", path = "crates/grep" }
ignore = { version = "0.4.21", path = "crates/ignore" }
lexopt = "0.3.0"
log = "0.4.5"
serde_json = "1.0.23"
termcolor = "1.1.0"
textwrap = { version = "0.16.0", default-features = false }
[dependencies.clap]
version = "2.29.4"
default-features = false
features = ["suggestions", "color"]
[target.'cfg(all(target_env = "musl", target_pointer_width = "64"))'.dependencies.jemallocator]
version = "0.5.0"
[target.'cfg(windows)'.dependencies.winapi]
version = "0.3"
features = ["std", "winnt"]
[build-dependencies]
lazy_static = "1"
[build-dependencies.clap]
version = "2.29.4"
default-features = false
features = ["suggestions", "color"]
[dev-dependencies]
serde = "1.0.77"
serde_derive = "1.0.77"
walkdir = "2"
[features]
avx-accel = [
"bytecount/avx-accel",
]
simd-accel = [
"bytecount/simd-accel",
"encoding_rs/simd-accel",
]
simd-accel = ["grep/simd-accel"]
pcre2 = ["grep/pcre2"]
[profile.release]
debug = true
debug = 1
[profile.release-lto]
inherits = "release"
opt-level = 3
debug = "none"
strip = "symbols"
debug-assertions = false
overflow-checks = false
lto = "fat"
panic = "abort"
incremental = false
codegen-units = 1
# This is the main way to strip binaries in the deb package created by
# 'cargo deb'. For other release binaries, we (currently) call 'strip'
# explicitly in the release process.
[profile.deb]
inherits = "release"
debug = false
[package.metadata.deb]
features = ["pcre2"]
section = "utils"
assets = [
["target/release/rg", "usr/bin/", "755"],
["COPYING", "usr/share/doc/ripgrep/", "644"],
["LICENSE-MIT", "usr/share/doc/ripgrep/", "644"],
["UNLICENSE", "usr/share/doc/ripgrep/", "644"],
["CHANGELOG.md", "usr/share/doc/ripgrep/CHANGELOG", "644"],
["README.md", "usr/share/doc/ripgrep/README", "644"],
["FAQ.md", "usr/share/doc/ripgrep/FAQ", "644"],
# The man page is automatically generated by ripgrep's build process, so
# this file isn't actually committed. Instead, to create a dpkg, either
# create a deployment/deb directory and copy the man page to it, or use the
# 'ci/build-deb' script.
["deployment/deb/rg.1", "usr/share/man/man1/rg.1", "644"],
# Similarly for shell completions.
["deployment/deb/rg.bash", "usr/share/bash-completion/completions/rg", "644"],
["deployment/deb/rg.fish", "usr/share/fish/vendor_completions.d/rg.fish", "644"],
["deployment/deb/_rg", "usr/share/zsh/vendor-completions/", "644"],
]
extended-description = """\
ripgrep (rg) recursively searches your current directory for a regex pattern.
By default, ripgrep will respect your .gitignore and automatically skip hidden
files/directories and binary files.
"""

470
FAQ.md
View File

@@ -5,17 +5,19 @@
* [When is the next release?](#release)
* [Does ripgrep have a man page?](#manpage)
* [Does ripgrep have support for shell auto-completion?](#complete)
* [How do I use lookaround and/or backreferences?](#fancy)
* [How do I configure ripgrep's colors?](#colors)
* [How do I enable true colors on Windows?](#truecolors-windows)
* [How do I stop ripgrep from messing up colors when I kill it?](#stop-ripgrep)
* [How can I get results in a consistent order?](#order)
* [How do I search files that aren't UTF-8?](#encoding)
* [How do I search compressed files?](#compressed)
* [How do I search over multiple lines?](#multiline)
* [How do I use lookaround and/or backreferences?](#fancy)
* [How do I configure ripgrep's colors?](#colors)
* [How do I enable true colors on Windows?](#truecolors-windows)
* [How do I stop ripgrep from messing up colors when I kill it?](#stop-ripgrep)
* [Why does using a leading `/` on Windows fail?](#because-cygwin)
* [How do I get around the regex size limit?](#size-limit)
* [How do I make the `-f/--file` flag faster?](#dfa-size)
* [How do I make the output look like The Silver Searcher's output?](#silver-searcher-output)
* [Why does ripgrep get slower when I enabled PCRE2 regexes?](#pcre2-slow)
* [When I run `rg`, why does it execute some other command?](#rg-other-cmd)
* [How do I create an alias for ripgrep on Windows?](#rg-alias-windows)
* [How do I create a PowerShell profile?](#powershell-profile)
@@ -24,6 +26,7 @@
* [How is ripgrep licensed?](#license)
* [Can ripgrep replace grep?](#posix4ever)
* [What does the "rip" in ripgrep mean?](#intentcountsforsomething)
* [How can I donate to ripgrep or its maintainers?](#donations)
<h3 name="config">
@@ -49,26 +52,33 @@ ripgrep is a project whose contributors are volunteers. A release schedule
adds undue stress to said volunteers. Therefore, releases are made on a best
effort basis and no dates **will ever be given**.
One exception to this is high impact bugs. If a ripgrep release contains a
significant regression, then there will generally be a strong push to get a
patch release out with a fix.
An exception to this _can be_ high impact bugs. If a ripgrep release contains
a significant regression, then there will generally be a strong push to get a
patch release out with a fix. However, no promises are made.
<h3 name="manpage">
Does ripgrep have a man page?
</h3>
Yes! Whenever ripgrep is compiled on a system with `asciidoc` present, then a
man page is generated from ripgrep's argv parser. After compiling ripgrep, you
can find the man page like so from the root of the repository:
Yes. If you installed ripgrep through a package manager on a Unix system, then
it would have ideally been installed for you in the proper location. In which
case, `man rg` should just work.
Otherwise, you can ask ripgrep to generate the man page:
```
$ find ./target -name rg.1 -print0 | xargs -0 ls -t | head -n1
./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1
$ mkdir -p man/man1
$ rg --generate man > man/man1/rg.1
$ MANPATH="$PWD/man" man rg
```
Running `man -l ./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1` will
show the man page in your normal pager.
Or, if your version of `man` supports the `-l/--local-file` flag, then this
will suffice:
```
$ rg --generate man | man -l -
```
Note that the man page's documentation for options is equivalent to the output
shown in `rg --help`. To see more condensed documentation (one line per flag),
@@ -82,22 +92,42 @@ The man page is also included in all
Does ripgrep have support for shell auto-completion?
</h3>
Yes! Shell completions can be found in the
[same directory as the man page](#manpage)
after building ripgrep. Zsh completions are maintained separately and committed
to the repository in `complete/_rg`.
Yes! If you installed ripgrep through a package manager on a Unix system, then
the shell completion files included in the release archive should have been
installed for you automatically. If not, you can generate completes using
ripgrep's command line interface.
Shell completions are also included in all
[ripgrep binary releases](https://github.com/BurntSushi/ripgrep/releases).
For **bash**:
For **bash**, move `rg.bash` to
`$XDG_CONFIG_HOME/bash_completion` or `/etc/bash_completion.d/`.
```
$ dir="$XDG_CONFIG_HOME/bash_completion"
$ mkdir -p "$dir"
$ rg --generate complete-bash > "$dir/rg.bash"
```
For **fish**, move `rg.fish` to `$HOME/.config/fish/completions/`.
For **fish**:
For **zsh**, move `_rg` to one of your `$fpath` directories.
```
$ dir="$XDG_CONFIG_HOME/fish/completions"
$ mkdir -p "$dir"
$ rg --generate complete-fish > "$dir/rg.fish"
```
For **PowerShell**, add `. _rg.ps1` to your PowerShell
For **zsh**:
```
$ dir="$HOME/.zsh-complete"
$ mkdir -p "$dir"
$ rg --generate complete-zsh > "$dir/_rg"
```
For **PowerShell**, create the completions:
```
$ rg --generate complete-powershell > _rg.ps1
```
And then add `. _rg.ps1` to your PowerShell
[profile](https://technet.microsoft.com/en-us/library/bb613488(v=vs.85).aspx)
(note the leading period). If the `_rg.ps1` file is not on your `PATH`, do
`. /path/to/_rg.ps1` instead.
@@ -117,7 +147,7 @@ from run to run of ripgrep.
The only way to make the order of results consistent is to ask ripgrep to
sort the output. Currently, this will disable all parallelism. (On smaller
repositories, you might not notice much of a performance difference!) You
can achieve this with the `--sort-files` flag.
can achieve this with the `--sort path` flag.
There is more discussion on this topic here:
https://github.com/BurntSushi/ripgrep/issues/152
@@ -135,10 +165,10 @@ How do I search compressed files?
</h3>
ripgrep's `-z/--search-zip` flag will cause it to search compressed files
automatically. Currently, this supports gzip, bzip2, lzma, lz4 and xz only and
requires the corresponding `gzip`, `bzip2` and `xz` binaries to be installed on
your system. (That is, ripgrep does decompression by shelling out to another
process.)
automatically. Currently, this supports gzip, bzip2, xz, lzma, lz4, Brotli and
Zstd. Each of these requires the corresponding `gzip`, `bzip2`, `xz`,
`lz4`, `brotli` and `zstd` binaries to be installed on your system. (That is,
ripgrep does decompression by shelling out to another process.)
ripgrep currently does not search archive formats, so `*.tar.gz` files, for
example, are skipped.
@@ -148,22 +178,45 @@ example, are skipped.
How do I search over multiple lines?
</h3>
This isn't currently possible. ripgrep is fundamentally a line-oriented search
tool. With that said,
[multiline search is a planned opt-in feature](https://github.com/BurntSushi/ripgrep/issues/176).
The `-U/--multiline` flag enables ripgrep to report results that span over
multiple lines.
<h3 name="fancy">
How do I use lookaround and/or backreferences?
</h3>
This isn't currently possible. ripgrep uses finite automata to implement
regular expression search, and in turn, guarantees linear time searching on all
inputs. It is difficult to efficiently support lookaround and backreferences in
finite automata engines, so ripgrep does not provide these features.
ripgrep's default regex engine does not support lookaround or backreferences.
This is primarily because the default regex engine is implemented using finite
state machines in order to guarantee a linear worst case time complexity on all
inputs. Backreferences are not possible to implement in this paradigm, and
lookaround appears difficult to do efficiently.
If a production quality regular expression engine with these features is ever
written in Rust, then it is possible ripgrep will provide it as an opt-in
However, ripgrep optionally supports using PCRE2 as the regex engine instead of
the default one based on finite state machines. You can enable PCRE2 with the
`-P/--pcre2` flag. For example, in the root of the ripgrep repo, you can easily
find all palindromes:
```
$ rg -P '(\w{10})\1'
tests/misc.rs
483: cmd.arg("--max-filesize").arg("44444444444444444444");
globset/src/glob.rs
1206: matches!(match7, "a*a*a*a*a*a*a*a*a", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
```
If your version of ripgrep doesn't support PCRE2, then you'll get an error
message when you try to use the `-P/--pcre2` flag:
```
$ rg -P '(\w{10})\1'
PCRE2 is not available in this build of ripgrep
```
Most of the releases distributed by the ripgrep project here on GitHub will
come bundled with PCRE2 enabled. If you installed ripgrep through a different
means (like your system's package manager), then please reach out to the
maintainer of that package to see whether it's possible to enable the PCRE2
feature.
@@ -181,7 +234,7 @@ The `--color` flag accepts one of the following possible values: `never`,
ripgrep to only enable colors when it is printing to a terminal. But if you
pipe ripgrep to a file or some other process, then it will suppress colors.
The --colors` flag is a bit more complicated. The general format is:
The `--colors` flag is a bit more complicated. The general format is:
```
--colors '{type}:{attribute}:{value}'
@@ -289,6 +342,26 @@ available
[here](https://github.com/BurntSushi/ripgrep/issues/281#issuecomment-269093893).
<h3 name="because-cygwin">
Why does using a leading `/` on Windows fail?
</h3>
If you're using cygwin on Windows and try to search for a pattern beginning
with a `/`, then it's possible that cygwin is mangling that pattern without
your knowledge. For example, if you tried running `rg /foo` in a cygwin shell
on Windows, then cygwin might mistakenly perform path translation on `/foo`,
which would result in `rg C:/msys64/foo` being searched instead.
You can fix this in one of three ways:
1. Stop using cygwin.
2. Escape the leading slash with an additional slash. e.g., `rg //foo`.
3. Temporarily disable path translation by setting `MSYS_NO_PATHCONV=1`. e.g.,
`MSYS_NO_PATHCONV=1 rg /foo`.
For more details, see https://github.com/BurntSushi/ripgrep/issues/1277
<h3 name="size-limit">
How do I get around the regex size limit?
</h3>
@@ -368,6 +441,301 @@ $ RIPGREP_CONFIG_PATH=$HOME/.config/ripgrep/rc rg foo
```
<h3 name="pcre2-slow">
Why does ripgrep get slower when I enable PCRE2 regexes?
</h3>
When you use the `--pcre2` (`-P` for short) flag, ripgrep will use the PCRE2
regex engine instead of the default. Both regex engines are quite fast,
but PCRE2 provides a number of additional features such as look-around and
backreferences that many enjoy using. This is largely because PCRE2 uses
a backtracking implementation where as the default regex engine uses a finite
automaton based implementation. The former provides the ability to add lots of
bells and whistles over the latter, but the latter executes with worst case
linear time complexity.
With that out of the way, if you've used `-P` with ripgrep, you may have
noticed that it can be slower. The reasons for why this is are quite complex,
and they are complex because the optimizations that ripgrep uses to implement
fast search are complex.
The task ripgrep has before it is somewhat simple; all it needs to do is search
a file for occurrences of some pattern and then print the lines containing
those occurrences. The problem lies in what is considered a valid match and how
exactly we read the bytes from a file.
In terms of what is considered a valid match, remember that ripgrep will only
report matches spanning a single line by default. The problem here is that
some patterns can match across multiple lines, and ripgrep needs to prevent
that from happening. For example, `foo\sbar` will match `foo\nbar`. The most
obvious way to achieve this is to read the data from a file, and then apply
the pattern search to that data for each line. The problem with this approach
is that it can be quite slow; it would be much faster to let the pattern
search across as much data as possible. It's faster because it gets rid of the
overhead of finding the boundaries of every line, and also because it gets rid
of the overhead of starting and stopping the pattern search for every single
line. (This is operating under the general assumption that matching lines are
much rarer than non-matching lines.)
It turns out that we can use the faster approach by applying a very simple
restriction to the pattern: *statically prevent* the pattern from matching
through a `\n` character. Namely, when given a pattern like `foo\sbar`,
ripgrep will remove `\n` from the `\s` character class automatically. In some
cases, a simple removal is not so easy. For example, ripgrep will return an
error when your pattern includes a `\n` literal:
```
$ rg '\n'
the literal '"\n"' is not allowed in a regex
```
So what does this have to do with PCRE2? Well, ripgrep's default regex engine
exposes APIs for doing syntactic analysis on the pattern in a way that makes
it quite easy to strip `\n` from the pattern (or otherwise detect it and report
an error if stripping isn't possible). PCRE2 seemingly does not provide a
similar API, so ripgrep does not do any stripping when PCRE2 is enabled. This
forces ripgrep to use the "slow" search strategy of searching each line
individually.
OK, so if enabling PCRE2 slows down the default method of searching because it
forces matches to be limited to a single line, then why is PCRE2 also sometimes
slower when performing multiline searches? Well, that's because there are
*multiple* reasons why using PCRE2 in ripgrep can be slower than the default
regex engine. This time, blame PCRE2's Unicode support, which ripgrep enables
by default. In particular, PCRE2 cannot simultaneously enable Unicode support
and search arbitrary data. That is, when PCRE2's Unicode support is enabled,
the data **must** be valid UTF-8 (to do otherwise is to invoke undefined
behavior). This is in contrast to ripgrep's default regex engine, which can
enable Unicode support and still search arbitrary data. ripgrep's default
regex engine simply won't match invalid UTF-8 for a pattern that can otherwise
only match valid UTF-8. Why doesn't PCRE2 do the same? This author isn't
familiar with its internals, so we can't comment on it here.
The bottom line here is that we can't enable PCRE2's Unicode support without
simultaneously incurring a performance penalty for ensuring that we are
searching valid UTF-8. In particular, ripgrep will transcode the contents
of each file to UTF-8 while replacing invalid UTF-8 data with the Unicode
replacement codepoint. ripgrep then disables PCRE2's own internal UTF-8
checking, since we've guaranteed the data we hand it will be valid UTF-8. The
reason why ripgrep takes this approach is because if we do hand PCRE2 invalid
UTF-8, then it will report a match error if it comes across an invalid UTF-8
sequence. This is not good news for ripgrep, since it will stop it from
searching the rest of the file, and will also print potentially undesirable
error messages to users.
All right, the above is a lot of information to swallow if you aren't already
familiar with ripgrep internals. Let's make this concrete with some examples.
First, let's get some data big enough to magnify the performance differences:
```
$ curl -O 'https://burntsushi.net/stuff/subtitles2016-sample.gz'
$ gzip -d subtitles2016-sample
$ md5sum subtitles2016-sample
e3cb796a20bbc602fbfd6bb43bda45f5 subtitles2016-sample
```
To search this data, we will use the pattern `^\w{42}$`, which contains exactly
one hit in the file and has no literals. Having no literals is important,
because it ensures that the regex engine won't use literal optimizations to
speed up the search. In other words, it lets us reason coherently about the
actual task that the regex engine is performing.
Let's now walk through a few examples in light of the information above. First,
let's consider the default search using ripgrep's default regex engine and
then the same search with PCRE2:
```
$ time rg '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.783s
user 0m1.731s
sys 0m0.051s
$ time rg -P '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m2.458s
user 0m2.419s
sys 0m0.038s
```
In this particular example, both pattern searches are using a Unicode aware
`\w` character class and both are counting lines in order to report line
numbers. The key difference here is that the first search will not search
line by line, but the second one will. We can observe which strategy ripgrep
uses by passing the `--trace` flag:
```
$ rg '^\w{42}$' subtitles2016-sample --trace
[... snip ...]
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:712: slice reader: searching via slice-by-line strategy
TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:61: searcher core: will use fast line searcher
[... snip ...]
$ rg -P '^\w{42}$' subtitles2016-sample --trace
[... snip ...]
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:705: slice reader: needs transcoding, using generic reader
TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:685: generic reader: searching via roll buffer strategy
TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:63: searcher core: will use slow line searcher
[... snip ...]
```
The first says it is using the "fast line searcher" where as the latter says
it is using the "slow line searcher." The latter also shows that we are
decoding the contents of the file, which also impacts performance.
Interestingly, in this case, the pattern does not match a `\n` and the file
we're searching is valid UTF-8, so neither the slow line-by-line search
strategy nor the decoding are necessary. We could fix the former issue with
better PCRE2 introspection APIs. We can actually fix the latter issue with
ripgrep's `--no-encoding` flag, which prevents the automatic UTF-8 decoding,
but will enable PCRE2's own UTF-8 validity checking. Unfortunately, it's slower
in my build of ripgrep:
```
$ time rg -P '^\w{42}$' subtitles2016-sample --no-encoding
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m3.074s
user 0m3.021s
sys 0m0.051s
```
(Tip: use the `--trace` flag to verify that no decoding in ripgrep is
happening.)
A possible reason why PCRE2's UTF-8 checking is slower is because it might
not be better than the highly optimized UTF-8 checking routines found in the
[`encoding_rs`](https://github.com/hsivonen/encoding_rs) library, which is what
ripgrep uses for UTF-8 decoding. Moreover, my build of ripgrep enables
`encoding_rs`'s SIMD optimizations, which may be in play here.
Also, note that using the `--no-encoding` flag can cause PCRE2 to report
invalid UTF-8 errors, which causes ripgrep to stop searching the file:
```
$ cat invalid-utf8
foobar
$ xxd invalid-utf8
00000000: 666f 6fff 6261 720a foo.bar.
$ rg foo invalid-utf8
1:foobar
$ rg -P foo invalid-utf8
1:foo<6F>bar
$ rg -P foo invalid-utf8 --no-encoding
invalid-utf8: PCRE2: error matching: UTF-8 error: illegal byte (0xfe or 0xff)
```
All right, so at this point, you might think that we could remove the penalty
for line-by-line searching by enabling multiline search. After all, our
particular pattern can't match across multiple lines anyway, so we'll still get
the results we want. Let's try it:
```
$ time rg -U '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.803s
user 0m1.748s
sys 0m0.054s
$ time rg -P -U '^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m2.962s
user 0m2.246s
sys 0m0.713s
```
Search times remain the same with the default regex engine, but the PCRE2
search gets _slower_. What happened? The secrets can be revealed with the
`--trace` flag once again. In the former case, ripgrep actually detects that
the pattern can't match across multiple lines, and so will fall back to the
"fast line search" strategy as with our search without `-U`.
However, for PCRE2, things are much worse. Namely, since Unicode mode is still
enabled, ripgrep is still going to decode UTF-8 to ensure that it hands only
valid UTF-8 to PCRE2. Unfortunately, one key downside of multiline search is
that ripgrep cannot do it incrementally. Since matches can be arbitrarily long,
ripgrep actually needs the entire file in memory at once. Normally, we can use
a memory map for this, but because we need to UTF-8 decode the file before
searching it, ripgrep winds up reading the entire contents of the file on to
the heap before executing a search. Owch.
OK, so Unicode is killing us here. The file we're searching is _mostly_ ASCII,
so maybe we're OK with missing some data. (Try `rg '[\w--\p{ascii}]'` to see
non-ASCII word characters that an ASCII-only `\w` character class would miss.)
We can disable Unicode in both searches, but this is done differently depending
on the regex engine we use:
```
$ time rg '(?-u)^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.714s
user 0m1.669s
sys 0m0.044s
$ time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.997s
user 0m1.958s
sys 0m0.037s
```
For the most part, ripgrep's default regex engine performs about the same.
PCRE2 does improve a little bit, and is now almost as fast as the default
regex engine. If you look at the output of `--trace`, you'll see that ripgrep
will no longer perform UTF-8 decoding, but it does still use the slow
line-by-line searcher.
At this point, we can combine all of our insights above: let's try to get off
of the slow line-by-line searcher by enabling multiline mode, and let's stop
UTF-8 decoding by disabling Unicode support:
```
$ time rg -U '(?-u)^\w{42}$' subtitles2016-sample
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.714s
user 0m1.655s
sys 0m0.058s
$ time rg -P -U '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
21225780:EverymajordevelopmentinthehistoryofAmerica
real 0m1.121s
user 0m1.071s
sys 0m0.048s
```
Ah, there's PCRE2's JIT shining! ripgrep's default regex engine once again
remains about the same, but PCRE2 no longer needs to search line-by-line and it
no longer needs to do any kind of UTF-8 checks. This allows the file to get
memory mapped and passed right through PCRE2's JIT at impressive speeds. (As
a brief and interesting historical note, the configuration of "memory map +
multiline + no-Unicode" is exactly the configuration used by The Silver
Searcher. This analysis perhaps sheds some reasoning as to why that
configuration is useful!)
In summary, if you want PCRE2 to go as fast as possible and you don't care
about Unicode and you don't care about matches possibly spanning across
multiple lines, then enable multiline mode with `-U` and disable PCRE2's
Unicode support with the `--no-pcre2-unicode` flag.
Caveat emptor: This author is not a PCRE2 expert, so there may be APIs that can
improve performance that the author missed. Similarly, there may be alternative
designs for a searching tool that are more amenable to how PCRE2 works.
<h3 name="rg-other-cmd">
When I run <code>rg</code>, why does it execute some other command?
</h3>
@@ -503,7 +871,7 @@ rg foo --files-with-matches | xargs sed -i 's/foo/bar/g'
will replace all instances of 'foo' with 'bar' in the files in which
ripgrep finds the foo pattern. The `-i` flag to sed indicates that you are
editing files in place, and `s/foo/bar/g` says that you are performing a
**s**ubstitution of the pattren `foo` for `bar`, and that you are doing this
**s**ubstitution of the pattern `foo` for `bar`, and that you are doing this
substitution **g**lobally (all occurrences of the pattern in each file).
Note: the above command assumes that you are using GNU sed. If you are using
@@ -550,7 +918,7 @@ The reason why ripgrep is dual licensed this way is two-fold:
1. I, as ripgrep's author, would like to participate in a small bit of
ideological activism by promoting the Unlicense's goal: to disclaim
copyright monopoly interest.
2. I, as ripgrep's author, would like as many people to use rigprep as
2. I, as ripgrep's author, would like as many people to use ripgrep as
possible. Since the Unlicense is not a proven or well known license, ripgrep
is also offered under the MIT license, which is ubiquitous and accepted by
almost everyone.
@@ -615,8 +983,8 @@ Here are some cases where you might *not* want to use ripgrep. The same caveats
for the previous section apply.
* Are you writing portable shell scripts intended to work in a variety of
environments? Great, probably not a good idea to use ripgrep! ripgrep is has
nowhere near the ubquity of grep, so if you do use ripgrep, you might need
environments? Great, probably not a good idea to use ripgrep! ripgrep has
nowhere near the ubiquity of grep, so if you do use ripgrep, you might need
to futz with the installation process more than you would with grep.
* Do you care about POSIX compatibility? If so, then you can't use ripgrep
because it never was, isn't and never will be POSIX compatible.
@@ -662,3 +1030,17 @@ grep](#posix4ever),
ripgrep is neither actually a "grep killer" nor was it ever intended to be. It
certainly does eat into some of its use cases, but that's nothing that other
tools like ack or The Silver Searcher weren't already doing.
<h3 name="donations">
How can I donate to ripgrep or its maintainers?
</h3>
I welcome [sponsorship](https://github.com/sponsors/BurntSushi/).
Or if you'd prefer, donating to a charitably organization that you like would
also be most welcome. My favorites are:
* [The Internet Archive](https://archive.org/donate/)
* [Rails Girls](https://railsgirlssummerofcode.org/)
* [Wikipedia](https://wikimediafoundation.org/support/)

383
GUIDE.md
View File

@@ -18,6 +18,8 @@ translatable to any command line shell environment.
* [Replacements](#replacements)
* [Configuration file](#configuration-file)
* [File encoding](#file-encoding)
* [Binary data](#binary-data)
* [Preprocessor](#preprocessor)
* [Common options](#common-options)
@@ -109,7 +111,7 @@ colors, you'll notice that `faster` will be highlighted instead of just the
It is beyond the scope of this guide to provide a full tutorial on regular
expressions, but ripgrep's specific syntax is documented here:
https://docs.rs/regex/0.2.5/regex/#syntax
https://docs.rs/regex/*/regex/#syntax
### Recursive search
@@ -175,16 +177,25 @@ After recursive search, ripgrep's most important feature is what it *doesn't*
search. By default, when you search a directory, ripgrep will ignore all of
the following:
1. Files and directories that match the rules in your `.gitignore` glob
pattern.
1. Files and directories that match glob patterns in these three categories:
1. `.gitignore` globs (including global and repo-specific globs). This
includes `.gitignore` files in parent directories that are part of the
same `git` repository. (Unless the `--no-require-git` flag is given.)
2. `.ignore` globs, which take precedence over all gitignore globs
when there's a conflict. This includes `.ignore` files in parent
directories.
3. `.rgignore` globs, which take precedence over all `.ignore` globs
when there's a conflict. This includes `.rgignore` files in parent
directories.
2. Hidden files and directories.
3. Binary files. (ripgrep considers any file with a `NUL` byte to be binary.)
4. Symbolic links aren't followed.
All of these things can be toggled using various flags provided by ripgrep:
1. You can disable `.gitignore` handling with the `--no-ignore` flag.
2. Hidden files and directories can be searched with the `--hidden` flag.
1. You can disable all ignore-related filtering with the `--no-ignore` flag.
2. Hidden files and directories can be searched with the `--hidden` (`-.` for
short) flag.
3. Binary files can be searched via the `--text` (`-a` for short) flag.
Be careful with this flag! Binary files may emit control characters to your
terminal, which might cause strange behavior.
@@ -227,7 +238,7 @@ with the following contents:
```
ripgrep treats `.ignore` files with higher precedence than `.gitignore` files
(and treats `.rgignore` files with higher precdence than `.ignore` files).
(and treats `.rgignore` files with higher precedence than `.ignore` files).
This means ripgrep will see the `!log/` whitelist rule first and search that
directory.
@@ -235,6 +246,11 @@ Like `.gitignore`, a `.ignore` file can be placed in any directory. Its rules
will be processed with respect to the directory it resides in, just like
`.gitignore`.
To process `.gitignore` and `.ignore` files case insensitively, use the flag
`--ignore-file-case-insensitive`. This is especially useful on case insensitive
file systems like those on Windows and macOS. Note though that this can come
with a significant performance penalty, and is therefore disabled by default.
For a more in depth description of how glob patterns in a `.gitignore` file
are interpreted, please see `man gitignore`.
@@ -370,7 +386,7 @@ make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
By default, ripgrep comes with a bunch of pre-defined types. Generally, these
types correspond to well known public formats. But you can define your own
types as well. For example, perhaps you frequently search "web" files, which
consist of Javascript, HTML and CSS:
consist of JavaScript, HTML and CSS:
```
$ rg --type-add 'web:*.html' --type-add 'web:*.css' --type-add 'web:*.js' -tweb title
@@ -405,6 +421,21 @@ alias rg="rg --type-add 'web:*.{html,css,js}'"
or add `--type-add=web:*.{html,css,js}` to your ripgrep configuration file.
([Configuration files](#configuration-file) are covered in more detail later.)
#### The special `all` file type
A special option supported by the `--type` flag is `all`. `--type all` looks
for a match in any of the supported file types listed by `--type-list`,
including those added on the command line using `--type-add`. It's equivalent
to the command `rg --type agda --type asciidoc --type asm ...`, where `...`
stands for a list of `--type` flags for the rest of the types in `--type-list`.
As an example, let's suppose you have a shell script in your current directory,
`my-shell-script`, which includes a shell library, `my-shell-library.bash`.
Both `rg --type sh` and `rg --type all` would only search for matches in
`my-shell-library.bash`, not `my-shell-script`, because the globs matched
by the `sh` file type don't include files without an extension. On the
other hand, `rg --type-not all` would search `my-shell-script` but not
`my-shell-library.bash`.
### Replacements
@@ -520,9 +551,9 @@ config file. Once the environment variable is set, open the file and just type
in the flags you want set automatically. There are only two rules for
describing the format of the config file:
1. Every line is a shell argument, after trimming ASCII whitespace.
2. Lines starting with `#` (optionally preceded by any amount of
ASCII whitespace) are ignored.
1. Every line is a shell argument, after trimming whitespace.
2. Lines starting with `#` (optionally preceded by any amount of whitespace)
are ignored.
In particular, there is no escaping. Each line is given to ripgrep as a single
command line argument verbatim.
@@ -532,19 +563,23 @@ formatting peculiarities:
```
$ cat $HOME/.ripgreprc
# Don't let ripgrep vomit really long lines to my terminal.
# Don't let ripgrep vomit really long lines to my terminal, and show a preview.
--max-columns=150
--max-columns-preview
# Add my 'web' type.
--type-add
web:*.{html,css,js}*
# Search hidden files / directories (e.g. dotfiles) by default
--hidden
# Using glob patterns to include/exclude files or folders
--glob=!git/*
--glob=!.git/*
# or
--glob
!git/*
!.git/*
# Set the colors.
--colors=line:none
@@ -580,7 +615,7 @@ override it.
If you're confused about what configuration file ripgrep is reading arguments
from, then running ripgrep with the `--debug` flag should help clarify things.
The debug output should note what config file is being loaded and the arugments
The debug output should note what config file is being loaded and the arguments
that have been read from the configuration.
Finally, if you want to make absolutely sure that ripgrep *isn't* reading a
@@ -598,13 +633,14 @@ topic, but we can try to summarize its relevancy to ripgrep:
* Files are generally just a bundle of bytes. There is no reliable way to know
their encoding.
* Either the encoding of the pattern must match the encoding of the files being
searched, or a form of transcoding must be performed converts either the
searched, or a form of transcoding must be performed that converts either the
pattern or the file to the same encoding as the other.
* ripgrep tends to work best on plain text files, and among plain text files,
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
a special exception, UTF-16 is prevalent in Windows environments
In light of the above, here is how ripgrep behaves:
In light of the above, here is how ripgrep behaves when `--encoding auto` is
given, which is the default:
* All input is assumed to be ASCII compatible (which means every byte that
corresponds to an ASCII codepoint actually is an ASCII codepoint). This
@@ -620,12 +656,15 @@ In light of the above, here is how ripgrep behaves:
they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
the file from UTF-16 to UTF-8, and then execute the search on the transcoded
version of the file. (This incurs a performance penalty since transcoding
is slower than regex searching.)
is needed in addition to regex searching.) If the file contains invalid
UTF-16, then the Unicode replacement codepoint is substituted in place of
invalid code units.
* To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
you to specify an encoding from the
[Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
ripgrep will assume *all* files searched are the encoding specified and
will perform a transcoding step just like in the UTF-16 case described above.
ripgrep will assume *all* files searched are the encoding specified (unless
the file has a BOM) and will perform a transcoding step just like in the
UTF-16 case described above.
By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
can and will search arbitrary bytes. The key here is that if you're searching
@@ -635,9 +674,26 @@ pattern won't find anything. With all that said, this mode of operation is
important, because it lets you find ASCII or UTF-8 *within* files that are
otherwise arbitrary bytes.
As a special case, the `-E/--encoding` flag supports the value `none`, which
will completely disable all encoding related logic, including BOM sniffing.
When `-E/--encoding` is set to `none`, ripgrep will search the raw bytes of
the underlying file with no transcoding step. For example, here's how you might
search the raw UTF-16 encoding of the string `Шерлок`:
```
$ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
```
Of course, that's just an example meant to show how one can drop down into
raw bytes. Namely, the simpler command works as you might expect automatically:
```
$ rg 'Шерлок' some-utf16-file
```
Finally, it is possible to disable ripgrep's Unicode support from within the
pattern regular expression. For example, let's say you wanted `.` to match any
byte rather than any Unicode codepoint. (You might want this while searching a
regular expression. For example, let's say you wanted `.` to match any byte
rather than any Unicode codepoint. (You might want this while searching a
binary file, since `.` by default will not match invalid UTF-8.) You could do
this by disabling Unicode via a regular expression flag:
@@ -654,6 +710,282 @@ $ rg '\w(?-u:\w)\w'
```
### Binary data
In addition to skipping hidden files and files in your `.gitignore` by default,
ripgrep also attempts to skip binary files. ripgrep does this by default
because binary files (like PDFs or images) are typically not things you want to
search when searching for regex matches. Moreover, if content in a binary file
did match, then it's possible for undesirable binary data to be printed to your
terminal and wreak havoc.
Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
rules, a file cannot as easily be classified as binary. In order to figure out
whether a file is binary, the most effective heuristic that balances
correctness with performance is to simply look for `NUL` bytes. At that point,
the determination is simple: a file is considered "binary" if and only if it
contains a `NUL` byte somewhere in its contents.
The issue is that while most binary files will have a `NUL` byte toward the
beginning of its contents, this is not necessarily true. The `NUL` byte might
be the very last byte in a large file, but that file is still considered
binary. While this leads to a fair amount of complexity inside ripgrep's
implementation, it also results in some unintuitive user experiences.
At a high level, ripgrep operates in three different modes with respect to
binary files:
1. The default mode is to attempt to remove binary files from a search
completely. This is meant to mirror how ripgrep removes hidden files and
files in your `.gitignore` automatically. That is, as soon as a file is
detected as binary, searching stops. If a match was already printed (because
it was detected long before a `NUL` byte), then ripgrep will print a warning
message indicating that the search stopped prematurely. This default mode
**only applies to files searched by ripgrep as a result of recursive
directory traversal**, which is consistent with ripgrep's other automatic
filtering. For example, `rg foo .file` will search `.file` even though it
is hidden. Similarly, `rg foo binary-file` will search `binary-file` in
"binary" mode automatically.
2. Binary mode is similar to the default mode, except it will not always
stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
will continue searching a file that is known to be binary until the first
of two conditions is met: 1) the end of the file has been reached or 2) a
match is or has been seen. This means that in binary mode, if ripgrep
reports no matches, then there are no matches in the file. When a match does
occur, ripgrep prints a message similar to one it prints when in its default
mode indicating that the search has stopped prematurely. This mode can be
forcefully enabled for all files with the `--binary` flag. The purpose of
binary mode is to provide a way to discover matches in all files, but to
avoid having binary data dumped into your terminal.
3. Text mode completely disables all binary detection and searches all files
as if they were text. This is useful when searching a file that is
predominantly text but contains a `NUL` byte, or if you are specifically
trying to search binary data. This mode can be enabled with the `-a/--text`
flag. Note that when using this mode on very large binary files, it is
possible for ripgrep to use a lot of memory.
Unfortunately, there is one additional complexity in ripgrep that can make it
difficult to reason about binary files. That is, the way binary detection works
depends on the way that ripgrep searches your files. Specifically:
* When ripgrep uses memory maps, then binary detection is only performed on the
first few kilobytes of the file in addition to every matching line.
* When ripgrep doesn't use memory maps, then binary detection is performed on
all bytes searched.
This means that whether a file is detected as binary or not can change based
on the internal search strategy used by ripgrep. If you prefer to keep
ripgrep's binary file detection consistent, then you can disable memory maps
via the `--no-mmap` flag. (The cost will be a small performance regression when
searching very large files on some platforms.)
### Preprocessor
In ripgrep, a preprocessor is any type of command that can be run to transform
the input of every file before ripgrep searches it. This makes it possible to
search virtually any kind of content that can be automatically converted to
text without having to teach ripgrep how to read said content.
One common example is searching PDFs. PDFs are first and foremost meant to be
displayed to users. But PDFs often have text streams in them that can be useful
to search. In our case, we want to search Bruce Watson's excellent
dissertation,
[Taxonomies and Toolkits of Regular Language Algorithms](https://burntsushi.net/stuff/1995-watson.pdf).
After downloading it, let's try searching it:
```
$ rg 'The Commentz-Walter algorithm' 1995-watson.pdf
$
```
Surely, a dissertation on regular language algorithms would mention
Commentz-Walter. Indeed it does, but our search isn't picking it up because
PDFs are a binary format, and the text shown in the PDF may not be encoded as
simple contiguous UTF-8. Namely, even passing the `-a/--text` flag to ripgrep
will not make our search work.
One way to fix this is to convert the PDF to plain text first. This won't work
well for all PDFs, but does great in a lot of cases. (Note that the tool we
use, `pdftotext`, is part of the [poppler](https://poppler.freedesktop.org)
PDF rendering library.)
```
$ pdftotext 1995-watson.pdf > 1995-watson.txt
$ rg 'The Commentz-Walter algorithm' 1995-watson.txt
316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
7165:4.4 The Commentz-Walter algorithms
10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
17297: The Commentz-Walter algorithms (CW). In all versions of the CW algorithms, a common program skeleton is used with di erent shift functions. The CW algorithms are
```
But having to explicitly convert every file can be a pain, especially when you
have a directory full of PDF files. Instead, we can use ripgrep's preprocessor
feature to search the PDF. ripgrep's `--pre` flag works by taking a single
command name and then executing that command for every file that it searches.
ripgrep passes the file path as the first and only argument to the command and
also sends the contents of the file to stdin. So let's write a simple shell
script that wraps `pdftotext` in a way that conforms to this interface:
```
$ cat preprocess
#!/bin/sh
exec pdftotext - -
```
With `preprocess` in the same directory as `1995-watson.pdf`, we can now use it
to search the PDF:
```
$ rg --pre ./preprocess 'The Commentz-Walter algorithm' 1995-watson.pdf
316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
7165:4.4 The Commentz-Walter algorithms
10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
17297: The Commentz-Walter algorithms (CW). In all versions of the CW algorithms, a common program skeleton is used with di erent shift functions. The CW algorithms are
```
Note that `preprocess` must be resolvable to a command that ripgrep can read.
The simplest way to do this is to put your preprocessor command in a directory
that is in your `PATH` (or equivalent), or otherwise use an absolute path.
As a bonus, this turns out to be quite a bit faster than other specialized PDF
grepping tools:
```
$ time rg --pre ./preprocess 'The Commentz-Walter algorithm' 1995-watson.pdf -c
6
real 0.697
user 0.684
sys 0.007
maxmem 16 MB
faults 0
$ time pdfgrep 'The Commentz-Walter algorithm' 1995-watson.pdf -c
6
real 1.336
user 1.310
sys 0.023
maxmem 16 MB
faults 0
```
If you wind up needing to search a lot of PDFs, then ripgrep's parallelism can
make the speed difference even greater.
#### A more robust preprocessor
One of the problems with the aforementioned preprocessor is that it will fail
if you try to search a file that isn't a PDF:
```
$ echo foo > not-a-pdf
$ rg --pre ./preprocess 'The Commentz-Walter algorithm' not-a-pdf
not-a-pdf: preprocessor command failed: '"./preprocess" "not-a-pdf"':
-------------------------------------------------------------------------------
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
```
To fix this, we can make our preprocessor script a bit more robust by only
running `pdftotext` when we think the input is a non-empty PDF:
```
$ cat preprocessor
#!/bin/sh
case "$1" in
*.pdf)
# The -s flag ensures that the file is non-empty.
if [ -s "$1" ]; then
exec pdftotext - -
else
exec cat
fi
;;
*)
exec cat
;;
esac
```
We can even extend our preprocessor to search other kinds of files. Sometimes
we don't always know the file type from the file name, so we can use the `file`
utility to "sniff" the type of the file based on its contents:
```
$ cat processor
#!/bin/sh
case "$1" in
*.pdf)
# The -s flag ensures that the file is non-empty.
if [ -s "$1" ]; then
exec pdftotext - -
else
exec cat
fi
;;
*)
case $(file "$1") in
*Zstandard*)
exec pzstd -cdq
;;
*)
exec cat
;;
esac
;;
esac
```
#### Reducing preprocessor overhead
There is one more problem with the above approach: it requires running a
preprocessor for every single file that ripgrep searches. If every file needs
a preprocessor, then this is OK. But if most don't, then this can substantially
slow down searches because of the overhead of launching new processors. You
can avoid this by telling ripgrep to only invoke the preprocessor when the file
path matches a glob. For example, consider the performance difference even when
searching a repository as small as ripgrep's:
```
$ time rg --pre pre-rg 'fn is_empty' -c
crates/globset/src/lib.rs:1
crates/matcher/src/lib.rs:2
crates/ignore/src/overrides.rs:1
crates/ignore/src/gitignore.rs:1
crates/ignore/src/types.rs:1
real 0.138
user 0.485
sys 0.209
maxmem 7 MB
faults 0
$ time rg --pre pre-rg --pre-glob '*.pdf' 'fn is_empty' -c
crates/globset/src/lib.rs:1
crates/ignore/src/types.rs:1
crates/ignore/src/gitignore.rs:1
crates/ignore/src/overrides.rs:1
crates/matcher/src/lib.rs:2
real 0.008
user 0.010
sys 0.002
maxmem 7 MB
faults 0
```
### Common options
ripgrep has a lot of flags. Too many to keep in your head at once. This section
@@ -668,6 +1000,8 @@ used options that will likely impact how you use ripgrep on a regular basis.
* `-S/--smart-case`: This is similar to `--ignore-case`, but disables itself
if the pattern contains any uppercase letters. Usually this flag is put into
alias or a config file.
* `-F/--fixed-strings`: Disable regular expression matching and treat the pattern
as a literal string.
* `-w/--word-regexp`: Require that all matches of the pattern be surrounded
by word boundaries. That is, given `pattern`, the `--word-regexp` flag will
cause ripgrep to behave as if `pattern` were actually `\b(?:pattern)\b`.
@@ -675,10 +1009,11 @@ used options that will likely impact how you use ripgrep on a regular basis.
* `--files`: Print the files that ripgrep *would* search, but don't actually
search them.
* `-a/--text`: Search binary files as if they were plain text.
* `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz). This is
disabled by default.
* `-U/--multiline`: Permit matches to span multiple lines.
* `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz, lz4,
brotli, zstd). This is disabled by default.
* `-C/--context`: Show the lines surrounding a match.
* `--sort-files`: Force ripgrep to sort its output by file name. (This disables
* `--sort path`: Force ripgrep to sort its output by file name. (This disables
parallelism, so it might be slower.)
* `-L/--follow`: Follow symbolic links while recursively searching.
* `-M/--max-columns`: Limit the length of lines printed by ripgrep.

View File

@@ -1,53 +0,0 @@
#### What version of ripgrep are you using?
Replace this text with the output of `rg --version`.
#### How did you install ripgrep?
If you installed ripgrep with snap and are getting strange file permission or
file not found errors, then please do not file a bug. Instead, use one of the
Github binary releases.
#### What operating system are you using ripgrep on?
Replace this text with your operating system and version.
#### Describe your question, feature request, or bug.
If a question, please describe the problem you're trying to solve and give
as much context as possible.
If a feature request, please describe the behavior you want and the motivation.
Please also provide an example of how ripgrep would be used if your feature
request were added.
If a bug, please see below.
#### If this is a bug, what are the steps to reproduce the behavior?
If possible, please include both your search patterns and the corpus on which
you are searching. Unless the bug is very obvious, then it is unlikely that it
will be fixed if the ripgrep maintainers cannot reproduce it.
If the corpus is too big and you cannot decrease its size, file the bug anyway
and the ripgrep maintainers will help figure out next steps.
#### If this is a bug, what is the actual behavior?
Show the command you ran and the actual output. Include the `--debug` flag in
your invocation of ripgrep.
If the output is large, put it in a gist: https://gist.github.com/
If the output is small, put it in code fences:
```
your
output
goes
here
```
#### If this is a bug, what is the expected behavior?
What do you think ripgrep should have done?

436
README.md
View File

@@ -1,17 +1,18 @@
ripgrep (rg)
------------
ripgrep is a line-oriented search tool that recursively searches your current
directory for a regex pattern while respecting your gitignore rules. ripgrep
has first class support on Windows, macOS and Linux, with binary downloads
available for [every release](https://github.com/BurntSushi/ripgrep/releases).
ripgrep is similar to other popular search tools like The Silver Searcher,
ack and grep.
ripgrep is a line-oriented search tool that recursively searches the current
directory for a regex pattern. By default, ripgrep will respect gitignore rules
and automatically skip hidden files/directories and binary files. (To disable
all automatic filtering by default, use `rg -uuu`.) ripgrep has first class
support on Windows, macOS and Linux, with binary downloads available for [every
release](https://github.com/BurntSushi/ripgrep/releases). ripgrep is similar to
other popular search tools like The Silver Searcher, ack and grep.
[![Linux build status](https://travis-ci.org/BurntSushi/ripgrep.svg?branch=master)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![Crates.io](https://img.shields.io/crates/v/ripgrep.svg)](https://crates.io/crates/ripgrep)
[![Packaging status](https://repology.org/badge/tiny-repos/ripgrep.svg)](https://repology.org/project/ripgrep/badges)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org).
### CHANGELOG
@@ -23,129 +24,174 @@ Please see the [CHANGELOG](CHANGELOG.md) for a release history.
* [Installation](#installation)
* [User Guide](GUIDE.md)
* [Frequently Asked Questions](FAQ.md)
* [Regex syntax](https://docs.rs/regex/0.2.5/regex/#syntax)
* [Regex syntax](https://docs.rs/regex/1/regex/#syntax)
* [Configuration files](GUIDE.md#configuration-file)
* [Shell completions](FAQ.md#complete)
* [Building](#building)
* [Translations](#translations)
### Screenshot of search results
[![A screenshot of a sample search with ripgrep](http://burntsushi.net/stuff/ripgrep1.png)](http://burntsushi.net/stuff/ripgrep1.png)
[![A screenshot of a sample search with ripgrep](https://burntsushi.net/stuff/ripgrep1.png)](https://burntsushi.net/stuff/ripgrep1.png)
### Quick examples comparing tools
This example searches the entire Linux kernel source tree (after running
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and
ripgrep was compiled with SIMD enabled.
This example searches the entire
[Linux kernel source tree](https://github.com/BurntSushi/linux)
(after running `make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where
all matches must be words. Timings were collected on a system with an Intel
i9-12900K 5.2 GHz.
Please remember that a single benchmark is never enough! See my
[blog post on ripgrep](http://blog.burntsushi.net/ripgrep/)
[blog post on ripgrep](https://blog.burntsushi.net/ripgrep/)
for a very detailed comparison with more benchmarks and analysis.
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.106s** |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.553s |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.589s |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.266s |
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.505s |
| [ack](https://github.com/petdance/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 6.823s |
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 14.208s |
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 536 | **0.082s** (1.00x) |
| [hypergrep](https://github.com/p-ranav/hypergrep) | `hgrep -n -w '[A-Z]+_SUSPEND'` | 536 | 0.167s (2.04x) |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `git grep -P -n -w '[A-Z]+_SUSPEND'` | 536 | 0.273s (3.34x) |
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 534 | 0.443s (5.43x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -r --ignore-files --no-hidden -I -w '[A-Z]+_SUSPEND'` | 536 | 0.639s (7.82x) |
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 536 | 0.727s (8.91x) |
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 536 | 2.670s (32.70x) |
| [ack](https://github.com/beyondgrep/ack3) | `ack -w '[A-Z]+_SUSPEND'` | 2677 | 2.935s (35.94x) |
(Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
[bug](https://github.com/petdance/ack2/issues/14).)
Here's another benchmark that disregards gitignore files and searches with a
whitelist instead. The corpus is the same as in the previous benchmark, and the
flags passed to each command ensure that they are doing equivalent work:
Here's another benchmark on the same corpus as above that disregards gitignore
files and searches with a whitelist instead. The corpus is the same as in the
previous benchmark, and the flags passed to each command ensure that they are
doing equivalent work:
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.079s** |
| [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 390 | 0.163s |
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.611s |
| ripgrep | `rg -uuu -tc -n -w '[A-Z]+_SUSPEND'` | 447 | **0.063s** (1.00x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 447 | 0.607s (9.62x) |
| [GNU grep](https://www.gnu.org/software/grep/) | `grep -E -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 447 | 0.674s (10.69x) |
(`ucg` [has slightly different behavior in the presence of symbolic links](https://github.com/gvansickle/ucg/issues/106).)
And finally, a straight-up comparison between ripgrep and GNU grep on a single
large file (~9.3GB,
[`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz)):
Now we'll move to searching on single large file. Here is a straight-up
comparison between ripgrep, ugrep and GNU grep on a file cached in memory
(~13GB, [`OpenSubtitles.raw.en.gz`](http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.raw.en.gz), decompressed):
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.108s** |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.014s |
| ripgrep (Unicode) | `rg -w 'Sherlock [A-Z]\w+'` | 7882 | **1.042s** (1.00x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -w 'Sherlock [A-Z]\w+'` | 7882 | 1.339s (1.28x) |
| [GNU grep (Unicode)](https://www.gnu.org/software/grep/) | `LC_ALL=en_US.UTF-8 egrep -w 'Sherlock [A-Z]\w+'` | 7882 | 6.577s (6.31x) |
In the above benchmark, passing the `-n` flag (for showing line numbers)
increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
increases the times to `1.664s` for ripgrep and `9.484s` for GNU grep. ugrep
times are unaffected by the presence or absence of `-n`.
Beware of performance cliffs though:
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep (Unicode) | `rg -w '[A-Z]\w+ Sherlock [A-Z]\w+'` | 485 | **1.053s** (1.00x) |
| [GNU grep (Unicode)](https://www.gnu.org/software/grep/) | `LC_ALL=en_US.UTF-8 grep -E -w '[A-Z]\w+ Sherlock [A-Z]\w+'` | 485 | 6.234s (5.92x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -w '[A-Z]\w+ Sherlock [A-Z]\w+'` | 485 | 28.973s (27.51x) |
And performance can drop precipitously across the board when searching big
files for patterns without any opportunities for literal optimizations:
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg '[A-Za-z]{30}'` | 6749 | **15.569s** (1.00x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -w '[A-Z]\w+ Sherlock [A-Z]\w+'` | 6749 | 21.857s (1.40x) |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C grep -E '[A-Za-z]{30}'` | 6749 | 32.409s (2.08x) |
| [GNU grep (Unicode)](https://www.gnu.org/software/grep/) | `LC_ALL=en_US.UTF-8 grep -E '[A-Za-z]{30}'` | 6795 | 8m30s (32.74x) |
Finally, high match counts also tend to both tank performance and smooth
out the differences between tools (because performance is dominated by how
quickly one can handle a match and not the algorithm used to detect the match,
generally speaking):
| Tool | Command | Line count | Time |
| ---- | ------- | ---------- | ---- |
| ripgrep | `rg the` | 83499915 | **6.948s** (1.00x) |
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep the` | 83499915 | 11.721s (1.69x) |
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C grep the` | 83499915 | 15.217s (2.19x) |
### Why should I use ripgrep?
* It can replace many use cases served by both The Silver Searcher and GNU grep
because it is generally faster than both. (See [the FAQ](FAQ.md#posix4ever)
for more details on whether ripgrep can truly replace grep.)
* Like The Silver Searcher, ripgrep defaults to recursive directory search
and won't search files ignored by your `.gitignore` files. It also ignores
hidden and binary files by default. ripgrep also implements full support
for `.gitignore`, whereas there are many bugs related to that functionality
in The Silver Searcher.
* ripgrep can search specific types of files. For example, `rg -tpy foo`
limits your search to Python files and `rg -Tjs foo` excludes Javascript
files from your search. ripgrep can be taught about new file types with
custom matching rules.
* It can replace many use cases served by other search tools
because it contains most of their features and is generally faster. (See
[the FAQ](FAQ.md#posix4ever) for more details on whether ripgrep can truly
replace grep.)
* Like other tools specialized to code search, ripgrep defaults to
[recursive search](GUIDE.md#recursive-search) and does [automatic
filtering](GUIDE.md#automatic-filtering). Namely, ripgrep won't search files
ignored by your `.gitignore`/`.ignore`/`.rgignore` files, it won't search
hidden files and it won't search binary files. Automatic filtering can be
disabled with `rg -uuu`.
* ripgrep can [search specific types of files](GUIDE.md#manual-filtering-file-types).
For example, `rg -tpy foo` limits your search to Python files and `rg -Tjs
foo` excludes JavaScript files from your search. ripgrep can be taught about
new file types with custom matching rules.
* ripgrep supports many features found in `grep`, such as showing the context
of search results, searching multiple patterns, highlighting matches with
color and full Unicode support. Unlike GNU grep, ripgrep stays fast while
supporting Unicode (which is always on).
* ripgrep supports searching files in text encodings other than UTF-8, such
as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
automatically detecting UTF-16 is provided. Other text encodings must be
specifically specified with the `-E/--encoding` flag.)
* ripgrep supports searching files compressed in a common format (gzip, xz,
lzma, bzip2 or lz4) with the `-z/--search-zip` flag.
* ripgrep supports arbitrary input preprocessing filters which could be PDF
text extraction, less supported decompression, decrypting, automatic encoding
detection and so on.
* ripgrep has optional support for switching its regex engine to use PCRE2.
Among other things, this makes it possible to use look-around and
backreferences in your patterns, which are not supported in ripgrep's default
regex engine. PCRE2 support can be enabled with `-P/--pcre2` (use PCRE2
always) or `--auto-hybrid-regex` (use PCRE2 only if needed). An alternative
syntax is provided via the `--engine (default|pcre2|auto-hybrid)` option.
* ripgrep has [rudimentary support for replacements](GUIDE.md#replacements),
which permit rewriting output based on what was matched.
* ripgrep supports [searching files in text encodings](GUIDE.md#file-encoding)
other than UTF-8, such as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more.
(Some support for automatically detecting UTF-16 is provided. Other text
encodings must be specifically specified with the `-E/--encoding` flag.)
* ripgrep supports searching files compressed in a common format (brotli,
bzip2, gzip, lz4, lzma, xz, or zstandard) with the `-z/--search-zip` flag.
* ripgrep supports
[arbitrary input preprocessing filters](GUIDE.md#preprocessor)
which could be PDF text extraction, less supported decompression, decrypting,
automatic encoding detection and so on.
* ripgrep can be configured via a
[configuration file](GUIDE.md#configuration-file).
In other words, use ripgrep if you like speed, filtering by default, fewer
bugs, and Unicode support.
bugs and Unicode support.
### Why shouldn't I use ripgrep?
I'd like to try to convince you why you *shouldn't* use ripgrep. This should
give you a glimpse at some important downsides or missing features of
ripgrep.
Despite initially not wanting to add every feature under the sun to ripgrep,
over time, ripgrep has grown support for most features found in other file
searching tools. This includes searching for results spanning across multiple
lines, and opt-in support for PCRE2, which provides look-around and
backreference support.
* ripgrep uses a regex engine based on finite automata, so if you want fancy
regex features such as backreferences or lookaround, ripgrep won't provide
them to you. ripgrep does support lots of things though, including, but not
limited to: lazy quantification (e.g., `a+?`), repetitions (e.g., `a{2,5}`),
begin/end assertions (e.g., `^\w+$`), word boundaries (e.g., `\bfoo\b`), and
support for Unicode categories (e.g., `\p{Sc}` to match currency symbols or
`\p{Lu}` to match any uppercase letter). (Fancier regexes will never be
supported.)
* ripgrep doesn't have multiline search. (Will happen as an opt-in feature.)
At this point, the primary reasons not to use ripgrep probably consist of one
or more of the following:
In other words, if you like fancy regexes or multiline search, then ripgrep
may not quite meet your needs (yet).
* You need a portable and ubiquitous tool. While ripgrep works on Windows,
macOS and Linux, it is not ubiquitous and it does not conform to any
standard such as POSIX. The best tool for this job is good old grep.
* There still exists some other feature (or bug) not listed in this README that
you rely on that's in another tool that isn't in ripgrep.
* There is a performance edge case where ripgrep doesn't do well where another
tool does do well. (Please file a bug report!)
* ripgrep isn't possible to install on your machine or isn't available for your
platform. (Please file a bug report!)
### Is it really faster than everything else?
Generally, yes. A large number of benchmarks with detailed analysis for each is
[available on my blog](http://blog.burntsushi.net/ripgrep/).
[available on my blog](https://blog.burntsushi.net/ripgrep/).
Summarizing, ripgrep is fast because:
* It is built on top of
[Rust's regex engine](https://github.com/rust-lang-nursery/regex).
[Rust's regex engine](https://github.com/rust-lang/regex).
Rust's regex engine uses finite automata, SIMD and aggressive literal
optimizations to make searching very fast.
optimizations to make searching very fast. (PCRE2 support can be opted into
with the `-P/--pcre2` flag.)
* Rust's regex library maintains performance with full Unicode support by
building UTF-8 decoding directly into its deterministic finite automaton
engine.
@@ -154,7 +200,7 @@ Summarizing, ripgrep is fast because:
latter is better for large directories. ripgrep chooses the best searching
strategy for you automatically.
* Applies your ignore patterns in `.gitignore` files using a
[`RegexSet`](https://docs.rs/regex/1.0.0/regex/struct.RegexSet.html).
[`RegexSet`](https://docs.rs/regex/1/regex/struct.RegexSet.html).
That means a single file path can be matched against multiple glob patterns
simultaneously.
* It uses a lock-free parallel recursive directory iterator, courtesy of
@@ -168,38 +214,28 @@ Andy Lester, author of [ack](https://beyondgrep.com/), has published an
excellent table comparing the features of ack, ag, git-grep, GNU grep and
ripgrep: https://beyondgrep.com/feature-comparison/
Note that ripgrep has grown a few significant new features recently that
are not yet present in Andy's table. This includes, but is not limited to,
configuration files, passthru, support for searching compressed files,
multiline search and opt-in fancy regex support via PCRE2.
### Installation
The binary name for ripgrep is `rg`.
**[Archives of precompiled binaries for ripgrep are available for Windows,
macOS and Linux.](https://github.com/BurntSushi/ripgrep/releases)** Users of
platforms not explicitly mentioned below are advised to download one of these
archives.
macOS and Linux.](https://github.com/BurntSushi/ripgrep/releases)** Linux and
Windows binaries are static executables. Users of platforms not explicitly
mentioned below are advised to download one of these archives.
Linux binaries are static executables. Windows binaries are available either as
built with MinGW (GNU) or with Microsoft Visual C++ (MSVC). When possible,
prefer MSVC over GNU, but you'll need to have the [Microsoft VC++ 2015
redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)
installed.
If you're a **macOS Homebrew** or a **Linuxbrew** user,
then you can install ripgrep either
from homebrew-core, (compiled with rust stable, no SIMD):
If you're a **macOS Homebrew** or a **Linuxbrew** user, then you can install
ripgrep from homebrew-core:
```
$ brew install ripgrep
```
or you can install a binary compiled with rust nightly (including SIMD and all
optimizations) by utilizing a custom tap:
```
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install ripgrep-bin
```
If you're a **MacPorts** user, then you can install ripgrep from the
[official ports](https://www.macports.org/ports.php?by=name&substr=ripgrep):
@@ -207,52 +243,60 @@ If you're a **MacPorts** user, then you can install ripgrep from the
$ sudo port install ripgrep
```
If you're a **Windows Chocolatey** user, then you can install ripgrep from the [official repo](https://chocolatey.org/packages/ripgrep):
If you're a **Windows Chocolatey** user, then you can install ripgrep from the
[official repo](https://chocolatey.org/packages/ripgrep):
```
$ choco install ripgrep
```
If you're a **Windows Scoop** user, then you can install ripgrep from the [official bucket](https://github.com/lukesampson/scoop/blob/master/bucket/ripgrep.json):
If you're a **Windows Scoop** user, then you can install ripgrep from the
[official bucket](https://github.com/ScoopInstaller/Main/blob/master/bucket/ripgrep.json):
```
$ scoop install ripgrep
```
If you're a **Windows Winget** user, then you can install ripgrep from the
[winget-pkgs](https://github.com/microsoft/winget-pkgs/tree/master/manifests/b/BurntSushi/ripgrep)
repository:
```
$ winget install BurntSushi.ripgrep.MSVC
```
If you're an **Arch Linux** user, then you can install ripgrep from the official repos:
```
$ pacman -S ripgrep
$ sudo pacman -S ripgrep
```
If you're a **Gentoo** user, you can install ripgrep from the [official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
If you're a **Gentoo** user, you can install ripgrep from the
[official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
```
$ emerge sys-apps/ripgrep
$ sudo emerge sys-apps/ripgrep
```
If you're a **Fedora 27+** user, you can install ripgrep from official repositories.
If you're a **Fedora** user, you can install ripgrep from official
repositories.
```
$ sudo dnf install ripgrep
```
If you're a **Fedora 24+** user, you can install ripgrep from [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
```
$ sudo dnf copr enable carlwgeorge/ripgrep
$ sudo dnf install ripgrep
```
If you're an **openSUSE Tumbleweed** user, you can install ripgrep from the [official repo](http://software.opensuse.org/package/ripgrep):
If you're an **openSUSE** user, ripgrep is included in **openSUSE Tumbleweed**
and **openSUSE Leap** since 15.1.
```
$ sudo zypper install ripgrep
```
If you're a **RHEL/CentOS 7** user, you can install ripgrep from [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
If you're a **RHEL/CentOS 7/8** user, you can install ripgrep from
[copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
```
$ sudo yum install -y yum-utils
$ sudo yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/repo/epel-7/carlwgeorge-ripgrep-epel-7.repo
$ sudo yum install ripgrep
```
@@ -262,17 +306,38 @@ If you're a **Nix** user, you can install ripgrep from
```
$ nix-env --install ripgrep
$ # (Or using the attribute name, which is also ripgrep.)
```
If you're a **Guix** user, you can install ripgrep from the official
package collection:
```
$ guix install ripgrep
```
If you're a **Debian** user (or a user of a Debian derivative like **Ubuntu**),
then ripgrep can be installed using a binary `.deb` file provided in each
[ripgrep release](https://github.com/BurntSushi/ripgrep/releases). Note that
ripgrep is not in the official Debian or Ubuntu repositories.
[ripgrep release](https://github.com/BurntSushi/ripgrep/releases).
```
$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.8.1/ripgrep_0.8.1_amd64.deb
$ sudo dpkg -i ripgrep_0.8.1_amd64.deb
$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/13.0.0/ripgrep_13.0.0_amd64.deb
$ sudo dpkg -i ripgrep_13.0.0_amd64.deb
```
If you run Debian stable, ripgrep is [officially maintained by
Debian](https://tracker.debian.org/pkg/rust-ripgrep), although its version may
be older than the `deb` package available in the previous step.
```
$ sudo apt-get install ripgrep
```
If you're an **Ubuntu Cosmic (18.10)** (or newer) user, ripgrep is
[available](https://launchpad.net/ubuntu/+source/rust-ripgrep) using the same
packaging as Debian:
```
$ sudo apt-get install ripgrep
```
(N.B. Various snaps for ripgrep on Ubuntu are also available, but none of them
@@ -280,26 +345,58 @@ seem to work right and generate a number of very strange bug reports that I
don't know how to fix and don't have the time to fix. Therefore, it is no
longer a recommended installation option.)
If you're a **FreeBSD** user, then you can install ripgrep from the [official ports](https://www.freshports.org/textproc/ripgrep/):
If you're an **ALT** user, you can install ripgrep from the
[official repo](https://packages.altlinux.org/en/search?name=ripgrep):
```
# pkg install ripgrep
$ sudo apt-get install ripgrep
```
If you're an **OpenBSD** user, then you can install ripgrep from the [official ports](http://openports.se/textproc/ripgrep):
If you're a **FreeBSD** user, then you can install ripgrep from the
[official ports](https://www.freshports.org/textproc/ripgrep/):
```
$ sudo pkg install ripgrep
```
If you're an **OpenBSD** user, then you can install ripgrep from the
[official ports](https://openports.se/textproc/ripgrep):
```
$ doas pkg_add ripgrep
```
If you're a **NetBSD** user, then you can install ripgrep from [pkgsrc](http://pkgsrc.se/textproc/ripgrep):
If you're a **NetBSD** user, then you can install ripgrep from
[pkgsrc](https://pkgsrc.se/textproc/ripgrep):
```
# pkgin install ripgrep
$ sudo pkgin install ripgrep
```
If you're a **Haiku x86_64** user, then you can install ripgrep from the
[official ports](https://github.com/haikuports/haikuports/tree/master/sys-apps/ripgrep):
```
$ sudo pkgman install ripgrep
```
If you're a **Haiku x86_gcc2** user, then you can install ripgrep from the
same port as Haiku x86_64 using the x86 secondary architecture build:
```
$ sudo pkgman install ripgrep_x86
```
If you're a **Void Linux** user, then you can install ripgrep from the
[official repository](https://voidlinux.org/packages/?arch=x86_64&q=ripgrep):
```
$ sudo xbps-install -Syv ripgrep
```
If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
* Note that the minimum supported version of Rust for ripgrep is **1.23.0**,
* Note that the minimum supported version of Rust for ripgrep is **1.72.0**,
although ripgrep may work with older versions.
* Note that the binary may be bigger than expected because it contains debug
symbols. This is intentional. To remove debug symbols and therefore reduce
@@ -309,18 +406,23 @@ If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
$ cargo install ripgrep
```
When compiling with Rust 1.27 or newer, this will automatically enable SIMD
optimizations for search.
Alternatively, one can use [`cargo
binstall`](https://github.com/cargo-bins/cargo-binstall) to install a ripgrep
binary directly from GitHub:
ripgrep isn't currently in any other package repositories.
[I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
```
$ cargo binstall ripgrep
```
### Building
ripgrep is written in Rust, so you'll need to grab a
[Rust installation](https://www.rust-lang.org/) in order to compile it.
ripgrep compiles with Rust 1.23.0 (stable) or newer. Building is easy:
ripgrep compiles with Rust 1.72.0 (stable) or newer. In general, ripgrep tracks
the latest stable release of the Rust compiler.
To build ripgrep:
```
$ git clone https://github.com/BurntSushi/ripgrep
@@ -334,18 +436,47 @@ If you have a Rust nightly compiler and a recent Intel CPU, then you can enable
additional optional SIMD acceleration like so:
```
RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'
RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel'
```
If your machine doesn't support AVX instructions, then simply remove
`avx-accel` from the features list. Similarly for SIMD (which corresponds
roughly to SSE instructions).
The `simd-accel` feature enables SIMD support in certain ripgrep dependencies
(responsible for transcoding). They are not necessary to get SIMD optimizations
for search; those are enabled automatically. Hopefully, some day, the
`simd-accel` feature will similarly become unnecessary. **WARNING:** Currently,
enabling this option can increase compilation times dramatically.
The `simd-accel` and `avx-accel` features enable SIMD support in certain
ripgrep dependencies (responsible for counting lines and transcoding). They
are not necessary to get SIMD optimizations for search; those are enabled
automatically. Hopefully, some day, the `simd-accel` and `avx-accel` features
will similarly become unnecessary.
Finally, optional PCRE2 support can be built with ripgrep by enabling the
`pcre2` feature:
```
$ cargo build --release --features 'pcre2'
```
(Tip: use `--features 'pcre2 simd-accel'` to also include compile time SIMD
optimizations, which will only work with a nightly compiler.)
Enabling the PCRE2 feature works with a stable Rust compiler and will
attempt to automatically find and link with your system's PCRE2 library via
`pkg-config`. If one doesn't exist, then ripgrep will build PCRE2 from source
using your system's C compiler and then statically link it into the final
executable. Static linking can be forced even when there is an available PCRE2
system library by either building ripgrep with the MUSL target or by setting
`PCRE2_SYS_STATIC=1`.
ripgrep can be built with the MUSL target on Linux by first installing the MUSL
library on your system (consult your friendly neighborhood package manager).
Then you just need to add MUSL support to your Rust toolchain and rebuild
ripgrep, which yields a fully static executable:
```
$ rustup target add x86_64-unknown-linux-musl
$ cargo build --release --target x86_64-unknown-linux-musl
```
Applying the `--features` flag from above works as expected. If you want to
build a static executable with MUSL and with PCRE2, then you will need to have
`musl-gcc` installed, which might be in a separate package from the actual
MUSL library, depending on your Linux distribution.
### Running tests
@@ -358,3 +489,28 @@ $ cargo test --all
```
from the repository root.
### Related tools
* [delta](https://github.com/dandavison/delta) is a syntax highlighting
pager that supports the `rg --json` output format. So all you need to do to
make it work is `rg --json pattern | delta`. See [delta's manual section on
grep](https://dandavison.github.io/delta/grep.html) for more details.
### Vulnerability reporting
For reporting a security vulnerability, please
[contact Andrew Gallant](https://blog.burntsushi.net/about/).
The contact page has my email address and PGP public key if you wish to send an
encrypted message.
### Translations
The following is a list of known translations of ripgrep's documentation. These
are unofficially maintained and may not be up to date.
* [Chinese](https://github.com/chinanf-boy/ripgrep-zh#%E6%9B%B4%E6%96%B0-)
* [Spanish](https://github.com/UltiRequiem/traducciones/tree/master/ripgrep)

59
RELEASE-CHECKLIST.md Normal file
View File

@@ -0,0 +1,59 @@
# Release Checklist
* Ensure local `master` is up to date with respect to `origin/master`.
* Run `cargo update` and review dependency updates. Commit updated
`Cargo.lock`.
* Run `cargo outdated` and review semver incompatible updates. Unless there is
a strong motivation otherwise, review and update every dependency. Also
run `--aggressive`, but don't update to crates that are still in beta.
* Update date in `crates/core/flags/doc/template.rg.1`.
* Review changes for every crate in `crates` since the last ripgrep release.
If the set of changes is non-empty, issue a new release for that crate. Check
crates in the following order. After updating a crate, ensure minimal
versions are updated as appropriate in dependents. If an update is required,
run `cargo-up --no-push crates/{CRATE}/Cargo.toml`.
* crates/globset
* crates/ignore
* crates/cli
* crates/matcher
* crates/regex
* crates/pcre2
* crates/searcher
* crates/printer
* crates/grep (bump minimal versions as necessary)
* crates/core (do **not** bump version, but update dependencies as needed)
* Update the CHANGELOG as appropriate.
* Edit the `Cargo.toml` to set the new ripgrep version. Run
`cargo update -p ripgrep` so that the `Cargo.lock` is updated. Commit the
changes and create a new signed tag. Alternatively, use
`cargo-up --no-push --no-release Cargo.toml {VERSION}` to automate this.
* Run `cargo package` and ensure it succeeds.
* Push changes to GitHub, NOT including the tag. (But do not publish a new
version of ripgrep to crates.io yet.)
* Once CI for `master` finishes successfully, push the version tag. (Trying to
do this in one step seems to result in GitHub Actions not seeing the tag
push and thus not running the release workflow.)
* Wait for CI to finish creating the release. If the release build fails, then
delete the tag from GitHub, make fixes, re-tag, delete the release and push.
* Copy the relevant section of the CHANGELOG to the tagged release notes.
Include this blurb describing what ripgrep is:
> In case you haven't heard of it before, ripgrep is a line-oriented search
> tool that recursively searches the current directory for a regex pattern.
> By default, ripgrep will respect gitignore rules and automatically skip
> hidden files/directories and binary files.
* Run `git checkout {VERSION} && ci/build-and-publish-m2 {VERSION}` on a macOS
system with Apple silicon.
* Run `cargo publish`.
* Run `ci/sha256-releases {VERSION} >> pkg/brew/ripgrep-bin.rb`. Then edit
`pkg/brew/ripgrep-bin.rb` to update the version number and sha256 hashes.
Remove extraneous stuff added by `ci/sha256-releases`. Commit changes.
* Add TBD section to the top of the CHANGELOG:
```
TBD
===
Unreleased changes. Release notes have not yet been written.
```
Note that [`cargo-up` can be found in BurntSushi's dotfiles][dotfiles].
[dotfiles]: https://github.com/BurntSushi/dotfiles/blob/master/bin/cargo-up

View File

@@ -1,84 +0,0 @@
# Inspired from https://github.com/habitat-sh/habitat/blob/master/appveyor.yml
cache:
- c:\cargo\registry
- c:\cargo\git
- c:\projects\ripgrep\target
init:
- mkdir c:\cargo
- mkdir c:\rustup
- SET PATH=c:\cargo\bin;%PATH%
clone_folder: c:\projects\ripgrep
environment:
CARGO_HOME: "c:\\cargo"
RUSTUP_HOME: "c:\\rustup"
CARGO_TARGET_DIR: "c:\\projects\\ripgrep\\target"
global:
PROJECT_NAME: ripgrep
RUST_BACKTRACE: full
matrix:
- TARGET: i686-pc-windows-gnu
CHANNEL: stable
- TARGET: i686-pc-windows-msvc
CHANNEL: stable
- TARGET: x86_64-pc-windows-gnu
CHANNEL: stable
- TARGET: x86_64-pc-windows-msvc
CHANNEL: stable
matrix:
fast_finish: true
# Install Rust and Cargo
# (Based on from https://github.com/rust-lang/libc/blob/master/appveyor.yml)
install:
- curl -sSf -o rustup-init.exe https://win.rustup.rs/
- rustup-init.exe -y --default-host %TARGET% --no-modify-path
- if defined MSYS2_BITS set PATH=%PATH%;C:\msys64\mingw%MSYS2_BITS%\bin
- rustc -V
- cargo -V
# ???
build: false
# Equivalent to Travis' `script` phase
# TODO modify this phase as you see fit
test_script:
- cargo test --verbose --all
before_deploy:
# Generate artifacts for release
- cargo build --release
- mkdir staging
- copy target\release\rg.exe staging
- ps: copy target\release\build\ripgrep-*\out\_rg.ps1 staging
- cd staging
# release zipfile will look like 'rust-everywhere-v1.2.3-x86_64-pc-windows-msvc'
- 7z a ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip *
- appveyor PushArtifact ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip
deploy:
description: 'Automatically deployed release'
# All the zipped artifacts will be deployed
artifact: /.*\.zip/
auth_token:
secure: vv4vBCEosGlyQjaEC1+kraP2P6O4CQSa+Tw50oHWFTGcmuXxaWS0/yEXbxsIRLpw
provider: GitHub
# deploy when a new tag is pushed and only on the stable channel
on:
# channel to use to produce the release artifacts
# NOTE make sure you only release *once* per target
# TODO you may want to pick a different channel
CHANNEL: stable
appveyor_repo_tag: true
branches:
only:
- /\d+\.\d+\.\d+/
- master
# - appveyor
# - /\d+\.\d+\.\d+/
# except:
# - master

View File

@@ -23,16 +23,16 @@ import time
# strategies used to increase the relevance of results returned.
SUBTITLES_DIR = 'subtitles'
SUBTITLES_EN_NAME = 'OpenSubtitles2016.raw.en'
SUBTITLES_EN_NAME_SAMPLE = 'OpenSubtitles2016.raw.sample.en'
SUBTITLES_EN_NAME = 'en.txt'
SUBTITLES_EN_NAME_SAMPLE = 'en.sample.txt'
SUBTITLES_EN_NAME_GZ = '%s.gz' % SUBTITLES_EN_NAME
SUBTITLES_EN_URL = 'http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz' # noqa
SUBTITLES_RU_NAME = 'OpenSubtitles2016.raw.ru'
SUBTITLES_EN_URL = 'https://object.pouta.csc.fi/OPUS-OpenSubtitles/v2016/mono/en.txt.gz' # noqa
SUBTITLES_RU_NAME = 'ru.txt'
SUBTITLES_RU_NAME_GZ = '%s.gz' % SUBTITLES_RU_NAME
SUBTITLES_RU_URL = 'http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.ru.gz' # noqa
SUBTITLES_RU_URL = 'https://object.pouta.csc.fi/OPUS-OpenSubtitles/v2016/mono/ru.txt.gz' # noqa
LINUX_DIR = 'linux'
LINUX_CLONE = 'git://github.com/BurntSushi/linux'
LINUX_CLONE = 'https://github.com/BurntSushi/linux'
# Grep takes locale settings from the environment. There is a *substantial*
# performance impact for enabling Unicode, so we need to handle this explicitly
@@ -55,8 +55,10 @@ def bench_linux_literal_default(suite_dir):
Benchmark the speed of a literal using *default* settings.
This is a purposefully unfair benchmark for use in performance
analysis, but it is pedagogically useful to demonstrate how
default behaviors differ.
analysis, but it is pedagogically useful to demonstrate how default
behaviors differ. For example, ugrep and grep don't do any smart
filtering by default, so they will invariably search more files
than ripgrep, ag or git grep.
'''
require(suite_dir, 'linux')
cwd = path.join(suite_dir, LINUX_DIR)
@@ -69,16 +71,11 @@ def bench_linux_literal_default(suite_dir):
return Benchmark(pattern=pat, commands=[
mkcmd('rg', ['rg', pat]),
mkcmd('ag', ['ag', pat]),
# ucg reports the exact same matches as ag and rg even though it
# doesn't read gitignore files. Instead, it has a file whitelist
# that happens to match up exactly with the gitignores for this search.
mkcmd('ucg', ['ucg', pat]),
# I guess setting LC_ALL=en_US.UTF-8 probably isn't necessarily the
# default, but I'd guess it to be on most desktop systems.
mkcmd('pt', ['pt', pat]),
# sift reports an extra line here for a binary file matched.
mkcmd('sift', ['sift', pat]),
mkcmd('git grep', ['git', 'grep', pat], env={'LC_ALL': 'en_US.UTF-8'}),
mkcmd('git grep', ['git', 'grep', pat], env=GREP_UNICODE),
mkcmd('ugrep', ['ugrep', '-r', pat, './']),
mkcmd('grep', ['grep', '-r', pat, './'], env=GREP_UNICODE),
])
@@ -100,16 +97,16 @@ def bench_linux_literal(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', pat]),
mkcmd('rg (ignore) (mmap)', ['rg', '-n', '--mmap', pat]),
mkcmd('ag (ignore) (mmap)', ['ag', '-s', pat]),
mkcmd('pt (ignore)', ['pt', pat]),
mkcmd('sift (ignore)', SIFT + ['-n', '--git', pat]),
mkcmd('git grep (ignore)', [
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('rg (mmap)', ['rg', '-n', '--mmap', pat]),
mkcmd('ag (mmap)', ['ag', '-s', pat]),
mkcmd('git grep', [
'git', 'grep', '-I', '-n', pat,
], env={'LC_ALL': 'C'}),
mkcmd('rg (whitelist)', ['rg', '-n', '--no-ignore', '-tall', pat]),
mkcmd('ucg (whitelist)', ['ucg', '--nosmart-case', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
])
])
@@ -129,31 +126,26 @@ def bench_linux_literal_casei(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', '-i', pat]),
mkcmd('rg (ignore) (mmap)', ['rg', '-n', '-i', '--mmap', pat]),
mkcmd('ag (ignore) (mmap)', ['ag', '-i', pat]),
mkcmd('pt (ignore)', ['pt', '-i', pat]),
mkcmd('sift (ignore)', SIFT + ['-n', '-i', '--git', pat]),
mkcmd('rg', ['rg', '-n', '-i', pat]),
mkcmd('rg (mmap)', ['rg', '-n', '-i', '--mmap', pat]),
mkcmd('ag (mmap)', ['ag', '-i', pat]),
# It'd technically be more appropriate to set LC_ALL=en_US.UTF-8 here,
# since that is certainly what ripgrep is doing, but this is for an
# ASCII literal, so we should give `git grep` all the opportunity to
# do its best.
mkcmd('git grep (ignore)', [
mkcmd('git grep', [
'git', 'grep', '-I', '-n', '-i', pat,
], env={'LC_ALL': 'C'}),
mkcmd('rg (whitelist)', [
'rg', '-n', '-i', '--no-ignore', '-tall', pat,
]),
mkcmd('ucg (whitelist)', ['ucg', '-i', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-i', pat, './',
])
])
def bench_linux_re_literal_suffix(suite_dir):
'''
Benchmark the speed of a literal inside a regex.
This, for example, inhibits a prefix byte optimization used
inside of Go's regex engine (relevant for sift and pt).
'''
require(suite_dir, 'linux')
cwd = path.join(suite_dir, LINUX_DIR)
@@ -164,26 +156,23 @@ def bench_linux_re_literal_suffix(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', pat]),
mkcmd('ag (ignore)', ['ag', '-s', pat]),
mkcmd('pt (ignore)', ['pt', '-e', pat]),
mkcmd('sift (ignore)', SIFT + ['-n', '--git', pat]),
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('ag', ['ag', '-s', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', ['rg', '-n', '--no-ignore', '-tall', pat]),
mkcmd('ucg (whitelist)', ['ucg', '--nosmart-case', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
])
])
def bench_linux_word(suite_dir):
'''
Benchmark use of the -w ("match word") flag in each tool.
sift has a lot of trouble with this because it forces it into Go's
regex engine by surrounding the pattern with \b assertions.
'''
require(suite_dir, 'linux')
cwd = path.join(suite_dir, LINUX_DIR)
@@ -194,28 +183,23 @@ def bench_linux_word(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', '-w', pat]),
mkcmd('ag (ignore)', ['ag', '-s', '-w', pat]),
mkcmd('pt (ignore)', ['pt', '-w', pat]),
mkcmd('sift (ignore)', SIFT + ['-n', '-w', '--git', pat]),
mkcmd('rg', ['rg', '-n', '-w', pat]),
mkcmd('ag', ['ag', '-s', '-w', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', '-w', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', [
'rg', '-n', '-w', '--no-ignore', '-tall', pat,
]),
mkcmd('ucg (whitelist)', ['ucg', '--nosmart-case', '-w', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-w', pat, './',
])
])
def bench_linux_unicode_greek(suite_dir):
'''
Benchmark matching of a Unicode category.
Only three tools (ripgrep, sift and pt) support this. We omit
pt because it is too slow.
'''
require(suite_dir, 'linux')
cwd = path.join(suite_dir, LINUX_DIR)
@@ -227,8 +211,10 @@ def bench_linux_unicode_greek(suite_dir):
return Benchmark(pattern=pat, commands=[
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('pt', ['pt', '-e', pat]),
mkcmd('sift', SIFT + ['-n', '--git', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
])
])
@@ -248,18 +234,20 @@ def bench_linux_unicode_greek_casei(suite_dir):
return Benchmark(pattern=pat, commands=[
mkcmd('rg', ['rg', '-n', '-i', pat]),
mkcmd('pt', ['pt', '-i', '-e', pat]),
mkcmd('sift', SIFT + ['-n', '-i', '--git', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-i', pat, './',
])
])
def bench_linux_unicode_word(suite_dir):
'''
Benchmark Unicode aware \w character class.
Benchmark Unicode aware \\w character class.
Only ripgrep and git-grep (with LC_ALL=en_US.UTF-8) actually get
this right. Everything else uses the standard ASCII interpretation
of \w.
of \\w.
'''
require(suite_dir, 'linux')
cwd = path.join(suite_dir, LINUX_DIR)
@@ -270,26 +258,27 @@ def bench_linux_unicode_word(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', pat]),
mkcmd('rg (ignore) (ASCII)', ['rg', '-n', '(?-u)' + pat]),
mkcmd('ag (ignore) (ASCII)', ['ag', '-s', pat]),
mkcmd('pt (ignore) (ASCII)', ['pt', '-e', pat]),
mkcmd('sift (ignore) (ASCII)', SIFT + ['-n', '--git', pat]),
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('rg (ASCII)', ['rg', '-n', '(?-u)' + pat]),
mkcmd('ag (ASCII)', ['ag', '-s', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'en_US.UTF-8'},
),
mkcmd(
'git grep (ignore) (ASCII)',
'git grep (ASCII)',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', ['rg', '-n', '--no-ignore', '-tall', pat]),
mkcmd('rg (whitelist) (ASCII)', [
'rg', '-n', '--no-ignore', '-tall', '(?-u)' + pat,
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
]),
mkcmd('ugrep (ASCII)', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-U', pat, './',
]),
mkcmd('ucg (ASCII)', ['ucg', '--nosmart-case', pat]),
])
@@ -311,26 +300,27 @@ def bench_linux_no_literal(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', pat]),
mkcmd('rg (ignore) (ASCII)', ['rg', '-n', '(?-u)' + pat]),
mkcmd('ag (ignore) (ASCII)', ['ag', '-s', pat]),
mkcmd('pt (ignore) (ASCII)', ['pt', '-e', pat]),
mkcmd('sift (ignore) (ASCII)', SIFT + ['-n', '--git', pat]),
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('rg (ASCII)', ['rg', '-n', '(?-u)' + pat]),
mkcmd('ag (ASCII)', ['ag', '-s', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'en_US.UTF-8'},
),
mkcmd(
'git grep (ignore) (ASCII)',
'git grep (ASCII)',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', ['rg', '-n', '--no-ignore', '-tall', pat]),
mkcmd('rg (whitelist) (ASCII)', [
'rg', '-n', '--no-ignore', '-tall', '(?-u)' + pat,
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
]),
mkcmd('ugrep (ASCII)', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-U', pat, './',
]),
mkcmd('ucg (whitelist) (ASCII)', ['ucg', '--nosmart-case', pat]),
])
@@ -352,15 +342,17 @@ def bench_linux_alternates(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', pat]),
mkcmd('ag (ignore)', ['ag', '-s', pat]),
mkcmd('rg', ['rg', '-n', pat]),
mkcmd('ag', ['ag', '-s', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', ['rg', '--no-ignore', '-n', pat]),
mkcmd('ucg (whitelist)', ['ucg', '--nosmart-case', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', pat, './',
])
])
@@ -375,15 +367,17 @@ def bench_linux_alternates_casei(suite_dir):
return Command(*args, **kwargs)
return Benchmark(pattern=pat, commands=[
mkcmd('rg (ignore)', ['rg', '-n', '-i', pat]),
mkcmd('ag (ignore)', ['ag', '-i', pat]),
mkcmd('rg', ['rg', '-n', '-i', pat]),
mkcmd('ag', ['ag', '-i', pat]),
mkcmd(
'git grep (ignore)',
'git grep',
['git', 'grep', '-E', '-I', '-n', '-i', pat],
env={'LC_ALL': 'C'},
),
mkcmd('rg (whitelist)', ['rg', '--no-ignore', '-n', '-i', pat]),
mkcmd('ucg (whitelist)', ['ucg', '-i', pat]),
mkcmd('ugrep', [
'ugrep', '-r', '--ignore-files', '--no-hidden', '-I',
'-n', '-i', pat, './',
])
])
@@ -398,15 +392,11 @@ def bench_subtitles_en_literal(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', pat, en]),
Command('rg (no mmap)', ['rg', '--no-mmap', pat, en]),
Command('pt', ['pt', '-N', pat, en]),
Command('sift', ['sift', pat, en]),
Command('grep', ['grep', '-a', pat, en], env=GREP_ASCII),
Command('grep', ['grep', pat, en], env=GREP_ASCII),
Command('rg (lines)', ['rg', '-n', pat, en]),
Command('ag (lines)', ['ag', '-s', pat, en]),
Command('ucg (lines)', ['ucg', '--nosmart-case', pat, en]),
Command('pt (lines)', ['pt', pat, en]),
Command('sift (lines)', ['sift', '-n', pat, en]),
Command('grep (lines)', ['grep', '-an', pat, en], env=GREP_ASCII),
Command('grep (lines)', ['grep', '-n', pat, en], env=GREP_ASCII),
Command('ugrep (lines)', ['ugrep', '-n', pat, en])
])
@@ -420,13 +410,11 @@ def bench_subtitles_en_literal_casei(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-i', pat, en]),
Command('grep', ['grep', '-ai', pat, en], env=GREP_UNICODE),
Command('grep (ASCII)', [
'grep', '-E', '-ai', pat, en,
], env=GREP_ASCII),
Command('grep', ['grep', '-i', pat, en], env=GREP_UNICODE),
Command('grep (ASCII)', ['grep', '-E', '-i', pat, en], env=GREP_ASCII),
Command('rg (lines)', ['rg', '-n', '-i', pat, en]),
Command('ag (lines) (ASCII)', ['ag', '-i', pat, en]),
Command('ucg (lines) (ASCII)', ['ucg', '-i', pat, en]),
Command('ugrep (lines)', ['ugrep', '-n', '-i', pat, en])
])
@@ -443,12 +431,10 @@ def bench_subtitles_en_literal_word(suite_dir):
'rg', '-n', r'(?-u:\b)' + pat + r'(?-u:\b)', en,
]),
Command('ag (ASCII)', ['ag', '-sw', pat, en]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, en]),
Command('grep (ASCII)', [
'grep', '-anw', pat, en,
], env=GREP_ASCII),
Command('grep (ASCII)', ['grep', '-nw', pat, en], env=GREP_ASCII),
Command('ugrep (ASCII)', ['ugrep', '-nw', pat, en]),
Command('rg', ['rg', '-nw', pat, en]),
Command('grep', ['grep', '-anw', pat, en], env=GREP_UNICODE),
Command('grep', ['grep', '-nw', pat, en], env=GREP_UNICODE),
])
@@ -469,14 +455,10 @@ def bench_subtitles_en_alternate(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg (lines)', ['rg', '-n', pat, en]),
Command('ag (lines)', ['ag', '-s', pat, en]),
Command('ucg (lines)', ['ucg', '--nosmart-case', pat, en]),
Command('grep (lines)', [
'grep', '-E', '-an', pat, en,
], env=GREP_ASCII),
Command('grep (lines)', ['grep', '-E', '-n', pat, en], env=GREP_ASCII),
Command('ugrep (lines)', ['ugrep', '-n', pat, en]),
Command('rg', ['rg', pat, en]),
Command('grep', [
'grep', '-E', '-a', pat, en,
], env=GREP_ASCII),
Command('grep', ['grep', '-E', pat, en], env=GREP_ASCII),
])
@@ -496,12 +478,12 @@ def bench_subtitles_en_alternate_casei(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('ag (ASCII)', ['ag', '-s', '-i', pat, en]),
Command('ucg (ASCII)', ['ucg', '-i', pat, en]),
Command('grep (ASCII)', [
'grep', '-E', '-ani', pat, en,
'grep', '-E', '-ni', pat, en,
], env=GREP_ASCII),
Command('ugrep (ASCII)', ['ugrep', '-n', '-i', pat, en]),
Command('rg', ['rg', '-n', '-i', pat, en]),
Command('grep', ['grep', '-E', '-ani', pat, en], env=GREP_UNICODE),
Command('grep', ['grep', '-E', '-ni', pat, en], env=GREP_UNICODE),
])
@@ -515,13 +497,12 @@ def bench_subtitles_en_surrounding_words(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-n', pat, en]),
Command('grep', ['grep', '-E', '-an', pat, en], env=GREP_UNICODE),
Command('grep', ['grep', '-E', '-n', pat, en], env=GREP_UNICODE),
Command('ugrep', ['ugrep', '-n', pat, en]),
Command('rg (ASCII)', ['rg', '-n', '(?-u)' + pat, en]),
Command('ag (ASCII)', ['ag', '-s', pat, en]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, en]),
Command('grep (ASCII)', [
'grep', '-E', '-an', pat, en,
], env=GREP_ASCII),
Command('grep (ASCII)', ['grep', '-E', '-n', pat, en], env=GREP_ASCII),
Command('ugrep (ASCII)', ['ugrep', '-n', '-U', pat, en])
])
@@ -540,12 +521,11 @@ def bench_subtitles_en_no_literal(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-n', pat, en]),
Command('ugrep', ['ugrep', '-n', pat, en]),
Command('rg (ASCII)', ['rg', '-n', '(?-u)' + pat, en]),
Command('ag (ASCII)', ['ag', '-s', pat, en]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, en]),
Command('grep (ASCII)', [
'grep', '-E', '-an', pat, en,
], env=GREP_ASCII),
Command('grep (ASCII)', ['grep', '-E', '-n', pat, en], env=GREP_ASCII),
Command('ugrep (ASCII)', ['ugrep', '-n', '-U', pat, en])
])
@@ -560,15 +540,15 @@ def bench_subtitles_ru_literal(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', pat, ru]),
Command('rg (no mmap)', ['rg', '--no-mmap', pat, ru]),
Command('pt', ['pt', '-N', pat, ru]),
Command('sift', ['sift', pat, ru]),
Command('grep', ['grep', '-a', pat, ru], env=GREP_ASCII),
Command('grep', ['grep', pat, ru], env=GREP_ASCII),
Command('rg (lines)', ['rg', '-n', pat, ru]),
Command('ag (lines)', ['ag', '-s', pat, ru]),
Command('ucg (lines)', ['ucg', '--nosmart-case', pat, ru]),
Command('pt (lines)', ['pt', pat, ru]),
Command('sift (lines)', ['sift', '-n', pat, ru]),
Command('grep (lines)', ['grep', '-an', pat, ru], env=GREP_ASCII),
Command('grep (lines)', ['grep', '-n', pat, ru], env=GREP_ASCII),
# ugrep incorrectly identifies this corpus as binary, but it is
# entirely valid UTF-8. So we tell ugrep to always treat the corpus
# as text even though this technically gives it an edge over other
# tools. (It no longer needs to check for binary data.)
Command('ugrep (lines)', ['ugrep', '-a', '-n', pat, ru])
])
@@ -582,13 +562,12 @@ def bench_subtitles_ru_literal_casei(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-i', pat, ru]),
Command('grep', ['grep', '-ai', pat, ru], env=GREP_UNICODE),
Command('grep (ASCII)', [
'grep', '-E', '-ai', pat, ru,
], env=GREP_ASCII),
Command('grep', ['grep', '-i', pat, ru], env=GREP_UNICODE),
Command('grep (ASCII)', ['grep', '-E', '-i', pat, ru], env=GREP_ASCII),
Command('rg (lines)', ['rg', '-n', '-i', pat, ru]),
Command('ag (lines) (ASCII)', ['ag', '-i', pat, ru]),
Command('ucg (lines) (ASCII)', ['ucg', '-i', pat, ru]),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (lines) (ASCII)', ['ugrep', '-a', '-n', '-i', pat, ru])
])
@@ -602,15 +581,20 @@ def bench_subtitles_ru_literal_word(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg (ASCII)', [
'rg', '-n', r'(?-u:\b)' + pat + r'(?-u:\b)', ru,
# You might think we'd use \b here for word boundaries, but both
# GNU grep and ripgrep implement -w with the formulation below.
# Since we can't use Unicode in a pattern and disable Unicode word
# boundaries, we just hand-jam this ourselves.
'rg', '-n', r'(?-u:^|\W)' + pat + r'(?-u:$|\W)', ru,
]),
Command('ag (ASCII)', ['ag', '-sw', pat, ru]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, ru]),
Command('grep (ASCII)', [
'grep', '-anw', pat, ru,
'grep', '-nw', pat, ru,
], env=GREP_ASCII),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (ASCII)', ['ugrep', '-anw', pat, ru]),
Command('rg', ['rg', '-nw', pat, ru]),
Command('grep', ['grep', '-anw', pat, ru], env=GREP_UNICODE),
Command('grep', ['grep', '-nw', pat, ru], env=GREP_UNICODE),
])
@@ -631,14 +615,11 @@ def bench_subtitles_ru_alternate(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg (lines)', ['rg', '-n', pat, ru]),
Command('ag (lines)', ['ag', '-s', pat, ru]),
Command('ucg (lines)', ['ucg', '--nosmart-case', pat, ru]),
Command('grep (lines)', [
'grep', '-E', '-an', pat, ru,
], env=GREP_ASCII),
Command('grep (lines)', ['grep', '-E', '-n', pat, ru], env=GREP_ASCII),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (lines)', ['ugrep', '-an', pat, ru]),
Command('rg', ['rg', pat, ru]),
Command('grep', [
'grep', '-E', '-a', pat, ru,
], env=GREP_ASCII),
Command('grep', ['grep', '-E', pat, ru], env=GREP_ASCII),
])
@@ -658,12 +639,13 @@ def bench_subtitles_ru_alternate_casei(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('ag (ASCII)', ['ag', '-s', '-i', pat, ru]),
Command('ucg (ASCII)', ['ucg', '-i', pat, ru]),
Command('grep (ASCII)', [
'grep', '-E', '-ani', pat, ru,
'grep', '-E', '-ni', pat, ru,
], env=GREP_ASCII),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (ASCII)', ['ugrep', '-ani', pat, ru]),
Command('rg', ['rg', '-n', '-i', pat, ru]),
Command('grep', ['grep', '-E', '-ani', pat, ru], env=GREP_UNICODE),
Command('grep', ['grep', '-E', '-ni', pat, ru], env=GREP_UNICODE),
])
@@ -677,12 +659,12 @@ def bench_subtitles_ru_surrounding_words(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-n', pat, ru]),
Command('grep', ['grep', '-E', '-an', pat, ru], env=GREP_UNICODE),
Command('grep', ['grep', '-E', '-n', pat, ru], env=GREP_UNICODE),
Command('ugrep', ['ugrep', '-an', pat, ru]),
Command('ag (ASCII)', ['ag', '-s', pat, ru]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, ru]),
Command('grep (ASCII)', [
'grep', '-E', '-an', pat, ru,
], env=GREP_ASCII),
Command('grep (ASCII)', ['grep', '-E', '-n', pat, ru], env=GREP_ASCII),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (ASCII)', ['ugrep', '-a', '-n', '-U', pat, ru]),
])
@@ -701,12 +683,13 @@ def bench_subtitles_ru_no_literal(suite_dir):
return Benchmark(pattern=pat, commands=[
Command('rg', ['rg', '-n', pat, ru]),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep', ['ugrep', '-an', pat, ru]),
Command('rg (ASCII)', ['rg', '-n', '(?-u)' + pat, ru]),
Command('ag (ASCII)', ['ag', '-s', pat, ru]),
Command('ucg (ASCII)', ['ucg', '--nosmart-case', pat, ru]),
Command('grep (ASCII)', [
'grep', '-E', '-an', pat, ru,
], env=GREP_ASCII),
Command('grep (ASCII)', ['grep', '-E', '-n', pat, ru], env=GREP_ASCII),
# See bench_subtitles_ru_literal for why we use '-a' here.
Command('ugrep (ASCII)', ['ugrep', '-anU', pat, ru])
])
@@ -756,7 +739,7 @@ class Benchmark(object):
def __init__(self, name=None, pattern=None, commands=None,
warmup_count=1, count=3, line_count=True,
allow_missing_commands=False,
disabled_cmds=None):
disabled_cmds=None, order=0):
'''
Create a single benchmark.
@@ -792,6 +775,8 @@ class Benchmark(object):
will simply skip it.
:param list(str) disabled_cmds:
A list of commands to skip.
:param int order:
An integer indicating the sequence number of this benchmark.
'''
self.name = name
self.pattern = pattern
@@ -801,6 +786,7 @@ class Benchmark(object):
self.line_count = line_count
self.allow_missing_commands = allow_missing_commands
self.disabled_cmds = set(disabled_cmds or [])
self.order = order
def raise_if_missing(self):
'''
@@ -894,7 +880,7 @@ class Result(object):
'''
Create a new set of results, initially empty.
:param Benchmarl benchmark:
:param Benchmark benchmark:
The benchmark that produced these results.
'''
self.benchmark = benchmark
@@ -1088,7 +1074,7 @@ def download_subtitles_en(suite_dir):
# benchmarks finish in a reasonable time.
with open(path.join(subtitle_dir, en_path_sample), 'wb+') as f:
run_cmd(
['head', '-n', '32722372', en_path],
['head', '-n', '55000000', en_path],
cwd=subtitle_dir, stdout=f)
@@ -1163,19 +1149,22 @@ def collect_benchmarks(suite_dir, filter_pat=None,
requires corpora that are missing, then a log message is
emitted to stderr and it is not yielded.
'''
for fun in sorted(globals()):
if not fun.startswith('bench_'):
benchmarks = []
for global_name in globals():
if not global_name.startswith('bench_'):
continue
name = re.sub('^bench_', '', fun)
name = re.sub('^bench_', '', global_name)
if filter_pat is not None and not re.search(filter_pat, name):
continue
try:
benchmark = globals()[fun](suite_dir)
fun = globals()[global_name]
benchmark = fun(suite_dir)
benchmark.name = name
benchmark.warmup_count = warmup_iter
benchmark.count = bench_iter
benchmark.allow_missing_commands = allow_missing_commands
benchmark.disabled_cmds = disabled_cmds
benchmark.order = fun.__code__.co_firstlineno
benchmark.raise_if_missing()
except MissingDependencies as e:
eprint(
@@ -1190,7 +1179,8 @@ def collect_benchmarks(suite_dir, filter_pat=None,
'(run with --allow-missing to run incomplete benchmarks)'
eprint(fmt % (', '.join(e.missing_names), name))
continue
yield benchmark
benchmarks.append(benchmark)
return sorted(benchmarks, key=lambda b: b.order)
def main():

View File

@@ -0,0 +1,37 @@
This directory contains updated benchmarks as of 2020-10-14. They were captured
via the benchsuite script at `benchsuite/benchsuite` from the root of this
repository. The command that was run:
$ ./benchsuite \
--dir /tmp/benchsuite \
--raw runs/2020-10-14-archlinux-frink/raw.csv \
--warmup-iter 1 \
--bench-iter 5
The versions of each tool are as follows:
$ rg --version
ripgrep 12.1.1 (rev def993bad1)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)
$ grep -V
grep (GNU grep) 3.4
$ ag -V
ag version 2.2.0
Features:
+jit +lzma +zlib
$ git --version
git version 2.28.0
$ ugrep --version
ugrep 3.0.2 x86_64-pc-linux-gnu +avx2 +pcre2_jit +zlib +bzip2 +lzma +lz4
License BSD-3-Clause: <https://opensource.org/licenses/BSD-3-Clause>
Written by Robert van Engelen and others: <https://github.com/Genivia/ugrep>
The version of ripgrep used was compiled from source on commit def993bad1:
$ cargo build --release --features 'pcre2'

View File

@@ -0,0 +1,671 @@
benchmark,warmup_iter,iter,name,command,duration,lines,env
linux_literal_default,1,5,rg,rg PM_RESUME,0.12675833702087402,19,
linux_literal_default,1,5,rg,rg PM_RESUME,0.1196434497833252,19,
linux_literal_default,1,5,rg,rg PM_RESUME,0.12096214294433594,19,
linux_literal_default,1,5,rg,rg PM_RESUME,0.1257617473602295,19,
linux_literal_default,1,5,rg,rg PM_RESUME,0.12903356552124023,19,
linux_literal_default,1,5,ag,ag PM_RESUME,0.8575565814971924,19,
linux_literal_default,1,5,ag,ag PM_RESUME,0.9113664627075195,19,
linux_literal_default,1,5,ag,ag PM_RESUME,0.944256067276001,19,
linux_literal_default,1,5,ag,ag PM_RESUME,0.5309450626373291,19,
linux_literal_default,1,5,ag,ag PM_RESUME,0.6105470657348633,19,
linux_literal_default,1,5,git grep,git grep PM_RESUME,0.49039149284362793,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,git grep,git grep PM_RESUME,0.48095154762268066,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,git grep,git grep PM_RESUME,0.48927950859069824,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,git grep,git grep PM_RESUME,0.47182321548461914,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,git grep,git grep PM_RESUME,0.46923041343688965,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,ugrep,ugrep -r PM_RESUME ./,0.13612771034240723,19,
linux_literal_default,1,5,ugrep,ugrep -r PM_RESUME ./,0.13677191734313965,19,
linux_literal_default,1,5,ugrep,ugrep -r PM_RESUME ./,0.13688087463378906,19,
linux_literal_default,1,5,ugrep,ugrep -r PM_RESUME ./,0.13218474388122559,19,
linux_literal_default,1,5,ugrep,ugrep -r PM_RESUME ./,0.13851046562194824,19,
linux_literal_default,1,5,grep,grep -r PM_RESUME ./,1.1436240673065186,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,grep,grep -r PM_RESUME ./,1.1436970233917236,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,grep,grep -r PM_RESUME ./,1.1542651653289795,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,grep,grep -r PM_RESUME ./,1.14790940284729,19,LC_ALL=en_US.UTF-8
linux_literal_default,1,5,grep,grep -r PM_RESUME ./,1.1441664695739746,19,LC_ALL=en_US.UTF-8
linux_literal,1,5,rg,rg -n PM_RESUME,0.134232759475708,19,
linux_literal,1,5,rg,rg -n PM_RESUME,0.12477993965148926,19,
linux_literal,1,5,rg,rg -n PM_RESUME,0.11790871620178223,19,
linux_literal,1,5,rg,rg -n PM_RESUME,0.13471150398254395,19,
linux_literal,1,5,rg,rg -n PM_RESUME,0.13730239868164062,19,
linux_literal,1,5,rg (mmap),rg -n --mmap PM_RESUME,1.2953157424926758,19,
linux_literal,1,5,rg (mmap),rg -n --mmap PM_RESUME,1.3263885974884033,19,
linux_literal,1,5,rg (mmap),rg -n --mmap PM_RESUME,1.320932388305664,19,
linux_literal,1,5,rg (mmap),rg -n --mmap PM_RESUME,1.3446438312530518,19,
linux_literal,1,5,rg (mmap),rg -n --mmap PM_RESUME,1.3919141292572021,19,
linux_literal,1,5,ag (mmap),ag -s PM_RESUME,0.7901346683502197,19,
linux_literal,1,5,ag (mmap),ag -s PM_RESUME,0.9647164344787598,19,
linux_literal,1,5,ag (mmap),ag -s PM_RESUME,0.8800022602081299,19,
linux_literal,1,5,ag (mmap),ag -s PM_RESUME,0.9307558536529541,19,
linux_literal,1,5,ag (mmap),ag -s PM_RESUME,0.8346366882324219,19,
linux_literal,1,5,git grep,git grep -I -n PM_RESUME,0.4694955348968506,19,LC_ALL=C
linux_literal,1,5,git grep,git grep -I -n PM_RESUME,0.4620368480682373,19,LC_ALL=C
linux_literal,1,5,git grep,git grep -I -n PM_RESUME,0.4673285484313965,19,LC_ALL=C
linux_literal,1,5,git grep,git grep -I -n PM_RESUME,0.4570960998535156,19,LC_ALL=C
linux_literal,1,5,git grep,git grep -I -n PM_RESUME,0.4648761749267578,19,LC_ALL=C
linux_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.3233473300933838,19,
linux_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.3199331760406494,19,
linux_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.29825615882873535,19,
linux_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.3003232479095459,19,
linux_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.30283141136169434,19,
linux_literal_casei,1,5,rg,rg -n -i PM_RESUME,0.1349015235900879,456,
linux_literal_casei,1,5,rg,rg -n -i PM_RESUME,0.1277780532836914,456,
linux_literal_casei,1,5,rg,rg -n -i PM_RESUME,0.1251516342163086,456,
linux_literal_casei,1,5,rg,rg -n -i PM_RESUME,0.12959671020507812,456,
linux_literal_casei,1,5,rg,rg -n -i PM_RESUME,0.1374528408050537,456,
linux_literal_casei,1,5,rg (mmap),rg -n -i --mmap PM_RESUME,1.3468265533447266,456,
linux_literal_casei,1,5,rg (mmap),rg -n -i --mmap PM_RESUME,1.3552894592285156,456,
linux_literal_casei,1,5,rg (mmap),rg -n -i --mmap PM_RESUME,1.3028552532196045,456,
linux_literal_casei,1,5,rg (mmap),rg -n -i --mmap PM_RESUME,1.336735725402832,456,
linux_literal_casei,1,5,rg (mmap),rg -n -i --mmap PM_RESUME,1.338634729385376,456,
linux_literal_casei,1,5,ag (mmap),ag -i PM_RESUME,0.5562450885772705,456,
linux_literal_casei,1,5,ag (mmap),ag -i PM_RESUME,0.7324790954589844,456,
linux_literal_casei,1,5,ag (mmap),ag -i PM_RESUME,0.8382794857025146,456,
linux_literal_casei,1,5,ag (mmap),ag -i PM_RESUME,0.5817627906799316,456,
linux_literal_casei,1,5,ag (mmap),ag -i PM_RESUME,0.5771033763885498,456,
linux_literal_casei,1,5,git grep,git grep -I -n -i PM_RESUME,0.48885059356689453,456,LC_ALL=C
linux_literal_casei,1,5,git grep,git grep -I -n -i PM_RESUME,0.4838893413543701,456,LC_ALL=C
linux_literal_casei,1,5,git grep,git grep -I -n -i PM_RESUME,0.48733997344970703,456,LC_ALL=C
linux_literal_casei,1,5,git grep,git grep -I -n -i PM_RESUME,0.4765594005584717,456,LC_ALL=C
linux_literal_casei,1,5,git grep,git grep -I -n -i PM_RESUME,0.47402334213256836,456,LC_ALL=C
linux_literal_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.3075406551361084,456,
linux_literal_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.2922379970550537,456,
linux_literal_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.2901036739349365,456,
linux_literal_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.2723674774169922,456,
linux_literal_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.2762429714202881,456,
linux_re_literal_suffix,1,5,rg,rg -n [A-Z]+_RESUME,0.12853646278381348,1944,
linux_re_literal_suffix,1,5,rg,rg -n [A-Z]+_RESUME,0.1190040111541748,1944,
linux_re_literal_suffix,1,5,rg,rg -n [A-Z]+_RESUME,0.14054393768310547,1944,
linux_re_literal_suffix,1,5,rg,rg -n [A-Z]+_RESUME,0.12263894081115723,1944,
linux_re_literal_suffix,1,5,rg,rg -n [A-Z]+_RESUME,0.12101268768310547,1944,
linux_re_literal_suffix,1,5,ag,ag -s [A-Z]+_RESUME,0.9220716953277588,1944,
linux_re_literal_suffix,1,5,ag,ag -s [A-Z]+_RESUME,1.009810209274292,1944,
linux_re_literal_suffix,1,5,ag,ag -s [A-Z]+_RESUME,0.9654982089996338,1944,
linux_re_literal_suffix,1,5,ag,ag -s [A-Z]+_RESUME,1.2758586406707764,1944,
linux_re_literal_suffix,1,5,ag,ag -s [A-Z]+_RESUME,1.0480666160583496,1944,
linux_re_literal_suffix,1,5,git grep,git grep -E -I -n [A-Z]+_RESUME,1.1811027526855469,1944,LC_ALL=C
linux_re_literal_suffix,1,5,git grep,git grep -E -I -n [A-Z]+_RESUME,1.1824719905853271,1944,LC_ALL=C
linux_re_literal_suffix,1,5,git grep,git grep -E -I -n [A-Z]+_RESUME,1.2052066326141357,1944,LC_ALL=C
linux_re_literal_suffix,1,5,git grep,git grep -E -I -n [A-Z]+_RESUME,1.224193811416626,1944,LC_ALL=C
linux_re_literal_suffix,1,5,git grep,git grep -E -I -n [A-Z]+_RESUME,1.2896029949188232,1944,LC_ALL=C
linux_re_literal_suffix,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.5580098628997803,1944,
linux_re_literal_suffix,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.5409820079803467,1944,
linux_re_literal_suffix,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.5436761379241943,1944,
linux_re_literal_suffix,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.5317332744598389,1944,
linux_re_literal_suffix,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.5662341117858887,1944,
linux_word,1,5,rg,rg -n -w PM_RESUME,0.13112211227416992,6,
linux_word,1,5,rg,rg -n -w PM_RESUME,0.13633346557617188,6,
linux_word,1,5,rg,rg -n -w PM_RESUME,0.1308743953704834,6,
linux_word,1,5,rg,rg -n -w PM_RESUME,0.13691973686218262,6,
linux_word,1,5,rg,rg -n -w PM_RESUME,0.1369326114654541,6,
linux_word,1,5,ag,ag -s -w PM_RESUME,0.5965347290039062,6,
linux_word,1,5,ag,ag -s -w PM_RESUME,0.8891518115997314,6,
linux_word,1,5,ag,ag -s -w PM_RESUME,0.5207972526550293,6,
linux_word,1,5,ag,ag -s -w PM_RESUME,0.5551142692565918,6,
linux_word,1,5,ag,ag -s -w PM_RESUME,0.5308854579925537,6,
linux_word,1,5,git grep,git grep -E -I -n -w PM_RESUME,0.45984363555908203,6,LC_ALL=C
linux_word,1,5,git grep,git grep -E -I -n -w PM_RESUME,0.47351694107055664,6,LC_ALL=C
linux_word,1,5,git grep,git grep -E -I -n -w PM_RESUME,0.5011758804321289,6,LC_ALL=C
linux_word,1,5,git grep,git grep -E -I -n -w PM_RESUME,0.45740509033203125,6,LC_ALL=C
linux_word,1,5,git grep,git grep -E -I -n -w PM_RESUME,0.46122002601623535,6,LC_ALL=C
linux_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.3174629211425781,6,
linux_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.32368993759155273,6,
linux_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.3131399154663086,6,
linux_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.2834908962249756,6,
linux_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.2899782657623291,6,
linux_unicode_greek,1,5,rg,rg -n \p{Greek},0.2624638080596924,105,
linux_unicode_greek,1,5,rg,rg -n \p{Greek},0.26248669624328613,105,
linux_unicode_greek,1,5,rg,rg -n \p{Greek},0.26514244079589844,105,
linux_unicode_greek,1,5,rg,rg -n \p{Greek},0.26303768157958984,105,
linux_unicode_greek,1,5,rg,rg -n \p{Greek},0.2612752914428711,105,
linux_unicode_greek,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.2842683792114258,105,
linux_unicode_greek,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.2718374729156494,105,
linux_unicode_greek,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.26900339126586914,105,
linux_unicode_greek,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.267728328704834,105,
linux_unicode_greek,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.27019381523132324,105,
linux_unicode_greek_casei,1,5,rg,rg -n -i \p{Greek},0.24460315704345703,225,
linux_unicode_greek_casei,1,5,rg,rg -n -i \p{Greek},0.2752077579498291,225,
linux_unicode_greek_casei,1,5,rg,rg -n -i \p{Greek},0.25118350982666016,225,
linux_unicode_greek_casei,1,5,rg,rg -n -i \p{Greek},0.2610158920288086,225,
linux_unicode_greek_casei,1,5,rg,rg -n -i \p{Greek},0.24675774574279785,225,
linux_unicode_greek_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.26882410049438477,105,
linux_unicode_greek_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.2770118713378906,105,
linux_unicode_greek_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.2694118022918701,105,
linux_unicode_greek_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.2690916061401367,105,
linux_unicode_greek_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.2686276435852051,105,
linux_unicode_word,1,5,rg,rg -n \wAh,0.13727664947509766,229,
linux_unicode_word,1,5,rg,rg -n \wAh,0.1450798511505127,229,
linux_unicode_word,1,5,rg,rg -n \wAh,0.13819336891174316,229,
linux_unicode_word,1,5,rg,rg -n \wAh,0.1422877311706543,229,
linux_unicode_word,1,5,rg,rg -n \wAh,0.13657712936401367,229,
linux_unicode_word,1,5,rg (ASCII),rg -n (?-u)\wAh,0.1487271785736084,216,
linux_unicode_word,1,5,rg (ASCII),rg -n (?-u)\wAh,0.1459641456604004,216,
linux_unicode_word,1,5,rg (ASCII),rg -n (?-u)\wAh,0.13515281677246094,216,
linux_unicode_word,1,5,rg (ASCII),rg -n (?-u)\wAh,0.12724566459655762,216,
linux_unicode_word,1,5,rg (ASCII),rg -n (?-u)\wAh,0.13360023498535156,216,
linux_unicode_word,1,5,ag (ASCII),ag -s \wAh,1.2160453796386719,216,
linux_unicode_word,1,5,ag (ASCII),ag -s \wAh,1.230163335800171,216,
linux_unicode_word,1,5,ag (ASCII),ag -s \wAh,1.2649273872375488,216,
linux_unicode_word,1,5,ag (ASCII),ag -s \wAh,1.224984884262085,216,
linux_unicode_word,1,5,ag (ASCII),ag -s \wAh,1.4559555053710938,216,
linux_unicode_word,1,5,git grep,git grep -E -I -n \wAh,8.233768224716187,229,LC_ALL=en_US.UTF-8
linux_unicode_word,1,5,git grep,git grep -E -I -n \wAh,8.191053867340088,229,LC_ALL=en_US.UTF-8
linux_unicode_word,1,5,git grep,git grep -E -I -n \wAh,8.175920724868774,229,LC_ALL=en_US.UTF-8
linux_unicode_word,1,5,git grep,git grep -E -I -n \wAh,8.167959451675415,229,LC_ALL=en_US.UTF-8
linux_unicode_word,1,5,git grep,git grep -E -I -n \wAh,8.1710205078125,229,LC_ALL=en_US.UTF-8
linux_unicode_word,1,5,git grep (ASCII),git grep -E -I -n \wAh,2.3747494220733643,216,LC_ALL=C
linux_unicode_word,1,5,git grep (ASCII),git grep -E -I -n \wAh,2.3170926570892334,216,LC_ALL=C
linux_unicode_word,1,5,git grep (ASCII),git grep -E -I -n \wAh,2.3430888652801514,216,LC_ALL=C
linux_unicode_word,1,5,git grep (ASCII),git grep -E -I -n \wAh,2.3219168186187744,216,LC_ALL=C
linux_unicode_word,1,5,git grep (ASCII),git grep -E -I -n \wAh,2.3155832290649414,216,LC_ALL=C
linux_unicode_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.2722008228302002,229,
linux_unicode_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.27547430992126465,229,
linux_unicode_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.2771613597869873,229,
linux_unicode_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.27692317962646484,229,
linux_unicode_word,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.27749085426330566,229,
linux_unicode_word,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.2744929790496826,216,
linux_unicode_word,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.2725999355316162,216,
linux_unicode_word,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.27443718910217285,216,
linux_unicode_word,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.2668039798736572,216,
linux_unicode_word,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.27918338775634766,216,
linux_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.38802123069763184,611,
linux_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.40351152420043945,611,
linux_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.40592288970947266,611,
linux_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.40622901916503906,611,
linux_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.40683722496032715,611,
linux_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2553420066833496,610,
linux_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2511327266693115,610,
linux_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2530384063720703,610,
linux_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2420644760131836,610,
linux_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2691671848297119,610,
linux_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.9446702003479004,971,
linux_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.9380638599395752,971,
linux_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.9273786544799805,971,
linux_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.9271430969238281,971,
linux_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.9307007789611816,971,
linux_no_literal,1,5,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},14.531656265258789,611,LC_ALL=en_US.UTF-8
linux_no_literal,1,5,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},14.570266008377075,611,LC_ALL=en_US.UTF-8
linux_no_literal,1,5,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},14.51328158378601,611,LC_ALL=en_US.UTF-8
linux_no_literal,1,5,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},14.644389629364014,611,LC_ALL=en_US.UTF-8
linux_no_literal,1,5,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},14.694648027420044,611,LC_ALL=en_US.UTF-8
linux_no_literal,1,5,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},3.164829730987549,610,LC_ALL=C
linux_no_literal,1,5,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},3.2377045154571533,610,LC_ALL=C
linux_no_literal,1,5,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},3.1798932552337646,610,LC_ALL=C
linux_no_literal,1,5,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},3.142343044281006,610,LC_ALL=C
linux_no_literal,1,5,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},3.185952663421631,610,LC_ALL=C
linux_no_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,6.241358041763306,973,
linux_no_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,6.213250637054443,973,
linux_no_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,6.242088079452515,973,
linux_no_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,6.126717567443848,973,
linux_no_literal,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,6.15744948387146,973,
linux_no_literal,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.3647449016571045,972,
linux_no_literal,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.36277341842651367,972,
linux_no_literal,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.3670034408569336,972,
linux_no_literal,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.3563535213470459,972,
linux_no_literal,1,5,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.36490702629089355,972,
linux_alternates,1,5,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.14299488067626953,112,
linux_alternates,1,5,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.15548348426818848,112,
linux_alternates,1,5,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.14477276802062988,112,
linux_alternates,1,5,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.12926578521728516,112,
linux_alternates,1,5,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.13896560668945312,112,
linux_alternates,1,5,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9893472194671631,112,
linux_alternates,1,5,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,1.016686201095581,112,
linux_alternates,1,5,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9755496978759766,112,
linux_alternates,1,5,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9718713760375977,112,
linux_alternates,1,5,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,1.0030465126037598,112,
linux_alternates,1,5,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5737886428833008,112,LC_ALL=C
linux_alternates,1,5,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.562185525894165,112,LC_ALL=C
linux_alternates,1,5,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5762710571289062,112,LC_ALL=C
linux_alternates,1,5,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5561251640319824,112,LC_ALL=C
linux_alternates,1,5,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5849525928497314,112,LC_ALL=C
linux_alternates,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.3186032772064209,112,
linux_alternates,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.2896738052368164,112,
linux_alternates,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.28582000732421875,112,
linux_alternates,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.2837677001953125,112,
linux_alternates,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.27143406867980957,112,
linux_alternates_casei,1,5,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.21955585479736328,203,
linux_alternates_casei,1,5,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.22631502151489258,203,
linux_alternates_casei,1,5,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.23458337783813477,203,
linux_alternates_casei,1,5,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.21781086921691895,203,
linux_alternates_casei,1,5,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.231217622756958,203,
linux_alternates_casei,1,5,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.7170076370239258,203,
linux_alternates_casei,1,5,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.7032256126403809,203,
linux_alternates_casei,1,5,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.6868026256561279,203,
linux_alternates_casei,1,5,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.6965539455413818,203,
linux_alternates_casei,1,5,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.6966633796691895,203,
linux_alternates_casei,1,5,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9774580001831055,203,LC_ALL=C
linux_alternates_casei,1,5,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9654648303985596,203,LC_ALL=C
linux_alternates_casei,1,5,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.967714786529541,203,LC_ALL=C
linux_alternates_casei,1,5,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9789888858795166,203,LC_ALL=C
linux_alternates_casei,1,5,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.9938976764678955,203,LC_ALL=C
linux_alternates_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.2825000286102295,203,
linux_alternates_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.27024054527282715,203,
linux_alternates_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.27353668212890625,203,
linux_alternates_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.27333736419677734,203,
linux_alternates_casei,1,5,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.2730555534362793,203,
subtitles_en_literal,1,5,rg,rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.2259538173675537,830,
subtitles_en_literal,1,5,rg,rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.22034168243408203,830,
subtitles_en_literal,1,5,rg,rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.22986674308776855,830,
subtitles_en_literal,1,5,rg,rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.22815775871276855,830,
subtitles_en_literal,1,5,rg,rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.2238922119140625,830,
subtitles_en_literal,1,5,rg (no mmap),rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.36427783966064453,830,
subtitles_en_literal,1,5,rg (no mmap),rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.37499117851257324,830,
subtitles_en_literal,1,5,rg (no mmap),rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.36223769187927246,830,
subtitles_en_literal,1,5,rg (no mmap),rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3646128177642822,830,
subtitles_en_literal,1,5,rg (no mmap),rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.36281347274780273,830,
subtitles_en_literal,1,5,grep,grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.8064453601837158,830,LC_ALL=C
subtitles_en_literal,1,5,grep,grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.8001935482025146,830,LC_ALL=C
subtitles_en_literal,1,5,grep,grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.8018591403961182,830,LC_ALL=C
subtitles_en_literal,1,5,grep,grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.7978458404541016,830,LC_ALL=C
subtitles_en_literal,1,5,grep,grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.7912843227386475,830,LC_ALL=C
subtitles_en_literal,1,5,rg (lines),rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.31099891662597656,830,
subtitles_en_literal,1,5,rg (lines),rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3145768642425537,830,
subtitles_en_literal,1,5,rg (lines),rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.30507469177246094,830,
subtitles_en_literal,1,5,rg (lines),rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3450126647949219,830,
subtitles_en_literal,1,5,rg (lines),rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.31091880798339844,830,
subtitles_en_literal,1,5,ag (lines),ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5518174171447754,830,
subtitles_en_literal,1,5,ag (lines),ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.551568031311035,830,
subtitles_en_literal,1,5,ag (lines),ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5306365489959717,830,
subtitles_en_literal,1,5,ag (lines),ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.537529468536377,830,
subtitles_en_literal,1,5,ag (lines),ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5627124309539795,830,
subtitles_en_literal,1,5,grep (lines),grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2934913635253906,830,LC_ALL=C
subtitles_en_literal,1,5,grep (lines),grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2990975379943848,830,LC_ALL=C
subtitles_en_literal,1,5,grep (lines),grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2942156791687012,830,LC_ALL=C
subtitles_en_literal,1,5,grep (lines),grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2887969017028809,830,LC_ALL=C
subtitles_en_literal,1,5,grep (lines),grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2922444343566895,830,LC_ALL=C
subtitles_en_literal,1,5,ugrep (lines),ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3939177989959717,830,
subtitles_en_literal,1,5,ugrep (lines),ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3916018009185791,830,
subtitles_en_literal,1,5,ugrep (lines),ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.40460968017578125,830,
subtitles_en_literal,1,5,ugrep (lines),ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.41738367080688477,830,
subtitles_en_literal,1,5,ugrep (lines),ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.41339826583862305,830,
subtitles_en_literal_casei,1,5,rg,rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.37847900390625,871,
subtitles_en_literal_casei,1,5,rg,rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3692331314086914,871,
subtitles_en_literal_casei,1,5,rg,rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.40493106842041016,871,
subtitles_en_literal_casei,1,5,rg,rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4074361324310303,871,
subtitles_en_literal_casei,1,5,rg,rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4297189712524414,871,
subtitles_en_literal_casei,1,5,grep,grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,3.63842511177063,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,5,grep,grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,3.6366350650787354,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,5,grep,grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,3.6044440269470215,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,5,grep,grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,3.6123127937316895,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,5,grep,grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,3.6119742393493652,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,5,grep (ASCII),grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.917151689529419,871,LC_ALL=C
subtitles_en_literal_casei,1,5,grep (ASCII),grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.9379458427429199,871,LC_ALL=C
subtitles_en_literal_casei,1,5,grep (ASCII),grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.9703550338745117,871,LC_ALL=C
subtitles_en_literal_casei,1,5,grep (ASCII),grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.9309988021850586,871,LC_ALL=C
subtitles_en_literal_casei,1,5,grep (ASCII),grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.9328129291534424,871,LC_ALL=C
subtitles_en_literal_casei,1,5,rg (lines),rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.5196061134338379,871,
subtitles_en_literal_casei,1,5,rg (lines),rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.5225450992584229,871,
subtitles_en_literal_casei,1,5,rg (lines),rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4856400489807129,871,
subtitles_en_literal_casei,1,5,rg (lines),rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.5204241275787354,871,
subtitles_en_literal_casei,1,5,rg (lines),rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.5224106311798096,871,
subtitles_en_literal_casei,1,5,ag (lines) (ASCII),ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5935003757476807,871,
subtitles_en_literal_casei,1,5,ag (lines) (ASCII),ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.640918016433716,871,
subtitles_en_literal_casei,1,5,ag (lines) (ASCII),ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.602182626724243,871,
subtitles_en_literal_casei,1,5,ag (lines) (ASCII),ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.575654983520508,871,
subtitles_en_literal_casei,1,5,ag (lines) (ASCII),ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5606820583343506,871,
subtitles_en_literal_casei,1,5,ugrep (lines),ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.0980546474456787,871,
subtitles_en_literal_casei,1,5,ugrep (lines),ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.095038652420044,871,
subtitles_en_literal_casei,1,5,ugrep (lines),ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.0974702835083008,871,
subtitles_en_literal_casei,1,5,ugrep (lines),ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.113879919052124,871,
subtitles_en_literal_casei,1,5,ugrep (lines),ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.1096961498260498,871,
subtitles_en_literal_word,1,5,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt,0.3175060749053955,830,
subtitles_en_literal_word,1,5,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt,0.321685791015625,830,
subtitles_en_literal_word,1,5,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt,0.30799293518066406,830,
subtitles_en_literal_word,1,5,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt,0.31140613555908203,830,
subtitles_en_literal_word,1,5,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt,0.32439208030700684,830,
subtitles_en_literal_word,1,5,ag (ASCII),ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5530965328216553,830,
subtitles_en_literal_word,1,5,ag (ASCII),ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5833561420440674,830,
subtitles_en_literal_word,1,5,ag (ASCII),ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5765762329101562,830,
subtitles_en_literal_word,1,5,ag (ASCII),ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.610975742340088,830,
subtitles_en_literal_word,1,5,ag (ASCII),ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,2.5965471267700195,830,
subtitles_en_literal_word,1,5,grep (ASCII),grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.3212966918945312,830,LC_ALL=C
subtitles_en_literal_word,1,5,grep (ASCII),grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.311401128768921,830,LC_ALL=C
subtitles_en_literal_word,1,5,grep (ASCII),grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.298889398574829,830,LC_ALL=C
subtitles_en_literal_word,1,5,grep (ASCII),grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.316542148590088,830,LC_ALL=C
subtitles_en_literal_word,1,5,grep (ASCII),grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.3483500480651855,830,LC_ALL=C
subtitles_en_literal_word,1,5,ugrep (ASCII),ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4127326011657715,830,
subtitles_en_literal_word,1,5,ugrep (ASCII),ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4138009548187256,830,
subtitles_en_literal_word,1,5,ugrep (ASCII),ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4203319549560547,830,
subtitles_en_literal_word,1,5,ugrep (ASCII),ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.4127979278564453,830,
subtitles_en_literal_word,1,5,ugrep (ASCII),ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.41126537322998047,830,
subtitles_en_literal_word,1,5,rg,rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3251321315765381,830,
subtitles_en_literal_word,1,5,rg,rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.31773900985717773,830,
subtitles_en_literal_word,1,5,rg,rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.32987523078918457,830,
subtitles_en_literal_word,1,5,rg,rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.32228970527648926,830,
subtitles_en_literal_word,1,5,rg,rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,0.3207516670227051,830,
subtitles_en_literal_word,1,5,grep,grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.2946159839630127,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,5,grep,grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.333972454071045,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,5,grep,grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.3002500534057617,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,5,grep,grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.347550630569458,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,5,grep,grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt,1.306572675704956,830,LC_ALL=en_US.UTF-8
subtitles_en_alternate,1,5,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.4178187847137451,1094,
subtitles_en_alternate,1,5,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.44626832008361816,1094,
subtitles_en_alternate,1,5,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.44959425926208496,1094,
subtitles_en_alternate,1,5,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.38634324073791504,1094,
subtitles_en_alternate,1,5,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.4460463523864746,1094,
subtitles_en_alternate,1,5,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.6045682430267334,1094,
subtitles_en_alternate,1,5,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.6191344261169434,1094,
subtitles_en_alternate,1,5,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.579859972000122,1094,
subtitles_en_alternate,1,5,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.6637580394744873,1094,
subtitles_en_alternate,1,5,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.5728182792663574,1094,
subtitles_en_alternate,1,5,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.323948621749878,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.3338429927825928,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.34714937210083,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.314117908477783,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,3.303710699081421,1094,LC_ALL=C
subtitles_en_alternate,1,5,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.147033452987671,1094,
subtitles_en_alternate,1,5,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.2054970264434814,1094,
subtitles_en_alternate,1,5,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.0998892784118652,1094,
subtitles_en_alternate,1,5,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.101989984512329,1094,
subtitles_en_alternate,1,5,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.110612154006958,1094,
subtitles_en_alternate,1,5,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.29009222984313965,1094,
subtitles_en_alternate,1,5,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.29300451278686523,1094,
subtitles_en_alternate,1,5,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.3199915885925293,1094,
subtitles_en_alternate,1,5,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.3187263011932373,1094,
subtitles_en_alternate,1,5,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.30321288108825684,1094,
subtitles_en_alternate,1,5,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,2.813009738922119,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,2.80930757522583,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,2.814509153366089,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,2.8390560150146484,1094,LC_ALL=C
subtitles_en_alternate,1,5,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,2.830871105194092,1094,LC_ALL=C
subtitles_en_alternate_casei,1,5,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,6.166510343551636,1136,
subtitles_en_alternate_casei,1,5,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,6.192304849624634,1136,
subtitles_en_alternate_casei,1,5,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,6.185140132904053,1136,
subtitles_en_alternate_casei,1,5,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,6.20132040977478,1136,
subtitles_en_alternate_casei,1,5,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,6.159040451049805,1136,
subtitles_en_alternate_casei,1,5,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.523138999938965,1136,LC_ALL=C
subtitles_en_alternate_casei,1,5,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.512346267700195,1136,LC_ALL=C
subtitles_en_alternate_casei,1,5,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.562563896179199,1136,LC_ALL=C
subtitles_en_alternate_casei,1,5,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.533160448074341,1136,LC_ALL=C
subtitles_en_alternate_casei,1,5,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.504830837249756,1136,LC_ALL=C
subtitles_en_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.1120033264160156,1136,
subtitles_en_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.1150739192962646,1136,
subtitles_en_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.1018304824829102,1136,
subtitles_en_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.1106996536254883,1136,
subtitles_en_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,1.0994808673858643,1136,
subtitles_en_alternate_casei,1,5,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.8494291305541992,1136,
subtitles_en_alternate_casei,1,5,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.7878148555755615,1136,
subtitles_en_alternate_casei,1,5,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.8290884494781494,1136,
subtitles_en_alternate_casei,1,5,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.7409803867340088,1136,
subtitles_en_alternate_casei,1,5,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,0.7880558967590332,1136,
subtitles_en_alternate_casei,1,5,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.5523765087127686,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,5,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.527086019515991,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,5,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.740911483764648,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,5,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.520638465881348,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,5,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt,5.52523398399353,1136,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,rg,rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3353078365325928,483,
subtitles_en_surrounding_words,1,5,rg,rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3248591423034668,483,
subtitles_en_surrounding_words,1,5,rg,rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.33918261528015137,483,
subtitles_en_surrounding_words,1,5,rg,rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.33177971839904785,483,
subtitles_en_surrounding_words,1,5,rg,rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.34472131729125977,483,
subtitles_en_surrounding_words,1,5,grep,grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7516274452209473,483,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,grep,grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7489221096038818,483,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,grep,grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7574889659881592,483,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,grep,grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.813244342803955,483,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,grep,grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.750051498413086,483,LC_ALL=en_US.UTF-8
subtitles_en_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,70.12419986724854,489,
subtitles_en_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,70.26925611495972,489,
subtitles_en_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,70.56865787506104,489,
subtitles_en_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,70.12933135032654,489,
subtitles_en_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,70.07925295829773,489,
subtitles_en_surrounding_words,1,5,rg (ASCII),rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3309454917907715,483,
subtitles_en_surrounding_words,1,5,rg (ASCII),rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.33062124252319336,483,
subtitles_en_surrounding_words,1,5,rg (ASCII),rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3292708396911621,483,
subtitles_en_surrounding_words,1,5,rg (ASCII),rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3300509452819824,483,
subtitles_en_surrounding_words,1,5,rg (ASCII),rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,0.3252389430999756,483,
subtitles_en_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,7.372813701629639,489,
subtitles_en_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,7.338848114013672,489,
subtitles_en_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,7.739792108535767,489,
subtitles_en_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,7.302056074142456,489,
subtitles_en_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,7.334207057952881,489,
subtitles_en_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7617950439453125,483,LC_ALL=C
subtitles_en_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7765378952026367,483,LC_ALL=C
subtitles_en_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7456245422363281,483,LC_ALL=C
subtitles_en_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.748713731765747,483,LC_ALL=C
subtitles_en_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,1.7846882343292236,483,LC_ALL=C
subtitles_en_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,31.14370322227478,489,
subtitles_en_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,31.543628454208374,489,
subtitles_en_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,31.133421182632446,489,
subtitles_en_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,31.149214506149292,489,
subtitles_en_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt,31.180144548416138,489,
subtitles_en_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.9173591136932373,22,
subtitles_en_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.867539644241333,22,
subtitles_en_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.9047088623046875,22,
subtitles_en_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.9265778064727783,22,
subtitles_en_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.874317169189453,22,
subtitles_en_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,24.619744777679443,309,
subtitles_en_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,24.622087240219116,309,
subtitles_en_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,24.770710468292236,309,
subtitles_en_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,24.60181713104248,309,
subtitles_en_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,24.678969383239746,309,
subtitles_en_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.676262140274048,22,
subtitles_en_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.673837184906006,22,
subtitles_en_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.667243003845215,22,
subtitles_en_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.667970657348633,22,
subtitles_en_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,2.6588196754455566,22,
subtitles_en_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,10.786212682723999,302,
subtitles_en_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,10.744041204452515,302,
subtitles_en_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,10.74718165397644,302,
subtitles_en_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,10.768681287765503,302,
subtitles_en_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,10.772834777832031,302,
subtitles_en_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,6.287469148635864,22,LC_ALL=C
subtitles_en_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,6.243509769439697,22,LC_ALL=C
subtitles_en_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,6.242478370666504,22,LC_ALL=C
subtitles_en_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,6.2600791454315186,22,LC_ALL=C
subtitles_en_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,6.2560741901397705,22,LC_ALL=C
subtitles_en_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,4.670856237411499,302,
subtitles_en_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,4.703561544418335,302,
subtitles_en_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,4.675989627838135,302,
subtitles_en_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,4.6688103675842285,302,
subtitles_en_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt,4.715432167053223,302,
subtitles_ru_literal,1,5,rg,rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.20440673828125,583,
subtitles_ru_literal,1,5,rg,rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.20561552047729492,583,
subtitles_ru_literal,1,5,rg,rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.2381761074066162,583,
subtitles_ru_literal,1,5,rg,rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.23102140426635742,583,
subtitles_ru_literal,1,5,rg,rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.19649791717529297,583,
subtitles_ru_literal,1,5,rg (no mmap),rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.3158297538757324,583,
subtitles_ru_literal,1,5,rg (no mmap),rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.3136112689971924,583,
subtitles_ru_literal,1,5,rg (no mmap),rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.32402992248535156,583,
subtitles_ru_literal,1,5,rg (no mmap),rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.3248250484466553,583,
subtitles_ru_literal,1,5,rg (no mmap),rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.3201103210449219,583,
subtitles_ru_literal,1,5,grep,grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7790360450744629,583,LC_ALL=C
subtitles_ru_literal,1,5,grep,grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7977695465087891,583,LC_ALL=C
subtitles_ru_literal,1,5,grep,grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7397308349609375,583,LC_ALL=C
subtitles_ru_literal,1,5,grep,grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7123947143554688,583,LC_ALL=C
subtitles_ru_literal,1,5,grep,grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.711977481842041,583,LC_ALL=C
subtitles_ru_literal,1,5,rg (lines),rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.27593088150024414,583,
subtitles_ru_literal,1,5,rg (lines),rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.2842848300933838,583,
subtitles_ru_literal,1,5,rg (lines),rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.28340864181518555,583,
subtitles_ru_literal,1,5,rg (lines),rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.28469133377075195,583,
subtitles_ru_literal,1,5,rg (lines),rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.27951884269714355,583,
subtitles_ru_literal,1,5,ag (lines),ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,2.7401182651519775,583,
subtitles_ru_literal,1,5,ag (lines),ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,2.658051013946533,583,
subtitles_ru_literal,1,5,ag (lines),ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,2.666799306869507,583,
subtitles_ru_literal,1,5,ag (lines),ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,2.7145025730133057,583,
subtitles_ru_literal,1,5,ag (lines),ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,2.7412168979644775,583,
subtitles_ru_literal,1,5,grep (lines),grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0886235237121582,583,LC_ALL=C
subtitles_ru_literal,1,5,grep (lines),grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0896506309509277,583,LC_ALL=C
subtitles_ru_literal,1,5,grep (lines),grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1100494861602783,583,LC_ALL=C
subtitles_ru_literal,1,5,grep (lines),grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.088308334350586,583,LC_ALL=C
subtitles_ru_literal,1,5,grep (lines),grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0891127586364746,583,LC_ALL=C
subtitles_ru_literal,1,5,ugrep (lines),ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8426175117492676,583,
subtitles_ru_literal,1,5,ugrep (lines),ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.85064697265625,583,
subtitles_ru_literal,1,5,ugrep (lines),ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8356082439422607,583,
subtitles_ru_literal,1,5,ugrep (lines),ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8405826091766357,583,
subtitles_ru_literal,1,5,ugrep (lines),ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.83730149269104,583,
subtitles_ru_literal_casei,1,5,rg,rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.48739099502563477,604,
subtitles_ru_literal_casei,1,5,rg,rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.4823324680328369,604,
subtitles_ru_literal_casei,1,5,rg,rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.4832422733306885,604,
subtitles_ru_literal_casei,1,5,rg,rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.4812777042388916,604,
subtitles_ru_literal_casei,1,5,rg,rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.4854264259338379,604,
subtitles_ru_literal_casei,1,5,grep,grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,6.694453477859497,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,5,grep,grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,6.759232044219971,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,5,grep,grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,6.686243534088135,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,5,grep,grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,6.7029454708099365,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,5,grep,grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,6.699738264083862,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,5,grep (ASCII),grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7290260791778564,583,LC_ALL=C
subtitles_ru_literal_casei,1,5,grep (ASCII),grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7400493621826172,583,LC_ALL=C
subtitles_ru_literal_casei,1,5,grep (ASCII),grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7299001216888428,583,LC_ALL=C
subtitles_ru_literal_casei,1,5,grep (ASCII),grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7308380603790283,583,LC_ALL=C
subtitles_ru_literal_casei,1,5,grep (ASCII),grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.7283904552459717,583,LC_ALL=C
subtitles_ru_literal_casei,1,5,rg (lines),rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.5711629390716553,604,
subtitles_ru_literal_casei,1,5,rg (lines),rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.574974536895752,604,
subtitles_ru_literal_casei,1,5,rg (lines),rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.5820963382720947,604,
subtitles_ru_literal_casei,1,5,rg (lines),rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.5438523292541504,604,
subtitles_ru_literal_casei,1,5,rg (lines),rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.5054161548614502,604,
subtitles_ru_literal_casei,1,5,ag (lines) (ASCII),ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6135058403015137,,
subtitles_ru_literal_casei,1,5,ag (lines) (ASCII),ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6051545143127441,,
subtitles_ru_literal_casei,1,5,ag (lines) (ASCII),ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6032793521881104,,
subtitles_ru_literal_casei,1,5,ag (lines) (ASCII),ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6097028255462646,,
subtitles_ru_literal_casei,1,5,ag (lines) (ASCII),ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6850666999816895,,
subtitles_ru_literal_casei,1,5,ugrep (lines) (ASCII),ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.833592176437378,583,
subtitles_ru_literal_casei,1,5,ugrep (lines) (ASCII),ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8357219696044922,583,
subtitles_ru_literal_casei,1,5,ugrep (lines) (ASCII),ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8394358158111572,583,
subtitles_ru_literal_casei,1,5,ugrep (lines) (ASCII),ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8334264755249023,583,
subtitles_ru_literal_casei,1,5,ugrep (lines) (ASCII),ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8304622173309326,583,
subtitles_ru_literal_word,1,5,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt,0.2904787063598633,583,
subtitles_ru_literal_word,1,5,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt,0.2831101417541504,583,
subtitles_ru_literal_word,1,5,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt,0.2786984443664551,583,
subtitles_ru_literal_word,1,5,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt,0.28719663619995117,583,
subtitles_ru_literal_word,1,5,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt,0.27600622177124023,583,
subtitles_ru_literal_word,1,5,ag (ASCII),ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6810102462768555,,
subtitles_ru_literal_word,1,5,ag (ASCII),ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6855161190032959,,
subtitles_ru_literal_word,1,5,ag (ASCII),ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6827929019927979,,
subtitles_ru_literal_word,1,5,ag (ASCII),ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6587810516357422,,
subtitles_ru_literal_word,1,5,ag (ASCII),ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.6551673412322998,,
subtitles_ru_literal_word,1,5,grep (ASCII),grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0948495864868164,583,LC_ALL=C
subtitles_ru_literal_word,1,5,grep (ASCII),grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.097151756286621,583,LC_ALL=C
subtitles_ru_literal_word,1,5,grep (ASCII),grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1051688194274902,583,LC_ALL=C
subtitles_ru_literal_word,1,5,grep (ASCII),grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1151607036590576,583,LC_ALL=C
subtitles_ru_literal_word,1,5,grep (ASCII),grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1100919246673584,583,LC_ALL=C
subtitles_ru_literal_word,1,5,ugrep (ASCII),ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.84104585647583,,
subtitles_ru_literal_word,1,5,ugrep (ASCII),ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.9092209339141846,,
subtitles_ru_literal_word,1,5,ugrep (ASCII),ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.836583137512207,,
subtitles_ru_literal_word,1,5,ugrep (ASCII),ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8941335678100586,,
subtitles_ru_literal_word,1,5,ugrep (ASCII),ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.8811957836151123,,
subtitles_ru_literal_word,1,5,rg,rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.2956504821777344,579,
subtitles_ru_literal_word,1,5,rg,rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.29023194313049316,579,
subtitles_ru_literal_word,1,5,rg,rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.3374972343444824,579,
subtitles_ru_literal_word,1,5,rg,rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.29686713218688965,579,
subtitles_ru_literal_word,1,5,rg,rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,0.29778003692626953,579,
subtitles_ru_literal_word,1,5,grep,grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1042869091033936,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,5,grep,grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.1068925857543945,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,5,grep,grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0973529815673828,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,5,grep,grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0917479991912842,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,5,grep,grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt,1.0987188816070557,579,LC_ALL=en_US.UTF-8
subtitles_ru_alternate,1,5,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8945937156677246,691,
subtitles_ru_alternate,1,5,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8919808864593506,691,
subtitles_ru_alternate,1,5,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.9041986465454102,691,
subtitles_ru_alternate,1,5,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8838107585906982,691,
subtitles_ru_alternate,1,5,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.903540849685669,691,
subtitles_ru_alternate,1,5,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.715298652648926,691,
subtitles_ru_alternate,1,5,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.676830530166626,691,
subtitles_ru_alternate,1,5,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.721431016921997,691,
subtitles_ru_alternate,1,5,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.6990325450897217,691,
subtitles_ru_alternate,1,5,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.764216184616089,691,
subtitles_ru_alternate,1,5,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.519805669784546,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.40212869644165,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.381818294525146,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.386401176452637,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.425997257232666,691,LC_ALL=C
subtitles_ru_alternate,1,5,ugrep (lines),ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.259684801101685,691,
subtitles_ru_alternate,1,5,ugrep (lines),ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.236181735992432,691,
subtitles_ru_alternate,1,5,ugrep (lines),ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.340983629226685,691,
subtitles_ru_alternate,1,5,ugrep (lines),ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.21895980834961,691,
subtitles_ru_alternate,1,5,ugrep (lines),ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.194425106048584,691,
subtitles_ru_alternate,1,5,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8262777328491211,691,
subtitles_ru_alternate,1,5,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8343832492828369,691,
subtitles_ru_alternate,1,5,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8675012588500977,691,
subtitles_ru_alternate,1,5,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8584244251251221,691,
subtitles_ru_alternate,1,5,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,0.8777158260345459,691,
subtitles_ru_alternate,1,5,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.25586986541748,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.007173538208008,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.068726301193237,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.010542631149292,691,LC_ALL=C
subtitles_ru_alternate,1,5,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.021028280258179,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.7179486751556396,691,
subtitles_ru_alternate_casei,1,5,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.682896375656128,691,
subtitles_ru_alternate_casei,1,5,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.699859142303467,691,
subtitles_ru_alternate_casei,1,5,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.662733316421509,691,
subtitles_ru_alternate_casei,1,5,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,3.661060094833374,691,
subtitles_ru_alternate_casei,1,5,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.434819221496582,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.436205625534058,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.388120412826538,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.407799243927002,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,8.44464373588562,691,LC_ALL=C
subtitles_ru_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.216991662979126,691,
subtitles_ru_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.470320701599121,691,
subtitles_ru_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.21274471282959,691,
subtitles_ru_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.38324522972107,691,
subtitles_ru_alternate_casei,1,5,ugrep (ASCII),ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,13.3148832321167,691,
subtitles_ru_alternate_casei,1,5,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,6.205031156539917,735,
subtitles_ru_alternate_casei,1,5,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,6.1502509117126465,735,
subtitles_ru_alternate_casei,1,5,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,6.150696516036987,735,
subtitles_ru_alternate_casei,1,5,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,6.150148630142212,735,
subtitles_ru_alternate_casei,1,5,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,6.153124809265137,735,
subtitles_ru_alternate_casei,1,5,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,7.477111339569092,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,5,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,7.483617782592773,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,5,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,7.502292156219482,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,5,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,7.528963327407837,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,5,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt,7.482379198074341,735,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,rg,rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,0.3461883068084717,278,
subtitles_ru_surrounding_words,1,5,rg,rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,0.30211687088012695,278,
subtitles_ru_surrounding_words,1,5,rg,rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,0.30521416664123535,278,
subtitles_ru_surrounding_words,1,5,rg,rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,0.2969543933868408,278,
subtitles_ru_surrounding_words,1,5,rg,rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,0.3003671169281006,278,
subtitles_ru_surrounding_words,1,5,grep,grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4209251403808594,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,grep,grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4190807342529297,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,grep,grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4178283214569092,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,grep,grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4173235893249512,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,grep,grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4221296310424805,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,70.6701226234436,326,
subtitles_ru_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,71.15788650512695,326,
subtitles_ru_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,71.07276272773743,326,
subtitles_ru_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,70.5626060962677,326,
subtitles_ru_surrounding_words,1,5,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,70.54449439048767,326,
subtitles_ru_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.868441104888916,,
subtitles_ru_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.886382818222046,,
subtitles_ru_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.8685986995697021,,
subtitles_ru_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.8727426528930664,,
subtitles_ru_surrounding_words,1,5,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.8667800426483154,,
subtitles_ru_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.3818490505218506,,LC_ALL=C
subtitles_ru_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.3709721565246582,,LC_ALL=C
subtitles_ru_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.3819043636322021,,LC_ALL=C
subtitles_ru_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.460402488708496,,LC_ALL=C
subtitles_ru_surrounding_words,1,5,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.4097135066986084,,LC_ALL=C
subtitles_ru_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.286102294921875,,
subtitles_ru_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.2712647914886475,,
subtitles_ru_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.2950100898742676,,
subtitles_ru_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.264500617980957,,
subtitles_ru_surrounding_words,1,5,ugrep (ASCII),ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt,1.2877566814422607,,
subtitles_ru_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,3.1152236461639404,41,
subtitles_ru_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,3.1311423778533936,41,
subtitles_ru_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,3.0800061225891113,41,
subtitles_ru_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,3.070636510848999,41,
subtitles_ru_no_literal,1,5,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,3.0940587520599365,41,
subtitles_ru_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,50.85447072982788,86,
subtitles_ru_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,50.832582235336304,86,
subtitles_ru_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,50.8755087852478,86,
subtitles_ru_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,50.79056358337402,86,
subtitles_ru_no_literal,1,5,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,50.84795618057251,86,
subtitles_ru_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,2.716826915740967,,
subtitles_ru_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,2.7381114959716797,,
subtitles_ru_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,2.7545180320739746,,
subtitles_ru_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,2.7215416431427,,
subtitles_ru_no_literal,1,5,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,2.707784414291382,,
subtitles_ru_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.9250116348266602,,
subtitles_ru_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.8956947326660156,,
subtitles_ru_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.8904175758361816,,
subtitles_ru_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.8968868255615234,,
subtitles_ru_no_literal,1,5,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.900888204574585,,
subtitles_ru_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.755054235458374,,LC_ALL=C
subtitles_ru_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.7681376934051514,,LC_ALL=C
subtitles_ru_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.7654614448547363,,LC_ALL=C
subtitles_ru_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.75648832321167,,LC_ALL=C
subtitles_ru_no_literal,1,5,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.7456772327423096,,LC_ALL=C
subtitles_ru_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.2170698642730713,,
subtitles_ru_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.1907124519348145,,
subtitles_ru_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.1722266674041748,,
subtitles_ru_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.191617727279663,,
subtitles_ru_no_literal,1,5,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt,1.1909863948822021,,
1 benchmark warmup_iter iter name command duration lines env
2 linux_literal_default 1 5 rg rg PM_RESUME 0.12675833702087402 19
3 linux_literal_default 1 5 rg rg PM_RESUME 0.1196434497833252 19
4 linux_literal_default 1 5 rg rg PM_RESUME 0.12096214294433594 19
5 linux_literal_default 1 5 rg rg PM_RESUME 0.1257617473602295 19
6 linux_literal_default 1 5 rg rg PM_RESUME 0.12903356552124023 19
7 linux_literal_default 1 5 ag ag PM_RESUME 0.8575565814971924 19
8 linux_literal_default 1 5 ag ag PM_RESUME 0.9113664627075195 19
9 linux_literal_default 1 5 ag ag PM_RESUME 0.944256067276001 19
10 linux_literal_default 1 5 ag ag PM_RESUME 0.5309450626373291 19
11 linux_literal_default 1 5 ag ag PM_RESUME 0.6105470657348633 19
12 linux_literal_default 1 5 git grep git grep PM_RESUME 0.49039149284362793 19 LC_ALL=en_US.UTF-8
13 linux_literal_default 1 5 git grep git grep PM_RESUME 0.48095154762268066 19 LC_ALL=en_US.UTF-8
14 linux_literal_default 1 5 git grep git grep PM_RESUME 0.48927950859069824 19 LC_ALL=en_US.UTF-8
15 linux_literal_default 1 5 git grep git grep PM_RESUME 0.47182321548461914 19 LC_ALL=en_US.UTF-8
16 linux_literal_default 1 5 git grep git grep PM_RESUME 0.46923041343688965 19 LC_ALL=en_US.UTF-8
17 linux_literal_default 1 5 ugrep ugrep -r PM_RESUME ./ 0.13612771034240723 19
18 linux_literal_default 1 5 ugrep ugrep -r PM_RESUME ./ 0.13677191734313965 19
19 linux_literal_default 1 5 ugrep ugrep -r PM_RESUME ./ 0.13688087463378906 19
20 linux_literal_default 1 5 ugrep ugrep -r PM_RESUME ./ 0.13218474388122559 19
21 linux_literal_default 1 5 ugrep ugrep -r PM_RESUME ./ 0.13851046562194824 19
22 linux_literal_default 1 5 grep grep -r PM_RESUME ./ 1.1436240673065186 19 LC_ALL=en_US.UTF-8
23 linux_literal_default 1 5 grep grep -r PM_RESUME ./ 1.1436970233917236 19 LC_ALL=en_US.UTF-8
24 linux_literal_default 1 5 grep grep -r PM_RESUME ./ 1.1542651653289795 19 LC_ALL=en_US.UTF-8
25 linux_literal_default 1 5 grep grep -r PM_RESUME ./ 1.14790940284729 19 LC_ALL=en_US.UTF-8
26 linux_literal_default 1 5 grep grep -r PM_RESUME ./ 1.1441664695739746 19 LC_ALL=en_US.UTF-8
27 linux_literal 1 5 rg rg -n PM_RESUME 0.134232759475708 19
28 linux_literal 1 5 rg rg -n PM_RESUME 0.12477993965148926 19
29 linux_literal 1 5 rg rg -n PM_RESUME 0.11790871620178223 19
30 linux_literal 1 5 rg rg -n PM_RESUME 0.13471150398254395 19
31 linux_literal 1 5 rg rg -n PM_RESUME 0.13730239868164062 19
32 linux_literal 1 5 rg (mmap) rg -n --mmap PM_RESUME 1.2953157424926758 19
33 linux_literal 1 5 rg (mmap) rg -n --mmap PM_RESUME 1.3263885974884033 19
34 linux_literal 1 5 rg (mmap) rg -n --mmap PM_RESUME 1.320932388305664 19
35 linux_literal 1 5 rg (mmap) rg -n --mmap PM_RESUME 1.3446438312530518 19
36 linux_literal 1 5 rg (mmap) rg -n --mmap PM_RESUME 1.3919141292572021 19
37 linux_literal 1 5 ag (mmap) ag -s PM_RESUME 0.7901346683502197 19
38 linux_literal 1 5 ag (mmap) ag -s PM_RESUME 0.9647164344787598 19
39 linux_literal 1 5 ag (mmap) ag -s PM_RESUME 0.8800022602081299 19
40 linux_literal 1 5 ag (mmap) ag -s PM_RESUME 0.9307558536529541 19
41 linux_literal 1 5 ag (mmap) ag -s PM_RESUME 0.8346366882324219 19
42 linux_literal 1 5 git grep git grep -I -n PM_RESUME 0.4694955348968506 19 LC_ALL=C
43 linux_literal 1 5 git grep git grep -I -n PM_RESUME 0.4620368480682373 19 LC_ALL=C
44 linux_literal 1 5 git grep git grep -I -n PM_RESUME 0.4673285484313965 19 LC_ALL=C
45 linux_literal 1 5 git grep git grep -I -n PM_RESUME 0.4570960998535156 19 LC_ALL=C
46 linux_literal 1 5 git grep git grep -I -n PM_RESUME 0.4648761749267578 19 LC_ALL=C
47 linux_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.3233473300933838 19
48 linux_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.3199331760406494 19
49 linux_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.29825615882873535 19
50 linux_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.3003232479095459 19
51 linux_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.30283141136169434 19
52 linux_literal_casei 1 5 rg rg -n -i PM_RESUME 0.1349015235900879 456
53 linux_literal_casei 1 5 rg rg -n -i PM_RESUME 0.1277780532836914 456
54 linux_literal_casei 1 5 rg rg -n -i PM_RESUME 0.1251516342163086 456
55 linux_literal_casei 1 5 rg rg -n -i PM_RESUME 0.12959671020507812 456
56 linux_literal_casei 1 5 rg rg -n -i PM_RESUME 0.1374528408050537 456
57 linux_literal_casei 1 5 rg (mmap) rg -n -i --mmap PM_RESUME 1.3468265533447266 456
58 linux_literal_casei 1 5 rg (mmap) rg -n -i --mmap PM_RESUME 1.3552894592285156 456
59 linux_literal_casei 1 5 rg (mmap) rg -n -i --mmap PM_RESUME 1.3028552532196045 456
60 linux_literal_casei 1 5 rg (mmap) rg -n -i --mmap PM_RESUME 1.336735725402832 456
61 linux_literal_casei 1 5 rg (mmap) rg -n -i --mmap PM_RESUME 1.338634729385376 456
62 linux_literal_casei 1 5 ag (mmap) ag -i PM_RESUME 0.5562450885772705 456
63 linux_literal_casei 1 5 ag (mmap) ag -i PM_RESUME 0.7324790954589844 456
64 linux_literal_casei 1 5 ag (mmap) ag -i PM_RESUME 0.8382794857025146 456
65 linux_literal_casei 1 5 ag (mmap) ag -i PM_RESUME 0.5817627906799316 456
66 linux_literal_casei 1 5 ag (mmap) ag -i PM_RESUME 0.5771033763885498 456
67 linux_literal_casei 1 5 git grep git grep -I -n -i PM_RESUME 0.48885059356689453 456 LC_ALL=C
68 linux_literal_casei 1 5 git grep git grep -I -n -i PM_RESUME 0.4838893413543701 456 LC_ALL=C
69 linux_literal_casei 1 5 git grep git grep -I -n -i PM_RESUME 0.48733997344970703 456 LC_ALL=C
70 linux_literal_casei 1 5 git grep git grep -I -n -i PM_RESUME 0.4765594005584717 456 LC_ALL=C
71 linux_literal_casei 1 5 git grep git grep -I -n -i PM_RESUME 0.47402334213256836 456 LC_ALL=C
72 linux_literal_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.3075406551361084 456
73 linux_literal_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.2922379970550537 456
74 linux_literal_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.2901036739349365 456
75 linux_literal_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.2723674774169922 456
76 linux_literal_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.2762429714202881 456
77 linux_re_literal_suffix 1 5 rg rg -n [A-Z]+_RESUME 0.12853646278381348 1944
78 linux_re_literal_suffix 1 5 rg rg -n [A-Z]+_RESUME 0.1190040111541748 1944
79 linux_re_literal_suffix 1 5 rg rg -n [A-Z]+_RESUME 0.14054393768310547 1944
80 linux_re_literal_suffix 1 5 rg rg -n [A-Z]+_RESUME 0.12263894081115723 1944
81 linux_re_literal_suffix 1 5 rg rg -n [A-Z]+_RESUME 0.12101268768310547 1944
82 linux_re_literal_suffix 1 5 ag ag -s [A-Z]+_RESUME 0.9220716953277588 1944
83 linux_re_literal_suffix 1 5 ag ag -s [A-Z]+_RESUME 1.009810209274292 1944
84 linux_re_literal_suffix 1 5 ag ag -s [A-Z]+_RESUME 0.9654982089996338 1944
85 linux_re_literal_suffix 1 5 ag ag -s [A-Z]+_RESUME 1.2758586406707764 1944
86 linux_re_literal_suffix 1 5 ag ag -s [A-Z]+_RESUME 1.0480666160583496 1944
87 linux_re_literal_suffix 1 5 git grep git grep -E -I -n [A-Z]+_RESUME 1.1811027526855469 1944 LC_ALL=C
88 linux_re_literal_suffix 1 5 git grep git grep -E -I -n [A-Z]+_RESUME 1.1824719905853271 1944 LC_ALL=C
89 linux_re_literal_suffix 1 5 git grep git grep -E -I -n [A-Z]+_RESUME 1.2052066326141357 1944 LC_ALL=C
90 linux_re_literal_suffix 1 5 git grep git grep -E -I -n [A-Z]+_RESUME 1.224193811416626 1944 LC_ALL=C
91 linux_re_literal_suffix 1 5 git grep git grep -E -I -n [A-Z]+_RESUME 1.2896029949188232 1944 LC_ALL=C
92 linux_re_literal_suffix 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.5580098628997803 1944
93 linux_re_literal_suffix 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.5409820079803467 1944
94 linux_re_literal_suffix 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.5436761379241943 1944
95 linux_re_literal_suffix 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.5317332744598389 1944
96 linux_re_literal_suffix 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.5662341117858887 1944
97 linux_word 1 5 rg rg -n -w PM_RESUME 0.13112211227416992 6
98 linux_word 1 5 rg rg -n -w PM_RESUME 0.13633346557617188 6
99 linux_word 1 5 rg rg -n -w PM_RESUME 0.1308743953704834 6
100 linux_word 1 5 rg rg -n -w PM_RESUME 0.13691973686218262 6
101 linux_word 1 5 rg rg -n -w PM_RESUME 0.1369326114654541 6
102 linux_word 1 5 ag ag -s -w PM_RESUME 0.5965347290039062 6
103 linux_word 1 5 ag ag -s -w PM_RESUME 0.8891518115997314 6
104 linux_word 1 5 ag ag -s -w PM_RESUME 0.5207972526550293 6
105 linux_word 1 5 ag ag -s -w PM_RESUME 0.5551142692565918 6
106 linux_word 1 5 ag ag -s -w PM_RESUME 0.5308854579925537 6
107 linux_word 1 5 git grep git grep -E -I -n -w PM_RESUME 0.45984363555908203 6 LC_ALL=C
108 linux_word 1 5 git grep git grep -E -I -n -w PM_RESUME 0.47351694107055664 6 LC_ALL=C
109 linux_word 1 5 git grep git grep -E -I -n -w PM_RESUME 0.5011758804321289 6 LC_ALL=C
110 linux_word 1 5 git grep git grep -E -I -n -w PM_RESUME 0.45740509033203125 6 LC_ALL=C
111 linux_word 1 5 git grep git grep -E -I -n -w PM_RESUME 0.46122002601623535 6 LC_ALL=C
112 linux_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.3174629211425781 6
113 linux_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.32368993759155273 6
114 linux_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.3131399154663086 6
115 linux_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.2834908962249756 6
116 linux_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.2899782657623291 6
117 linux_unicode_greek 1 5 rg rg -n \p{Greek} 0.2624638080596924 105
118 linux_unicode_greek 1 5 rg rg -n \p{Greek} 0.26248669624328613 105
119 linux_unicode_greek 1 5 rg rg -n \p{Greek} 0.26514244079589844 105
120 linux_unicode_greek 1 5 rg rg -n \p{Greek} 0.26303768157958984 105
121 linux_unicode_greek 1 5 rg rg -n \p{Greek} 0.2612752914428711 105
122 linux_unicode_greek 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.2842683792114258 105
123 linux_unicode_greek 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.2718374729156494 105
124 linux_unicode_greek 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.26900339126586914 105
125 linux_unicode_greek 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.267728328704834 105
126 linux_unicode_greek 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.27019381523132324 105
127 linux_unicode_greek_casei 1 5 rg rg -n -i \p{Greek} 0.24460315704345703 225
128 linux_unicode_greek_casei 1 5 rg rg -n -i \p{Greek} 0.2752077579498291 225
129 linux_unicode_greek_casei 1 5 rg rg -n -i \p{Greek} 0.25118350982666016 225
130 linux_unicode_greek_casei 1 5 rg rg -n -i \p{Greek} 0.2610158920288086 225
131 linux_unicode_greek_casei 1 5 rg rg -n -i \p{Greek} 0.24675774574279785 225
132 linux_unicode_greek_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.26882410049438477 105
133 linux_unicode_greek_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.2770118713378906 105
134 linux_unicode_greek_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.2694118022918701 105
135 linux_unicode_greek_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.2690916061401367 105
136 linux_unicode_greek_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.2686276435852051 105
137 linux_unicode_word 1 5 rg rg -n \wAh 0.13727664947509766 229
138 linux_unicode_word 1 5 rg rg -n \wAh 0.1450798511505127 229
139 linux_unicode_word 1 5 rg rg -n \wAh 0.13819336891174316 229
140 linux_unicode_word 1 5 rg rg -n \wAh 0.1422877311706543 229
141 linux_unicode_word 1 5 rg rg -n \wAh 0.13657712936401367 229
142 linux_unicode_word 1 5 rg (ASCII) rg -n (?-u)\wAh 0.1487271785736084 216
143 linux_unicode_word 1 5 rg (ASCII) rg -n (?-u)\wAh 0.1459641456604004 216
144 linux_unicode_word 1 5 rg (ASCII) rg -n (?-u)\wAh 0.13515281677246094 216
145 linux_unicode_word 1 5 rg (ASCII) rg -n (?-u)\wAh 0.12724566459655762 216
146 linux_unicode_word 1 5 rg (ASCII) rg -n (?-u)\wAh 0.13360023498535156 216
147 linux_unicode_word 1 5 ag (ASCII) ag -s \wAh 1.2160453796386719 216
148 linux_unicode_word 1 5 ag (ASCII) ag -s \wAh 1.230163335800171 216
149 linux_unicode_word 1 5 ag (ASCII) ag -s \wAh 1.2649273872375488 216
150 linux_unicode_word 1 5 ag (ASCII) ag -s \wAh 1.224984884262085 216
151 linux_unicode_word 1 5 ag (ASCII) ag -s \wAh 1.4559555053710938 216
152 linux_unicode_word 1 5 git grep git grep -E -I -n \wAh 8.233768224716187 229 LC_ALL=en_US.UTF-8
153 linux_unicode_word 1 5 git grep git grep -E -I -n \wAh 8.191053867340088 229 LC_ALL=en_US.UTF-8
154 linux_unicode_word 1 5 git grep git grep -E -I -n \wAh 8.175920724868774 229 LC_ALL=en_US.UTF-8
155 linux_unicode_word 1 5 git grep git grep -E -I -n \wAh 8.167959451675415 229 LC_ALL=en_US.UTF-8
156 linux_unicode_word 1 5 git grep git grep -E -I -n \wAh 8.1710205078125 229 LC_ALL=en_US.UTF-8
157 linux_unicode_word 1 5 git grep (ASCII) git grep -E -I -n \wAh 2.3747494220733643 216 LC_ALL=C
158 linux_unicode_word 1 5 git grep (ASCII) git grep -E -I -n \wAh 2.3170926570892334 216 LC_ALL=C
159 linux_unicode_word 1 5 git grep (ASCII) git grep -E -I -n \wAh 2.3430888652801514 216 LC_ALL=C
160 linux_unicode_word 1 5 git grep (ASCII) git grep -E -I -n \wAh 2.3219168186187744 216 LC_ALL=C
161 linux_unicode_word 1 5 git grep (ASCII) git grep -E -I -n \wAh 2.3155832290649414 216 LC_ALL=C
162 linux_unicode_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.2722008228302002 229
163 linux_unicode_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.27547430992126465 229
164 linux_unicode_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.2771613597869873 229
165 linux_unicode_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.27692317962646484 229
166 linux_unicode_word 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.27749085426330566 229
167 linux_unicode_word 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.2744929790496826 216
168 linux_unicode_word 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.2725999355316162 216
169 linux_unicode_word 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.27443718910217285 216
170 linux_unicode_word 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.2668039798736572 216
171 linux_unicode_word 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.27918338775634766 216
172 linux_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.38802123069763184 611
173 linux_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.40351152420043945 611
174 linux_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.40592288970947266 611
175 linux_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.40622901916503906 611
176 linux_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.40683722496032715 611
177 linux_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2553420066833496 610
178 linux_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2511327266693115 610
179 linux_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2530384063720703 610
180 linux_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2420644760131836 610
181 linux_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2691671848297119 610
182 linux_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.9446702003479004 971
183 linux_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.9380638599395752 971
184 linux_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.9273786544799805 971
185 linux_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.9271430969238281 971
186 linux_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.9307007789611816 971
187 linux_no_literal 1 5 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 14.531656265258789 611 LC_ALL=en_US.UTF-8
188 linux_no_literal 1 5 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 14.570266008377075 611 LC_ALL=en_US.UTF-8
189 linux_no_literal 1 5 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 14.51328158378601 611 LC_ALL=en_US.UTF-8
190 linux_no_literal 1 5 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 14.644389629364014 611 LC_ALL=en_US.UTF-8
191 linux_no_literal 1 5 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 14.694648027420044 611 LC_ALL=en_US.UTF-8
192 linux_no_literal 1 5 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 3.164829730987549 610 LC_ALL=C
193 linux_no_literal 1 5 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 3.2377045154571533 610 LC_ALL=C
194 linux_no_literal 1 5 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 3.1798932552337646 610 LC_ALL=C
195 linux_no_literal 1 5 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 3.142343044281006 610 LC_ALL=C
196 linux_no_literal 1 5 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 3.185952663421631 610 LC_ALL=C
197 linux_no_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 6.241358041763306 973
198 linux_no_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 6.213250637054443 973
199 linux_no_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 6.242088079452515 973
200 linux_no_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 6.126717567443848 973
201 linux_no_literal 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 6.15744948387146 973
202 linux_no_literal 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.3647449016571045 972
203 linux_no_literal 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.36277341842651367 972
204 linux_no_literal 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.3670034408569336 972
205 linux_no_literal 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.3563535213470459 972
206 linux_no_literal 1 5 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.36490702629089355 972
207 linux_alternates 1 5 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.14299488067626953 112
208 linux_alternates 1 5 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.15548348426818848 112
209 linux_alternates 1 5 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.14477276802062988 112
210 linux_alternates 1 5 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.12926578521728516 112
211 linux_alternates 1 5 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.13896560668945312 112
212 linux_alternates 1 5 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9893472194671631 112
213 linux_alternates 1 5 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 1.016686201095581 112
214 linux_alternates 1 5 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9755496978759766 112
215 linux_alternates 1 5 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9718713760375977 112
216 linux_alternates 1 5 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 1.0030465126037598 112
217 linux_alternates 1 5 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5737886428833008 112 LC_ALL=C
218 linux_alternates 1 5 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.562185525894165 112 LC_ALL=C
219 linux_alternates 1 5 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5762710571289062 112 LC_ALL=C
220 linux_alternates 1 5 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5561251640319824 112 LC_ALL=C
221 linux_alternates 1 5 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5849525928497314 112 LC_ALL=C
222 linux_alternates 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.3186032772064209 112
223 linux_alternates 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.2896738052368164 112
224 linux_alternates 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.28582000732421875 112
225 linux_alternates 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.2837677001953125 112
226 linux_alternates 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.27143406867980957 112
227 linux_alternates_casei 1 5 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.21955585479736328 203
228 linux_alternates_casei 1 5 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.22631502151489258 203
229 linux_alternates_casei 1 5 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.23458337783813477 203
230 linux_alternates_casei 1 5 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.21781086921691895 203
231 linux_alternates_casei 1 5 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.231217622756958 203
232 linux_alternates_casei 1 5 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.7170076370239258 203
233 linux_alternates_casei 1 5 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.7032256126403809 203
234 linux_alternates_casei 1 5 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.6868026256561279 203
235 linux_alternates_casei 1 5 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.6965539455413818 203
236 linux_alternates_casei 1 5 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.6966633796691895 203
237 linux_alternates_casei 1 5 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9774580001831055 203 LC_ALL=C
238 linux_alternates_casei 1 5 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9654648303985596 203 LC_ALL=C
239 linux_alternates_casei 1 5 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.967714786529541 203 LC_ALL=C
240 linux_alternates_casei 1 5 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9789888858795166 203 LC_ALL=C
241 linux_alternates_casei 1 5 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.9938976764678955 203 LC_ALL=C
242 linux_alternates_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.2825000286102295 203
243 linux_alternates_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.27024054527282715 203
244 linux_alternates_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.27353668212890625 203
245 linux_alternates_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.27333736419677734 203
246 linux_alternates_casei 1 5 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.2730555534362793 203
247 subtitles_en_literal 1 5 rg rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.2259538173675537 830
248 subtitles_en_literal 1 5 rg rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.22034168243408203 830
249 subtitles_en_literal 1 5 rg rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.22986674308776855 830
250 subtitles_en_literal 1 5 rg rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.22815775871276855 830
251 subtitles_en_literal 1 5 rg rg Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.2238922119140625 830
252 subtitles_en_literal 1 5 rg (no mmap) rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.36427783966064453 830
253 subtitles_en_literal 1 5 rg (no mmap) rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.37499117851257324 830
254 subtitles_en_literal 1 5 rg (no mmap) rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.36223769187927246 830
255 subtitles_en_literal 1 5 rg (no mmap) rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3646128177642822 830
256 subtitles_en_literal 1 5 rg (no mmap) rg --no-mmap Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.36281347274780273 830
257 subtitles_en_literal 1 5 grep grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.8064453601837158 830 LC_ALL=C
258 subtitles_en_literal 1 5 grep grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.8001935482025146 830 LC_ALL=C
259 subtitles_en_literal 1 5 grep grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.8018591403961182 830 LC_ALL=C
260 subtitles_en_literal 1 5 grep grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.7978458404541016 830 LC_ALL=C
261 subtitles_en_literal 1 5 grep grep Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.7912843227386475 830 LC_ALL=C
262 subtitles_en_literal 1 5 rg (lines) rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.31099891662597656 830
263 subtitles_en_literal 1 5 rg (lines) rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3145768642425537 830
264 subtitles_en_literal 1 5 rg (lines) rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.30507469177246094 830
265 subtitles_en_literal 1 5 rg (lines) rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3450126647949219 830
266 subtitles_en_literal 1 5 rg (lines) rg -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.31091880798339844 830
267 subtitles_en_literal 1 5 ag (lines) ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5518174171447754 830
268 subtitles_en_literal 1 5 ag (lines) ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.551568031311035 830
269 subtitles_en_literal 1 5 ag (lines) ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5306365489959717 830
270 subtitles_en_literal 1 5 ag (lines) ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.537529468536377 830
271 subtitles_en_literal 1 5 ag (lines) ag -s Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5627124309539795 830
272 subtitles_en_literal 1 5 grep (lines) grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2934913635253906 830 LC_ALL=C
273 subtitles_en_literal 1 5 grep (lines) grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2990975379943848 830 LC_ALL=C
274 subtitles_en_literal 1 5 grep (lines) grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2942156791687012 830 LC_ALL=C
275 subtitles_en_literal 1 5 grep (lines) grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2887969017028809 830 LC_ALL=C
276 subtitles_en_literal 1 5 grep (lines) grep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2922444343566895 830 LC_ALL=C
277 subtitles_en_literal 1 5 ugrep (lines) ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3939177989959717 830
278 subtitles_en_literal 1 5 ugrep (lines) ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3916018009185791 830
279 subtitles_en_literal 1 5 ugrep (lines) ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.40460968017578125 830
280 subtitles_en_literal 1 5 ugrep (lines) ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.41738367080688477 830
281 subtitles_en_literal 1 5 ugrep (lines) ugrep -n Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.41339826583862305 830
282 subtitles_en_literal_casei 1 5 rg rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.37847900390625 871
283 subtitles_en_literal_casei 1 5 rg rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3692331314086914 871
284 subtitles_en_literal_casei 1 5 rg rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.40493106842041016 871
285 subtitles_en_literal_casei 1 5 rg rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4074361324310303 871
286 subtitles_en_literal_casei 1 5 rg rg -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4297189712524414 871
287 subtitles_en_literal_casei 1 5 grep grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 3.63842511177063 871 LC_ALL=en_US.UTF-8
288 subtitles_en_literal_casei 1 5 grep grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 3.6366350650787354 871 LC_ALL=en_US.UTF-8
289 subtitles_en_literal_casei 1 5 grep grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 3.6044440269470215 871 LC_ALL=en_US.UTF-8
290 subtitles_en_literal_casei 1 5 grep grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 3.6123127937316895 871 LC_ALL=en_US.UTF-8
291 subtitles_en_literal_casei 1 5 grep grep -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 3.6119742393493652 871 LC_ALL=en_US.UTF-8
292 subtitles_en_literal_casei 1 5 grep (ASCII) grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.917151689529419 871 LC_ALL=C
293 subtitles_en_literal_casei 1 5 grep (ASCII) grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.9379458427429199 871 LC_ALL=C
294 subtitles_en_literal_casei 1 5 grep (ASCII) grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.9703550338745117 871 LC_ALL=C
295 subtitles_en_literal_casei 1 5 grep (ASCII) grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.9309988021850586 871 LC_ALL=C
296 subtitles_en_literal_casei 1 5 grep (ASCII) grep -E -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.9328129291534424 871 LC_ALL=C
297 subtitles_en_literal_casei 1 5 rg (lines) rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.5196061134338379 871
298 subtitles_en_literal_casei 1 5 rg (lines) rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.5225450992584229 871
299 subtitles_en_literal_casei 1 5 rg (lines) rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4856400489807129 871
300 subtitles_en_literal_casei 1 5 rg (lines) rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.5204241275787354 871
301 subtitles_en_literal_casei 1 5 rg (lines) rg -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.5224106311798096 871
302 subtitles_en_literal_casei 1 5 ag (lines) (ASCII) ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5935003757476807 871
303 subtitles_en_literal_casei 1 5 ag (lines) (ASCII) ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.640918016433716 871
304 subtitles_en_literal_casei 1 5 ag (lines) (ASCII) ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.602182626724243 871
305 subtitles_en_literal_casei 1 5 ag (lines) (ASCII) ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.575654983520508 871
306 subtitles_en_literal_casei 1 5 ag (lines) (ASCII) ag -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5606820583343506 871
307 subtitles_en_literal_casei 1 5 ugrep (lines) ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.0980546474456787 871
308 subtitles_en_literal_casei 1 5 ugrep (lines) ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.095038652420044 871
309 subtitles_en_literal_casei 1 5 ugrep (lines) ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.0974702835083008 871
310 subtitles_en_literal_casei 1 5 ugrep (lines) ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.113879919052124 871
311 subtitles_en_literal_casei 1 5 ugrep (lines) ugrep -n -i Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.1096961498260498 871
312 subtitles_en_literal_word 1 5 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt 0.3175060749053955 830
313 subtitles_en_literal_word 1 5 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt 0.321685791015625 830
314 subtitles_en_literal_word 1 5 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt 0.30799293518066406 830
315 subtitles_en_literal_word 1 5 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt 0.31140613555908203 830
316 subtitles_en_literal_word 1 5 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /tmp/benchsuite/subtitles/en.sample.txt 0.32439208030700684 830
317 subtitles_en_literal_word 1 5 ag (ASCII) ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5530965328216553 830
318 subtitles_en_literal_word 1 5 ag (ASCII) ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5833561420440674 830
319 subtitles_en_literal_word 1 5 ag (ASCII) ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5765762329101562 830
320 subtitles_en_literal_word 1 5 ag (ASCII) ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.610975742340088 830
321 subtitles_en_literal_word 1 5 ag (ASCII) ag -sw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 2.5965471267700195 830
322 subtitles_en_literal_word 1 5 grep (ASCII) grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.3212966918945312 830 LC_ALL=C
323 subtitles_en_literal_word 1 5 grep (ASCII) grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.311401128768921 830 LC_ALL=C
324 subtitles_en_literal_word 1 5 grep (ASCII) grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.298889398574829 830 LC_ALL=C
325 subtitles_en_literal_word 1 5 grep (ASCII) grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.316542148590088 830 LC_ALL=C
326 subtitles_en_literal_word 1 5 grep (ASCII) grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.3483500480651855 830 LC_ALL=C
327 subtitles_en_literal_word 1 5 ugrep (ASCII) ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4127326011657715 830
328 subtitles_en_literal_word 1 5 ugrep (ASCII) ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4138009548187256 830
329 subtitles_en_literal_word 1 5 ugrep (ASCII) ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4203319549560547 830
330 subtitles_en_literal_word 1 5 ugrep (ASCII) ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.4127979278564453 830
331 subtitles_en_literal_word 1 5 ugrep (ASCII) ugrep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.41126537322998047 830
332 subtitles_en_literal_word 1 5 rg rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3251321315765381 830
333 subtitles_en_literal_word 1 5 rg rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.31773900985717773 830
334 subtitles_en_literal_word 1 5 rg rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.32987523078918457 830
335 subtitles_en_literal_word 1 5 rg rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.32228970527648926 830
336 subtitles_en_literal_word 1 5 rg rg -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 0.3207516670227051 830
337 subtitles_en_literal_word 1 5 grep grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.2946159839630127 830 LC_ALL=en_US.UTF-8
338 subtitles_en_literal_word 1 5 grep grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.333972454071045 830 LC_ALL=en_US.UTF-8
339 subtitles_en_literal_word 1 5 grep grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.3002500534057617 830 LC_ALL=en_US.UTF-8
340 subtitles_en_literal_word 1 5 grep grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.347550630569458 830 LC_ALL=en_US.UTF-8
341 subtitles_en_literal_word 1 5 grep grep -nw Sherlock Holmes /tmp/benchsuite/subtitles/en.sample.txt 1.306572675704956 830 LC_ALL=en_US.UTF-8
342 subtitles_en_alternate 1 5 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.4178187847137451 1094
343 subtitles_en_alternate 1 5 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.44626832008361816 1094
344 subtitles_en_alternate 1 5 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.44959425926208496 1094
345 subtitles_en_alternate 1 5 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.38634324073791504 1094
346 subtitles_en_alternate 1 5 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.4460463523864746 1094
347 subtitles_en_alternate 1 5 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.6045682430267334 1094
348 subtitles_en_alternate 1 5 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.6191344261169434 1094
349 subtitles_en_alternate 1 5 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.579859972000122 1094
350 subtitles_en_alternate 1 5 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.6637580394744873 1094
351 subtitles_en_alternate 1 5 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.5728182792663574 1094
352 subtitles_en_alternate 1 5 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.323948621749878 1094 LC_ALL=C
353 subtitles_en_alternate 1 5 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.3338429927825928 1094 LC_ALL=C
354 subtitles_en_alternate 1 5 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.34714937210083 1094 LC_ALL=C
355 subtitles_en_alternate 1 5 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.314117908477783 1094 LC_ALL=C
356 subtitles_en_alternate 1 5 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 3.303710699081421 1094 LC_ALL=C
357 subtitles_en_alternate 1 5 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.147033452987671 1094
358 subtitles_en_alternate 1 5 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.2054970264434814 1094
359 subtitles_en_alternate 1 5 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.0998892784118652 1094
360 subtitles_en_alternate 1 5 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.101989984512329 1094
361 subtitles_en_alternate 1 5 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.110612154006958 1094
362 subtitles_en_alternate 1 5 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.29009222984313965 1094
363 subtitles_en_alternate 1 5 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.29300451278686523 1094
364 subtitles_en_alternate 1 5 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.3199915885925293 1094
365 subtitles_en_alternate 1 5 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.3187263011932373 1094
366 subtitles_en_alternate 1 5 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.30321288108825684 1094
367 subtitles_en_alternate 1 5 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 2.813009738922119 1094 LC_ALL=C
368 subtitles_en_alternate 1 5 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 2.80930757522583 1094 LC_ALL=C
369 subtitles_en_alternate 1 5 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 2.814509153366089 1094 LC_ALL=C
370 subtitles_en_alternate 1 5 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 2.8390560150146484 1094 LC_ALL=C
371 subtitles_en_alternate 1 5 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 2.830871105194092 1094 LC_ALL=C
372 subtitles_en_alternate_casei 1 5 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 6.166510343551636 1136
373 subtitles_en_alternate_casei 1 5 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 6.192304849624634 1136
374 subtitles_en_alternate_casei 1 5 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 6.185140132904053 1136
375 subtitles_en_alternate_casei 1 5 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 6.20132040977478 1136
376 subtitles_en_alternate_casei 1 5 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 6.159040451049805 1136
377 subtitles_en_alternate_casei 1 5 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.523138999938965 1136 LC_ALL=C
378 subtitles_en_alternate_casei 1 5 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.512346267700195 1136 LC_ALL=C
379 subtitles_en_alternate_casei 1 5 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.562563896179199 1136 LC_ALL=C
380 subtitles_en_alternate_casei 1 5 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.533160448074341 1136 LC_ALL=C
381 subtitles_en_alternate_casei 1 5 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.504830837249756 1136 LC_ALL=C
382 subtitles_en_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.1120033264160156 1136
383 subtitles_en_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.1150739192962646 1136
384 subtitles_en_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.1018304824829102 1136
385 subtitles_en_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.1106996536254883 1136
386 subtitles_en_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 1.0994808673858643 1136
387 subtitles_en_alternate_casei 1 5 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.8494291305541992 1136
388 subtitles_en_alternate_casei 1 5 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.7878148555755615 1136
389 subtitles_en_alternate_casei 1 5 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.8290884494781494 1136
390 subtitles_en_alternate_casei 1 5 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.7409803867340088 1136
391 subtitles_en_alternate_casei 1 5 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 0.7880558967590332 1136
392 subtitles_en_alternate_casei 1 5 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.5523765087127686 1136 LC_ALL=en_US.UTF-8
393 subtitles_en_alternate_casei 1 5 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.527086019515991 1136 LC_ALL=en_US.UTF-8
394 subtitles_en_alternate_casei 1 5 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.740911483764648 1136 LC_ALL=en_US.UTF-8
395 subtitles_en_alternate_casei 1 5 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.520638465881348 1136 LC_ALL=en_US.UTF-8
396 subtitles_en_alternate_casei 1 5 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /tmp/benchsuite/subtitles/en.sample.txt 5.52523398399353 1136 LC_ALL=en_US.UTF-8
397 subtitles_en_surrounding_words 1 5 rg rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3353078365325928 483
398 subtitles_en_surrounding_words 1 5 rg rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3248591423034668 483
399 subtitles_en_surrounding_words 1 5 rg rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.33918261528015137 483
400 subtitles_en_surrounding_words 1 5 rg rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.33177971839904785 483
401 subtitles_en_surrounding_words 1 5 rg rg -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.34472131729125977 483
402 subtitles_en_surrounding_words 1 5 grep grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7516274452209473 483 LC_ALL=en_US.UTF-8
403 subtitles_en_surrounding_words 1 5 grep grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7489221096038818 483 LC_ALL=en_US.UTF-8
404 subtitles_en_surrounding_words 1 5 grep grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7574889659881592 483 LC_ALL=en_US.UTF-8
405 subtitles_en_surrounding_words 1 5 grep grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.813244342803955 483 LC_ALL=en_US.UTF-8
406 subtitles_en_surrounding_words 1 5 grep grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.750051498413086 483 LC_ALL=en_US.UTF-8
407 subtitles_en_surrounding_words 1 5 ugrep ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 70.12419986724854 489
408 subtitles_en_surrounding_words 1 5 ugrep ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 70.26925611495972 489
409 subtitles_en_surrounding_words 1 5 ugrep ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 70.56865787506104 489
410 subtitles_en_surrounding_words 1 5 ugrep ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 70.12933135032654 489
411 subtitles_en_surrounding_words 1 5 ugrep ugrep -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 70.07925295829773 489
412 subtitles_en_surrounding_words 1 5 rg (ASCII) rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3309454917907715 483
413 subtitles_en_surrounding_words 1 5 rg (ASCII) rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.33062124252319336 483
414 subtitles_en_surrounding_words 1 5 rg (ASCII) rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3292708396911621 483
415 subtitles_en_surrounding_words 1 5 rg (ASCII) rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3300509452819824 483
416 subtitles_en_surrounding_words 1 5 rg (ASCII) rg -n (?-u)\w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 0.3252389430999756 483
417 subtitles_en_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 7.372813701629639 489
418 subtitles_en_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 7.338848114013672 489
419 subtitles_en_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 7.739792108535767 489
420 subtitles_en_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 7.302056074142456 489
421 subtitles_en_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 7.334207057952881 489
422 subtitles_en_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7617950439453125 483 LC_ALL=C
423 subtitles_en_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7765378952026367 483 LC_ALL=C
424 subtitles_en_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7456245422363281 483 LC_ALL=C
425 subtitles_en_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.748713731765747 483 LC_ALL=C
426 subtitles_en_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 1.7846882343292236 483 LC_ALL=C
427 subtitles_en_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 31.14370322227478 489
428 subtitles_en_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 31.543628454208374 489
429 subtitles_en_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 31.133421182632446 489
430 subtitles_en_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 31.149214506149292 489
431 subtitles_en_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Holmes\s+\w+ /tmp/benchsuite/subtitles/en.sample.txt 31.180144548416138 489
432 subtitles_en_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.9173591136932373 22
433 subtitles_en_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.867539644241333 22
434 subtitles_en_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.9047088623046875 22
435 subtitles_en_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.9265778064727783 22
436 subtitles_en_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.874317169189453 22
437 subtitles_en_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 24.619744777679443 309
438 subtitles_en_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 24.622087240219116 309
439 subtitles_en_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 24.770710468292236 309
440 subtitles_en_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 24.60181713104248 309
441 subtitles_en_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 24.678969383239746 309
442 subtitles_en_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.676262140274048 22
443 subtitles_en_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.673837184906006 22
444 subtitles_en_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.667243003845215 22
445 subtitles_en_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.667970657348633 22
446 subtitles_en_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 2.6588196754455566 22
447 subtitles_en_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 10.786212682723999 302
448 subtitles_en_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 10.744041204452515 302
449 subtitles_en_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 10.74718165397644 302
450 subtitles_en_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 10.768681287765503 302
451 subtitles_en_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 10.772834777832031 302
452 subtitles_en_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 6.287469148635864 22 LC_ALL=C
453 subtitles_en_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 6.243509769439697 22 LC_ALL=C
454 subtitles_en_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 6.242478370666504 22 LC_ALL=C
455 subtitles_en_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 6.2600791454315186 22 LC_ALL=C
456 subtitles_en_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 6.2560741901397705 22 LC_ALL=C
457 subtitles_en_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 4.670856237411499 302
458 subtitles_en_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 4.703561544418335 302
459 subtitles_en_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 4.675989627838135 302
460 subtitles_en_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 4.6688103675842285 302
461 subtitles_en_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/en.sample.txt 4.715432167053223 302
462 subtitles_ru_literal 1 5 rg rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.20440673828125 583
463 subtitles_ru_literal 1 5 rg rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.20561552047729492 583
464 subtitles_ru_literal 1 5 rg rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.2381761074066162 583
465 subtitles_ru_literal 1 5 rg rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.23102140426635742 583
466 subtitles_ru_literal 1 5 rg rg Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.19649791717529297 583
467 subtitles_ru_literal 1 5 rg (no mmap) rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.3158297538757324 583
468 subtitles_ru_literal 1 5 rg (no mmap) rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.3136112689971924 583
469 subtitles_ru_literal 1 5 rg (no mmap) rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.32402992248535156 583
470 subtitles_ru_literal 1 5 rg (no mmap) rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.3248250484466553 583
471 subtitles_ru_literal 1 5 rg (no mmap) rg --no-mmap Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.3201103210449219 583
472 subtitles_ru_literal 1 5 grep grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7790360450744629 583 LC_ALL=C
473 subtitles_ru_literal 1 5 grep grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7977695465087891 583 LC_ALL=C
474 subtitles_ru_literal 1 5 grep grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7397308349609375 583 LC_ALL=C
475 subtitles_ru_literal 1 5 grep grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7123947143554688 583 LC_ALL=C
476 subtitles_ru_literal 1 5 grep grep Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.711977481842041 583 LC_ALL=C
477 subtitles_ru_literal 1 5 rg (lines) rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.27593088150024414 583
478 subtitles_ru_literal 1 5 rg (lines) rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.2842848300933838 583
479 subtitles_ru_literal 1 5 rg (lines) rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.28340864181518555 583
480 subtitles_ru_literal 1 5 rg (lines) rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.28469133377075195 583
481 subtitles_ru_literal 1 5 rg (lines) rg -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.27951884269714355 583
482 subtitles_ru_literal 1 5 ag (lines) ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 2.7401182651519775 583
483 subtitles_ru_literal 1 5 ag (lines) ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 2.658051013946533 583
484 subtitles_ru_literal 1 5 ag (lines) ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 2.666799306869507 583
485 subtitles_ru_literal 1 5 ag (lines) ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 2.7145025730133057 583
486 subtitles_ru_literal 1 5 ag (lines) ag -s Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 2.7412168979644775 583
487 subtitles_ru_literal 1 5 grep (lines) grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0886235237121582 583 LC_ALL=C
488 subtitles_ru_literal 1 5 grep (lines) grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0896506309509277 583 LC_ALL=C
489 subtitles_ru_literal 1 5 grep (lines) grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1100494861602783 583 LC_ALL=C
490 subtitles_ru_literal 1 5 grep (lines) grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.088308334350586 583 LC_ALL=C
491 subtitles_ru_literal 1 5 grep (lines) grep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0891127586364746 583 LC_ALL=C
492 subtitles_ru_literal 1 5 ugrep (lines) ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8426175117492676 583
493 subtitles_ru_literal 1 5 ugrep (lines) ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.85064697265625 583
494 subtitles_ru_literal 1 5 ugrep (lines) ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8356082439422607 583
495 subtitles_ru_literal 1 5 ugrep (lines) ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8405826091766357 583
496 subtitles_ru_literal 1 5 ugrep (lines) ugrep -n Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.83730149269104 583
497 subtitles_ru_literal_casei 1 5 rg rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.48739099502563477 604
498 subtitles_ru_literal_casei 1 5 rg rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.4823324680328369 604
499 subtitles_ru_literal_casei 1 5 rg rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.4832422733306885 604
500 subtitles_ru_literal_casei 1 5 rg rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.4812777042388916 604
501 subtitles_ru_literal_casei 1 5 rg rg -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.4854264259338379 604
502 subtitles_ru_literal_casei 1 5 grep grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 6.694453477859497 604 LC_ALL=en_US.UTF-8
503 subtitles_ru_literal_casei 1 5 grep grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 6.759232044219971 604 LC_ALL=en_US.UTF-8
504 subtitles_ru_literal_casei 1 5 grep grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 6.686243534088135 604 LC_ALL=en_US.UTF-8
505 subtitles_ru_literal_casei 1 5 grep grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 6.7029454708099365 604 LC_ALL=en_US.UTF-8
506 subtitles_ru_literal_casei 1 5 grep grep -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 6.699738264083862 604 LC_ALL=en_US.UTF-8
507 subtitles_ru_literal_casei 1 5 grep (ASCII) grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7290260791778564 583 LC_ALL=C
508 subtitles_ru_literal_casei 1 5 grep (ASCII) grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7400493621826172 583 LC_ALL=C
509 subtitles_ru_literal_casei 1 5 grep (ASCII) grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7299001216888428 583 LC_ALL=C
510 subtitles_ru_literal_casei 1 5 grep (ASCII) grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7308380603790283 583 LC_ALL=C
511 subtitles_ru_literal_casei 1 5 grep (ASCII) grep -E -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.7283904552459717 583 LC_ALL=C
512 subtitles_ru_literal_casei 1 5 rg (lines) rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.5711629390716553 604
513 subtitles_ru_literal_casei 1 5 rg (lines) rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.574974536895752 604
514 subtitles_ru_literal_casei 1 5 rg (lines) rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.5820963382720947 604
515 subtitles_ru_literal_casei 1 5 rg (lines) rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.5438523292541504 604
516 subtitles_ru_literal_casei 1 5 rg (lines) rg -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.5054161548614502 604
517 subtitles_ru_literal_casei 1 5 ag (lines) (ASCII) ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6135058403015137
518 subtitles_ru_literal_casei 1 5 ag (lines) (ASCII) ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6051545143127441
519 subtitles_ru_literal_casei 1 5 ag (lines) (ASCII) ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6032793521881104
520 subtitles_ru_literal_casei 1 5 ag (lines) (ASCII) ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6097028255462646
521 subtitles_ru_literal_casei 1 5 ag (lines) (ASCII) ag -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6850666999816895
522 subtitles_ru_literal_casei 1 5 ugrep (lines) (ASCII) ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.833592176437378 583
523 subtitles_ru_literal_casei 1 5 ugrep (lines) (ASCII) ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8357219696044922 583
524 subtitles_ru_literal_casei 1 5 ugrep (lines) (ASCII) ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8394358158111572 583
525 subtitles_ru_literal_casei 1 5 ugrep (lines) (ASCII) ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8334264755249023 583
526 subtitles_ru_literal_casei 1 5 ugrep (lines) (ASCII) ugrep -n -i Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8304622173309326 583
527 subtitles_ru_literal_word 1 5 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt 0.2904787063598633 583
528 subtitles_ru_literal_word 1 5 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt 0.2831101417541504 583
529 subtitles_ru_literal_word 1 5 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt 0.2786984443664551 583
530 subtitles_ru_literal_word 1 5 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt 0.28719663619995117 583
531 subtitles_ru_literal_word 1 5 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /tmp/benchsuite/subtitles/ru.txt 0.27600622177124023 583
532 subtitles_ru_literal_word 1 5 ag (ASCII) ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6810102462768555
533 subtitles_ru_literal_word 1 5 ag (ASCII) ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6855161190032959
534 subtitles_ru_literal_word 1 5 ag (ASCII) ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6827929019927979
535 subtitles_ru_literal_word 1 5 ag (ASCII) ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6587810516357422
536 subtitles_ru_literal_word 1 5 ag (ASCII) ag -sw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.6551673412322998
537 subtitles_ru_literal_word 1 5 grep (ASCII) grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0948495864868164 583 LC_ALL=C
538 subtitles_ru_literal_word 1 5 grep (ASCII) grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.097151756286621 583 LC_ALL=C
539 subtitles_ru_literal_word 1 5 grep (ASCII) grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1051688194274902 583 LC_ALL=C
540 subtitles_ru_literal_word 1 5 grep (ASCII) grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1151607036590576 583 LC_ALL=C
541 subtitles_ru_literal_word 1 5 grep (ASCII) grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1100919246673584 583 LC_ALL=C
542 subtitles_ru_literal_word 1 5 ugrep (ASCII) ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.84104585647583
543 subtitles_ru_literal_word 1 5 ugrep (ASCII) ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.9092209339141846
544 subtitles_ru_literal_word 1 5 ugrep (ASCII) ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.836583137512207
545 subtitles_ru_literal_word 1 5 ugrep (ASCII) ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8941335678100586
546 subtitles_ru_literal_word 1 5 ugrep (ASCII) ugrep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.8811957836151123
547 subtitles_ru_literal_word 1 5 rg rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.2956504821777344 579
548 subtitles_ru_literal_word 1 5 rg rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.29023194313049316 579
549 subtitles_ru_literal_word 1 5 rg rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.3374972343444824 579
550 subtitles_ru_literal_word 1 5 rg rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.29686713218688965 579
551 subtitles_ru_literal_word 1 5 rg rg -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 0.29778003692626953 579
552 subtitles_ru_literal_word 1 5 grep grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1042869091033936 579 LC_ALL=en_US.UTF-8
553 subtitles_ru_literal_word 1 5 grep grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.1068925857543945 579 LC_ALL=en_US.UTF-8
554 subtitles_ru_literal_word 1 5 grep grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0973529815673828 579 LC_ALL=en_US.UTF-8
555 subtitles_ru_literal_word 1 5 grep grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0917479991912842 579 LC_ALL=en_US.UTF-8
556 subtitles_ru_literal_word 1 5 grep grep -nw Шерлок Холмс /tmp/benchsuite/subtitles/ru.txt 1.0987188816070557 579 LC_ALL=en_US.UTF-8
557 subtitles_ru_alternate 1 5 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8945937156677246 691
558 subtitles_ru_alternate 1 5 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8919808864593506 691
559 subtitles_ru_alternate 1 5 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.9041986465454102 691
560 subtitles_ru_alternate 1 5 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8838107585906982 691
561 subtitles_ru_alternate 1 5 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.903540849685669 691
562 subtitles_ru_alternate 1 5 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.715298652648926 691
563 subtitles_ru_alternate 1 5 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.676830530166626 691
564 subtitles_ru_alternate 1 5 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.721431016921997 691
565 subtitles_ru_alternate 1 5 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.6990325450897217 691
566 subtitles_ru_alternate 1 5 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.764216184616089 691
567 subtitles_ru_alternate 1 5 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.519805669784546 691 LC_ALL=C
568 subtitles_ru_alternate 1 5 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.40212869644165 691 LC_ALL=C
569 subtitles_ru_alternate 1 5 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.381818294525146 691 LC_ALL=C
570 subtitles_ru_alternate 1 5 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.386401176452637 691 LC_ALL=C
571 subtitles_ru_alternate 1 5 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.425997257232666 691 LC_ALL=C
572 subtitles_ru_alternate 1 5 ugrep (lines) ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.259684801101685 691
573 subtitles_ru_alternate 1 5 ugrep (lines) ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.236181735992432 691
574 subtitles_ru_alternate 1 5 ugrep (lines) ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.340983629226685 691
575 subtitles_ru_alternate 1 5 ugrep (lines) ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.21895980834961 691
576 subtitles_ru_alternate 1 5 ugrep (lines) ugrep -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.194425106048584 691
577 subtitles_ru_alternate 1 5 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8262777328491211 691
578 subtitles_ru_alternate 1 5 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8343832492828369 691
579 subtitles_ru_alternate 1 5 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8675012588500977 691
580 subtitles_ru_alternate 1 5 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8584244251251221 691
581 subtitles_ru_alternate 1 5 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 0.8777158260345459 691
582 subtitles_ru_alternate 1 5 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.25586986541748 691 LC_ALL=C
583 subtitles_ru_alternate 1 5 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.007173538208008 691 LC_ALL=C
584 subtitles_ru_alternate 1 5 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.068726301193237 691 LC_ALL=C
585 subtitles_ru_alternate 1 5 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.010542631149292 691 LC_ALL=C
586 subtitles_ru_alternate 1 5 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.021028280258179 691 LC_ALL=C
587 subtitles_ru_alternate_casei 1 5 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.7179486751556396 691
588 subtitles_ru_alternate_casei 1 5 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.682896375656128 691
589 subtitles_ru_alternate_casei 1 5 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.699859142303467 691
590 subtitles_ru_alternate_casei 1 5 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.662733316421509 691
591 subtitles_ru_alternate_casei 1 5 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 3.661060094833374 691
592 subtitles_ru_alternate_casei 1 5 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.434819221496582 691 LC_ALL=C
593 subtitles_ru_alternate_casei 1 5 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.436205625534058 691 LC_ALL=C
594 subtitles_ru_alternate_casei 1 5 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.388120412826538 691 LC_ALL=C
595 subtitles_ru_alternate_casei 1 5 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.407799243927002 691 LC_ALL=C
596 subtitles_ru_alternate_casei 1 5 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 8.44464373588562 691 LC_ALL=C
597 subtitles_ru_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.216991662979126 691
598 subtitles_ru_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.470320701599121 691
599 subtitles_ru_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.21274471282959 691
600 subtitles_ru_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.38324522972107 691
601 subtitles_ru_alternate_casei 1 5 ugrep (ASCII) ugrep -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 13.3148832321167 691
602 subtitles_ru_alternate_casei 1 5 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 6.205031156539917 735
603 subtitles_ru_alternate_casei 1 5 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 6.1502509117126465 735
604 subtitles_ru_alternate_casei 1 5 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 6.150696516036987 735
605 subtitles_ru_alternate_casei 1 5 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 6.150148630142212 735
606 subtitles_ru_alternate_casei 1 5 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 6.153124809265137 735
607 subtitles_ru_alternate_casei 1 5 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 7.477111339569092 735 LC_ALL=en_US.UTF-8
608 subtitles_ru_alternate_casei 1 5 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 7.483617782592773 735 LC_ALL=en_US.UTF-8
609 subtitles_ru_alternate_casei 1 5 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 7.502292156219482 735 LC_ALL=en_US.UTF-8
610 subtitles_ru_alternate_casei 1 5 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 7.528963327407837 735 LC_ALL=en_US.UTF-8
611 subtitles_ru_alternate_casei 1 5 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /tmp/benchsuite/subtitles/ru.txt 7.482379198074341 735 LC_ALL=en_US.UTF-8
612 subtitles_ru_surrounding_words 1 5 rg rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 0.3461883068084717 278
613 subtitles_ru_surrounding_words 1 5 rg rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 0.30211687088012695 278
614 subtitles_ru_surrounding_words 1 5 rg rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 0.30521416664123535 278
615 subtitles_ru_surrounding_words 1 5 rg rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 0.2969543933868408 278
616 subtitles_ru_surrounding_words 1 5 rg rg -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 0.3003671169281006 278
617 subtitles_ru_surrounding_words 1 5 grep grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4209251403808594 278 LC_ALL=en_US.UTF-8
618 subtitles_ru_surrounding_words 1 5 grep grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4190807342529297 278 LC_ALL=en_US.UTF-8
619 subtitles_ru_surrounding_words 1 5 grep grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4178283214569092 278 LC_ALL=en_US.UTF-8
620 subtitles_ru_surrounding_words 1 5 grep grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4173235893249512 278 LC_ALL=en_US.UTF-8
621 subtitles_ru_surrounding_words 1 5 grep grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4221296310424805 278 LC_ALL=en_US.UTF-8
622 subtitles_ru_surrounding_words 1 5 ugrep ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 70.6701226234436 326
623 subtitles_ru_surrounding_words 1 5 ugrep ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 71.15788650512695 326
624 subtitles_ru_surrounding_words 1 5 ugrep ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 71.07276272773743 326
625 subtitles_ru_surrounding_words 1 5 ugrep ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 70.5626060962677 326
626 subtitles_ru_surrounding_words 1 5 ugrep ugrep -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 70.54449439048767 326
627 subtitles_ru_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.868441104888916
628 subtitles_ru_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.886382818222046
629 subtitles_ru_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.8685986995697021
630 subtitles_ru_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.8727426528930664
631 subtitles_ru_surrounding_words 1 5 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.8667800426483154
632 subtitles_ru_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.3818490505218506 LC_ALL=C
633 subtitles_ru_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.3709721565246582 LC_ALL=C
634 subtitles_ru_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.3819043636322021 LC_ALL=C
635 subtitles_ru_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.460402488708496 LC_ALL=C
636 subtitles_ru_surrounding_words 1 5 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.4097135066986084 LC_ALL=C
637 subtitles_ru_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.286102294921875
638 subtitles_ru_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.2712647914886475
639 subtitles_ru_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.2950100898742676
640 subtitles_ru_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.264500617980957
641 subtitles_ru_surrounding_words 1 5 ugrep (ASCII) ugrep -n -U \w+\s+Холмс\s+\w+ /tmp/benchsuite/subtitles/ru.txt 1.2877566814422607
642 subtitles_ru_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 3.1152236461639404 41
643 subtitles_ru_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 3.1311423778533936 41
644 subtitles_ru_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 3.0800061225891113 41
645 subtitles_ru_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 3.070636510848999 41
646 subtitles_ru_no_literal 1 5 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 3.0940587520599365 41
647 subtitles_ru_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 50.85447072982788 86
648 subtitles_ru_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 50.832582235336304 86
649 subtitles_ru_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 50.8755087852478 86
650 subtitles_ru_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 50.79056358337402 86
651 subtitles_ru_no_literal 1 5 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 50.84795618057251 86
652 subtitles_ru_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 2.716826915740967
653 subtitles_ru_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 2.7381114959716797
654 subtitles_ru_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 2.7545180320739746
655 subtitles_ru_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 2.7215416431427
656 subtitles_ru_no_literal 1 5 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 2.707784414291382
657 subtitles_ru_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.9250116348266602
658 subtitles_ru_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.8956947326660156
659 subtitles_ru_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.8904175758361816
660 subtitles_ru_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.8968868255615234
661 subtitles_ru_no_literal 1 5 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.900888204574585
662 subtitles_ru_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.755054235458374 LC_ALL=C
663 subtitles_ru_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.7681376934051514 LC_ALL=C
664 subtitles_ru_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.7654614448547363 LC_ALL=C
665 subtitles_ru_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.75648832321167 LC_ALL=C
666 subtitles_ru_no_literal 1 5 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.7456772327423096 LC_ALL=C
667 subtitles_ru_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.2170698642730713
668 subtitles_ru_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.1907124519348145
669 subtitles_ru_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.1722266674041748
670 subtitles_ru_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.191617727279663
671 subtitles_ru_no_literal 1 5 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /tmp/benchsuite/subtitles/ru.txt 1.1909863948822021

View File

@@ -0,0 +1,208 @@
linux_literal_default (pattern: PM_RESUME)
------------------------------------------
rg* 0.124 +/- 0.004 (lines: 19)*
ag 0.771 +/- 0.187 (lines: 19)
git grep 0.480 +/- 0.010 (lines: 19)
ugrep 0.136 +/- 0.002 (lines: 19)
grep 1.147 +/- 0.005 (lines: 19)
linux_literal (pattern: PM_RESUME)
----------------------------------
rg* 0.130 +/- 0.008 (lines: 19)*
rg (mmap) 1.336 +/- 0.036 (lines: 19)
ag (mmap) 0.880 +/- 0.071 (lines: 19)
git grep 0.464 +/- 0.005 (lines: 19)
ugrep 0.309 +/- 0.012 (lines: 19)
linux_literal_casei (pattern: PM_RESUME)
----------------------------------------
rg* 0.131 +/- 0.005 (lines: 456)*
rg (mmap) 1.336 +/- 0.020 (lines: 456)
ag (mmap) 0.657 +/- 0.123 (lines: 456)
git grep 0.482 +/- 0.007 (lines: 456)
ugrep 0.288 +/- 0.014 (lines: 456)
linux_re_literal_suffix (pattern: [A-Z]+_RESUME)
------------------------------------------------
rg* 0.126 +/- 0.009 (lines: 1944)*
ag 1.044 +/- 0.138 (lines: 1944)
git grep 1.217 +/- 0.045 (lines: 1944)
ugrep 0.548 +/- 0.014 (lines: 1944)
linux_word (pattern: PM_RESUME)
-------------------------------
rg* 0.134 +/- 0.003 (lines: 6)*
ag 0.618 +/- 0.154 (lines: 6)
git grep 0.471 +/- 0.018 (lines: 6)
ugrep 0.306 +/- 0.018 (lines: 6)
linux_unicode_greek (pattern: \p{Greek})
----------------------------------------
rg* 0.263 +/- 0.001 (lines: 105)*
ugrep 0.273 +/- 0.007 (lines: 105)
linux_unicode_greek_casei (pattern: \p{Greek})
----------------------------------------------
rg* 0.256 +/- 0.013 (lines: 225)*
ugrep 0.271 +/- 0.004 (lines: 105)
linux_unicode_word (pattern: \wAh)
----------------------------------
rg 0.140 +/- 0.004 (lines: 229)
rg (ASCII)* 0.138 +/- 0.009 (lines: 216)*
ag (ASCII) 1.278 +/- 0.101 (lines: 216)
git grep 8.188 +/- 0.027 (lines: 229)
git grep (ASCII) 2.334 +/- 0.025 (lines: 216)
ugrep 0.276 +/- 0.002 (lines: 229)
ugrep (ASCII) 0.274 +/- 0.004 (lines: 216)
linux_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
-----------------------------------------------------------------
rg 0.402 +/- 0.008 (lines: 611)
rg (ASCII)* 0.254 +/- 0.010 (lines: 610)*
ag (ASCII) 0.934 +/- 0.008 (lines: 971)
git grep 14.591 +/- 0.077 (lines: 611)
git grep (ASCII) 3.182 +/- 0.035 (lines: 610)
ugrep 6.196 +/- 0.052 (lines: 973)
ugrep (ASCII) 0.363 +/- 0.004 (lines: 972)
linux_alternates (pattern: ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT)
-------------------------------------------------------------------------
rg* 0.142 +/- 0.010 (lines: 112)*
ag 0.991 +/- 0.019 (lines: 112)
git grep 0.571 +/- 0.011 (lines: 112)
ugrep 0.290 +/- 0.017 (lines: 112)
linux_alternates_casei (pattern: ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT)
-------------------------------------------------------------------------------
rg* 0.226 +/- 0.007 (lines: 203)*
ag 0.700 +/- 0.011 (lines: 203)
git grep 0.977 +/- 0.011 (lines: 203)
ugrep 0.275 +/- 0.005 (lines: 203)
subtitles_en_literal (pattern: Sherlock Holmes)
-----------------------------------------------
rg* 0.226 +/- 0.004 (lines: 830)*
rg (no mmap) 0.366 +/- 0.005 (lines: 830)
grep 0.800 +/- 0.006 (lines: 830)
rg (lines) 0.317 +/- 0.016 (lines: 830)
ag (lines) 2.547 +/- 0.013 (lines: 830)
grep (lines) 1.294 +/- 0.004 (lines: 830)
ugrep (lines) 0.404 +/- 0.011 (lines: 830)
subtitles_en_literal_casei (pattern: Sherlock Holmes)
-----------------------------------------------------
rg* 0.398 +/- 0.024 (lines: 871)*
grep 3.621 +/- 0.016 (lines: 871)
grep (ASCII) 0.938 +/- 0.020 (lines: 871)
rg (lines) 0.514 +/- 0.016 (lines: 871)
ag (lines) (ASCII) 2.595 +/- 0.030 (lines: 871)
ugrep (lines) 1.103 +/- 0.008 (lines: 871)
subtitles_en_literal_word (pattern: Sherlock Holmes)
----------------------------------------------------
rg (ASCII)* 0.317 +/- 0.007 (lines: 830)*
ag (ASCII) 2.584 +/- 0.022 (lines: 830)
grep (ASCII) 1.319 +/- 0.018 (lines: 830)
ugrep (ASCII) 0.414 +/- 0.004 (lines: 830)
rg 0.323 +/- 0.005 (lines: 830)
grep 1.317 +/- 0.023 (lines: 830)
subtitles_en_alternate (pattern: Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)
---------------------------------------------------------------------------------------------------------------
rg (lines) 0.429 +/- 0.027 (lines: 1094)
ag (lines) 3.608 +/- 0.036 (lines: 1094)
grep (lines) 3.325 +/- 0.017 (lines: 1094)
ugrep (lines) 1.133 +/- 0.045 (lines: 1094)
rg* 0.305 +/- 0.014 (lines: 1094)*
grep 2.821 +/- 0.013 (lines: 1094)
subtitles_en_alternate_casei (pattern: Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)
---------------------------------------------------------------------------------------------------------------------
ag (ASCII) 6.181 +/- 0.018 (lines: 1136)
grep (ASCII) 5.527 +/- 0.022 (lines: 1136)
ugrep (ASCII) 1.108 +/- 0.007 (lines: 1136)
rg* 0.799 +/- 0.042 (lines: 1136)*
grep 5.573 +/- 0.095 (lines: 1136)
subtitles_en_surrounding_words (pattern: \w+\s+Holmes\s+\w+)
------------------------------------------------------------
rg* 0.335 +/- 0.008 (lines: 483)
grep 1.764 +/- 0.028 (lines: 483)
ugrep 70.234 +/- 0.200 (lines: 489)
rg (ASCII) 0.329 +/- 0.002 (lines: 483)*
ag (ASCII) 7.418 +/- 0.182 (lines: 489)
grep (ASCII) 1.763 +/- 0.017 (lines: 483)
ugrep (ASCII) 31.230 +/- 0.176 (lines: 489)
subtitles_en_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
----------------------------------------------------------------------------------------
rg 2.898 +/- 0.026 (lines: 22)
ugrep 24.659 +/- 0.069 (lines: 309)
rg (ASCII)* 2.669 +/- 0.007 (lines: 22)*
ag (ASCII) 10.764 +/- 0.018 (lines: 302)
grep (ASCII) 6.258 +/- 0.018 (lines: 22)
ugrep (ASCII) 4.687 +/- 0.021 (lines: 302)
subtitles_ru_literal (pattern: Шерлок Холмс)
--------------------------------------------
rg* 0.215 +/- 0.018 (lines: 583)*
rg (no mmap) 0.320 +/- 0.005 (lines: 583)
grep 0.748 +/- 0.039 (lines: 583)
rg (lines) 0.282 +/- 0.004 (lines: 583)
ag (lines) 2.704 +/- 0.040 (lines: 583)
grep (lines) 1.093 +/- 0.009 (lines: 583)
ugrep (lines) 1.841 +/- 0.006 (lines: 583)
subtitles_ru_literal_casei (pattern: Шерлок Холмс)
--------------------------------------------------
rg* 0.484 +/- 0.002 (lines: 604)*
grep 6.709 +/- 0.029 (lines: 604)
grep (ASCII) 0.732 +/- 0.005 (lines: 583)
rg (lines) 0.556 +/- 0.032 (lines: 604)
ag (lines) (ASCII) 0.623 +/- 0.035 (lines: 0)
ugrep (lines) (ASCII) 1.835 +/- 0.003 (lines: 583)
subtitles_ru_literal_word (pattern: Шерлок Холмс)
-------------------------------------------------
rg (ASCII)* 0.283 +/- 0.006 (lines: 583)*
ag (ASCII) 0.673 +/- 0.014 (lines: 0)
grep (ASCII) 1.104 +/- 0.009 (lines: 583)
ugrep (ASCII) 1.872 +/- 0.032 (lines: 0)
rg 0.304 +/- 0.019 (lines: 579)
grep 1.100 +/- 0.006 (lines: 579)
subtitles_ru_alternate (pattern: Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти)
-----------------------------------------------------------------------------------------------------------
rg (lines) 0.896 +/- 0.009 (lines: 691)
ag (lines) 3.715 +/- 0.032 (lines: 691)
grep (lines) 8.423 +/- 0.057 (lines: 691)
ugrep (lines) 13.250 +/- 0.056 (lines: 691)
rg* 0.853 +/- 0.022 (lines: 691)*
grep 8.073 +/- 0.105 (lines: 691)
subtitles_ru_alternate_casei (pattern: Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти)
-----------------------------------------------------------------------------------------------------------------
ag (ASCII)* 3.685 +/- 0.024 (lines: 691)*
grep (ASCII) 8.422 +/- 0.024 (lines: 691)
ugrep (ASCII) 13.320 +/- 0.110 (lines: 691)
rg 6.162 +/- 0.024 (lines: 735)
grep 7.495 +/- 0.021 (lines: 735)
subtitles_ru_surrounding_words (pattern: \w+\s+Холмс\s+\w+)
-----------------------------------------------------------
rg* 0.310 +/- 0.020 (lines: 278)*
grep 1.419 +/- 0.002 (lines: 278)
ugrep 70.802 +/- 0.292 (lines: 326)
ag (ASCII) 1.873 +/- 0.008 (lines: 0)
grep (ASCII) 1.401 +/- 0.036 (lines: 0)
ugrep (ASCII) 1.281 +/- 0.013 (lines: 0)
subtitles_ru_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
----------------------------------------------------------------------------------------
rg 3.098 +/- 0.025 (lines: 41)
ugrep 50.840 +/- 0.032 (lines: 86)
rg (ASCII) 2.728 +/- 0.019 (lines: 0)
ag (ASCII) 1.902 +/- 0.014 (lines: 0)
grep (ASCII) 1.758 +/- 0.009 (lines: 0)
ugrep (ASCII)* 1.193 +/- 0.016 (lines: 0)*

View File

@@ -0,0 +1,38 @@
This directory contains updated benchmarks as of 2022-12-16. They were captured
via the benchsuite script at `benchsuite/benchsuite` from the root of this
repository. The command that was run:
$ ./benchsuite \
--dir /dev/shm/benchsuite \
--raw runs/2022-12-16-archlinux-duff/raw.csv \
| tee runs/2022-12-16-archlinux-duff/summary
The versions of each tool are as follows:
$ rg --version
ripgrep 13.0.0 (rev 87c4a2b4b1)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)
$ grep -V
grep (GNU grep) 3.8
$ ag -V
ag version 2.2.0
Features:
+jit +lzma +zlib
$ git --version
git version 2.39.0
$ ugrep --version
ugrep 3.9.2 x86_64-pc-linux-gnu +avx2 +pcre2jit +zlib +bzip2 +lzma +lz4 +zstd
License BSD-3-Clause: <https://opensource.org/licenses/BSD-3-Clause>
Written by Robert van Engelen and others: <https://github.com/Genivia/ugrep>
The version of ripgrep used was compiled from source on commit 7f23cd63:
$ cargo build --release --features 'pcre2'
This was run on a machine with an Intel i9-12900K with 128GB of memory.

View File

@@ -0,0 +1,400 @@
benchmark,warmup_iter,iter,name,command,duration,lines,env
linux_literal_default,1,3,rg,rg PM_RESUME,0.08678817749023438,39,
linux_literal_default,1,3,rg,rg PM_RESUME,0.08307123184204102,39,
linux_literal_default,1,3,rg,rg PM_RESUME,0.08347964286804199,39,
linux_literal_default,1,3,ag,ag PM_RESUME,0.2955434322357178,39,
linux_literal_default,1,3,ag,ag PM_RESUME,0.2954287528991699,39,
linux_literal_default,1,3,ag,ag PM_RESUME,0.2938194274902344,39,
linux_literal_default,1,3,git grep,git grep PM_RESUME,0.23198556900024414,39,LC_ALL=en_US.UTF-8
linux_literal_default,1,3,git grep,git grep PM_RESUME,0.22356963157653809,39,LC_ALL=en_US.UTF-8
linux_literal_default,1,3,git grep,git grep PM_RESUME,0.2189793586730957,39,LC_ALL=en_US.UTF-8
linux_literal_default,1,3,ugrep,ugrep -r PM_RESUME ./,0.10710000991821289,39,
linux_literal_default,1,3,ugrep,ugrep -r PM_RESUME ./,0.10364222526550293,39,
linux_literal_default,1,3,ugrep,ugrep -r PM_RESUME ./,0.1052248477935791,39,
linux_literal_default,1,3,grep,grep -r PM_RESUME ./,0.9994468688964844,39,LC_ALL=en_US.UTF-8
linux_literal_default,1,3,grep,grep -r PM_RESUME ./,0.9939279556274414,39,LC_ALL=en_US.UTF-8
linux_literal_default,1,3,grep,grep -r PM_RESUME ./,0.9957931041717529,39,LC_ALL=en_US.UTF-8
linux_literal,1,3,rg,rg -n PM_RESUME,0.08603358268737793,39,
linux_literal,1,3,rg,rg -n PM_RESUME,0.0837090015411377,39,
linux_literal,1,3,rg,rg -n PM_RESUME,0.08435535430908203,39,
linux_literal,1,3,rg (mmap),rg -n --mmap PM_RESUME,0.3215503692626953,39,
linux_literal,1,3,rg (mmap),rg -n --mmap PM_RESUME,0.32426929473876953,39,
linux_literal,1,3,rg (mmap),rg -n --mmap PM_RESUME,0.3215982913970947,39,
linux_literal,1,3,ag (mmap),ag -s PM_RESUME,0.2894856929779053,39,
linux_literal,1,3,ag (mmap),ag -s PM_RESUME,0.2892603874206543,39,
linux_literal,1,3,ag (mmap),ag -s PM_RESUME,0.29217028617858887,39,
linux_literal,1,3,git grep,git grep -I -n PM_RESUME,0.206068754196167,39,LC_ALL=C
linux_literal,1,3,git grep,git grep -I -n PM_RESUME,0.2218036651611328,39,LC_ALL=C
linux_literal,1,3,git grep,git grep -I -n PM_RESUME,0.20590710639953613,39,LC_ALL=C
linux_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.18692874908447266,39,
linux_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.19518327713012695,39,
linux_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./,0.18577361106872559,39,
linux_literal_casei,1,3,rg,rg -n -i PM_RESUME,0.08709383010864258,536,
linux_literal_casei,1,3,rg,rg -n -i PM_RESUME,0.08861064910888672,536,
linux_literal_casei,1,3,rg,rg -n -i PM_RESUME,0.08769798278808594,536,
linux_literal_casei,1,3,rg (mmap),rg -n -i --mmap PM_RESUME,0.3218965530395508,536,
linux_literal_casei,1,3,rg (mmap),rg -n -i --mmap PM_RESUME,0.30869364738464355,536,
linux_literal_casei,1,3,rg (mmap),rg -n -i --mmap PM_RESUME,0.31044936180114746,536,
linux_literal_casei,1,3,ag (mmap),ag -i PM_RESUME,0.2989068031311035,536,
linux_literal_casei,1,3,ag (mmap),ag -i PM_RESUME,0.2996039390563965,536,
linux_literal_casei,1,3,ag (mmap),ag -i PM_RESUME,0.29817700386047363,536,
linux_literal_casei,1,3,git grep,git grep -I -n -i PM_RESUME,0.2122786045074463,536,LC_ALL=C
linux_literal_casei,1,3,git grep,git grep -I -n -i PM_RESUME,0.20763754844665527,536,LC_ALL=C
linux_literal_casei,1,3,git grep,git grep -I -n -i PM_RESUME,0.220794677734375,536,LC_ALL=C
linux_literal_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.17305850982666016,536,
linux_literal_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.1745915412902832,536,
linux_literal_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./,0.17526865005493164,536,
linux_re_literal_suffix,1,3,rg,rg -n [A-Z]+_RESUME,0.08527851104736328,2160,
linux_re_literal_suffix,1,3,rg,rg -n [A-Z]+_RESUME,0.08487534523010254,2160,
linux_re_literal_suffix,1,3,rg,rg -n [A-Z]+_RESUME,0.0848684310913086,2160,
linux_re_literal_suffix,1,3,ag,ag -s [A-Z]+_RESUME,0.37945985794067383,2160,
linux_re_literal_suffix,1,3,ag,ag -s [A-Z]+_RESUME,0.36303210258483887,2160,
linux_re_literal_suffix,1,3,ag,ag -s [A-Z]+_RESUME,0.36359691619873047,2160,
linux_re_literal_suffix,1,3,git grep,git grep -E -I -n [A-Z]+_RESUME,0.9589834213256836,2160,LC_ALL=C
linux_re_literal_suffix,1,3,git grep,git grep -E -I -n [A-Z]+_RESUME,0.9206984043121338,2160,LC_ALL=C
linux_re_literal_suffix,1,3,git grep,git grep -E -I -n [A-Z]+_RESUME,0.8642933368682861,2160,LC_ALL=C
linux_re_literal_suffix,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.40503501892089844,2160,
linux_re_literal_suffix,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.4531714916229248,2160,
linux_re_literal_suffix,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./,0.4397866725921631,2160,
linux_word,1,3,rg,rg -n -w PM_RESUME,0.08639907836914062,9,
linux_word,1,3,rg,rg -n -w PM_RESUME,0.08583569526672363,9,
linux_word,1,3,rg,rg -n -w PM_RESUME,0.08414363861083984,9,
linux_word,1,3,ag,ag -s -w PM_RESUME,0.2853865623474121,9,
linux_word,1,3,ag,ag -s -w PM_RESUME,0.2871377468109131,9,
linux_word,1,3,ag,ag -s -w PM_RESUME,0.28753662109375,9,
linux_word,1,3,git grep,git grep -E -I -n -w PM_RESUME,0.20428204536437988,9,LC_ALL=C
linux_word,1,3,git grep,git grep -E -I -n -w PM_RESUME,0.20490717887878418,9,LC_ALL=C
linux_word,1,3,git grep,git grep -E -I -n -w PM_RESUME,0.20840072631835938,9,LC_ALL=C
linux_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.18790841102600098,9,
linux_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.18659543991088867,9,
linux_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./,0.19104933738708496,9,
linux_unicode_greek,1,3,rg,rg -n \p{Greek},0.19976496696472168,105,
linux_unicode_greek,1,3,rg,rg -n \p{Greek},0.20618367195129395,105,
linux_unicode_greek,1,3,rg,rg -n \p{Greek},0.19702935218811035,105,
linux_unicode_greek,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.17758727073669434,105,
linux_unicode_greek,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.17793798446655273,105,
linux_unicode_greek,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./,0.1872577667236328,105,
linux_unicode_greek_casei,1,3,rg,rg -n -i \p{Greek},0.19808244705200195,245,
linux_unicode_greek_casei,1,3,rg,rg -n -i \p{Greek},0.1979837417602539,245,
linux_unicode_greek_casei,1,3,rg,rg -n -i \p{Greek},0.1984400749206543,245,
linux_unicode_greek_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.1819148063659668,105,
linux_unicode_greek_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.17530512809753418,105,
linux_unicode_greek_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./,0.17999005317687988,105,
linux_unicode_word,1,3,rg,rg -n \wAh,0.08527827262878418,247,
linux_unicode_word,1,3,rg,rg -n \wAh,0.08541679382324219,247,
linux_unicode_word,1,3,rg,rg -n \wAh,0.08553218841552734,247,
linux_unicode_word,1,3,rg (ASCII),rg -n (?-u)\wAh,0.08484745025634766,233,
linux_unicode_word,1,3,rg (ASCII),rg -n (?-u)\wAh,0.08466482162475586,233,
linux_unicode_word,1,3,rg (ASCII),rg -n (?-u)\wAh,0.08487439155578613,233,
linux_unicode_word,1,3,ag (ASCII),ag -s \wAh,0.3061795234680176,233,
linux_unicode_word,1,3,ag (ASCII),ag -s \wAh,0.2993617057800293,233,
linux_unicode_word,1,3,ag (ASCII),ag -s \wAh,0.29722046852111816,233,
linux_unicode_word,1,3,git grep,git grep -E -I -n \wAh,4.257144451141357,247,LC_ALL=en_US.UTF-8
linux_unicode_word,1,3,git grep,git grep -E -I -n \wAh,3.852163076400757,247,LC_ALL=en_US.UTF-8
linux_unicode_word,1,3,git grep,git grep -E -I -n \wAh,3.8293941020965576,247,LC_ALL=en_US.UTF-8
linux_unicode_word,1,3,git grep (ASCII),git grep -E -I -n \wAh,1.647632122039795,233,LC_ALL=C
linux_unicode_word,1,3,git grep (ASCII),git grep -E -I -n \wAh,1.6269629001617432,233,LC_ALL=C
linux_unicode_word,1,3,git grep (ASCII),git grep -E -I -n \wAh,1.5847914218902588,233,LC_ALL=C
linux_unicode_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.1802208423614502,247,
linux_unicode_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.17564702033996582,247,
linux_unicode_word,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \wAh ./,0.1746981143951416,247,
linux_unicode_word,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.1799161434173584,233,
linux_unicode_word,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.18733000755310059,233,
linux_unicode_word,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./,0.18859529495239258,233,
linux_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.26203155517578125,721,
linux_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2615540027618408,721,
linux_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.2730247974395752,721,
linux_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.19902300834655762,720,
linux_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.20034146308898926,720,
linux_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.20192813873291016,720,
linux_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.8269081115722656,1134,
linux_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.8393104076385498,1134,
linux_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},0.8293666839599609,1134,
linux_no_literal,1,3,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},7.334395408630371,721,LC_ALL=en_US.UTF-8
linux_no_literal,1,3,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},7.338796854019165,721,LC_ALL=en_US.UTF-8
linux_no_literal,1,3,git grep,git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},7.36545991897583,721,LC_ALL=en_US.UTF-8
linux_no_literal,1,3,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},2.1588926315307617,720,LC_ALL=C
linux_no_literal,1,3,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},2.132209062576294,720,LC_ALL=C
linux_no_literal,1,3,git grep (ASCII),git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5},2.1407439708709717,720,LC_ALL=C
linux_no_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,3.410162925720215,723,
linux_no_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,3.405057668685913,723,
linux_no_literal,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,3.3945884704589844,723,
linux_no_literal,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.23865604400634766,722,
linux_no_literal,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.23371148109436035,722,
linux_no_literal,1,3,ugrep (ASCII),ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./,0.2343149185180664,722,
linux_alternates,1,3,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.08691263198852539,140,
linux_alternates,1,3,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.08707070350646973,140,
linux_alternates,1,3,rg,rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.08713960647583008,140,
linux_alternates,1,3,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.32947278022766113,140,
linux_alternates,1,3,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.33203840255737305,140,
linux_alternates,1,3,ag,ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.3292670249938965,140,
linux_alternates,1,3,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.4576725959777832,140,LC_ALL=C
linux_alternates,1,3,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.41936421394348145,140,LC_ALL=C
linux_alternates,1,3,git grep,git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.3639688491821289,140,LC_ALL=C
linux_alternates,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.17806458473205566,140,
linux_alternates,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.18224716186523438,140,
linux_alternates,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.17795038223266602,140,
linux_alternates_casei,1,3,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.12421393394470215,241,
linux_alternates_casei,1,3,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.12235784530639648,241,
linux_alternates_casei,1,3,rg,rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.12151455879211426,241,
linux_alternates_casei,1,3,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.529585599899292,241,
linux_alternates_casei,1,3,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5305526256561279,241,
linux_alternates_casei,1,3,ag,ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.5311264991760254,241,
linux_alternates_casei,1,3,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.7589735984802246,241,LC_ALL=C
linux_alternates_casei,1,3,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.7852108478546143,241,LC_ALL=C
linux_alternates_casei,1,3,git grep,git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT,0.8308050632476807,241,LC_ALL=C
linux_alternates_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.17955923080444336,241,
linux_alternates_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.1745290756225586,241,
linux_alternates_casei,1,3,ugrep,ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./,0.1773686408996582,241,
subtitles_en_literal,1,3,rg,rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.1213979721069336,830,
subtitles_en_literal,1,3,rg,rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.1213991641998291,830,
subtitles_en_literal,1,3,rg,rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.12620782852172852,830,
subtitles_en_literal,1,3,rg (no mmap),rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18207263946533203,830,
subtitles_en_literal,1,3,rg (no mmap),rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.17281484603881836,830,
subtitles_en_literal,1,3,rg (no mmap),rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.17368507385253906,830,
subtitles_en_literal,1,3,grep,grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.560560941696167,830,LC_ALL=C
subtitles_en_literal,1,3,grep,grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.563499927520752,830,LC_ALL=C
subtitles_en_literal,1,3,grep,grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.5916609764099121,830,LC_ALL=C
subtitles_en_literal,1,3,rg (lines),rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.19600844383239746,830,
subtitles_en_literal,1,3,rg (lines),rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18436980247497559,830,
subtitles_en_literal,1,3,rg (lines),rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18594050407409668,830,
subtitles_en_literal,1,3,ag (lines),ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.871025562286377,830,
subtitles_en_literal,1,3,ag (lines),ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.8636960983276367,830,
subtitles_en_literal,1,3,ag (lines),ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.8680994510650635,830,
subtitles_en_literal,1,3,grep (lines),grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.9978001117706299,830,LC_ALL=C
subtitles_en_literal,1,3,grep (lines),grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.9385361671447754,830,LC_ALL=C
subtitles_en_literal,1,3,grep (lines),grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.0036489963531494,830,LC_ALL=C
subtitles_en_literal,1,3,ugrep (lines),ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18918490409851074,830,
subtitles_en_literal,1,3,ugrep (lines),ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.1769108772277832,830,
subtitles_en_literal,1,3,ugrep (lines),ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18808293342590332,830,
subtitles_en_literal_casei,1,3,rg,rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.21876287460327148,871,
subtitles_en_literal_casei,1,3,rg,rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.2044692039489746,871,
subtitles_en_literal_casei,1,3,rg,rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.2184743881225586,871,
subtitles_en_literal_casei,1,3,grep,grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,2.224027156829834,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,3,grep,grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,2.223188877105713,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,3,grep,grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,2.223966598510742,871,LC_ALL=en_US.UTF-8
subtitles_en_literal_casei,1,3,grep (ASCII),grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.671149492263794,871,LC_ALL=C
subtitles_en_literal_casei,1,3,grep (ASCII),grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.6705749034881592,871,LC_ALL=C
subtitles_en_literal_casei,1,3,grep (ASCII),grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.6700258255004883,871,LC_ALL=C
subtitles_en_literal_casei,1,3,rg (lines),rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.2624058723449707,871,
subtitles_en_literal_casei,1,3,rg (lines),rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.25513339042663574,871,
subtitles_en_literal_casei,1,3,rg (lines),rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.26088857650756836,871,
subtitles_en_literal_casei,1,3,ag (lines) (ASCII),ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.9144322872161865,871,
subtitles_en_literal_casei,1,3,ag (lines) (ASCII),ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.866628885269165,871,
subtitles_en_literal_casei,1,3,ag (lines) (ASCII),ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.9098389148712158,871,
subtitles_en_literal_casei,1,3,ugrep (lines),ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.7860472202301025,871,
subtitles_en_literal_casei,1,3,ugrep (lines),ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.7858343124389648,871,
subtitles_en_literal_casei,1,3,ugrep (lines),ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.782252311706543,871,
subtitles_en_literal_word,1,3,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt,0.18424677848815918,830,
subtitles_en_literal_word,1,3,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt,0.19610810279846191,830,
subtitles_en_literal_word,1,3,rg (ASCII),rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt,0.18711471557617188,830,
subtitles_en_literal_word,1,3,ag (ASCII),ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.8301315307617188,830,
subtitles_en_literal_word,1,3,ag (ASCII),ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.8689801692962646,830,
subtitles_en_literal_word,1,3,ag (ASCII),ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.8279321193695068,830,
subtitles_en_literal_word,1,3,grep (ASCII),grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.0036842823028564,830,LC_ALL=C
subtitles_en_literal_word,1,3,grep (ASCII),grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.002833604812622,830,LC_ALL=C
subtitles_en_literal_word,1,3,grep (ASCII),grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.9236147403717041,830,LC_ALL=C
subtitles_en_literal_word,1,3,ugrep (ASCII),ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.17717313766479492,830,
subtitles_en_literal_word,1,3,ugrep (ASCII),ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18994617462158203,830,
subtitles_en_literal_word,1,3,ugrep (ASCII),ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.17972850799560547,830,
subtitles_en_literal_word,1,3,rg,rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18804550170898438,830,
subtitles_en_literal_word,1,3,rg,rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.18867778778076172,830,
subtitles_en_literal_word,1,3,rg,rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.19913530349731445,830,
subtitles_en_literal_word,1,3,grep,grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.0044364929199219,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,3,grep,grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,1.0040032863616943,830,LC_ALL=en_US.UTF-8
subtitles_en_literal_word,1,3,grep,grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt,0.9627983570098877,830,LC_ALL=en_US.UTF-8
subtitles_en_alternate,1,3,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.24848055839538574,1094,
subtitles_en_alternate,1,3,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.24738383293151855,1094,
subtitles_en_alternate,1,3,rg (lines),rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.24789118766784668,1094,
subtitles_en_alternate,1,3,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.668708562850952,1094,
subtitles_en_alternate,1,3,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.57511305809021,1094,
subtitles_en_alternate,1,3,ag (lines),ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.6714110374450684,1094,
subtitles_en_alternate,1,3,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.0586187839508057,1094,LC_ALL=C
subtitles_en_alternate,1,3,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.0227150917053223,1094,LC_ALL=C
subtitles_en_alternate,1,3,grep (lines),grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,2.075378179550171,1094,LC_ALL=C
subtitles_en_alternate,1,3,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.7863781452178955,1094,
subtitles_en_alternate,1,3,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.7874250411987305,1094,
subtitles_en_alternate,1,3,ugrep (lines),ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.7867889404296875,1094,
subtitles_en_alternate,1,3,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.18195557594299316,1094,
subtitles_en_alternate,1,3,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.18239641189575195,1094,
subtitles_en_alternate,1,3,rg,rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.1625690460205078,1094,
subtitles_en_alternate,1,3,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,1.6601614952087402,1094,LC_ALL=C
subtitles_en_alternate,1,3,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,1.6617567539215088,1094,LC_ALL=C
subtitles_en_alternate,1,3,grep,grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,1.6584677696228027,1094,LC_ALL=C
subtitles_en_alternate_casei,1,3,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,4.0028722286224365,1136,
subtitles_en_alternate_casei,1,3,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.991217851638794,1136,
subtitles_en_alternate_casei,1,3,ag (ASCII),ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,4.00272274017334,1136,
subtitles_en_alternate_casei,1,3,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.549154758453369,1136,LC_ALL=C
subtitles_en_alternate_casei,1,3,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.5468921661376953,1136,LC_ALL=C
subtitles_en_alternate_casei,1,3,grep (ASCII),grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.5873491764068604,1136,LC_ALL=C
subtitles_en_alternate_casei,1,3,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.7872169017791748,1136,
subtitles_en_alternate_casei,1,3,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.784674882888794,1136,
subtitles_en_alternate_casei,1,3,ugrep (ASCII),ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.7882401943206787,1136,
subtitles_en_alternate_casei,1,3,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.4785435199737549,1136,
subtitles_en_alternate_casei,1,3,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.4940922260284424,1136,
subtitles_en_alternate_casei,1,3,rg,rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,0.4774627685546875,1136,
subtitles_en_alternate_casei,1,3,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.5677175521850586,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,3,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.603273391723633,1136,LC_ALL=en_US.UTF-8
subtitles_en_alternate_casei,1,3,grep,grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt,3.5834741592407227,1136,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.20238041877746582,278,
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.2031264305114746,278,
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.20475172996520996,278,
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0288453102111816,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.044802188873291,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0432109832763672,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,43.00765633583069,278,
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,42.832849740982056,278,
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,42.915205240249634,278,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.083683967590332,,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0841526985168457,,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0850934982299805,,
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0116353034973145,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.9868073463439941,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0224814414978027,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8892502784729004,,
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8910088539123535,,
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8897674083709717,,
subtitles_en_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,2.11850643157959,22,
subtitles_en_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,2.1359670162200928,22,
subtitles_en_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,2.103114128112793,22,
subtitles_en_no_literal,1,3,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,13.050881385803223,22,
subtitles_en_no_literal,1,3,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,13.050772190093994,22,
subtitles_en_no_literal,1,3,ugrep,ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,13.05719804763794,22,
subtitles_en_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,1.9961926937103271,22,
subtitles_en_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,2.019721508026123,22,
subtitles_en_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,1.9965126514434814,22,
subtitles_en_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,6.849602222442627,302,
subtitles_en_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,6.813834190368652,302,
subtitles_en_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,6.8263633251190186,302,
subtitles_en_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,4.42924165725708,22,LC_ALL=C
subtitles_en_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,4.378557205200195,22,LC_ALL=C
subtitles_en_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,4.376646518707275,22,LC_ALL=C
subtitles_en_no_literal,1,3,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,3.5110037326812744,22,
subtitles_en_no_literal,1,3,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,3.5137360095977783,22,
subtitles_en_no_literal,1,3,ugrep (ASCII),ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt,3.5051844120025635,22,
subtitles_ru_literal,1,3,rg,rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.13207745552062988,583,
subtitles_ru_literal,1,3,rg,rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.13084721565246582,583,
subtitles_ru_literal,1,3,rg,rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.13469862937927246,583,
subtitles_ru_literal,1,3,rg (no mmap),rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.18022370338439941,583,
subtitles_ru_literal,1,3,rg (no mmap),rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.1801767349243164,583,
subtitles_ru_literal,1,3,rg (no mmap),rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.17995166778564453,583,
subtitles_ru_literal,1,3,grep,grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.5151040554046631,583,LC_ALL=C
subtitles_ru_literal,1,3,grep,grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.5154542922973633,583,LC_ALL=C
subtitles_ru_literal,1,3,grep,grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.49927639961242676,583,LC_ALL=C
subtitles_ru_literal,1,3,rg (lines),rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.19464492797851562,583,
subtitles_ru_literal,1,3,rg (lines),rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.18920588493347168,583,
subtitles_ru_literal,1,3,rg (lines),rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.19465351104736328,583,
subtitles_ru_literal,1,3,ag (lines),ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,1.9595966339111328,583,
subtitles_ru_literal,1,3,ag (lines),ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,2.0014493465423584,583,
subtitles_ru_literal,1,3,ag (lines),ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,1.9567768573760986,583,
subtitles_ru_literal,1,3,grep (lines),grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8119180202484131,583,LC_ALL=C
subtitles_ru_literal,1,3,grep (lines),grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8111097812652588,583,LC_ALL=C
subtitles_ru_literal,1,3,grep (lines),grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8006868362426758,583,LC_ALL=C
subtitles_ru_literal,1,3,ugrep (lines),ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.70003342628479,583,
subtitles_ru_literal,1,3,ugrep (lines),ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.650275468826294,583,
subtitles_ru_literal,1,3,ugrep (lines),ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.689772367477417,583,
subtitles_ru_literal_casei,1,3,rg,rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.267578125,604,
subtitles_ru_literal_casei,1,3,rg,rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.2665982246398926,604,
subtitles_ru_literal_casei,1,3,rg,rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.26861572265625,604,
subtitles_ru_literal_casei,1,3,grep,grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,4.764627456665039,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,3,grep,grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,4.767015695571899,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,3,grep,grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,4.7688889503479,604,LC_ALL=en_US.UTF-8
subtitles_ru_literal_casei,1,3,grep (ASCII),grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.5046737194061279,583,LC_ALL=C
subtitles_ru_literal_casei,1,3,grep (ASCII),grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.5139875411987305,583,LC_ALL=C
subtitles_ru_literal_casei,1,3,grep (ASCII),grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.4993159770965576,583,LC_ALL=C
subtitles_ru_literal_casei,1,3,rg (lines),rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.33438658714294434,604,
subtitles_ru_literal_casei,1,3,rg (lines),rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.3398289680480957,604,
subtitles_ru_literal_casei,1,3,rg (lines),rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.3298227787017822,604,
subtitles_ru_literal_casei,1,3,ag (lines) (ASCII),ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.4468214511871338,,
subtitles_ru_literal_casei,1,3,ag (lines) (ASCII),ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.44559574127197266,,
subtitles_ru_literal_casei,1,3,ag (lines) (ASCII),ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.47882938385009766,,
subtitles_ru_literal_casei,1,3,ugrep (lines) (ASCII),ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.7039575576782227,583,
subtitles_ru_literal_casei,1,3,ugrep (lines) (ASCII),ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.6490752696990967,583,
subtitles_ru_literal_casei,1,3,ugrep (lines) (ASCII),ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8081104755401611,583,
subtitles_ru_literal_word,1,3,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt,0.20162224769592285,583,
subtitles_ru_literal_word,1,3,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt,0.18215250968933105,583,
subtitles_ru_literal_word,1,3,rg (ASCII),rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt,0.20087671279907227,583,
subtitles_ru_literal_word,1,3,ag (ASCII),ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.48624587059020996,,
subtitles_ru_literal_word,1,3,ag (ASCII),ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.5212516784667969,,
subtitles_ru_literal_word,1,3,ag (ASCII),ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.520557165145874,,
subtitles_ru_literal_word,1,3,grep (ASCII),grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8108196258544922,583,LC_ALL=C
subtitles_ru_literal_word,1,3,grep (ASCII),grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8121066093444824,583,LC_ALL=C
subtitles_ru_literal_word,1,3,grep (ASCII),grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.7784581184387207,583,LC_ALL=C
subtitles_ru_literal_word,1,3,ugrep (ASCII),ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.7469344139099121,583,
subtitles_ru_literal_word,1,3,ugrep (ASCII),ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.6838233470916748,583,
subtitles_ru_literal_word,1,3,ugrep (ASCII),ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.6921679973602295,583,
subtitles_ru_literal_word,1,3,rg,rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.19918251037597656,579,
subtitles_ru_literal_word,1,3,rg,rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.2046656608581543,579,
subtitles_ru_literal_word,1,3,rg,rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.1984848976135254,579,
subtitles_ru_literal_word,1,3,grep,grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.794173002243042,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,3,grep,grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.7715346813201904,579,LC_ALL=en_US.UTF-8
subtitles_ru_literal_word,1,3,grep,grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt,0.8116705417633057,579,LC_ALL=en_US.UTF-8
subtitles_ru_alternate,1,3,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.6730976104736328,691,
subtitles_ru_alternate,1,3,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.7020411491394043,691,
subtitles_ru_alternate,1,3,rg (lines),rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.6693949699401855,691,
subtitles_ru_alternate,1,3,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.7100515365600586,691,
subtitles_ru_alternate,1,3,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.7458419799804688,691,
subtitles_ru_alternate,1,3,ag (lines),ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.7115116119384766,691,
subtitles_ru_alternate,1,3,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.703738451004028,691,LC_ALL=C
subtitles_ru_alternate,1,3,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.715883731842041,691,LC_ALL=C
subtitles_ru_alternate,1,3,grep (lines),grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.712724924087524,691,LC_ALL=C
subtitles_ru_alternate,1,3,ugrep (lines),ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.276995420455933,691,
subtitles_ru_alternate,1,3,ugrep (lines),ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.304608345031738,691,
subtitles_ru_alternate,1,3,ugrep (lines),ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.322760820388794,691,
subtitles_ru_alternate,1,3,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.6119842529296875,691,
subtitles_ru_alternate,1,3,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.6368775367736816,691,
subtitles_ru_alternate,1,3,rg,rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,0.6258070468902588,691,
subtitles_ru_alternate,1,3,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.4300291538238525,691,LC_ALL=C
subtitles_ru_alternate,1,3,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.418199300765991,691,LC_ALL=C
subtitles_ru_alternate,1,3,grep,grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.425868511199951,691,LC_ALL=C
subtitles_ru_alternate_casei,1,3,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.7216460704803467,691,
subtitles_ru_alternate_casei,1,3,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.7108607292175293,691,
subtitles_ru_alternate_casei,1,3,ag (ASCII),ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,2.747138500213623,691,
subtitles_ru_alternate_casei,1,3,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.711230039596558,691,LC_ALL=C
subtitles_ru_alternate_casei,1,3,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.709407329559326,691,LC_ALL=C
subtitles_ru_alternate_casei,1,3,grep (ASCII),grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.714034557342529,691,LC_ALL=C
subtitles_ru_alternate_casei,1,3,ugrep (ASCII),ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.305904626846313,691,
subtitles_ru_alternate_casei,1,3,ugrep (ASCII),ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.307406187057495,691,
subtitles_ru_alternate_casei,1,3,ugrep (ASCII),ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,8.288233995437622,691,
subtitles_ru_alternate_casei,1,3,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,3.673624277114868,735,
subtitles_ru_alternate_casei,1,3,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,3.6759188175201416,735,
subtitles_ru_alternate_casei,1,3,rg,rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,3.66877818107605,735,
subtitles_ru_alternate_casei,1,3,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.366282224655151,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,3,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.370524883270264,735,LC_ALL=en_US.UTF-8
subtitles_ru_alternate_casei,1,3,grep,grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt,5.342163324356079,735,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.20331382751464844,278,
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.2034592628479004,278,
subtitles_ru_surrounding_words,1,3,rg,rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.20407724380493164,278,
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0436389446258545,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0388383865356445,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,grep,grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0446207523345947,278,LC_ALL=en_US.UTF-8
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.29245424270629883,1,
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.29168128967285156,1,
subtitles_ru_surrounding_words,1,3,ugrep,ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.29593825340270996,1,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.085604190826416,,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.083526372909546,,
subtitles_ru_surrounding_words,1,3,ag (ASCII),ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.1223819255828857,,
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.9905192852020264,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0222513675689697,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,grep (ASCII),grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,1.0216262340545654,,LC_ALL=C
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8875806331634521,,
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8861405849456787,,
subtitles_ru_surrounding_words,1,3,ugrep (ASCII),ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt,0.8898241519927979,,
subtitles_ru_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.237398147583008,41,
subtitles_ru_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.253706693649292,41,
subtitles_ru_no_literal,1,3,rg,rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.2161178588867188,41,
subtitles_ru_no_literal,1,3,ugrep,ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,28.85959553718567,41,
subtitles_ru_no_literal,1,3,ugrep,ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,28.666419982910156,41,
subtitles_ru_no_literal,1,3,ugrep,ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,28.90555214881897,41,
subtitles_ru_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.051813840866089,,
subtitles_ru_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.026675224304199,,
subtitles_ru_no_literal,1,3,rg (ASCII),rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,2.027498245239258,,
subtitles_ru_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0998010635375977,,
subtitles_ru_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0900018215179443,,
subtitles_ru_no_literal,1,3,ag (ASCII),ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0901548862457275,,
subtitles_ru_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0691263675689697,,LC_ALL=C
subtitles_ru_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0875153541564941,,LC_ALL=C
subtitles_ru_no_literal,1,3,grep (ASCII),grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,1.0997354984283447,,LC_ALL=C
subtitles_ru_no_literal,1,3,ugrep (ASCII),ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,0.8329172134399414,,
subtitles_ru_no_literal,1,3,ugrep (ASCII),ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,0.8292679786682129,,
subtitles_ru_no_literal,1,3,ugrep (ASCII),ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt,0.8326950073242188,,
1 benchmark warmup_iter iter name command duration lines env
2 linux_literal_default 1 3 rg rg PM_RESUME 0.08678817749023438 39
3 linux_literal_default 1 3 rg rg PM_RESUME 0.08307123184204102 39
4 linux_literal_default 1 3 rg rg PM_RESUME 0.08347964286804199 39
5 linux_literal_default 1 3 ag ag PM_RESUME 0.2955434322357178 39
6 linux_literal_default 1 3 ag ag PM_RESUME 0.2954287528991699 39
7 linux_literal_default 1 3 ag ag PM_RESUME 0.2938194274902344 39
8 linux_literal_default 1 3 git grep git grep PM_RESUME 0.23198556900024414 39 LC_ALL=en_US.UTF-8
9 linux_literal_default 1 3 git grep git grep PM_RESUME 0.22356963157653809 39 LC_ALL=en_US.UTF-8
10 linux_literal_default 1 3 git grep git grep PM_RESUME 0.2189793586730957 39 LC_ALL=en_US.UTF-8
11 linux_literal_default 1 3 ugrep ugrep -r PM_RESUME ./ 0.10710000991821289 39
12 linux_literal_default 1 3 ugrep ugrep -r PM_RESUME ./ 0.10364222526550293 39
13 linux_literal_default 1 3 ugrep ugrep -r PM_RESUME ./ 0.1052248477935791 39
14 linux_literal_default 1 3 grep grep -r PM_RESUME ./ 0.9994468688964844 39 LC_ALL=en_US.UTF-8
15 linux_literal_default 1 3 grep grep -r PM_RESUME ./ 0.9939279556274414 39 LC_ALL=en_US.UTF-8
16 linux_literal_default 1 3 grep grep -r PM_RESUME ./ 0.9957931041717529 39 LC_ALL=en_US.UTF-8
17 linux_literal 1 3 rg rg -n PM_RESUME 0.08603358268737793 39
18 linux_literal 1 3 rg rg -n PM_RESUME 0.0837090015411377 39
19 linux_literal 1 3 rg rg -n PM_RESUME 0.08435535430908203 39
20 linux_literal 1 3 rg (mmap) rg -n --mmap PM_RESUME 0.3215503692626953 39
21 linux_literal 1 3 rg (mmap) rg -n --mmap PM_RESUME 0.32426929473876953 39
22 linux_literal 1 3 rg (mmap) rg -n --mmap PM_RESUME 0.3215982913970947 39
23 linux_literal 1 3 ag (mmap) ag -s PM_RESUME 0.2894856929779053 39
24 linux_literal 1 3 ag (mmap) ag -s PM_RESUME 0.2892603874206543 39
25 linux_literal 1 3 ag (mmap) ag -s PM_RESUME 0.29217028617858887 39
26 linux_literal 1 3 git grep git grep -I -n PM_RESUME 0.206068754196167 39 LC_ALL=C
27 linux_literal 1 3 git grep git grep -I -n PM_RESUME 0.2218036651611328 39 LC_ALL=C
28 linux_literal 1 3 git grep git grep -I -n PM_RESUME 0.20590710639953613 39 LC_ALL=C
29 linux_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.18692874908447266 39
30 linux_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.19518327713012695 39
31 linux_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n PM_RESUME ./ 0.18577361106872559 39
32 linux_literal_casei 1 3 rg rg -n -i PM_RESUME 0.08709383010864258 536
33 linux_literal_casei 1 3 rg rg -n -i PM_RESUME 0.08861064910888672 536
34 linux_literal_casei 1 3 rg rg -n -i PM_RESUME 0.08769798278808594 536
35 linux_literal_casei 1 3 rg (mmap) rg -n -i --mmap PM_RESUME 0.3218965530395508 536
36 linux_literal_casei 1 3 rg (mmap) rg -n -i --mmap PM_RESUME 0.30869364738464355 536
37 linux_literal_casei 1 3 rg (mmap) rg -n -i --mmap PM_RESUME 0.31044936180114746 536
38 linux_literal_casei 1 3 ag (mmap) ag -i PM_RESUME 0.2989068031311035 536
39 linux_literal_casei 1 3 ag (mmap) ag -i PM_RESUME 0.2996039390563965 536
40 linux_literal_casei 1 3 ag (mmap) ag -i PM_RESUME 0.29817700386047363 536
41 linux_literal_casei 1 3 git grep git grep -I -n -i PM_RESUME 0.2122786045074463 536 LC_ALL=C
42 linux_literal_casei 1 3 git grep git grep -I -n -i PM_RESUME 0.20763754844665527 536 LC_ALL=C
43 linux_literal_casei 1 3 git grep git grep -I -n -i PM_RESUME 0.220794677734375 536 LC_ALL=C
44 linux_literal_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.17305850982666016 536
45 linux_literal_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.1745915412902832 536
46 linux_literal_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i PM_RESUME ./ 0.17526865005493164 536
47 linux_re_literal_suffix 1 3 rg rg -n [A-Z]+_RESUME 0.08527851104736328 2160
48 linux_re_literal_suffix 1 3 rg rg -n [A-Z]+_RESUME 0.08487534523010254 2160
49 linux_re_literal_suffix 1 3 rg rg -n [A-Z]+_RESUME 0.0848684310913086 2160
50 linux_re_literal_suffix 1 3 ag ag -s [A-Z]+_RESUME 0.37945985794067383 2160
51 linux_re_literal_suffix 1 3 ag ag -s [A-Z]+_RESUME 0.36303210258483887 2160
52 linux_re_literal_suffix 1 3 ag ag -s [A-Z]+_RESUME 0.36359691619873047 2160
53 linux_re_literal_suffix 1 3 git grep git grep -E -I -n [A-Z]+_RESUME 0.9589834213256836 2160 LC_ALL=C
54 linux_re_literal_suffix 1 3 git grep git grep -E -I -n [A-Z]+_RESUME 0.9206984043121338 2160 LC_ALL=C
55 linux_re_literal_suffix 1 3 git grep git grep -E -I -n [A-Z]+_RESUME 0.8642933368682861 2160 LC_ALL=C
56 linux_re_literal_suffix 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.40503501892089844 2160
57 linux_re_literal_suffix 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.4531714916229248 2160
58 linux_re_literal_suffix 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n [A-Z]+_RESUME ./ 0.4397866725921631 2160
59 linux_word 1 3 rg rg -n -w PM_RESUME 0.08639907836914062 9
60 linux_word 1 3 rg rg -n -w PM_RESUME 0.08583569526672363 9
61 linux_word 1 3 rg rg -n -w PM_RESUME 0.08414363861083984 9
62 linux_word 1 3 ag ag -s -w PM_RESUME 0.2853865623474121 9
63 linux_word 1 3 ag ag -s -w PM_RESUME 0.2871377468109131 9
64 linux_word 1 3 ag ag -s -w PM_RESUME 0.28753662109375 9
65 linux_word 1 3 git grep git grep -E -I -n -w PM_RESUME 0.20428204536437988 9 LC_ALL=C
66 linux_word 1 3 git grep git grep -E -I -n -w PM_RESUME 0.20490717887878418 9 LC_ALL=C
67 linux_word 1 3 git grep git grep -E -I -n -w PM_RESUME 0.20840072631835938 9 LC_ALL=C
68 linux_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.18790841102600098 9
69 linux_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.18659543991088867 9
70 linux_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -w PM_RESUME ./ 0.19104933738708496 9
71 linux_unicode_greek 1 3 rg rg -n \p{Greek} 0.19976496696472168 105
72 linux_unicode_greek 1 3 rg rg -n \p{Greek} 0.20618367195129395 105
73 linux_unicode_greek 1 3 rg rg -n \p{Greek} 0.19702935218811035 105
74 linux_unicode_greek 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.17758727073669434 105
75 linux_unicode_greek 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.17793798446655273 105
76 linux_unicode_greek 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \p{Greek} ./ 0.1872577667236328 105
77 linux_unicode_greek_casei 1 3 rg rg -n -i \p{Greek} 0.19808244705200195 245
78 linux_unicode_greek_casei 1 3 rg rg -n -i \p{Greek} 0.1979837417602539 245
79 linux_unicode_greek_casei 1 3 rg rg -n -i \p{Greek} 0.1984400749206543 245
80 linux_unicode_greek_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.1819148063659668 105
81 linux_unicode_greek_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.17530512809753418 105
82 linux_unicode_greek_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i \p{Greek} ./ 0.17999005317687988 105
83 linux_unicode_word 1 3 rg rg -n \wAh 0.08527827262878418 247
84 linux_unicode_word 1 3 rg rg -n \wAh 0.08541679382324219 247
85 linux_unicode_word 1 3 rg rg -n \wAh 0.08553218841552734 247
86 linux_unicode_word 1 3 rg (ASCII) rg -n (?-u)\wAh 0.08484745025634766 233
87 linux_unicode_word 1 3 rg (ASCII) rg -n (?-u)\wAh 0.08466482162475586 233
88 linux_unicode_word 1 3 rg (ASCII) rg -n (?-u)\wAh 0.08487439155578613 233
89 linux_unicode_word 1 3 ag (ASCII) ag -s \wAh 0.3061795234680176 233
90 linux_unicode_word 1 3 ag (ASCII) ag -s \wAh 0.2993617057800293 233
91 linux_unicode_word 1 3 ag (ASCII) ag -s \wAh 0.29722046852111816 233
92 linux_unicode_word 1 3 git grep git grep -E -I -n \wAh 4.257144451141357 247 LC_ALL=en_US.UTF-8
93 linux_unicode_word 1 3 git grep git grep -E -I -n \wAh 3.852163076400757 247 LC_ALL=en_US.UTF-8
94 linux_unicode_word 1 3 git grep git grep -E -I -n \wAh 3.8293941020965576 247 LC_ALL=en_US.UTF-8
95 linux_unicode_word 1 3 git grep (ASCII) git grep -E -I -n \wAh 1.647632122039795 233 LC_ALL=C
96 linux_unicode_word 1 3 git grep (ASCII) git grep -E -I -n \wAh 1.6269629001617432 233 LC_ALL=C
97 linux_unicode_word 1 3 git grep (ASCII) git grep -E -I -n \wAh 1.5847914218902588 233 LC_ALL=C
98 linux_unicode_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.1802208423614502 247
99 linux_unicode_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.17564702033996582 247
100 linux_unicode_word 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \wAh ./ 0.1746981143951416 247
101 linux_unicode_word 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.1799161434173584 233
102 linux_unicode_word 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.18733000755310059 233
103 linux_unicode_word 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \wAh ./ 0.18859529495239258 233
104 linux_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.26203155517578125 721
105 linux_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2615540027618408 721
106 linux_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.2730247974395752 721
107 linux_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.19902300834655762 720
108 linux_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.20034146308898926 720
109 linux_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.20192813873291016 720
110 linux_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.8269081115722656 1134
111 linux_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.8393104076385498 1134
112 linux_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 0.8293666839599609 1134
113 linux_no_literal 1 3 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 7.334395408630371 721 LC_ALL=en_US.UTF-8
114 linux_no_literal 1 3 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 7.338796854019165 721 LC_ALL=en_US.UTF-8
115 linux_no_literal 1 3 git grep git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 7.36545991897583 721 LC_ALL=en_US.UTF-8
116 linux_no_literal 1 3 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 2.1588926315307617 720 LC_ALL=C
117 linux_no_literal 1 3 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 2.132209062576294 720 LC_ALL=C
118 linux_no_literal 1 3 git grep (ASCII) git grep -E -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} 2.1407439708709717 720 LC_ALL=C
119 linux_no_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 3.410162925720215 723
120 linux_no_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 3.405057668685913 723
121 linux_no_literal 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 3.3945884704589844 723
122 linux_no_literal 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.23865604400634766 722
123 linux_no_literal 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.23371148109436035 722
124 linux_no_literal 1 3 ugrep (ASCII) ugrep -r --ignore-files --no-hidden -I -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} ./ 0.2343149185180664 722
125 linux_alternates 1 3 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.08691263198852539 140
126 linux_alternates 1 3 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.08707070350646973 140
127 linux_alternates 1 3 rg rg -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.08713960647583008 140
128 linux_alternates 1 3 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.32947278022766113 140
129 linux_alternates 1 3 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.33203840255737305 140
130 linux_alternates 1 3 ag ag -s ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.3292670249938965 140
131 linux_alternates 1 3 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.4576725959777832 140 LC_ALL=C
132 linux_alternates 1 3 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.41936421394348145 140 LC_ALL=C
133 linux_alternates 1 3 git grep git grep -E -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.3639688491821289 140 LC_ALL=C
134 linux_alternates 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.17806458473205566 140
135 linux_alternates 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.18224716186523438 140
136 linux_alternates 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.17795038223266602 140
137 linux_alternates_casei 1 3 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.12421393394470215 241
138 linux_alternates_casei 1 3 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.12235784530639648 241
139 linux_alternates_casei 1 3 rg rg -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.12151455879211426 241
140 linux_alternates_casei 1 3 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.529585599899292 241
141 linux_alternates_casei 1 3 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5305526256561279 241
142 linux_alternates_casei 1 3 ag ag -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.5311264991760254 241
143 linux_alternates_casei 1 3 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.7589735984802246 241 LC_ALL=C
144 linux_alternates_casei 1 3 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.7852108478546143 241 LC_ALL=C
145 linux_alternates_casei 1 3 git grep git grep -E -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT 0.8308050632476807 241 LC_ALL=C
146 linux_alternates_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.17955923080444336 241
147 linux_alternates_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.1745290756225586 241
148 linux_alternates_casei 1 3 ugrep ugrep -r --ignore-files --no-hidden -I -n -i ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT ./ 0.1773686408996582 241
149 subtitles_en_literal 1 3 rg rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.1213979721069336 830
150 subtitles_en_literal 1 3 rg rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.1213991641998291 830
151 subtitles_en_literal 1 3 rg rg Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.12620782852172852 830
152 subtitles_en_literal 1 3 rg (no mmap) rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18207263946533203 830
153 subtitles_en_literal 1 3 rg (no mmap) rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.17281484603881836 830
154 subtitles_en_literal 1 3 rg (no mmap) rg --no-mmap Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.17368507385253906 830
155 subtitles_en_literal 1 3 grep grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.560560941696167 830 LC_ALL=C
156 subtitles_en_literal 1 3 grep grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.563499927520752 830 LC_ALL=C
157 subtitles_en_literal 1 3 grep grep Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.5916609764099121 830 LC_ALL=C
158 subtitles_en_literal 1 3 rg (lines) rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.19600844383239746 830
159 subtitles_en_literal 1 3 rg (lines) rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18436980247497559 830
160 subtitles_en_literal 1 3 rg (lines) rg -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18594050407409668 830
161 subtitles_en_literal 1 3 ag (lines) ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.871025562286377 830
162 subtitles_en_literal 1 3 ag (lines) ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.8636960983276367 830
163 subtitles_en_literal 1 3 ag (lines) ag -s Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.8680994510650635 830
164 subtitles_en_literal 1 3 grep (lines) grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.9978001117706299 830 LC_ALL=C
165 subtitles_en_literal 1 3 grep (lines) grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.9385361671447754 830 LC_ALL=C
166 subtitles_en_literal 1 3 grep (lines) grep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.0036489963531494 830 LC_ALL=C
167 subtitles_en_literal 1 3 ugrep (lines) ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18918490409851074 830
168 subtitles_en_literal 1 3 ugrep (lines) ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.1769108772277832 830
169 subtitles_en_literal 1 3 ugrep (lines) ugrep -n Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18808293342590332 830
170 subtitles_en_literal_casei 1 3 rg rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.21876287460327148 871
171 subtitles_en_literal_casei 1 3 rg rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.2044692039489746 871
172 subtitles_en_literal_casei 1 3 rg rg -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.2184743881225586 871
173 subtitles_en_literal_casei 1 3 grep grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 2.224027156829834 871 LC_ALL=en_US.UTF-8
174 subtitles_en_literal_casei 1 3 grep grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 2.223188877105713 871 LC_ALL=en_US.UTF-8
175 subtitles_en_literal_casei 1 3 grep grep -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 2.223966598510742 871 LC_ALL=en_US.UTF-8
176 subtitles_en_literal_casei 1 3 grep (ASCII) grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.671149492263794 871 LC_ALL=C
177 subtitles_en_literal_casei 1 3 grep (ASCII) grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.6705749034881592 871 LC_ALL=C
178 subtitles_en_literal_casei 1 3 grep (ASCII) grep -E -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.6700258255004883 871 LC_ALL=C
179 subtitles_en_literal_casei 1 3 rg (lines) rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.2624058723449707 871
180 subtitles_en_literal_casei 1 3 rg (lines) rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.25513339042663574 871
181 subtitles_en_literal_casei 1 3 rg (lines) rg -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.26088857650756836 871
182 subtitles_en_literal_casei 1 3 ag (lines) (ASCII) ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.9144322872161865 871
183 subtitles_en_literal_casei 1 3 ag (lines) (ASCII) ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.866628885269165 871
184 subtitles_en_literal_casei 1 3 ag (lines) (ASCII) ag -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.9098389148712158 871
185 subtitles_en_literal_casei 1 3 ugrep (lines) ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.7860472202301025 871
186 subtitles_en_literal_casei 1 3 ugrep (lines) ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.7858343124389648 871
187 subtitles_en_literal_casei 1 3 ugrep (lines) ugrep -n -i Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.782252311706543 871
188 subtitles_en_literal_word 1 3 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt 0.18424677848815918 830
189 subtitles_en_literal_word 1 3 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt 0.19610810279846191 830
190 subtitles_en_literal_word 1 3 rg (ASCII) rg -n (?-u:\b)Sherlock Holmes(?-u:\b) /dev/shm/benchsuite/subtitles/en.sample.txt 0.18711471557617188 830
191 subtitles_en_literal_word 1 3 ag (ASCII) ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.8301315307617188 830
192 subtitles_en_literal_word 1 3 ag (ASCII) ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.8689801692962646 830
193 subtitles_en_literal_word 1 3 ag (ASCII) ag -sw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.8279321193695068 830
194 subtitles_en_literal_word 1 3 grep (ASCII) grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.0036842823028564 830 LC_ALL=C
195 subtitles_en_literal_word 1 3 grep (ASCII) grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.002833604812622 830 LC_ALL=C
196 subtitles_en_literal_word 1 3 grep (ASCII) grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.9236147403717041 830 LC_ALL=C
197 subtitles_en_literal_word 1 3 ugrep (ASCII) ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.17717313766479492 830
198 subtitles_en_literal_word 1 3 ugrep (ASCII) ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18994617462158203 830
199 subtitles_en_literal_word 1 3 ugrep (ASCII) ugrep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.17972850799560547 830
200 subtitles_en_literal_word 1 3 rg rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18804550170898438 830
201 subtitles_en_literal_word 1 3 rg rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.18867778778076172 830
202 subtitles_en_literal_word 1 3 rg rg -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.19913530349731445 830
203 subtitles_en_literal_word 1 3 grep grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.0044364929199219 830 LC_ALL=en_US.UTF-8
204 subtitles_en_literal_word 1 3 grep grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 1.0040032863616943 830 LC_ALL=en_US.UTF-8
205 subtitles_en_literal_word 1 3 grep grep -nw Sherlock Holmes /dev/shm/benchsuite/subtitles/en.sample.txt 0.9627983570098877 830 LC_ALL=en_US.UTF-8
206 subtitles_en_alternate 1 3 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.24848055839538574 1094
207 subtitles_en_alternate 1 3 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.24738383293151855 1094
208 subtitles_en_alternate 1 3 rg (lines) rg -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.24789118766784668 1094
209 subtitles_en_alternate 1 3 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.668708562850952 1094
210 subtitles_en_alternate 1 3 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.57511305809021 1094
211 subtitles_en_alternate 1 3 ag (lines) ag -s Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.6714110374450684 1094
212 subtitles_en_alternate 1 3 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.0586187839508057 1094 LC_ALL=C
213 subtitles_en_alternate 1 3 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.0227150917053223 1094 LC_ALL=C
214 subtitles_en_alternate 1 3 grep (lines) grep -E -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 2.075378179550171 1094 LC_ALL=C
215 subtitles_en_alternate 1 3 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.7863781452178955 1094
216 subtitles_en_alternate 1 3 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.7874250411987305 1094
217 subtitles_en_alternate 1 3 ugrep (lines) ugrep -n Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.7867889404296875 1094
218 subtitles_en_alternate 1 3 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.18195557594299316 1094
219 subtitles_en_alternate 1 3 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.18239641189575195 1094
220 subtitles_en_alternate 1 3 rg rg Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.1625690460205078 1094
221 subtitles_en_alternate 1 3 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 1.6601614952087402 1094 LC_ALL=C
222 subtitles_en_alternate 1 3 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 1.6617567539215088 1094 LC_ALL=C
223 subtitles_en_alternate 1 3 grep grep -E Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 1.6584677696228027 1094 LC_ALL=C
224 subtitles_en_alternate_casei 1 3 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 4.0028722286224365 1136
225 subtitles_en_alternate_casei 1 3 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.991217851638794 1136
226 subtitles_en_alternate_casei 1 3 ag (ASCII) ag -s -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 4.00272274017334 1136
227 subtitles_en_alternate_casei 1 3 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.549154758453369 1136 LC_ALL=C
228 subtitles_en_alternate_casei 1 3 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.5468921661376953 1136 LC_ALL=C
229 subtitles_en_alternate_casei 1 3 grep (ASCII) grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.5873491764068604 1136 LC_ALL=C
230 subtitles_en_alternate_casei 1 3 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.7872169017791748 1136
231 subtitles_en_alternate_casei 1 3 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.784674882888794 1136
232 subtitles_en_alternate_casei 1 3 ugrep (ASCII) ugrep -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.7882401943206787 1136
233 subtitles_en_alternate_casei 1 3 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.4785435199737549 1136
234 subtitles_en_alternate_casei 1 3 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.4940922260284424 1136
235 subtitles_en_alternate_casei 1 3 rg rg -n -i Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 0.4774627685546875 1136
236 subtitles_en_alternate_casei 1 3 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.5677175521850586 1136 LC_ALL=en_US.UTF-8
237 subtitles_en_alternate_casei 1 3 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.603273391723633 1136 LC_ALL=en_US.UTF-8
238 subtitles_en_alternate_casei 1 3 grep grep -E -ni Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty /dev/shm/benchsuite/subtitles/en.sample.txt 3.5834741592407227 1136 LC_ALL=en_US.UTF-8
239 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.20238041877746582 278
240 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.2031264305114746 278
241 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.20475172996520996 278
242 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0288453102111816 278 LC_ALL=en_US.UTF-8
243 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.044802188873291 278 LC_ALL=en_US.UTF-8
244 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0432109832763672 278 LC_ALL=en_US.UTF-8
245 subtitles_ru_surrounding_words 1 3 ugrep ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 43.00765633583069 278
246 subtitles_ru_surrounding_words 1 3 ugrep ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 42.832849740982056 278
247 subtitles_ru_surrounding_words 1 3 ugrep ugrep -an \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 42.915205240249634 278
248 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.083683967590332
249 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0841526985168457
250 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0850934982299805
251 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0116353034973145 LC_ALL=C
252 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.9868073463439941 LC_ALL=C
253 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0224814414978027 LC_ALL=C
254 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8892502784729004
255 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8910088539123535
256 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8897674083709717
257 subtitles_en_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 2.11850643157959 22
258 subtitles_en_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 2.1359670162200928 22
259 subtitles_en_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 2.103114128112793 22
260 subtitles_en_no_literal 1 3 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 13.050881385803223 22
261 subtitles_en_no_literal 1 3 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 13.050772190093994 22
262 subtitles_en_no_literal 1 3 ugrep ugrep -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 13.05719804763794 22
263 subtitles_en_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 1.9961926937103271 22
264 subtitles_en_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 2.019721508026123 22
265 subtitles_en_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 1.9965126514434814 22
266 subtitles_en_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 6.849602222442627 302
267 subtitles_en_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 6.813834190368652 302
268 subtitles_en_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 6.8263633251190186 302
269 subtitles_en_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 4.42924165725708 22 LC_ALL=C
270 subtitles_en_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 4.378557205200195 22 LC_ALL=C
271 subtitles_en_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 4.376646518707275 22 LC_ALL=C
272 subtitles_en_no_literal 1 3 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 3.5110037326812744 22
273 subtitles_en_no_literal 1 3 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 3.5137360095977783 22
274 subtitles_en_no_literal 1 3 ugrep (ASCII) ugrep -n -U \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/en.sample.txt 3.5051844120025635 22
275 subtitles_ru_literal 1 3 rg rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.13207745552062988 583
276 subtitles_ru_literal 1 3 rg rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.13084721565246582 583
277 subtitles_ru_literal 1 3 rg rg Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.13469862937927246 583
278 subtitles_ru_literal 1 3 rg (no mmap) rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.18022370338439941 583
279 subtitles_ru_literal 1 3 rg (no mmap) rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.1801767349243164 583
280 subtitles_ru_literal 1 3 rg (no mmap) rg --no-mmap Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.17995166778564453 583
281 subtitles_ru_literal 1 3 grep grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.5151040554046631 583 LC_ALL=C
282 subtitles_ru_literal 1 3 grep grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.5154542922973633 583 LC_ALL=C
283 subtitles_ru_literal 1 3 grep grep Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.49927639961242676 583 LC_ALL=C
284 subtitles_ru_literal 1 3 rg (lines) rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.19464492797851562 583
285 subtitles_ru_literal 1 3 rg (lines) rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.18920588493347168 583
286 subtitles_ru_literal 1 3 rg (lines) rg -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.19465351104736328 583
287 subtitles_ru_literal 1 3 ag (lines) ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 1.9595966339111328 583
288 subtitles_ru_literal 1 3 ag (lines) ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 2.0014493465423584 583
289 subtitles_ru_literal 1 3 ag (lines) ag -s Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 1.9567768573760986 583
290 subtitles_ru_literal 1 3 grep (lines) grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8119180202484131 583 LC_ALL=C
291 subtitles_ru_literal 1 3 grep (lines) grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8111097812652588 583 LC_ALL=C
292 subtitles_ru_literal 1 3 grep (lines) grep -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8006868362426758 583 LC_ALL=C
293 subtitles_ru_literal 1 3 ugrep (lines) ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.70003342628479 583
294 subtitles_ru_literal 1 3 ugrep (lines) ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.650275468826294 583
295 subtitles_ru_literal 1 3 ugrep (lines) ugrep -a -n Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.689772367477417 583
296 subtitles_ru_literal_casei 1 3 rg rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.267578125 604
297 subtitles_ru_literal_casei 1 3 rg rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.2665982246398926 604
298 subtitles_ru_literal_casei 1 3 rg rg -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.26861572265625 604
299 subtitles_ru_literal_casei 1 3 grep grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 4.764627456665039 604 LC_ALL=en_US.UTF-8
300 subtitles_ru_literal_casei 1 3 grep grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 4.767015695571899 604 LC_ALL=en_US.UTF-8
301 subtitles_ru_literal_casei 1 3 grep grep -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 4.7688889503479 604 LC_ALL=en_US.UTF-8
302 subtitles_ru_literal_casei 1 3 grep (ASCII) grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.5046737194061279 583 LC_ALL=C
303 subtitles_ru_literal_casei 1 3 grep (ASCII) grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.5139875411987305 583 LC_ALL=C
304 subtitles_ru_literal_casei 1 3 grep (ASCII) grep -E -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.4993159770965576 583 LC_ALL=C
305 subtitles_ru_literal_casei 1 3 rg (lines) rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.33438658714294434 604
306 subtitles_ru_literal_casei 1 3 rg (lines) rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.3398289680480957 604
307 subtitles_ru_literal_casei 1 3 rg (lines) rg -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.3298227787017822 604
308 subtitles_ru_literal_casei 1 3 ag (lines) (ASCII) ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.4468214511871338
309 subtitles_ru_literal_casei 1 3 ag (lines) (ASCII) ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.44559574127197266
310 subtitles_ru_literal_casei 1 3 ag (lines) (ASCII) ag -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.47882938385009766
311 subtitles_ru_literal_casei 1 3 ugrep (lines) (ASCII) ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.7039575576782227 583
312 subtitles_ru_literal_casei 1 3 ugrep (lines) (ASCII) ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.6490752696990967 583
313 subtitles_ru_literal_casei 1 3 ugrep (lines) (ASCII) ugrep -a -n -i Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8081104755401611 583
314 subtitles_ru_literal_word 1 3 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt 0.20162224769592285 583
315 subtitles_ru_literal_word 1 3 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt 0.18215250968933105 583
316 subtitles_ru_literal_word 1 3 rg (ASCII) rg -n (?-u:^|\W)Шерлок Холмс(?-u:$|\W) /dev/shm/benchsuite/subtitles/ru.txt 0.20087671279907227 583
317 subtitles_ru_literal_word 1 3 ag (ASCII) ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.48624587059020996
318 subtitles_ru_literal_word 1 3 ag (ASCII) ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.5212516784667969
319 subtitles_ru_literal_word 1 3 ag (ASCII) ag -sw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.520557165145874
320 subtitles_ru_literal_word 1 3 grep (ASCII) grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8108196258544922 583 LC_ALL=C
321 subtitles_ru_literal_word 1 3 grep (ASCII) grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8121066093444824 583 LC_ALL=C
322 subtitles_ru_literal_word 1 3 grep (ASCII) grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.7784581184387207 583 LC_ALL=C
323 subtitles_ru_literal_word 1 3 ugrep (ASCII) ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.7469344139099121 583
324 subtitles_ru_literal_word 1 3 ugrep (ASCII) ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.6838233470916748 583
325 subtitles_ru_literal_word 1 3 ugrep (ASCII) ugrep -anw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.6921679973602295 583
326 subtitles_ru_literal_word 1 3 rg rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.19918251037597656 579
327 subtitles_ru_literal_word 1 3 rg rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.2046656608581543 579
328 subtitles_ru_literal_word 1 3 rg rg -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.1984848976135254 579
329 subtitles_ru_literal_word 1 3 grep grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.794173002243042 579 LC_ALL=en_US.UTF-8
330 subtitles_ru_literal_word 1 3 grep grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.7715346813201904 579 LC_ALL=en_US.UTF-8
331 subtitles_ru_literal_word 1 3 grep grep -nw Шерлок Холмс /dev/shm/benchsuite/subtitles/ru.txt 0.8116705417633057 579 LC_ALL=en_US.UTF-8
332 subtitles_ru_alternate 1 3 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.6730976104736328 691
333 subtitles_ru_alternate 1 3 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.7020411491394043 691
334 subtitles_ru_alternate 1 3 rg (lines) rg -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.6693949699401855 691
335 subtitles_ru_alternate 1 3 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.7100515365600586 691
336 subtitles_ru_alternate 1 3 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.7458419799804688 691
337 subtitles_ru_alternate 1 3 ag (lines) ag -s Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.7115116119384766 691
338 subtitles_ru_alternate 1 3 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.703738451004028 691 LC_ALL=C
339 subtitles_ru_alternate 1 3 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.715883731842041 691 LC_ALL=C
340 subtitles_ru_alternate 1 3 grep (lines) grep -E -n Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.712724924087524 691 LC_ALL=C
341 subtitles_ru_alternate 1 3 ugrep (lines) ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.276995420455933 691
342 subtitles_ru_alternate 1 3 ugrep (lines) ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.304608345031738 691
343 subtitles_ru_alternate 1 3 ugrep (lines) ugrep -an Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.322760820388794 691
344 subtitles_ru_alternate 1 3 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.6119842529296875 691
345 subtitles_ru_alternate 1 3 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.6368775367736816 691
346 subtitles_ru_alternate 1 3 rg rg Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 0.6258070468902588 691
347 subtitles_ru_alternate 1 3 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.4300291538238525 691 LC_ALL=C
348 subtitles_ru_alternate 1 3 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.418199300765991 691 LC_ALL=C
349 subtitles_ru_alternate 1 3 grep grep -E Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.425868511199951 691 LC_ALL=C
350 subtitles_ru_alternate_casei 1 3 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.7216460704803467 691
351 subtitles_ru_alternate_casei 1 3 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.7108607292175293 691
352 subtitles_ru_alternate_casei 1 3 ag (ASCII) ag -s -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 2.747138500213623 691
353 subtitles_ru_alternate_casei 1 3 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.711230039596558 691 LC_ALL=C
354 subtitles_ru_alternate_casei 1 3 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.709407329559326 691 LC_ALL=C
355 subtitles_ru_alternate_casei 1 3 grep (ASCII) grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.714034557342529 691 LC_ALL=C
356 subtitles_ru_alternate_casei 1 3 ugrep (ASCII) ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.305904626846313 691
357 subtitles_ru_alternate_casei 1 3 ugrep (ASCII) ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.307406187057495 691
358 subtitles_ru_alternate_casei 1 3 ugrep (ASCII) ugrep -ani Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 8.288233995437622 691
359 subtitles_ru_alternate_casei 1 3 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 3.673624277114868 735
360 subtitles_ru_alternate_casei 1 3 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 3.6759188175201416 735
361 subtitles_ru_alternate_casei 1 3 rg rg -n -i Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 3.66877818107605 735
362 subtitles_ru_alternate_casei 1 3 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.366282224655151 735 LC_ALL=en_US.UTF-8
363 subtitles_ru_alternate_casei 1 3 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.370524883270264 735 LC_ALL=en_US.UTF-8
364 subtitles_ru_alternate_casei 1 3 grep grep -E -ni Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти /dev/shm/benchsuite/subtitles/ru.txt 5.342163324356079 735 LC_ALL=en_US.UTF-8
365 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.20331382751464844 278
366 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.2034592628479004 278
367 subtitles_ru_surrounding_words 1 3 rg rg -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.20407724380493164 278
368 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0436389446258545 278 LC_ALL=en_US.UTF-8
369 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0388383865356445 278 LC_ALL=en_US.UTF-8
370 subtitles_ru_surrounding_words 1 3 grep grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0446207523345947 278 LC_ALL=en_US.UTF-8
371 subtitles_ru_surrounding_words 1 3 ugrep ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.29245424270629883 1
372 subtitles_ru_surrounding_words 1 3 ugrep ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.29168128967285156 1
373 subtitles_ru_surrounding_words 1 3 ugrep ugrep -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.29593825340270996 1
374 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.085604190826416
375 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.083526372909546
376 subtitles_ru_surrounding_words 1 3 ag (ASCII) ag -s \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.1223819255828857
377 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.9905192852020264 LC_ALL=C
378 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0222513675689697 LC_ALL=C
379 subtitles_ru_surrounding_words 1 3 grep (ASCII) grep -E -n \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 1.0216262340545654 LC_ALL=C
380 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8875806331634521
381 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8861405849456787
382 subtitles_ru_surrounding_words 1 3 ugrep (ASCII) ugrep -a -n -U \w+\s+Холмс\s+\w+ /dev/shm/benchsuite/subtitles/ru.txt 0.8898241519927979
383 subtitles_ru_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.237398147583008 41
384 subtitles_ru_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.253706693649292 41
385 subtitles_ru_no_literal 1 3 rg rg -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.2161178588867188 41
386 subtitles_ru_no_literal 1 3 ugrep ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 28.85959553718567 41
387 subtitles_ru_no_literal 1 3 ugrep ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 28.666419982910156 41
388 subtitles_ru_no_literal 1 3 ugrep ugrep -an \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 28.90555214881897 41
389 subtitles_ru_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.051813840866089
390 subtitles_ru_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.026675224304199
391 subtitles_ru_no_literal 1 3 rg (ASCII) rg -n (?-u)\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 2.027498245239258
392 subtitles_ru_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0998010635375977
393 subtitles_ru_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0900018215179443
394 subtitles_ru_no_literal 1 3 ag (ASCII) ag -s \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0901548862457275
395 subtitles_ru_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0691263675689697 LC_ALL=C
396 subtitles_ru_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0875153541564941 LC_ALL=C
397 subtitles_ru_no_literal 1 3 grep (ASCII) grep -E -n \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 1.0997354984283447 LC_ALL=C
398 subtitles_ru_no_literal 1 3 ugrep (ASCII) ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 0.8329172134399414
399 subtitles_ru_no_literal 1 3 ugrep (ASCII) ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 0.8292679786682129
400 subtitles_ru_no_literal 1 3 ugrep (ASCII) ugrep -anU \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5} /dev/shm/benchsuite/subtitles/ru.txt 0.8326950073242188

View File

@@ -0,0 +1,208 @@
linux_literal_default (pattern: PM_RESUME)
------------------------------------------
rg* 0.084 +/- 0.002 (lines: 39)*
ag 0.295 +/- 0.001 (lines: 39)
git grep 0.225 +/- 0.007 (lines: 39)
ugrep 0.105 +/- 0.002 (lines: 39)
grep 0.996 +/- 0.003 (lines: 39)
linux_literal (pattern: PM_RESUME)
----------------------------------
rg* 0.085 +/- 0.001 (lines: 39)*
rg (mmap) 0.322 +/- 0.002 (lines: 39)
ag (mmap) 0.290 +/- 0.002 (lines: 39)
git grep 0.211 +/- 0.009 (lines: 39)
ugrep 0.189 +/- 0.005 (lines: 39)
linux_literal_casei (pattern: PM_RESUME)
----------------------------------------
rg* 0.088 +/- 0.001 (lines: 536)*
rg (mmap) 0.314 +/- 0.007 (lines: 536)
ag (mmap) 0.299 +/- 0.001 (lines: 536)
git grep 0.214 +/- 0.007 (lines: 536)
ugrep 0.174 +/- 0.001 (lines: 536)
linux_re_literal_suffix (pattern: [A-Z]+_RESUME)
------------------------------------------------
rg* 0.085 +/- 0.000 (lines: 2160)*
ag 0.369 +/- 0.009 (lines: 2160)
git grep 0.915 +/- 0.048 (lines: 2160)
ugrep 0.433 +/- 0.025 (lines: 2160)
linux_word (pattern: PM_RESUME)
-------------------------------
rg* 0.085 +/- 0.001 (lines: 9)*
ag 0.287 +/- 0.001 (lines: 9)
git grep 0.206 +/- 0.002 (lines: 9)
ugrep 0.189 +/- 0.002 (lines: 9)
linux_unicode_greek (pattern: \p{Greek})
----------------------------------------
rg 0.201 +/- 0.005 (lines: 105)
ugrep* 0.181 +/- 0.005 (lines: 105)*
linux_unicode_greek_casei (pattern: \p{Greek})
----------------------------------------------
rg 0.198 +/- 0.000 (lines: 245)
ugrep* 0.179 +/- 0.003 (lines: 105)*
linux_unicode_word (pattern: \wAh)
----------------------------------
rg 0.085 +/- 0.000 (lines: 247)
rg (ASCII)* 0.085 +/- 0.000 (lines: 233)*
ag (ASCII) 0.301 +/- 0.005 (lines: 233)
git grep 3.980 +/- 0.241 (lines: 247)
git grep (ASCII) 1.620 +/- 0.032 (lines: 233)
ugrep 0.177 +/- 0.003 (lines: 247)
ugrep (ASCII) 0.185 +/- 0.005 (lines: 233)
linux_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
-----------------------------------------------------------------
rg 0.266 +/- 0.006 (lines: 721)
rg (ASCII)* 0.200 +/- 0.001 (lines: 720)*
ag (ASCII) 0.832 +/- 0.007 (lines: 1134)
git grep 7.346 +/- 0.017 (lines: 721)
git grep (ASCII) 2.144 +/- 0.014 (lines: 720)
ugrep 3.403 +/- 0.008 (lines: 723)
ugrep (ASCII) 0.236 +/- 0.003 (lines: 722)
linux_alternates (pattern: ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT)
-------------------------------------------------------------------------
rg* 0.087 +/- 0.000 (lines: 140)*
ag 0.330 +/- 0.002 (lines: 140)
git grep 0.414 +/- 0.047 (lines: 140)
ugrep 0.179 +/- 0.002 (lines: 140)
linux_alternates_casei (pattern: ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT)
-------------------------------------------------------------------------------
rg* 0.123 +/- 0.001 (lines: 241)*
ag 0.530 +/- 0.001 (lines: 241)
git grep 0.792 +/- 0.036 (lines: 241)
ugrep 0.177 +/- 0.003 (lines: 241)
subtitles_en_literal (pattern: Sherlock Holmes)
-----------------------------------------------
rg* 0.123 +/- 0.003 (lines: 830)*
rg (no mmap) 0.176 +/- 0.005 (lines: 830)
grep 0.572 +/- 0.017 (lines: 830)
rg (lines) 0.189 +/- 0.006 (lines: 830)
ag (lines) 1.868 +/- 0.004 (lines: 830)
grep (lines) 0.980 +/- 0.036 (lines: 830)
ugrep (lines) 0.185 +/- 0.007 (lines: 830)
subtitles_en_literal_casei (pattern: Sherlock Holmes)
-----------------------------------------------------
rg* 0.214 +/- 0.008 (lines: 871)*
grep 2.224 +/- 0.000 (lines: 871)
grep (ASCII) 0.671 +/- 0.001 (lines: 871)
rg (lines) 0.259 +/- 0.004 (lines: 871)
ag (lines) (ASCII) 1.897 +/- 0.026 (lines: 871)
ugrep (lines) 0.785 +/- 0.002 (lines: 871)
subtitles_en_literal_word (pattern: Sherlock Holmes)
----------------------------------------------------
rg (ASCII) 0.189 +/- 0.006 (lines: 830)
ag (ASCII) 1.842 +/- 0.023 (lines: 830)
grep (ASCII) 0.977 +/- 0.046 (lines: 830)
ugrep (ASCII)* 0.182 +/- 0.007 (lines: 830)*
rg 0.192 +/- 0.006 (lines: 830)
grep 0.990 +/- 0.024 (lines: 830)
subtitles_en_alternate (pattern: Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)
---------------------------------------------------------------------------------------------------------------
rg (lines) 0.248 +/- 0.001 (lines: 1094)
ag (lines) 2.638 +/- 0.055 (lines: 1094)
grep (lines) 2.052 +/- 0.027 (lines: 1094)
ugrep (lines) 0.787 +/- 0.001 (lines: 1094)
rg* 0.176 +/- 0.011 (lines: 1094)*
grep 1.660 +/- 0.002 (lines: 1094)
subtitles_en_alternate_casei (pattern: Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)
---------------------------------------------------------------------------------------------------------------------
ag (ASCII) 3.999 +/- 0.007 (lines: 1136)
grep (ASCII) 3.561 +/- 0.023 (lines: 1136)
ugrep (ASCII) 0.787 +/- 0.002 (lines: 1136)
rg* 0.483 +/- 0.009 (lines: 1136)*
grep 3.585 +/- 0.018 (lines: 1136)
subtitles_en_surrounding_words (pattern: \w+\s+Holmes\s+\w+)
------------------------------------------------------------
rg 0.200 +/- 0.001 (lines: 483)
grep 1.303 +/- 0.040 (lines: 483)
ugrep 43.220 +/- 0.047 (lines: 483)
rg (ASCII)* 0.197 +/- 0.000 (lines: 483)*
ag (ASCII) 5.223 +/- 0.056 (lines: 489)
grep (ASCII) 1.316 +/- 0.043 (lines: 483)
ugrep (ASCII) 17.647 +/- 0.219 (lines: 483)
subtitles_en_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
----------------------------------------------------------------------------------------
rg 2.119 +/- 0.016 (lines: 22)
ugrep 13.053 +/- 0.004 (lines: 22)
rg (ASCII)* 2.004 +/- 0.013 (lines: 22)*
ag (ASCII) 6.830 +/- 0.018 (lines: 302)
grep (ASCII) 4.395 +/- 0.030 (lines: 22)
ugrep (ASCII) 3.510 +/- 0.004 (lines: 22)
subtitles_ru_literal (pattern: Шерлок Холмс)
--------------------------------------------
rg* 0.133 +/- 0.002 (lines: 583)*
rg (no mmap) 0.180 +/- 0.000 (lines: 583)
grep 0.510 +/- 0.009 (lines: 583)
rg (lines) 0.193 +/- 0.003 (lines: 583)
ag (lines) 1.973 +/- 0.025 (lines: 583)
grep (lines) 0.808 +/- 0.006 (lines: 583)
ugrep (lines) 0.680 +/- 0.026 (lines: 583)
subtitles_ru_literal_casei (pattern: Шерлок Холмс)
--------------------------------------------------
rg* 0.268 +/- 0.001 (lines: 604)*
grep 4.767 +/- 0.002 (lines: 604)
grep (ASCII) 0.506 +/- 0.007 (lines: 583)
rg (lines) 0.335 +/- 0.005 (lines: 604)
ag (lines) (ASCII) 0.457 +/- 0.019 (lines: 0)
ugrep (lines) (ASCII) 0.720 +/- 0.081 (lines: 583)
subtitles_ru_literal_word (pattern: Шерлок Холмс)
-------------------------------------------------
rg (ASCII)* 0.195 +/- 0.011 (lines: 583)*
ag (ASCII) 0.509 +/- 0.020 (lines: 0)
grep (ASCII) 0.800 +/- 0.019 (lines: 583)
ugrep (ASCII) 0.708 +/- 0.034 (lines: 583)
rg 0.201 +/- 0.003 (lines: 579)
grep 0.792 +/- 0.020 (lines: 579)
subtitles_ru_alternate (pattern: Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти)
-----------------------------------------------------------------------------------------------------------
rg (lines) 0.682 +/- 0.018 (lines: 691)
ag (lines) 2.722 +/- 0.020 (lines: 691)
grep (lines) 5.711 +/- 0.006 (lines: 691)
ugrep (lines) 8.301 +/- 0.023 (lines: 691)
rg* 0.625 +/- 0.012 (lines: 691)*
grep 5.425 +/- 0.006 (lines: 691)
subtitles_ru_alternate_casei (pattern: Шерлок Холмс|Джон Уотсон|Ирен Адлер|инспектор Лестрейд|профессор Мориарти)
-----------------------------------------------------------------------------------------------------------------
ag (ASCII)* 2.727 +/- 0.019 (lines: 691)*
grep (ASCII) 5.712 +/- 0.002 (lines: 691)
ugrep (ASCII) 8.301 +/- 0.011 (lines: 691)
rg 3.673 +/- 0.004 (lines: 735)
grep 5.360 +/- 0.015 (lines: 735)
subtitles_ru_surrounding_words (pattern: \w+\s+Холмс\s+\w+)
-----------------------------------------------------------
rg* 0.203 +/- 0.001 (lines: 278)*
grep 1.039 +/- 0.009 (lines: 278)
ugrep 42.919 +/- 0.087 (lines: 278)
ag (ASCII) 1.084 +/- 0.001 (lines: 0)
grep (ASCII) 1.007 +/- 0.018 (lines: 0)
ugrep (ASCII) 0.890 +/- 0.001 (lines: 0)
subtitles_ru_no_literal (pattern: \w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5}\s+\w{5})
----------------------------------------------------------------------------------------
rg 2.236 +/- 0.019 (lines: 41)
ugrep 28.811 +/- 0.127 (lines: 41)
rg (ASCII) 2.035 +/- 0.014 (lines: 0)
ag (ASCII) 1.093 +/- 0.006 (lines: 0)
grep (ASCII) 1.085 +/- 0.015 (lines: 0)
ugrep (ASCII)* 0.832 +/- 0.002 (lines: 0)*

216
build.rs
View File

@@ -1,184 +1,46 @@
#[macro_use]
extern crate clap;
#[macro_use]
extern crate lazy_static;
use std::env;
use std::fs::{self, File};
use std::io::{self, Read, Write};
use std::path::Path;
use std::process;
use clap::Shell;
use app::{RGArg, RGArgKind};
#[allow(dead_code)]
#[path = "src/app.rs"]
mod app;
fn main() {
// OUT_DIR is set by Cargo and it's where any additional build artifacts
// are written.
let outdir = match env::var_os("OUT_DIR") {
Some(outdir) => outdir,
None => {
eprintln!(
"OUT_DIR environment variable not defined. \
Please file a bug: \
https://github.com/BurntSushi/ripgrep/issues/new");
process::exit(1);
}
};
fs::create_dir_all(&outdir).unwrap();
let stamp_path = Path::new(&outdir).join("ripgrep-stamp");
if let Err(err) = File::create(&stamp_path) {
panic!("failed to write {}: {}", stamp_path.display(), err);
}
if let Err(err) = generate_man_page(&outdir) {
eprintln!("failed to generate man page: {}", err);
}
// Use clap to build completion files.
let mut app = app::app();
app.gen_completions("rg", Shell::Bash, &outdir);
app.gen_completions("rg", Shell::Fish, &outdir);
app.gen_completions("rg", Shell::PowerShell, &outdir);
// Note that we do not use clap's support for zsh. Instead, zsh completions
// are manually maintained in `complete/_rg`.
// Make the current git hash available to the build.
if let Some(rev) = git_revision_hash() {
println!("cargo:rustc-env=RIPGREP_BUILD_GIT_HASH={}", rev);
}
set_git_revision_hash();
set_windows_exe_options();
}
fn git_revision_hash() -> Option<String> {
let result = process::Command::new("git")
.args(&["rev-parse", "--short=10", "HEAD"])
.output();
result.ok().and_then(|output| {
let v = String::from_utf8_lossy(&output.stdout).trim().to_string();
if v.is_empty() {
None
} else {
Some(v)
}
})
/// Embed a Windows manifest and set some linker options.
///
/// The main reason for this is to enable long path support on Windows. This
/// still, I believe, requires enabling long path support in the registry. But
/// if that's enabled, then this will let ripgrep use C:\... style paths that
/// are longer than 260 characters.
fn set_windows_exe_options() {
static MANIFEST: &str = "pkg/windows/Manifest.xml";
let Ok(target_os) = std::env::var("CARGO_CFG_TARGET_OS") else { return };
let Ok(target_env) = std::env::var("CARGO_CFG_TARGET_ENV") else { return };
if !(target_os == "windows" && target_env == "msvc") {
return;
}
let Ok(mut manifest) = std::env::current_dir() else { return };
manifest.push(MANIFEST);
let Some(manifest) = manifest.to_str() else { return };
println!("cargo:rerun-if-changed={}", MANIFEST);
// Embed the Windows application manifest file.
println!("cargo:rustc-link-arg-bin=rg=/MANIFEST:EMBED");
println!("cargo:rustc-link-arg-bin=rg=/MANIFESTINPUT:{manifest}");
// Turn linker warnings into errors. Helps debugging, otherwise the
// warnings get squashed (I believe).
println!("cargo:rustc-link-arg-bin=rg=/WX");
}
fn generate_man_page<P: AsRef<Path>>(outdir: P) -> io::Result<()> {
// If asciidoc isn't installed, then don't do anything.
if let Err(err) = process::Command::new("a2x").output() {
eprintln!("Could not run 'a2x' binary, skipping man page generation.");
eprintln!("Error from running 'a2x': {}", err);
return Ok(());
/// Make the current git hash available to the build as the environment
/// variable `RIPGREP_BUILD_GIT_HASH`.
fn set_git_revision_hash() {
use std::process::Command;
let args = &["rev-parse", "--short=10", "HEAD"];
let Ok(output) = Command::new("git").args(args).output() else { return };
let rev = String::from_utf8_lossy(&output.stdout).trim().to_string();
if rev.is_empty() {
return;
}
// 1. Read asciidoc template.
// 2. Interpolate template with auto-generated docs.
// 3. Save interpolation to disk.
// 4. Use a2x (part of asciidoc) to convert to man page.
let outdir = outdir.as_ref();
let cwd = env::current_dir()?;
let tpl_path = cwd.join("doc").join("rg.1.txt.tpl");
let txt_path = outdir.join("rg.1.txt");
let mut tpl = String::new();
File::open(&tpl_path)?.read_to_string(&mut tpl)?;
tpl = tpl.replace("{OPTIONS}", &formatted_options()?);
let githash = git_revision_hash();
let githash = githash.as_ref().map(|x| &**x);
tpl = tpl.replace("{VERSION}", &app::long_version(githash));
File::create(&txt_path)?.write_all(tpl.as_bytes())?;
let result = process::Command::new("a2x")
.arg("--no-xmllint")
.arg("--doctype").arg("manpage")
.arg("--format").arg("manpage")
.arg(&txt_path)
.spawn()?
.wait()?;
if !result.success() {
let msg = format!("'a2x' failed with exit code {:?}", result.code());
return Err(ioerr(msg));
}
Ok(())
}
fn formatted_options() -> io::Result<String> {
let mut args = app::all_args_and_flags();
args.sort_by(|x1, x2| x1.name.cmp(&x2.name));
let mut formatted = vec![];
for arg in args {
if arg.hidden {
continue;
}
// ripgrep only has two positional arguments, and probably will only
// ever have two positional arguments, so we just hardcode them into
// the template.
if let app::RGArgKind::Positional{..} = arg.kind {
continue;
}
formatted.push(formatted_arg(&arg)?);
}
Ok(formatted.join("\n\n"))
}
fn formatted_arg(arg: &RGArg) -> io::Result<String> {
match arg.kind {
RGArgKind::Positional{..} => panic!("unexpected positional argument"),
RGArgKind::Switch { long, short, multiple } => {
let mut out = vec![];
let mut header = format!("--{}", long);
if let Some(short) = short {
header = format!("-{}, {}", short, header);
}
if multiple {
header = format!("*{}* ...::", header);
} else {
header = format!("*{}*::", header);
}
writeln!(out, "{}", header)?;
writeln!(out, "{}", formatted_doc_txt(arg)?)?;
Ok(String::from_utf8(out).unwrap())
}
RGArgKind::Flag { long, short, value_name, multiple, .. } => {
let mut out = vec![];
let mut header = format!("--{}", long);
if let Some(short) = short {
header = format!("-{}, {}", short, header);
}
if multiple {
header = format!("*{}* _{}_ ...::", header, value_name);
} else {
header = format!("*{}* _{}_::", header, value_name);
}
writeln!(out, "{}", header)?;
writeln!(out, "{}", formatted_doc_txt(arg)?)?;
Ok(String::from_utf8(out).unwrap())
}
}
}
fn formatted_doc_txt(arg: &RGArg) -> io::Result<String> {
let paragraphs: Vec<&str> = arg.doc_long.split("\n\n").collect();
if paragraphs.is_empty() {
return Err(ioerr(format!("missing docs for --{}", arg.name)));
}
let first = format!(" {}", paragraphs[0].replace("\n", "\n "));
if paragraphs.len() == 1 {
return Ok(first);
}
Ok(format!("{}\n+\n{}", first, paragraphs[1..].join("\n+\n")))
}
fn ioerr(msg: String) -> io::Error {
io::Error::new(io::ErrorKind::Other, msg)
println!("cargo:rustc-env=RIPGREP_BUILD_GIT_HASH={}", rev);
}

View File

@@ -1,55 +0,0 @@
#!/bin/bash
# package the build artifacts
set -ex
. "$(dirname $0)/utils.sh"
# Generate artifacts for release
mk_artifacts() {
cargo build --target "$TARGET" --release
}
mk_tarball() {
# When cross-compiling, use the right `strip` tool on the binary.
local gcc_prefix="$(gcc_prefix)"
# Create a temporary dir that contains our staging area.
# $tmpdir/$name is what eventually ends up as the deployed archive.
local tmpdir="$(mktemp -d)"
local name="${PROJECT_NAME}-${TRAVIS_TAG}-${TARGET}"
local staging="$tmpdir/$name"
mkdir -p "$staging"/{complete,doc}
# The deployment directory is where the final archive will reside.
# This path is known by the .travis.yml configuration.
local out_dir="$(pwd)/deployment"
mkdir -p "$out_dir"
# Find the correct (most recent) Cargo "out" directory. The out directory
# contains shell completion files and the man page.
local cargo_out_dir="$(cargo_out_dir "target/$TARGET")"
# Copy the ripgrep binary and strip it.
cp "target/$TARGET/release/rg" "$staging/rg"
"${gcc_prefix}strip" "$staging/rg"
# Copy the licenses and README.
cp {README.md,UNLICENSE,COPYING,LICENSE-MIT} "$staging/"
# Copy documentation and man page.
cp {CHANGELOG.md,FAQ.md,GUIDE.md} "$staging/doc/"
if command -V a2x 2>&1 > /dev/null; then
# The man page should only exist if we have asciidoc installed.
cp "$cargo_out_dir/rg.1" "$staging/doc/"
fi
# Copy shell completion files.
cp "$cargo_out_dir"/{rg.bash,rg.fish,_rg.ps1} "$staging/complete/"
cp complete/_rg "$staging/complete/"
(cd "$tmpdir" && tar czf "$out_dir/$name.tar.gz" "$name")
rm -rf "$tmpdir"
}
main() {
mk_artifacts
mk_tarball
}
main

43
ci/build-and-publish-m2 Executable file
View File

@@ -0,0 +1,43 @@
#!/bin/bash
# This script builds a ripgrep release for the aarch64-apple-darwin target.
# At time of writing (2023-11-21), GitHub Actions does not free Apple silicon
# runners. Since I have somewhat recently acquired an M2 mac mini, I just use
# this script to build the release tarball and upload it with `gh`.
#
# Once GitHub Actions has proper support for Apple silicon, we should add it
# to our release workflow and drop this script.
set -e
version="$1"
if [ -z "$version" ]; then
echo "missing version" >&2
echo "Usage: "$(basename "$0")" <version>" >&2
exit 1
fi
if ! grep -q "version = \"$version\"" Cargo.toml; then
echo "version does not match Cargo.toml" >&2
exit 1
fi
target=aarch64-apple-darwin
cargo build --release --features pcre2 --target $target
BIN=target/$target/release/rg
NAME=ripgrep-$version-$target
ARCHIVE="deployment/m2/$NAME"
mkdir -p "$ARCHIVE"/{complete,doc}
cp "$BIN" "$ARCHIVE"/
strip "$ARCHIVE/rg"
cp {README.md,COPYING,UNLICENSE,LICENSE-MIT} "$ARCHIVE"/
cp {CHANGELOG.md,FAQ.md,GUIDE.md} "$ARCHIVE"/doc/
"$BIN" --generate complete-bash > "$ARCHIVE/complete/rg.bash"
"$BIN" --generate complete-fish > "$ARCHIVE/complete/rg.fish"
"$BIN" --generate complete-powershell > "$ARCHIVE/complete/_rg.ps1"
"$BIN" --generate complete-zsh > "$ARCHIVE/complete/_rg"
"$BIN" --generate man > "$ARCHIVE/doc/rg.1"
tar c -C deployment/m2 -z -f "$ARCHIVE.tar.gz" "$NAME"
shasum -a 256 "$ARCHIVE.tar.gz" > "$ARCHIVE.tar.gz.sha256"
gh release upload "$version" "$ARCHIVE.tar.gz" "$ARCHIVE.tar.gz.sha256"

View File

@@ -1,61 +0,0 @@
#!/bin/bash
# install stuff needed for the `script` phase
# Where rustup gets installed.
export PATH="$PATH:$HOME/.cargo/bin"
set -ex
. "$(dirname $0)/utils.sh"
install_rustup() {
curl https://sh.rustup.rs -sSf \
| sh -s -- -y --default-toolchain="$TRAVIS_RUST_VERSION"
rustc -V
cargo -V
}
install_targets() {
if [ $(host) != "$TARGET" ]; then
rustup target add $TARGET
fi
}
install_osx_dependencies() {
if ! is_osx; then
return
fi
brew install asciidoc docbook-xsl
}
configure_cargo() {
local prefix=$(gcc_prefix)
if [ -n "${prefix}" ]; then
local gcc_suffix=
if [ -n "$GCC_VERSION" ]; then
gcc_suffix="-$GCC_VERSION"
fi
local gcc="${prefix}gcc${gcc_suffix}"
# information about the cross compiler
"${gcc}" -v
# tell cargo which linker to use for cross compilation
mkdir -p .cargo
cat >>.cargo/config <<EOF
[target.$TARGET]
linker = "${gcc}"
EOF
fi
}
main() {
install_osx_dependencies
install_rustup
install_targets
configure_cargo
}
main

View File

@@ -1,46 +0,0 @@
#!/bin/bash
# build, test and generate docs in this phase
set -ex
. "$(dirname $0)/utils.sh"
main() {
# Test a normal debug build.
cargo build --target "$TARGET" --verbose --all
# Show the output of the most recent build.rs stderr.
set +x
stderr="$(find "target/$TARGET/debug" -name stderr -print0 | xargs -0 ls -t | head -n1)"
if [ -s "$stderr" ]; then
echo "===== $stderr ====="
cat "$stderr"
echo "====="
fi
set -x
# sanity check the file type
file target/"$TARGET"/debug/rg
# Check that we've generated man page and other shell completions.
outdir="$(cargo_out_dir "target/$TARGET/debug")"
file "$outdir/rg.bash"
file "$outdir/rg.fish"
file "$outdir/_rg.ps1"
file "$outdir/rg.1"
# Apparently tests don't work on arm, so just bail now. I guess we provide
# ARM releases on a best effort basis?
if is_arm; then
return 0
fi
# Test that zsh completions are in sync with ripgrep's actual args.
"$(dirname "${0}")/test_complete.sh"
# Run tests for ripgrep and all sub-crates.
cargo test --target "$TARGET" --verbose --all
}
main

View File

@@ -18,8 +18,8 @@ get_comp_args() {
main() {
local diff
local rg="${0:a:h}/../target/${TARGET:-}/release/rg"
local _rg="${0:a:h}/../complete/_rg"
local rg="${0:a:h}/../${TARGET_DIR:-target}/release/rg"
local _rg="${0:a:h}/../crates/core/flags/complete/rg.zsh"
local -a help_args comp_args
[[ -e $rg ]] || rg=${rg/%\/release\/rg/\/debug\/rg}
@@ -39,12 +39,14 @@ main() {
print -rl - 'Comparing options:' "-$rg" "+$_rg"
# 'Parse' options out of the `--help` output. To prevent false positives we
# only look at lines where the first non-white-space character is `-`
# only look at lines where the first non-white-space character is `-`, or
# where a long option starting with certain letters (see `_rg`) is found.
# Occasionally we may have to handle some manually, however
help_args=( ${(f)"$(
$rg --help |
$rg -- '^\s*-' |
$rg -io -- '[\t ,](-[a-z0-9]|--[a-z0-9-]+)\b' |
tr -d '\t ,' |
$rg -i -- '^\s+--?[a-z0-9.]|--[a-z]' |
$rg -ior '$1' -- $'[\t /\"\'`.,](-[a-z0-9.]|--[a-z0-9-]+)(,|\\b)' |
$rg -v -- --print0 | # False positives
sort -u
)"} )
@@ -58,8 +60,6 @@ main() {
comp_args=( ${comp_args%%-[:[]*} ) # Strip everything after -optname-
comp_args=( ${comp_args%%[:+=[]*} ) # Strip everything after other optspecs
comp_args=( ${comp_args##[^-]*} ) # Remove non-options
# This probably isn't necessary, but we should ensure the same order
comp_args=( ${(f)"$( print -rl - $comp_args | sort -u )"} )
(( $#help_args )) || {

14
ci/ubuntu-install-packages Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/sh
# This script gets run in weird environments that have been stripped of just
# about every inessential thing. In order to keep this script versatile, we
# just install 'sudo' and use it like normal if it doesn't exist. If it doesn't
# exist, we assume we're root. (Otherwise we ain't doing much of anything
# anyway.)
if ! command -V sudo; then
apt-get update
apt-get install -y --no-install-recommends sudo
fi
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
zsh xz-utils liblz4-tool musl-tools brotli zstd

View File

@@ -55,10 +55,10 @@ gcc_prefix() {
esac
}
is_ssse3_target() {
case "$(architecture)" in
amd64) return 0 ;;
*) return 1 ;;
is_musl() {
case "$TARGET" in
*-musl) return 0 ;;
*) return 1 ;;
esac
}
@@ -69,6 +69,13 @@ is_x86() {
esac
}
is_x86_64() {
case "$(architecture)" in
amd64) return 0 ;;
*) return 1 ;;
esac
}
is_arm() {
case "$(architecture)" in
armhf) return 0 ;;
@@ -89,3 +96,12 @@ is_osx() {
*) return 1 ;;
esac
}
builder() {
if is_musl && is_x86_64; then
cargo install cross
echo "cross"
else
echo "cargo"
fi
}

View File

@@ -1,377 +0,0 @@
#compdef rg
##
# zsh completion function for ripgrep
#
# Run ci/test_complete.sh after building to ensure that the options supported by
# this function stay in synch with the `rg` binary.
#
# @see http://zsh.sourceforge.net/Doc/Release/Completion-System.html
# @see https://github.com/zsh-users/zsh/blob/master/Etc/completion-style-guide
#
# Originally based on code from the zsh-users project — see copyright notice
# below.
_rg() {
local curcontext=$curcontext no='!' descr ret=1
local -a context line state state_descr args tmp suf
local -A opt_args
# ripgrep has many options which negate the effect of a more common one — for
# example, `--no-column` to negate `--column`, and `--messages` to negate
# `--no-messages`. There are so many of these, and they're so infrequently
# used, that some users will probably find it irritating if they're completed
# indiscriminately, so let's not do that unless either the current prefix
# matches one of those negation options or the user has the `complete-all`
# style set. Note that this prefix check has to be updated manually to account
# for all of the potential negation options listed below!
if
# (--[imn]* => --ignore*, --messages, --no-*)
[[ $PREFIX$SUFFIX == --[imn]* ]] ||
zstyle -t ":complete:$curcontext:*" complete-all
then
no=
fi
# We make heavy use of argument groups here to prevent the option specs from
# growing unwieldy. These aren't supported in zsh <5.4, though, so we'll strip
# them out below if necessary. This makes the exclusions inaccurate on those
# older versions, but oh well — it's not that big a deal
args=(
+ '(exclusive)' # Misc. fully exclusive options
'(: * -)'{-h,--help}'[display help information]'
'(: * -)'{-V,--version}'[display version information]'
+ '(case)' # Case-sensitivity options
{-i,--ignore-case}'[search case-insensitively]'
{-s,--case-sensitive}'[search case-sensitively]'
{-S,--smart-case}'[search case-insensitively if pattern is all lowercase]'
+ '(context-a)' # Context (after) options
'(context-c)'{-A+,--after-context=}'[specify lines to show after each match]:number of lines'
+ '(context-b)' # Context (before) options
'(context-c)'{-B+,--before-context=}'[specify lines to show before each match]:number of lines'
+ '(context-c)' # Context (combined) options
'(context-a context-b)'{-C+,--context=}'[specify lines to show before and after each match]:number of lines'
+ '(column)' # Column options
'--column[show column numbers for matches]'
$no"--no-column[don't show column numbers for matches]"
+ '(count)' # Counting options
'(passthru)'{-c,--count}'[only show count of matching lines for each file]'
'(passthru)--count-matches[only show count of individual matches for each file]'
+ file # File-input options
'*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'
+ '(file-match)' # Files with/without match options
'(stats)'{-l,--files-with-matches}'[only show names of files with matches]'
'(stats)--files-without-match[only show names of files without matches]'
+ '(file-name)' # File-name options
{-H,--with-filename}'[show file name for matches]'
"--no-filename[don't show file name for matches]"
+ '(fixed)' # Fixed-string options
{-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
$no"--no-fixed-strings[don't treat pattern as literal string]"
+ '(follow)' # Symlink-following options
{-L,--follow}'[follow symlinks]'
$no"--no-follow[don't follow symlinks]"
+ glob # File-glob options
'*'{-g+,--glob=}'[include/exclude files matching specified glob]:glob'
'*--iglob=[include/exclude files matching specified case-insensitive glob]:glob'
+ '(heading)' # Heading options
'(pretty-vimgrep)--heading[show matches grouped by file name]'
"(pretty-vimgrep)--no-heading[don't show matches grouped by file name]"
+ '(hidden)' # Hidden-file options
'--hidden[search hidden files and directories]'
$no"--no-hidden[don't search hidden files and directories]"
+ '(ignore)' # Ignore-file options
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs)--no-ignore[don't respect ignore files]"
$no'(--ignore-global --ignore-parent --ignore-vcs)--ignore[respect ignore files]'
+ '(ignore-global)' # Global ignore-file options
"--no-ignore-global[don't respect global ignore files]"
$no'--ignore-global[respect global ignore files]'
+ '(ignore-parent)' # Parent ignore-file options
"--no-ignore-parent[don't respect ignore files in parent directories]"
$no'--ignore-parent[respect ignore files in parent directories]'
+ '(ignore-vcs)' # VCS ignore-file options
"--no-ignore-vcs[don't respect version control ignore files]"
$no'--ignore-vcs[respect version control ignore files]'
+ '(line)' # Line-number options
{-n,--line-number}'[show line numbers for matches]'
{-N,--no-line-number}"[don't show line numbers for matches]"
+ '(max-depth)' # Directory-depth options
'--max-depth=[specify max number of directories to descend]:number of directories'
'!--maxdepth=:number of directories'
+ '(messages)' # Error-message options
'(--no-ignore-messages)--no-messages[suppress some error messages]'
$no"--messages[don't suppress error messages affected by --no-messages]"
+ '(messages-ignore)' # Ignore-error message options
"--no-ignore-messages[don't show ignore-file parse error messages]"
$no'--ignore-messages[show ignore-file parse error messages]'
+ '(mmap)' # mmap options
'--mmap[search using memory maps when possible]'
"--no-mmap[don't search using memory maps]"
+ '(only)' # Only-match options
'(passthru replace)'{-o,--only-matching}'[show only matching part of each line]'
+ '(passthru)' # Pass-through options
'(--vimgrep count only replace)--passthru[show both matching and non-matching lines]'
'!(--vimgrep count only replace)--passthrough'
+ '(pre)' # Preprocessing options
'(-z --search-zip)--pre=[specify preprocessor utility]:preprocessor utility:_command_names -e'
$no'--no-pre[disable preprocessor utility]'
+ '(pretty-vimgrep)' # Pretty/vimgrep display options
'(heading)'{-p,--pretty}'[alias for --color=always --heading -n]'
'(heading passthru)--vimgrep[show results in vim-compatible format]'
+ regexp # Explicit pattern options
'(1 file)*'{-e+,--regexp=}'[specify pattern]:pattern'
+ '(replace)' # Replacement options
'(count only passthru)'{-r+,--replace=}'[specify string used to replace matches]:replace string'
+ '(sort)' # File-sorting options
'(threads)--sort-files[sort results by file path (disables parallelism)]'
$no"--no-sort-files[don't sort results by file path]"
+ stats # Statistics options
'(--files file-match)--stats[show search statistics]'
+ '(text)' # Binary-search options
{-a,--text}'[search binary files as if they were text]'
$no"--no-text[don't search binary files as if they were text]"
+ '(threads)' # Thread-count options
'(--sort-files)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
+ type # Type options
'*'{-t+,--type=}'[only search files matching specified type]: :_rg_types'
'*--type-add=[add new glob for specified file type]: :->typespec'
'*--type-clear=[clear globs previously defined for specified file type]: :_rg_types'
# This should actually be exclusive with everything but other type options
'(: *)--type-list[show all supported file types and their associated globs]'
'*'{-T+,--type-not=}"[don't search files matching specified file type]: :_rg_types"
+ '(word-line)' # Whole-word/line match options
{-w,--word-regexp}'[only show matches surrounded by word boundaries]'
{-x,--line-regexp}'[only show matches surrounded by line boundaries]'
+ '(zip)' # Compression options
'(--pre)'{-z,--search-zip}'[search in compressed files]'
$no"--no-search-zip[don't search in compressed files]"
+ misc # Other options — no need to separate these at the moment
'(-b --byte-offset)'{-b,--byte-offset}'[show 0-based byte offset for each matching line]'
'--color=[specify when to use colors in output]:when:((
never\:"never use colors"
auto\:"use colors or not based on stdout, TERM, etc."
always\:"always use colors"
ansi\:"always use ANSI colors (even on Windows)"
))'
'*--colors=[specify color and style settings]: :->colorspec'
'--context-separator=[specify string used to separate non-continuous context lines in output]:separator'
'--debug[show debug messages]'
'--dfa-size-limit=[specify upper size limit of generated DFA]:DFA size (bytes)'
'(-E --encoding)'{-E+,--encoding=}'[specify text encoding of files to search]: :_rg_encodings'
"(1 stats)--files[show each file that would be searched (but don't search)]"
'*--ignore-file=[specify additional ignore file]:ignore file:_files'
'(-v --invert-match)'{-v,--invert-match}'[invert matching]'
'(-M --max-columns)'{-M+,--max-columns=}'[specify max length of lines to print]:number of bytes'
'(-m --max-count)'{-m+,--max-count=}'[specify max number of matches per file]:number of matches'
'--max-filesize=[specify size above which files should be ignored]:file size (bytes)'
"--no-config[don't load configuration files]"
'(-0 --null)'{-0,--null}'[print NUL byte after file names]'
'--path-separator=[specify path separator to use when printing file names]:separator'
'(-q --quiet)'{-q,--quiet}'[suppress normal output]'
'--regex-size-limit=[specify upper size limit of compiled regex]:regex size (bytes)'
'*'{-u,--unrestricted}'[reduce level of "smart" searching]'
+ operand # Operands
'(--files --type-list file regexp)1: :_guard "^-*" pattern'
'(--type-list)*: :_files'
)
# This is used with test_complete.sh to verify that there are no options
# listed in the help output that aren't also defined here
[[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] && {
print -rl - $args
return 0
}
# Strip out argument groups where unsupported (see above)
[[ $ZSH_VERSION == (4|5.<0-3>)(.*)# ]] &&
args=( ${(@)args:#(#i)(+|[a-z0-9][a-z0-9_-]#|\([a-z0-9][a-z0-9_-]#\))} )
_arguments -C -s -S : $args && ret=0
case $state in
colorspec)
if [[ ${IPREFIX#--*=}$PREFIX == [^:]# ]]; then
suf=( -qS: )
tmp=(
'column:specify coloring for column numbers'
'line:specify coloring for line numbers'
'match:specify coloring for match text'
'path:specify coloring for file names'
)
descr='color/style type'
elif [[ ${IPREFIX#--*=}$PREFIX == (column|line|match|path):[^:]# ]]; then
suf=( -qS: )
tmp=(
'none:clear color/style for type'
'bg:specify background color'
'fg:specify foreground color'
'style:specify text style'
)
descr='color/style attribute'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:(bg|fg):[^:]# ]]; then
tmp=( black blue green red cyan magenta yellow white )
descr='color name or r,g,b'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:style:[^:]# ]]; then
tmp=( {,no}bold {,no}intense {,no}underline )
descr='style name'
else
_message -e colorspec 'no more arguments'
fi
(( $#tmp )) && {
compset -P '*:'
_describe -t colorspec $descr tmp $suf && ret=0
}
;;
typespec)
if compset -P '[^:]##:include:'; then
_sequence -s , _rg_types && ret=0
# @todo This bit in particular could be better, but it's a little
# complex, and attempting to solve it seems to run us up against a crash
# bug — zsh # 40362
elif compset -P '[^:]##:'; then
_message 'glob or include directive' && ret=1
elif [[ ! -prefix *:* ]]; then
_rg_types -qS : && ret=0
fi
;;
esac
return ret
}
# Complete encodings
_rg_encodings() {
local -a expl
local -aU _encodings
# This is impossible to read, but these encodings rarely if ever change, so it
# probably doesn't matter. They are derived from the list given here:
# https://encoding.spec.whatwg.org/#concept-encoding-get
_encodings=(
{{,us-}ascii,arabic,chinese,cyrillic,greek{,8},hebrew,korean}
logical visual mac {,cs}macintosh x-mac-{cyrillic,roman,ukrainian}
866 ibm{819,866} csibm866
big5{,-hkscs} {cn-,cs}big5 x-x-big5
cp{819,866,125{0..8}} x-cp125{0..8}
csiso2022{jp,kr} csiso8859{6,8}{e,i}
csisolatin{{1..6},9} csisolatin{arabic,cyrillic,greek,hebrew}
ecma-{114,118} asmo-708 elot_928 sun_eu_greek
euc-{jp,kr} x-euc-jp cseuckr cseucpkdfmtjapanese
{,x-}gbk csiso58gb231280 gb18030 {,cs}gb2312 gb_2312{,-80} hz-gb-2312
iso-2022-{cn,cn-ext,jp,kr}
iso8859{,-}{{1..11},13,14,15}
iso-8859-{{1..11},{6,8}-{e,i},13,14,15,16} iso_8859-{{1..9},15}
iso_8859-{1,2,6,7}:1987 iso_8859-{3,4,5,8}:1988 iso_8859-9:1989
iso-ir-{58,100,101,109,110,126,127,138,144,148,149,157}
koi{,8,8-r,8-ru,8-u,8_r} cskoi8r
ks_c_5601-{1987,1989} ksc{,_}5691 csksc56011987
latin{1..6} l{{1..6},9}
shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
windows-{31j,874,949,125{0..8}} dos-874 tis-620 ansi_x3.4-1968
x-user-defined auto
)
_wanted encodings expl encoding compadd -a "$@" - _encodings
}
# Complete file types
_rg_types() {
local -a expl
local -aU _types
_types=( ${(@)${(f)"$( _call_program types rg --type-list )"}%%:*} )
_wanted types expl 'file type' compadd -a "$@" - _types
}
_rg "$@"
# ------------------------------------------------------------------------------
# Copyright (c) 2011 Github zsh-users - http://github.com/zsh-users
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of the zsh-users nor the
# names of its contributors may be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL ZSH-USERS BE LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# ------------------------------------------------------------------------------
# Description
# -----------
#
# Completion script for ripgrep
#
# ------------------------------------------------------------------------------
# Authors
# -------
#
# * arcizan <ghostrevery@gmail.com>
# * MaskRay <i@maskray.me>
#
# ------------------------------------------------------------------------------
# Local Variables:
# mode: shell-script
# coding: utf-8-unix
# indent-tabs-mode: nil
# sh-indentation: 2
# sh-basic-offset: 2
# End:
# vim: ft=zsh sw=2 ts=2 et

26
crates/cli/Cargo.toml Normal file
View File

@@ -0,0 +1,26 @@
[package]
name = "grep-cli"
version = "0.1.10" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Utilities for search oriented command line applications.
"""
documentation = "https://docs.rs/grep-cli"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/cli"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/cli"
readme = "README.md"
keywords = ["regex", "grep", "cli", "utility", "util"]
license = "Unlicense OR MIT"
edition = "2021"
[dependencies]
bstr = { version = "1.6.2", features = ["std"] }
globset = { version = "0.4.14", path = "../globset" }
log = "0.4.20"
termcolor = "1.3.0"
[target.'cfg(windows)'.dependencies.winapi-util]
version = "0.1.6"
[target.'cfg(unix)'.dependencies.libc]
version = "0.2.148"

31
crates/cli/README.md Normal file
View File

@@ -0,0 +1,31 @@
grep-cli
--------
A utility library that provides common routines desired in search oriented
command line applications. This includes, but is not limited to, parsing hex
escapes, detecting whether stdin is readable and more. To the extent possible,
this crate strives for compatibility across Windows, macOS and Linux.
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![](https://img.shields.io/crates/v/grep-cli.svg)](https://crates.io/crates/grep-cli)
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org/).
### Documentation
[https://docs.rs/grep-cli](https://docs.rs/grep-cli)
**NOTE:** You probably don't want to use this crate directly. Instead, you
should prefer the facade defined in the
[`grep`](https://docs.rs/grep)
crate.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep-cli = "0.1"
```

View File

@@ -0,0 +1,530 @@
use std::{
ffi::{OsStr, OsString},
fs::File,
io,
path::{Path, PathBuf},
process::Command,
};
use globset::{Glob, GlobSet, GlobSetBuilder};
use crate::process::{CommandError, CommandReader, CommandReaderBuilder};
/// A builder for a matcher that determines which files get decompressed.
#[derive(Clone, Debug)]
pub struct DecompressionMatcherBuilder {
/// The commands for each matching glob.
commands: Vec<DecompressionCommand>,
/// Whether to include the default matching rules.
defaults: bool,
}
/// A representation of a single command for decompressing data
/// out-of-process.
#[derive(Clone, Debug)]
struct DecompressionCommand {
/// The glob that matches this command.
glob: String,
/// The command or binary name.
bin: PathBuf,
/// The arguments to invoke with the command.
args: Vec<OsString>,
}
impl Default for DecompressionMatcherBuilder {
fn default() -> DecompressionMatcherBuilder {
DecompressionMatcherBuilder::new()
}
}
impl DecompressionMatcherBuilder {
/// Create a new builder for configuring a decompression matcher.
pub fn new() -> DecompressionMatcherBuilder {
DecompressionMatcherBuilder { commands: vec![], defaults: true }
}
/// Build a matcher for determining how to decompress files.
///
/// If there was a problem compiling the matcher, then an error is
/// returned.
pub fn build(&self) -> Result<DecompressionMatcher, CommandError> {
let defaults = if !self.defaults {
vec![]
} else {
default_decompression_commands()
};
let mut glob_builder = GlobSetBuilder::new();
let mut commands = vec![];
for decomp_cmd in defaults.iter().chain(&self.commands) {
let glob = Glob::new(&decomp_cmd.glob).map_err(|err| {
CommandError::io(io::Error::new(io::ErrorKind::Other, err))
})?;
glob_builder.add(glob);
commands.push(decomp_cmd.clone());
}
let globs = glob_builder.build().map_err(|err| {
CommandError::io(io::Error::new(io::ErrorKind::Other, err))
})?;
Ok(DecompressionMatcher { globs, commands })
}
/// When enabled, the default matching rules will be compiled into this
/// matcher before any other associations. When disabled, only the
/// rules explicitly given to this builder will be used.
///
/// This is enabled by default.
pub fn defaults(&mut self, yes: bool) -> &mut DecompressionMatcherBuilder {
self.defaults = yes;
self
}
/// Associates a glob with a command to decompress files matching the glob.
///
/// If multiple globs match the same file, then the most recently added
/// glob takes precedence.
///
/// The syntax for the glob is documented in the
/// [`globset` crate](https://docs.rs/globset/#syntax).
///
/// The `program` given is resolved with respect to `PATH` and turned
/// into an absolute path internally before being executed by the current
/// platform. Notably, on Windows, this avoids a security problem where
/// passing a relative path to `CreateProcess` will automatically search
/// the current directory for a matching program. If the program could
/// not be resolved, then it is silently ignored and the association is
/// dropped. For this reason, callers should prefer `try_associate`.
pub fn associate<P, I, A>(
&mut self,
glob: &str,
program: P,
args: I,
) -> &mut DecompressionMatcherBuilder
where
P: AsRef<OsStr>,
I: IntoIterator<Item = A>,
A: AsRef<OsStr>,
{
let _ = self.try_associate(glob, program, args);
self
}
/// Associates a glob with a command to decompress files matching the glob.
///
/// If multiple globs match the same file, then the most recently added
/// glob takes precedence.
///
/// The syntax for the glob is documented in the
/// [`globset` crate](https://docs.rs/globset/#syntax).
///
/// The `program` given is resolved with respect to `PATH` and turned
/// into an absolute path internally before being executed by the current
/// platform. Notably, on Windows, this avoids a security problem where
/// passing a relative path to `CreateProcess` will automatically search
/// the current directory for a matching program. If the program could not
/// be resolved, then an error is returned.
pub fn try_associate<P, I, A>(
&mut self,
glob: &str,
program: P,
args: I,
) -> Result<&mut DecompressionMatcherBuilder, CommandError>
where
P: AsRef<OsStr>,
I: IntoIterator<Item = A>,
A: AsRef<OsStr>,
{
let glob = glob.to_string();
let bin = try_resolve_binary(Path::new(program.as_ref()))?;
let args =
args.into_iter().map(|a| a.as_ref().to_os_string()).collect();
self.commands.push(DecompressionCommand { glob, bin, args });
Ok(self)
}
}
/// A matcher for determining how to decompress files.
#[derive(Clone, Debug)]
pub struct DecompressionMatcher {
/// The set of globs to match. Each glob has a corresponding entry in
/// `commands`. When a glob matches, the corresponding command should be
/// used to perform out-of-process decompression.
globs: GlobSet,
/// The commands for each matching glob.
commands: Vec<DecompressionCommand>,
}
impl Default for DecompressionMatcher {
fn default() -> DecompressionMatcher {
DecompressionMatcher::new()
}
}
impl DecompressionMatcher {
/// Create a new matcher with default rules.
///
/// To add more matching rules, build a matcher with
/// [`DecompressionMatcherBuilder`].
pub fn new() -> DecompressionMatcher {
DecompressionMatcherBuilder::new()
.build()
.expect("built-in matching rules should always compile")
}
/// Return a pre-built command based on the given file path that can
/// decompress its contents. If no such decompressor is known, then this
/// returns `None`.
///
/// If there are multiple possible commands matching the given path, then
/// the command added last takes precedence.
pub fn command<P: AsRef<Path>>(&self, path: P) -> Option<Command> {
for i in self.globs.matches(path).into_iter().rev() {
let decomp_cmd = &self.commands[i];
let mut cmd = Command::new(&decomp_cmd.bin);
cmd.args(&decomp_cmd.args);
return Some(cmd);
}
None
}
/// Returns true if and only if the given file path has at least one
/// matching command to perform decompression on.
pub fn has_command<P: AsRef<Path>>(&self, path: P) -> bool {
self.globs.is_match(path)
}
}
/// Configures and builds a streaming reader for decompressing data.
#[derive(Clone, Debug, Default)]
pub struct DecompressionReaderBuilder {
matcher: DecompressionMatcher,
command_builder: CommandReaderBuilder,
}
impl DecompressionReaderBuilder {
/// Create a new builder with the default configuration.
pub fn new() -> DecompressionReaderBuilder {
DecompressionReaderBuilder::default()
}
/// Build a new streaming reader for decompressing data.
///
/// If decompression is done out-of-process and if there was a problem
/// spawning the process, then its error is logged at the debug level and a
/// passthru reader is returned that does no decompression. This behavior
/// typically occurs when the given file path matches a decompression
/// command, but is executing in an environment where the decompression
/// command is not available.
///
/// If the given file path could not be matched with a decompression
/// strategy, then a passthru reader is returned that does no
/// decompression.
pub fn build<P: AsRef<Path>>(
&self,
path: P,
) -> Result<DecompressionReader, CommandError> {
let path = path.as_ref();
let Some(mut cmd) = self.matcher.command(path) else {
return DecompressionReader::new_passthru(path);
};
cmd.arg(path);
match self.command_builder.build(&mut cmd) {
Ok(cmd_reader) => Ok(DecompressionReader { rdr: Ok(cmd_reader) }),
Err(err) => {
log::debug!(
"{}: error spawning command '{:?}': {} \
(falling back to uncompressed reader)",
path.display(),
cmd,
err,
);
DecompressionReader::new_passthru(path)
}
}
}
/// Set the matcher to use to look up the decompression command for each
/// file path.
///
/// A set of sensible rules is enabled by default. Setting this will
/// completely replace the current rules.
pub fn matcher(
&mut self,
matcher: DecompressionMatcher,
) -> &mut DecompressionReaderBuilder {
self.matcher = matcher;
self
}
/// Get the underlying matcher currently used by this builder.
pub fn get_matcher(&self) -> &DecompressionMatcher {
&self.matcher
}
/// When enabled, the reader will asynchronously read the contents of the
/// command's stderr output. When disabled, stderr is only read after the
/// stdout stream has been exhausted (or if the process quits with an error
/// code).
///
/// Note that when enabled, this may require launching an additional
/// thread in order to read stderr. This is done so that the process being
/// executed is never blocked from writing to stdout or stderr. If this is
/// disabled, then it is possible for the process to fill up the stderr
/// buffer and deadlock.
///
/// This is enabled by default.
pub fn async_stderr(
&mut self,
yes: bool,
) -> &mut DecompressionReaderBuilder {
self.command_builder.async_stderr(yes);
self
}
}
/// A streaming reader for decompressing the contents of a file.
///
/// The purpose of this reader is to provide a seamless way to decompress the
/// contents of file using existing tools in the current environment. This is
/// meant to be an alternative to using decompression libraries in favor of the
/// simplicity and portability of using external commands such as `gzip` and
/// `xz`. This does impose the overhead of spawning a process, so other means
/// for performing decompression should be sought if this overhead isn't
/// acceptable.
///
/// A decompression reader comes with a default set of matching rules that are
/// meant to associate file paths with the corresponding command to use to
/// decompress them. For example, a glob like `*.gz` matches gzip compressed
/// files with the command `gzip -d -c`. If a file path does not match any
/// existing rules, or if it matches a rule whose command does not exist in the
/// current environment, then the decompression reader passes through the
/// contents of the underlying file without doing any decompression.
///
/// The default matching rules are probably good enough for most cases, and if
/// they require revision, pull requests are welcome. In cases where they must
/// be changed or extended, they can be customized through the use of
/// [`DecompressionMatcherBuilder`] and [`DecompressionReaderBuilder`].
///
/// By default, this reader will asynchronously read the processes' stderr.
/// This prevents subtle deadlocking bugs for noisy processes that write a lot
/// to stderr. Currently, the entire contents of stderr is read on to the heap.
///
/// # Example
///
/// This example shows how to read the decompressed contents of a file without
/// needing to explicitly choose the decompression command to run.
///
/// Note that if you need to decompress multiple files, it is better to use
/// `DecompressionReaderBuilder`, which will amortize the cost of compiling the
/// matcher.
///
/// ```no_run
/// use std::{io::Read, process::Command};
///
/// use grep_cli::DecompressionReader;
///
/// let mut rdr = DecompressionReader::new("/usr/share/man/man1/ls.1.gz")?;
/// let mut contents = vec![];
/// rdr.read_to_end(&mut contents)?;
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
#[derive(Debug)]
pub struct DecompressionReader {
rdr: Result<CommandReader, File>,
}
impl DecompressionReader {
/// Build a new streaming reader for decompressing data.
///
/// If decompression is done out-of-process and if there was a problem
/// spawning the process, then its error is returned.
///
/// If the given file path could not be matched with a decompression
/// strategy, then a passthru reader is returned that does no
/// decompression.
///
/// This uses the default matching rules for determining how to decompress
/// the given file. To change those matching rules, use
/// [`DecompressionReaderBuilder`] and [`DecompressionMatcherBuilder`].
///
/// When creating readers for many paths. it is better to use the builder
/// since it will amortize the cost of constructing the matcher.
pub fn new<P: AsRef<Path>>(
path: P,
) -> Result<DecompressionReader, CommandError> {
DecompressionReaderBuilder::new().build(path)
}
/// Creates a new "passthru" decompression reader that reads from the file
/// corresponding to the given path without doing decompression and without
/// executing another process.
fn new_passthru(path: &Path) -> Result<DecompressionReader, CommandError> {
let file = File::open(path)?;
Ok(DecompressionReader { rdr: Err(file) })
}
/// Closes this reader, freeing any resources used by its underlying child
/// process, if one was used. If the child process exits with a nonzero
/// exit code, the returned Err value will include its stderr.
///
/// `close` is idempotent, meaning it can be safely called multiple times.
/// The first call closes the CommandReader and any subsequent calls do
/// nothing.
///
/// This method should be called after partially reading a file to prevent
/// resource leakage. However there is no need to call `close` explicitly
/// if your code always calls `read` to EOF, as `read` takes care of
/// calling `close` in this case.
///
/// `close` is also called in `drop` as a last line of defense against
/// resource leakage. Any error from the child process is then printed as a
/// warning to stderr. This can be avoided by explicitly calling `close`
/// before the CommandReader is dropped.
pub fn close(&mut self) -> io::Result<()> {
match self.rdr {
Ok(ref mut rdr) => rdr.close(),
Err(_) => Ok(()),
}
}
}
impl io::Read for DecompressionReader {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
match self.rdr {
Ok(ref mut rdr) => rdr.read(buf),
Err(ref mut rdr) => rdr.read(buf),
}
}
}
/// Resolves a path to a program to a path by searching for the program in
/// `PATH`.
///
/// If the program could not be resolved, then an error is returned.
///
/// The purpose of doing this instead of passing the path to the program
/// directly to Command::new is that Command::new will hand relative paths
/// to CreateProcess on Windows, which will implicitly search the current
/// working directory for the executable. This could be undesirable for
/// security reasons. e.g., running ripgrep with the -z/--search-zip flag on an
/// untrusted directory tree could result in arbitrary programs executing on
/// Windows.
///
/// Note that this could still return a relative path if PATH contains a
/// relative path. We permit this since it is assumed that the user has set
/// this explicitly, and thus, desires this behavior.
///
/// On non-Windows, this is a no-op.
pub fn resolve_binary<P: AsRef<Path>>(
prog: P,
) -> Result<PathBuf, CommandError> {
if !cfg!(windows) {
return Ok(prog.as_ref().to_path_buf());
}
try_resolve_binary(prog)
}
/// Resolves a path to a program to a path by searching for the program in
/// `PATH`.
///
/// If the program could not be resolved, then an error is returned.
///
/// The purpose of doing this instead of passing the path to the program
/// directly to Command::new is that Command::new will hand relative paths
/// to CreateProcess on Windows, which will implicitly search the current
/// working directory for the executable. This could be undesirable for
/// security reasons. e.g., running ripgrep with the -z/--search-zip flag on an
/// untrusted directory tree could result in arbitrary programs executing on
/// Windows.
///
/// Note that this could still return a relative path if PATH contains a
/// relative path. We permit this since it is assumed that the user has set
/// this explicitly, and thus, desires this behavior.
///
/// If `check_exists` is false or the path is already an absolute path this
/// will return immediately.
fn try_resolve_binary<P: AsRef<Path>>(
prog: P,
) -> Result<PathBuf, CommandError> {
use std::env;
fn is_exe(path: &Path) -> bool {
let Ok(md) = path.metadata() else { return false };
!md.is_dir()
}
let prog = prog.as_ref();
if prog.is_absolute() {
return Ok(prog.to_path_buf());
}
let Some(syspaths) = env::var_os("PATH") else {
let msg = "system PATH environment variable not found";
return Err(CommandError::io(io::Error::new(
io::ErrorKind::Other,
msg,
)));
};
for syspath in env::split_paths(&syspaths) {
if syspath.as_os_str().is_empty() {
continue;
}
let abs_prog = syspath.join(prog);
if is_exe(&abs_prog) {
return Ok(abs_prog.to_path_buf());
}
if abs_prog.extension().is_none() {
for extension in ["com", "exe"] {
let abs_prog = abs_prog.with_extension(extension);
if is_exe(&abs_prog) {
return Ok(abs_prog.to_path_buf());
}
}
}
}
let msg = format!("{}: could not find executable in PATH", prog.display());
return Err(CommandError::io(io::Error::new(io::ErrorKind::Other, msg)));
}
fn default_decompression_commands() -> Vec<DecompressionCommand> {
const ARGS_GZIP: &[&str] = &["gzip", "-d", "-c"];
const ARGS_BZIP: &[&str] = &["bzip2", "-d", "-c"];
const ARGS_XZ: &[&str] = &["xz", "-d", "-c"];
const ARGS_LZ4: &[&str] = &["lz4", "-d", "-c"];
const ARGS_LZMA: &[&str] = &["xz", "--format=lzma", "-d", "-c"];
const ARGS_BROTLI: &[&str] = &["brotli", "-d", "-c"];
const ARGS_ZSTD: &[&str] = &["zstd", "-q", "-d", "-c"];
const ARGS_UNCOMPRESS: &[&str] = &["uncompress", "-c"];
fn add(glob: &str, args: &[&str], cmds: &mut Vec<DecompressionCommand>) {
let bin = match resolve_binary(Path::new(args[0])) {
Ok(bin) => bin,
Err(err) => {
log::debug!("{}", err);
return;
}
};
cmds.push(DecompressionCommand {
glob: glob.to_string(),
bin,
args: args
.iter()
.skip(1)
.map(|s| OsStr::new(s).to_os_string())
.collect(),
});
}
let mut cmds = vec![];
add("*.gz", ARGS_GZIP, &mut cmds);
add("*.tgz", ARGS_GZIP, &mut cmds);
add("*.bz2", ARGS_BZIP, &mut cmds);
add("*.tbz2", ARGS_BZIP, &mut cmds);
add("*.xz", ARGS_XZ, &mut cmds);
add("*.txz", ARGS_XZ, &mut cmds);
add("*.lz4", ARGS_LZ4, &mut cmds);
add("*.lzma", ARGS_LZMA, &mut cmds);
add("*.br", ARGS_BROTLI, &mut cmds);
add("*.zst", ARGS_ZSTD, &mut cmds);
add("*.zstd", ARGS_ZSTD, &mut cmds);
add("*.Z", ARGS_UNCOMPRESS, &mut cmds);
cmds
}

159
crates/cli/src/escape.rs Normal file
View File

@@ -0,0 +1,159 @@
use std::ffi::OsStr;
use bstr::{ByteSlice, ByteVec};
/// Escapes arbitrary bytes into a human readable string.
///
/// This converts `\t`, `\r` and `\n` into their escaped forms. It also
/// converts the non-printable subset of ASCII in addition to invalid UTF-8
/// bytes to hexadecimal escape sequences. Everything else is left as is.
///
/// The dual of this routine is [`unescape`].
///
/// # Example
///
/// This example shows how to convert a byte string that contains a `\n` and
/// invalid UTF-8 bytes into a `String`.
///
/// Pay special attention to the use of raw strings. That is, `r"\n"` is
/// equivalent to `"\\n"`.
///
/// ```
/// use grep_cli::escape;
///
/// assert_eq!(r"foo\nbar\xFFbaz", escape(b"foo\nbar\xFFbaz"));
/// ```
pub fn escape(bytes: &[u8]) -> String {
bytes.escape_bytes().to_string()
}
/// Escapes an OS string into a human readable string.
///
/// This is like [`escape`], but accepts an OS string.
pub fn escape_os(string: &OsStr) -> String {
escape(Vec::from_os_str_lossy(string).as_bytes())
}
/// Unescapes a string.
///
/// It supports a limited set of escape sequences:
///
/// * `\t`, `\r` and `\n` are mapped to their corresponding ASCII bytes.
/// * `\xZZ` hexadecimal escapes are mapped to their byte.
///
/// Everything else is left as is, including non-hexadecimal escapes like
/// `\xGG`.
///
/// This is useful when it is desirable for a command line argument to be
/// capable of specifying arbitrary bytes or otherwise make it easier to
/// specify non-printable characters.
///
/// The dual of this routine is [`escape`].
///
/// # Example
///
/// This example shows how to convert an escaped string (which is valid UTF-8)
/// into a corresponding sequence of bytes. Each escape sequence is mapped to
/// its bytes, which may include invalid UTF-8.
///
/// Pay special attention to the use of raw strings. That is, `r"\n"` is
/// equivalent to `"\\n"`.
///
/// ```
/// use grep_cli::unescape;
///
/// assert_eq!(&b"foo\nbar\xFFbaz"[..], &*unescape(r"foo\nbar\xFFbaz"));
/// ```
pub fn unescape(s: &str) -> Vec<u8> {
Vec::unescape_bytes(s)
}
/// Unescapes an OS string.
///
/// This is like [`unescape`], but accepts an OS string.
///
/// Note that this first lossily decodes the given OS string as UTF-8. That
/// is, an escaped string (the thing given) should be valid UTF-8.
pub fn unescape_os(string: &OsStr) -> Vec<u8> {
unescape(&string.to_string_lossy())
}
#[cfg(test)]
mod tests {
use super::{escape, unescape};
fn b(bytes: &'static [u8]) -> Vec<u8> {
bytes.to_vec()
}
#[test]
fn empty() {
assert_eq!(b(b""), unescape(r""));
assert_eq!(r"", escape(b""));
}
#[test]
fn backslash() {
assert_eq!(b(b"\\"), unescape(r"\\"));
assert_eq!(r"\\", escape(b"\\"));
}
#[test]
fn nul() {
assert_eq!(b(b"\x00"), unescape(r"\x00"));
assert_eq!(b(b"\x00"), unescape(r"\0"));
assert_eq!(r"\0", escape(b"\x00"));
}
#[test]
fn nl() {
assert_eq!(b(b"\n"), unescape(r"\n"));
assert_eq!(r"\n", escape(b"\n"));
}
#[test]
fn tab() {
assert_eq!(b(b"\t"), unescape(r"\t"));
assert_eq!(r"\t", escape(b"\t"));
}
#[test]
fn carriage() {
assert_eq!(b(b"\r"), unescape(r"\r"));
assert_eq!(r"\r", escape(b"\r"));
}
#[test]
fn nothing_simple() {
assert_eq!(b(b"\\a"), unescape(r"\a"));
assert_eq!(b(b"\\a"), unescape(r"\\a"));
assert_eq!(r"\\a", escape(b"\\a"));
}
#[test]
fn nothing_hex0() {
assert_eq!(b(b"\\x"), unescape(r"\x"));
assert_eq!(b(b"\\x"), unescape(r"\\x"));
assert_eq!(r"\\x", escape(b"\\x"));
}
#[test]
fn nothing_hex1() {
assert_eq!(b(b"\\xz"), unescape(r"\xz"));
assert_eq!(b(b"\\xz"), unescape(r"\\xz"));
assert_eq!(r"\\xz", escape(b"\\xz"));
}
#[test]
fn nothing_hex2() {
assert_eq!(b(b"\\xzz"), unescape(r"\xzz"));
assert_eq!(b(b"\\xzz"), unescape(r"\\xzz"));
assert_eq!(r"\\xzz", escape(b"\\xzz"));
}
#[test]
fn invalid_utf8() {
assert_eq!(r"\xFF", escape(b"\xFF"));
assert_eq!(r"a\xFFb", escape(b"a\xFFb"));
}
}

View File

@@ -0,0 +1,85 @@
use std::{ffi::OsString, io};
/// Returns the hostname of the current system.
///
/// It is unusual, although technically possible, for this routine to return
/// an error. It is difficult to list out the error conditions, but one such
/// possibility is platform support.
///
/// # Platform specific behavior
///
/// On Windows, this currently uses the "physical DNS hostname" computer name.
/// This may change in the future.
///
/// On Unix, this returns the result of the `gethostname` function from the
/// `libc` linked into the program.
pub fn hostname() -> io::Result<OsString> {
#[cfg(windows)]
{
use winapi_util::sysinfo::{get_computer_name, ComputerNameKind};
get_computer_name(ComputerNameKind::PhysicalDnsHostname)
}
#[cfg(unix)]
{
gethostname()
}
#[cfg(not(any(windows, unix)))]
{
io::Error::new(
io::ErrorKind::Other,
"hostname could not be found on unsupported platform",
)
}
}
#[cfg(unix)]
fn gethostname() -> io::Result<OsString> {
use std::os::unix::ffi::OsStringExt;
// SAFETY: There don't appear to be any safety requirements for calling
// sysconf.
let limit = unsafe { libc::sysconf(libc::_SC_HOST_NAME_MAX) };
if limit == -1 {
// It is in theory possible for sysconf to return -1 for a limit but
// *not* set errno, in which case, io::Error::last_os_error is
// indeterminate. But untangling that is super annoying because std
// doesn't expose any unix-specific APIs for inspecting the errno. (We
// could do it ourselves, but it just doesn't seem worth doing?)
return Err(io::Error::last_os_error());
}
let Ok(maxlen) = usize::try_from(limit) else {
let msg = format!("host name max limit ({}) overflowed usize", limit);
return Err(io::Error::new(io::ErrorKind::Other, msg));
};
// maxlen here includes the NUL terminator.
let mut buf = vec![0; maxlen];
// SAFETY: The pointer we give is valid as it is derived directly from a
// Vec. Similarly, `maxlen` is the length of our Vec, and is thus valid
// to write to.
let rc = unsafe {
libc::gethostname(buf.as_mut_ptr().cast::<libc::c_char>(), maxlen)
};
if rc == -1 {
return Err(io::Error::last_os_error());
}
// POSIX says that if the hostname is bigger than `maxlen`, then it may
// write a truncate name back that is not necessarily NUL terminated (wtf,
// lol). So if we can't find a NUL terminator, then just give up.
let Some(zeropos) = buf.iter().position(|&b| b == 0) else {
let msg = "could not find NUL terminator in hostname";
return Err(io::Error::new(io::ErrorKind::Other, msg));
};
buf.truncate(zeropos);
buf.shrink_to_fit();
Ok(OsString::from_vec(buf))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn print_hostname() {
println!("{:?}", hostname().unwrap());
}
}

149
crates/cli/src/human.rs Normal file
View File

@@ -0,0 +1,149 @@
/// An error that occurs when parsing a human readable size description.
///
/// This error provides an end user friendly message describing why the
/// description couldn't be parsed and what the expected format is.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct ParseSizeError {
original: String,
kind: ParseSizeErrorKind,
}
#[derive(Clone, Debug, Eq, PartialEq)]
enum ParseSizeErrorKind {
InvalidFormat,
InvalidInt(std::num::ParseIntError),
Overflow,
}
impl ParseSizeError {
fn format(original: &str) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::InvalidFormat,
}
}
fn int(original: &str, err: std::num::ParseIntError) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::InvalidInt(err),
}
}
fn overflow(original: &str) -> ParseSizeError {
ParseSizeError {
original: original.to_string(),
kind: ParseSizeErrorKind::Overflow,
}
}
}
impl std::error::Error for ParseSizeError {}
impl std::fmt::Display for ParseSizeError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
use self::ParseSizeErrorKind::*;
match self.kind {
InvalidFormat => write!(
f,
"invalid format for size '{}', which should be a non-empty \
sequence of digits followed by an optional 'K', 'M' or 'G' \
suffix",
self.original
),
InvalidInt(ref err) => write!(
f,
"invalid integer found in size '{}': {}",
self.original, err
),
Overflow => write!(f, "size too big in '{}'", self.original),
}
}
}
impl From<ParseSizeError> for std::io::Error {
fn from(size_err: ParseSizeError) -> std::io::Error {
std::io::Error::new(std::io::ErrorKind::Other, size_err)
}
}
/// Parse a human readable size like `2M` into a corresponding number of bytes.
///
/// Supported size suffixes are `K` (for kilobyte), `M` (for megabyte) and `G`
/// (for gigabyte). If a size suffix is missing, then the size is interpreted
/// as bytes. If the size is too big to fit into a `u64`, then this returns an
/// error.
///
/// Additional suffixes may be added over time.
pub fn parse_human_readable_size(size: &str) -> Result<u64, ParseSizeError> {
let digits_end =
size.as_bytes().iter().take_while(|&b| b.is_ascii_digit()).count();
let digits = &size[..digits_end];
if digits.is_empty() {
return Err(ParseSizeError::format(size));
}
let value =
digits.parse::<u64>().map_err(|e| ParseSizeError::int(size, e))?;
let suffix = &size[digits_end..];
if suffix.is_empty() {
return Ok(value);
}
let bytes = match suffix {
"K" => value.checked_mul(1 << 10),
"M" => value.checked_mul(1 << 20),
"G" => value.checked_mul(1 << 30),
_ => return Err(ParseSizeError::format(size)),
};
bytes.ok_or_else(|| ParseSizeError::overflow(size))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn suffix_none() {
let x = parse_human_readable_size("123").unwrap();
assert_eq!(123, x);
}
#[test]
fn suffix_k() {
let x = parse_human_readable_size("123K").unwrap();
assert_eq!(123 * (1 << 10), x);
}
#[test]
fn suffix_m() {
let x = parse_human_readable_size("123M").unwrap();
assert_eq!(123 * (1 << 20), x);
}
#[test]
fn suffix_g() {
let x = parse_human_readable_size("123G").unwrap();
assert_eq!(123 * (1 << 30), x);
}
#[test]
fn invalid_empty() {
assert!(parse_human_readable_size("").is_err());
}
#[test]
fn invalid_non_digit() {
assert!(parse_human_readable_size("a").is_err());
}
#[test]
fn invalid_overflow() {
assert!(parse_human_readable_size("9999999999999999G").is_err());
}
#[test]
fn invalid_suffix() {
assert!(parse_human_readable_size("123T").is_err());
}
}

246
crates/cli/src/lib.rs Normal file
View File

@@ -0,0 +1,246 @@
/*!
This crate provides common routines used in command line applications, with a
focus on routines useful for search oriented applications. As a utility
library, there is no central type or function. However, a key focus of this
crate is to improve failure modes and provide user friendly error messages
when things go wrong.
To the best extent possible, everything in this crate works on Windows, macOS
and Linux.
# Standard I/O
[`is_readable_stdin`] determines whether stdin can be usefully read from. It
is useful when writing an application that changes behavior based on whether
the application was invoked with data on stdin. For example, `rg foo` might
recursively search the current working directory for occurrences of `foo`, but
`rg foo < file` might only search the contents of `file`.
# Coloring and buffering
The [`stdout`], [`stdout_buffered_block`] and [`stdout_buffered_line`] routines
are alternative constructors for [`StandardStream`]. A `StandardStream`
implements `termcolor::WriteColor`, which provides a way to emit colors to
terminals. Its key use is the encapsulation of buffering style. Namely,
`stdout` will return a line buffered `StandardStream` if and only if
stdout is connected to a tty, and will otherwise return a block buffered
`StandardStream`. Line buffering is important for use with a tty because it
typically decreases the latency at which the end user sees output. Block
buffering is used otherwise because it is faster, and redirecting stdout to a
file typically doesn't benefit from the decreased latency that line buffering
provides.
The `stdout_buffered_block` and `stdout_buffered_line` can be used to
explicitly set the buffering strategy regardless of whether stdout is connected
to a tty or not.
# Escaping
The [`escape`](crate::escape()), [`escape_os`], [`unescape`] and
[`unescape_os`] routines provide a user friendly way of dealing with UTF-8
encoded strings that can express arbitrary bytes. For example, you might want
to accept a string containing arbitrary bytes as a command line argument, but
most interactive shells make such strings difficult to type. Instead, we can
ask users to use escape sequences.
For example, `a\xFFz` is itself a valid UTF-8 string corresponding to the
following bytes:
```ignore
[b'a', b'\\', b'x', b'F', b'F', b'z']
```
However, we can
interpret `\xFF` as an escape sequence with the `unescape`/`unescape_os`
routines, which will yield
```ignore
[b'a', b'\xFF', b'z']
```
instead. For example:
```
use grep_cli::unescape;
// Note the use of a raw string!
assert_eq!(vec![b'a', b'\xFF', b'z'], unescape(r"a\xFFz"));
```
The `escape`/`escape_os` routines provide the reverse transformation, which
makes it easy to show user friendly error messages involving arbitrary bytes.
# Building patterns
Typically, regular expression patterns must be valid UTF-8. However, command
line arguments aren't guaranteed to be valid UTF-8. Unfortunately, the standard
library's UTF-8 conversion functions from `OsStr`s do not provide good error
messages. However, the [`pattern_from_bytes`] and [`pattern_from_os`] do,
including reporting exactly where the first invalid UTF-8 byte is seen.
Additionally, it can be useful to read patterns from a file while reporting
good error messages that include line numbers. The [`patterns_from_path`],
[`patterns_from_reader`] and [`patterns_from_stdin`] routines do just that. If
any pattern is found that is invalid UTF-8, then the error includes the file
path (if available) along with the line number and the byte offset at which the
first invalid UTF-8 byte was observed.
# Read process output
Sometimes a command line application needs to execute other processes and
read its stdout in a streaming fashion. The [`CommandReader`] provides this
functionality with an explicit goal of improving failure modes. In particular,
if the process exits with an error code, then stderr is read and converted into
a normal Rust error to show to end users. This makes the underlying failure
modes explicit and gives more information to end users for debugging the
problem.
As a special case, [`DecompressionReader`] provides a way to decompress
arbitrary files by matching their file extensions up with corresponding
decompression programs (such as `gzip` and `xz`). This is useful as a means of
performing simplistic decompression in a portable manner without binding to
specific compression libraries. This does come with some overhead though, so
if you need to decompress lots of small files, this may not be an appropriate
convenience to use.
Each reader has a corresponding builder for additional configuration, such as
whether to read stderr asynchronously in order to avoid deadlock (which is
enabled by default).
# Miscellaneous parsing
The [`parse_human_readable_size`] routine parses strings like `2M` and converts
them to the corresponding number of bytes (`2 * 1<<20` in this case). If an
invalid size is found, then a good error message is crafted that typically
tells the user how to fix the problem.
*/
#![deny(missing_docs)]
mod decompress;
mod escape;
mod hostname;
mod human;
mod pattern;
mod process;
mod wtr;
pub use crate::{
decompress::{
resolve_binary, DecompressionMatcher, DecompressionMatcherBuilder,
DecompressionReader, DecompressionReaderBuilder,
},
escape::{escape, escape_os, unescape, unescape_os},
hostname::hostname,
human::{parse_human_readable_size, ParseSizeError},
pattern::{
pattern_from_bytes, pattern_from_os, patterns_from_path,
patterns_from_reader, patterns_from_stdin, InvalidPatternError,
},
process::{CommandError, CommandReader, CommandReaderBuilder},
wtr::{
stdout, stdout_buffered_block, stdout_buffered_line, StandardStream,
},
};
/// Returns true if and only if stdin is believed to be readable.
///
/// When stdin is readable, command line programs may choose to behave
/// differently than when stdin is not readable. For example, `command foo`
/// might search the current directory for occurrences of `foo` where as
/// `command foo < some-file` or `cat some-file | command foo` might instead
/// only search stdin for occurrences of `foo`.
///
/// Note that this isn't perfect and essentially corresponds to a heuristic.
/// When things are unclear (such as if an error occurs during introspection to
/// determine whether stdin is readable), this prefers to return `false`. That
/// means it's possible for an end user to pipe something into your program and
/// have this return `false` and thus potentially lead to ignoring the user's
/// stdin data. While not ideal, this is perhaps better than falsely assuming
/// stdin is readable, which would result in blocking forever on reading stdin.
/// Regardless, commands should always provide explicit fallbacks to override
/// behavior. For example, `rg foo -` will explicitly search stdin and `rg foo
/// ./` will explicitly search the current working directory.
pub fn is_readable_stdin() -> bool {
use std::io::IsTerminal;
#[cfg(unix)]
fn imp() -> bool {
use std::{
fs::File,
os::{fd::AsFd, unix::fs::FileTypeExt},
};
let stdin = std::io::stdin();
let Ok(fd) = stdin.as_fd().try_clone_to_owned() else { return false };
let file = File::from(fd);
let Ok(md) = file.metadata() else { return false };
let ft = md.file_type();
ft.is_file() || ft.is_fifo() || ft.is_socket()
}
#[cfg(windows)]
fn imp() -> bool {
winapi_util::file::typ(winapi_util::HandleRef::stdin())
.map(|t| t.is_disk() || t.is_pipe())
.unwrap_or(false)
}
#[cfg(not(any(unix, windows)))]
fn imp() -> bool {
false
}
!std::io::stdin().is_terminal() && imp()
}
/// Returns true if and only if stdin is believed to be connected to a tty
/// or a console.
///
/// Note that this is now just a wrapper around
/// [`std::io::IsTerminal`](https://doc.rust-lang.org/std/io/trait.IsTerminal.html).
/// Callers should prefer using the `IsTerminal` trait directly. This routine
/// is deprecated and will be removed in the next semver incompatible release.
#[deprecated(since = "0.1.10", note = "use std::io::IsTerminal instead")]
pub fn is_tty_stdin() -> bool {
use std::io::IsTerminal;
std::io::stdin().is_terminal()
}
/// Returns true if and only if stdout is believed to be connected to a tty
/// or a console.
///
/// This is useful for when you want your command line program to produce
/// different output depending on whether it's printing directly to a user's
/// terminal or whether it's being redirected somewhere else. For example,
/// implementations of `ls` will often show one item per line when stdout is
/// redirected, but will condensed output when printing to a tty.
///
/// Note that this is now just a wrapper around
/// [`std::io::IsTerminal`](https://doc.rust-lang.org/std/io/trait.IsTerminal.html).
/// Callers should prefer using the `IsTerminal` trait directly. This routine
/// is deprecated and will be removed in the next semver incompatible release.
#[deprecated(since = "0.1.10", note = "use std::io::IsTerminal instead")]
pub fn is_tty_stdout() -> bool {
use std::io::IsTerminal;
std::io::stdout().is_terminal()
}
/// Returns true if and only if stderr is believed to be connected to a tty
/// or a console.
///
/// Note that this is now just a wrapper around
/// [`std::io::IsTerminal`](https://doc.rust-lang.org/std/io/trait.IsTerminal.html).
/// Callers should prefer using the `IsTerminal` trait directly. This routine
/// is deprecated and will be removed in the next semver incompatible release.
#[deprecated(since = "0.1.10", note = "use std::io::IsTerminal instead")]
pub fn is_tty_stderr() -> bool {
use std::io::IsTerminal;
std::io::stderr().is_terminal()
}

181
crates/cli/src/pattern.rs Normal file
View File

@@ -0,0 +1,181 @@
use std::{ffi::OsStr, io, path::Path};
use bstr::io::BufReadExt;
use crate::escape::{escape, escape_os};
/// An error that occurs when a pattern could not be converted to valid UTF-8.
///
/// The purpose of this error is to give a more targeted failure mode for
/// patterns written by end users that are not valid UTF-8.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct InvalidPatternError {
original: String,
valid_up_to: usize,
}
impl InvalidPatternError {
/// Returns the index in the given string up to which valid UTF-8 was
/// verified.
pub fn valid_up_to(&self) -> usize {
self.valid_up_to
}
}
impl std::error::Error for InvalidPatternError {}
impl std::fmt::Display for InvalidPatternError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"found invalid UTF-8 in pattern at byte offset {}: {} \
(disable Unicode mode and use hex escape sequences to match \
arbitrary bytes in a pattern, e.g., '(?-u)\\xFF')",
self.valid_up_to, self.original,
)
}
}
impl From<InvalidPatternError> for io::Error {
fn from(paterr: InvalidPatternError) -> io::Error {
io::Error::new(io::ErrorKind::Other, paterr)
}
}
/// Convert an OS string into a regular expression pattern.
///
/// This conversion fails if the given pattern is not valid UTF-8, in which
/// case, a targeted error with more information about where the invalid UTF-8
/// occurs is given. The error also suggests the use of hex escape sequences,
/// which are supported by many regex engines.
pub fn pattern_from_os(pattern: &OsStr) -> Result<&str, InvalidPatternError> {
pattern.to_str().ok_or_else(|| {
let valid_up_to = pattern
.to_string_lossy()
.find('\u{FFFD}')
.expect("a Unicode replacement codepoint for invalid UTF-8");
InvalidPatternError { original: escape_os(pattern), valid_up_to }
})
}
/// Convert arbitrary bytes into a regular expression pattern.
///
/// This conversion fails if the given pattern is not valid UTF-8, in which
/// case, a targeted error with more information about where the invalid UTF-8
/// occurs is given. The error also suggests the use of hex escape sequences,
/// which are supported by many regex engines.
pub fn pattern_from_bytes(
pattern: &[u8],
) -> Result<&str, InvalidPatternError> {
std::str::from_utf8(pattern).map_err(|err| InvalidPatternError {
original: escape(pattern),
valid_up_to: err.valid_up_to(),
})
}
/// Read patterns from a file path, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number and the file
/// path.
pub fn patterns_from_path<P: AsRef<Path>>(path: P) -> io::Result<Vec<String>> {
let path = path.as_ref();
let file = std::fs::File::open(path).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("{}: {}", path.display(), err),
)
})?;
patterns_from_reader(file).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("{}:{}", path.display(), err),
)
})
}
/// Read patterns from stdin, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number and the fact
/// that it came from stdin.
pub fn patterns_from_stdin() -> io::Result<Vec<String>> {
let stdin = io::stdin();
let locked = stdin.lock();
patterns_from_reader(locked).map_err(|err| {
io::Error::new(io::ErrorKind::Other, format!("<stdin>:{}", err))
})
}
/// Read patterns from any reader, one per line.
///
/// If there was a problem reading or if any of the patterns contain invalid
/// UTF-8, then an error is returned. If there was a problem with a specific
/// pattern, then the error message will include the line number.
///
/// Note that this routine uses its own internal buffer, so the caller should
/// not provide their own buffered reader if possible.
///
/// # Example
///
/// This shows how to parse patterns, one per line.
///
/// ```
/// use grep_cli::patterns_from_reader;
///
/// let patterns = "\
/// foo
/// bar\\s+foo
/// [a-z]{3}
/// ";
///
/// assert_eq!(patterns_from_reader(patterns.as_bytes())?, vec![
/// r"foo",
/// r"bar\s+foo",
/// r"[a-z]{3}",
/// ]);
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
pub fn patterns_from_reader<R: io::Read>(rdr: R) -> io::Result<Vec<String>> {
let mut patterns = vec![];
let mut line_number = 0;
io::BufReader::new(rdr).for_byte_line(|line| {
line_number += 1;
match pattern_from_bytes(line) {
Ok(pattern) => {
patterns.push(pattern.to_string());
Ok(true)
}
Err(err) => Err(io::Error::new(
io::ErrorKind::Other,
format!("{}: {}", line_number, err),
)),
}
})?;
Ok(patterns)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn bytes() {
let pat = b"abc\xFFxyz";
let err = pattern_from_bytes(pat).unwrap_err();
assert_eq!(3, err.valid_up_to());
}
#[test]
#[cfg(unix)]
fn os() {
use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;
let pat = OsStr::from_bytes(b"abc\xFFxyz");
let err = pattern_from_os(pat).unwrap_err();
assert_eq!(3, err.valid_up_to());
}
}

316
crates/cli/src/process.rs Normal file
View File

@@ -0,0 +1,316 @@
use std::{
io::{self, Read},
process,
};
/// An error that can occur while running a command and reading its output.
///
/// This error can be seamlessly converted to an `io::Error` via a `From`
/// implementation.
#[derive(Debug)]
pub struct CommandError {
kind: CommandErrorKind,
}
#[derive(Debug)]
enum CommandErrorKind {
Io(io::Error),
Stderr(Vec<u8>),
}
impl CommandError {
/// Create an error from an I/O error.
pub(crate) fn io(ioerr: io::Error) -> CommandError {
CommandError { kind: CommandErrorKind::Io(ioerr) }
}
/// Create an error from the contents of stderr (which may be empty).
pub(crate) fn stderr(bytes: Vec<u8>) -> CommandError {
CommandError { kind: CommandErrorKind::Stderr(bytes) }
}
/// Returns true if and only if this error has empty data from stderr.
pub(crate) fn is_empty(&self) -> bool {
match self.kind {
CommandErrorKind::Stderr(ref bytes) => bytes.is_empty(),
_ => false,
}
}
}
impl std::error::Error for CommandError {}
impl std::fmt::Display for CommandError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.kind {
CommandErrorKind::Io(ref e) => e.fmt(f),
CommandErrorKind::Stderr(ref bytes) => {
let msg = String::from_utf8_lossy(bytes);
if msg.trim().is_empty() {
write!(f, "<stderr is empty>")
} else {
let div = "-".repeat(79);
write!(
f,
"\n{div}\n{msg}\n{div}",
div = div,
msg = msg.trim()
)
}
}
}
}
}
impl From<io::Error> for CommandError {
fn from(ioerr: io::Error) -> CommandError {
CommandError { kind: CommandErrorKind::Io(ioerr) }
}
}
impl From<CommandError> for io::Error {
fn from(cmderr: CommandError) -> io::Error {
match cmderr.kind {
CommandErrorKind::Io(ioerr) => ioerr,
CommandErrorKind::Stderr(_) => {
io::Error::new(io::ErrorKind::Other, cmderr)
}
}
}
}
/// Configures and builds a streaming reader for process output.
#[derive(Clone, Debug, Default)]
pub struct CommandReaderBuilder {
async_stderr: bool,
}
impl CommandReaderBuilder {
/// Create a new builder with the default configuration.
pub fn new() -> CommandReaderBuilder {
CommandReaderBuilder::default()
}
/// Build a new streaming reader for the given command's output.
///
/// The caller should set everything that's required on the given command
/// before building a reader, such as its arguments, environment and
/// current working directory. Settings such as the stdout and stderr (but
/// not stdin) pipes will be overridden so that they can be controlled by
/// the reader.
///
/// If there was a problem spawning the given command, then its error is
/// returned.
pub fn build(
&self,
command: &mut process::Command,
) -> Result<CommandReader, CommandError> {
let mut child = command
.stdout(process::Stdio::piped())
.stderr(process::Stdio::piped())
.spawn()?;
let stderr = if self.async_stderr {
StderrReader::r#async(child.stderr.take().unwrap())
} else {
StderrReader::sync(child.stderr.take().unwrap())
};
Ok(CommandReader { child, stderr, eof: false })
}
/// When enabled, the reader will asynchronously read the contents of the
/// command's stderr output. When disabled, stderr is only read after the
/// stdout stream has been exhausted (or if the process quits with an error
/// code).
///
/// Note that when enabled, this may require launching an additional
/// thread in order to read stderr. This is done so that the process being
/// executed is never blocked from writing to stdout or stderr. If this is
/// disabled, then it is possible for the process to fill up the stderr
/// buffer and deadlock.
///
/// This is enabled by default.
pub fn async_stderr(&mut self, yes: bool) -> &mut CommandReaderBuilder {
self.async_stderr = yes;
self
}
}
/// A streaming reader for a command's output.
///
/// The purpose of this reader is to provide an easy way to execute processes
/// whose stdout is read in a streaming way while also making the processes'
/// stderr available when the process fails with an exit code. This makes it
/// possible to execute processes while surfacing the underlying failure mode
/// in the case of an error.
///
/// Moreover, by default, this reader will asynchronously read the processes'
/// stderr. This prevents subtle deadlocking bugs for noisy processes that
/// write a lot to stderr. Currently, the entire contents of stderr is read
/// on to the heap.
///
/// # Example
///
/// This example shows how to invoke `gzip` to decompress the contents of a
/// file. If the `gzip` command reports a failing exit status, then its stderr
/// is returned as an error.
///
/// ```no_run
/// use std::{io::Read, process::Command};
///
/// use grep_cli::CommandReader;
///
/// let mut cmd = Command::new("gzip");
/// cmd.arg("-d").arg("-c").arg("/usr/share/man/man1/ls.1.gz");
///
/// let mut rdr = CommandReader::new(&mut cmd)?;
/// let mut contents = vec![];
/// rdr.read_to_end(&mut contents)?;
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
#[derive(Debug)]
pub struct CommandReader {
child: process::Child,
stderr: StderrReader,
/// This is set to true once 'read' returns zero bytes. When this isn't
/// set and we close the reader, then we anticipate a pipe error when
/// reaping the child process and silence it.
eof: bool,
}
impl CommandReader {
/// Create a new streaming reader for the given command using the default
/// configuration.
///
/// The caller should set everything that's required on the given command
/// before building a reader, such as its arguments, environment and
/// current working directory. Settings such as the stdout and stderr (but
/// not stdin) pipes will be overridden so that they can be controlled by
/// the reader.
///
/// If there was a problem spawning the given command, then its error is
/// returned.
///
/// If the caller requires additional configuration for the reader
/// returned, then use [`CommandReaderBuilder`].
pub fn new(
cmd: &mut process::Command,
) -> Result<CommandReader, CommandError> {
CommandReaderBuilder::new().build(cmd)
}
/// Closes the CommandReader, freeing any resources used by its underlying
/// child process. If the child process exits with a nonzero exit code, the
/// returned Err value will include its stderr.
///
/// `close` is idempotent, meaning it can be safely called multiple times.
/// The first call closes the CommandReader and any subsequent calls do
/// nothing.
///
/// This method should be called after partially reading a file to prevent
/// resource leakage. However there is no need to call `close` explicitly
/// if your code always calls `read` to EOF, as `read` takes care of
/// calling `close` in this case.
///
/// `close` is also called in `drop` as a last line of defense against
/// resource leakage. Any error from the child process is then printed as a
/// warning to stderr. This can be avoided by explicitly calling `close`
/// before the CommandReader is dropped.
pub fn close(&mut self) -> io::Result<()> {
// Dropping stdout closes the underlying file descriptor, which should
// cause a well-behaved child process to exit. If child.stdout is None
// we assume that close() has already been called and do nothing.
let stdout = match self.child.stdout.take() {
None => return Ok(()),
Some(stdout) => stdout,
};
drop(stdout);
if self.child.wait()?.success() {
Ok(())
} else {
let err = self.stderr.read_to_end();
// In the specific case where we haven't consumed the full data
// from the child process, then closing stdout above results in
// a pipe signal being thrown in most cases. But I don't think
// there is any reliable and portable way of detecting it. Instead,
// if we know we haven't hit EOF (so we anticipate a broken pipe
// error) and if stderr otherwise doesn't have anything on it, then
// we assume total success.
if !self.eof && err.is_empty() {
return Ok(());
}
Err(io::Error::from(err))
}
}
}
impl Drop for CommandReader {
fn drop(&mut self) {
if let Err(error) = self.close() {
log::warn!("{}", error);
}
}
}
impl io::Read for CommandReader {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
let stdout = match self.child.stdout {
None => return Ok(0),
Some(ref mut stdout) => stdout,
};
let nread = stdout.read(buf)?;
if nread == 0 {
self.eof = true;
self.close().map(|_| 0)
} else {
Ok(nread)
}
}
}
/// A reader that encapsulates the asynchronous or synchronous reading of
/// stderr.
#[derive(Debug)]
enum StderrReader {
Async(Option<std::thread::JoinHandle<CommandError>>),
Sync(process::ChildStderr),
}
impl StderrReader {
/// Create a reader for stderr that reads contents asynchronously.
fn r#async(mut stderr: process::ChildStderr) -> StderrReader {
let handle =
std::thread::spawn(move || stderr_to_command_error(&mut stderr));
StderrReader::Async(Some(handle))
}
/// Create a reader for stderr that reads contents synchronously.
fn sync(stderr: process::ChildStderr) -> StderrReader {
StderrReader::Sync(stderr)
}
/// Consumes all of stderr on to the heap and returns it as an error.
///
/// If there was a problem reading stderr itself, then this returns an I/O
/// command error.
fn read_to_end(&mut self) -> CommandError {
match *self {
StderrReader::Async(ref mut handle) => {
let handle = handle
.take()
.expect("read_to_end cannot be called more than once");
handle.join().expect("stderr reading thread does not panic")
}
StderrReader::Sync(ref mut stderr) => {
stderr_to_command_error(stderr)
}
}
}
}
fn stderr_to_command_error(stderr: &mut process::ChildStderr) -> CommandError {
let mut bytes = vec![];
match stderr.read_to_end(&mut bytes) {
Ok(_) => CommandError::stderr(bytes),
Err(err) => CommandError::io(err),
}
}

149
crates/cli/src/wtr.rs Normal file
View File

@@ -0,0 +1,149 @@
use std::io::{self, IsTerminal};
use termcolor::{self, HyperlinkSpec};
/// A writer that supports coloring with either line or block buffering.
#[derive(Debug)]
pub struct StandardStream(StandardStreamKind);
/// Returns a possibly buffered writer to stdout for the given color choice.
///
/// The writer returned is either line buffered or block buffered. The decision
/// between these two is made automatically based on whether a tty is attached
/// to stdout or not. If a tty is attached, then line buffering is used.
/// Otherwise, block buffering is used. In general, block buffering is more
/// efficient, but may increase the time it takes for the end user to see the
/// first bits of output.
///
/// If you need more fine grained control over the buffering mode, then use one
/// of `stdout_buffered_line` or `stdout_buffered_block`.
///
/// The color choice given is passed along to the underlying writer. To
/// completely disable colors in all cases, use `ColorChoice::Never`.
pub fn stdout(color_choice: termcolor::ColorChoice) -> StandardStream {
if std::io::stdout().is_terminal() {
stdout_buffered_line(color_choice)
} else {
stdout_buffered_block(color_choice)
}
}
/// Returns a line buffered writer to stdout for the given color choice.
///
/// This writer is useful when printing results directly to a tty such that
/// users see output as soon as it's written. The downside of this approach
/// is that it can be slower, especially when there is a lot of output.
///
/// You might consider using [`stdout`] instead, which chooses the buffering
/// strategy automatically based on whether stdout is connected to a tty.
pub fn stdout_buffered_line(
color_choice: termcolor::ColorChoice,
) -> StandardStream {
let out = termcolor::StandardStream::stdout(color_choice);
StandardStream(StandardStreamKind::LineBuffered(out))
}
/// Returns a block buffered writer to stdout for the given color choice.
///
/// This writer is useful when printing results to a file since it amortizes
/// the cost of writing data. The downside of this approach is that it can
/// increase the latency of display output when writing to a tty.
///
/// You might consider using [`stdout`] instead, which chooses the buffering
/// strategy automatically based on whether stdout is connected to a tty.
pub fn stdout_buffered_block(
color_choice: termcolor::ColorChoice,
) -> StandardStream {
let out = termcolor::BufferedStandardStream::stdout(color_choice);
StandardStream(StandardStreamKind::BlockBuffered(out))
}
#[derive(Debug)]
enum StandardStreamKind {
LineBuffered(termcolor::StandardStream),
BlockBuffered(termcolor::BufferedStandardStream),
}
impl io::Write for StandardStream {
#[inline]
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.write(buf),
BlockBuffered(ref mut w) => w.write(buf),
}
}
#[inline]
fn flush(&mut self) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.flush(),
BlockBuffered(ref mut w) => w.flush(),
}
}
}
impl termcolor::WriteColor for StandardStream {
#[inline]
fn supports_color(&self) -> bool {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref w) => w.supports_color(),
BlockBuffered(ref w) => w.supports_color(),
}
}
#[inline]
fn supports_hyperlinks(&self) -> bool {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref w) => w.supports_hyperlinks(),
BlockBuffered(ref w) => w.supports_hyperlinks(),
}
}
#[inline]
fn set_color(&mut self, spec: &termcolor::ColorSpec) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.set_color(spec),
BlockBuffered(ref mut w) => w.set_color(spec),
}
}
#[inline]
fn set_hyperlink(&mut self, link: &HyperlinkSpec) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.set_hyperlink(link),
BlockBuffered(ref mut w) => w.set_hyperlink(link),
}
}
#[inline]
fn reset(&mut self) -> io::Result<()> {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref mut w) => w.reset(),
BlockBuffered(ref mut w) => w.reset(),
}
}
#[inline]
fn is_synchronous(&self) -> bool {
use self::StandardStreamKind::*;
match self.0 {
LineBuffered(ref w) => w.is_synchronous(),
BlockBuffered(ref w) => w.is_synchronous(),
}
}
}

15
crates/core/README.md Normal file
View File

@@ -0,0 +1,15 @@
ripgrep core
------------
This is the core ripgrep crate. In particular, `main.rs` is where the `main`
function lives.
Most of ripgrep core consists of two things:
* The definition of the CLI interface, including docs for every flag.
* Glue code that brings the `grep-matcher`, `grep-regex`, `grep-searcher` and
`grep-printer` crates together to actually execute the search.
Currently, there are no plans to make ripgrep core available as an independent
library. However, much of the heavy lifting of ripgrep is done via its
constituent crates, which can be reused independent of ripgrep. Unfortunately,
there is no guide or tutorial to teach folks how to do this yet.

View File

@@ -0,0 +1,107 @@
/*!
Provides completions for ripgrep's CLI for the bash shell.
*/
use crate::flags::defs::FLAGS;
const TEMPLATE_FULL: &'static str = "
_rg() {
local i cur prev opts cmds
COMPREPLY=()
cur=\"${COMP_WORDS[COMP_CWORD]}\"
prev=\"${COMP_WORDS[COMP_CWORD-1]}\"
cmd=\"\"
opts=\"\"
for i in ${COMP_WORDS[@]}; do
case \"${i}\" in
rg)
cmd=\"rg\"
;;
*)
;;
esac
done
case \"${cmd}\" in
rg)
opts=\"!OPTS!\"
if [[ ${cur} == -* || ${COMP_CWORD} -eq 1 ]] ; then
COMPREPLY=($(compgen -W \"${opts}\" -- \"${cur}\"))
return 0
fi
case \"${prev}\" in
!CASES!
esac
COMPREPLY=($(compgen -W \"${opts}\" -- \"${cur}\"))
return 0
;;
esac
}
complete -F _rg -o bashdefault -o default rg
";
const TEMPLATE_CASE: &'static str = "
!FLAG!)
COMPREPLY=($(compgen -f \"${cur}\"))
return 0
;;
";
const TEMPLATE_CASE_CHOICES: &'static str = "
!FLAG!)
COMPREPLY=($(compgen -W \"!CHOICES!\" -- \"${cur}\"))
return 0
;;
";
/// Generate completions for Bash.
///
/// Note that these completions are based on what was produced for ripgrep <=13
/// using Clap 2.x. Improvements on this are welcome.
pub(crate) fn generate() -> String {
let mut opts = String::new();
for flag in FLAGS.iter() {
opts.push_str("--");
opts.push_str(flag.name_long());
opts.push(' ');
if let Some(short) = flag.name_short() {
opts.push('-');
opts.push(char::from(short));
opts.push(' ');
}
if let Some(name) = flag.name_negated() {
opts.push_str("--");
opts.push_str(name);
opts.push(' ');
}
}
opts.push_str("<PATTERN> <PATH>...");
let mut cases = String::new();
for flag in FLAGS.iter() {
let template = if !flag.doc_choices().is_empty() {
let choices = flag.doc_choices().join(" ");
TEMPLATE_CASE_CHOICES.trim_end().replace("!CHOICES!", &choices)
} else {
TEMPLATE_CASE.trim_end().to_string()
};
let name = format!("--{}", flag.name_long());
cases.push_str(&template.replace("!FLAG!", &name));
if let Some(short) = flag.name_short() {
let name = format!("-{}", char::from(short));
cases.push_str(&template.replace("!FLAG!", &name));
}
if let Some(negated) = flag.name_negated() {
let name = format!("--{negated}");
cases.push_str(&template.replace("!FLAG!", &name));
}
}
TEMPLATE_FULL
.replace("!OPTS!", &opts)
.replace("!CASES!", &cases)
.trim_start()
.to_string()
}

View File

@@ -0,0 +1,29 @@
# This is impossible to read, but these encodings rarely if ever change, so
# it probably does not matter. They are derived from the list given here:
# https://encoding.spec.whatwg.org/#concept-encoding-get
#
# The globbing here works in both fish and zsh (though they expand it in
# different orders). It may work in other shells too.
{{,us-}ascii,arabic,chinese,cyrillic,greek{,8},hebrew,korean}
logical visual mac {,cs}macintosh x-mac-{cyrillic,roman,ukrainian}
866 ibm{819,866} csibm866
big5{,-hkscs} {cn-,cs}big5 x-x-big5
cp{819,866,125{0,1,2,3,4,5,6,7,8}} x-cp125{0,1,2,3,4,5,6,7,8}
csiso2022{jp,kr} csiso8859{6,8}{e,i}
csisolatin{1,2,3,4,5,6,9} csisolatin{arabic,cyrillic,greek,hebrew}
ecma-{114,118} asmo-708 elot_928 sun_eu_greek
euc-{jp,kr} x-euc-jp cseuckr cseucpkdfmtjapanese
{,x-}gbk csiso58gb231280 gb18030 {,cs}gb2312 gb_2312{,-80} hz-gb-2312
iso-2022-{cn,cn-ext,jp,kr}
iso8859{,-}{1,2,3,4,5,6,7,8,9,10,11,13,14,15}
iso-8859-{1,2,3,4,5,6,7,8,9,10,11,{6,8}-{e,i},13,14,15,16} iso_8859-{1,2,3,4,5,6,7,8,9,15}
iso_8859-{1,2,6,7}:1987 iso_8859-{3,4,5,8}:1988 iso_8859-9:1989
iso-ir-{58,100,101,109,110,126,127,138,144,148,149,157}
koi{,8,8-r,8-ru,8-u,8_r} cskoi8r
ks_c_5601-{1987,1989} ksc{,_}5691 csksc56011987
latin{1,2,3,4,5,6} l{1,2,3,4,5,6,9}
shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
windows-{31j,874,949,125{0,1,2,3,4,5,6,7,8}} dos-874 tis-620 ansi_x3.4-1968
x-user-defined auto none

View File

@@ -0,0 +1,68 @@
/*!
Provides completions for ripgrep's CLI for the fish shell.
*/
use crate::flags::{defs::FLAGS, CompletionType};
const TEMPLATE: &'static str = "complete -c rg !SHORT! -l !LONG! -d '!DOC!'";
const TEMPLATE_NEGATED: &'static str =
"complete -c rg -l !NEGATED! -n '__fish_contains_opt !SHORT! !LONG!' -d '!DOC!'\n";
/// Generate completions for Fish.
pub(crate) fn generate() -> String {
let mut out = String::new();
for flag in FLAGS.iter() {
let short = match flag.name_short() {
None => "".to_string(),
Some(byte) => format!("-s {}", char::from(byte)),
};
let long = flag.name_long();
let doc = flag.doc_short().replace("'", "\\'");
let mut completion = TEMPLATE
.replace("!SHORT!", &short)
.replace("!LONG!", &long)
.replace("!DOC!", &doc);
match flag.completion_type() {
CompletionType::Filename => {
completion.push_str(" -r -F");
}
CompletionType::Executable => {
completion.push_str(" -r -f -a '(__fish_complete_command)'");
}
CompletionType::Filetype => {
completion.push_str(
" -r -f -a '(rg --type-list | string replace : \\t)'",
);
}
CompletionType::Encoding => {
completion.push_str(" -r -f -a '");
completion.push_str(super::ENCODINGS);
completion.push_str("'");
}
CompletionType::Other if !flag.doc_choices().is_empty() => {
completion.push_str(" -r -f -a '");
completion.push_str(&flag.doc_choices().join(" "));
completion.push_str("'");
}
CompletionType::Other if !flag.is_switch() => {
completion.push_str(" -r -f");
}
CompletionType::Other => (),
}
completion.push('\n');
out.push_str(&completion);
if let Some(negated) = flag.name_negated() {
out.push_str(
&TEMPLATE_NEGATED
.replace("!NEGATED!", &negated)
.replace("!SHORT!", &short)
.replace("!LONG!", &long)
.replace("!DOC!", &doc),
);
}
}
out
}

View File

@@ -0,0 +1,10 @@
/*!
Modules for generating completions for various shells.
*/
static ENCODINGS: &'static str = include_str!("encodings.sh");
pub(super) mod bash;
pub(super) mod fish;
pub(super) mod powershell;
pub(super) mod zsh;

View File

@@ -0,0 +1,86 @@
/*!
Provides completions for ripgrep's CLI for PowerShell.
*/
use crate::flags::defs::FLAGS;
const TEMPLATE: &'static str = "
using namespace System.Management.Automation
using namespace System.Management.Automation.Language
Register-ArgumentCompleter -Native -CommandName 'rg' -ScriptBlock {
param($wordToComplete, $commandAst, $cursorPosition)
$commandElements = $commandAst.CommandElements
$command = @(
'rg'
for ($i = 1; $i -lt $commandElements.Count; $i++) {
$element = $commandElements[$i]
if ($element -isnot [StringConstantExpressionAst] -or
$element.StringConstantType -ne [StringConstantType]::BareWord -or
$element.Value.StartsWith('-')) {
break
}
$element.Value
}) -join ';'
$completions = @(switch ($command) {
'rg' {
!FLAGS!
}
})
$completions.Where{ $_.CompletionText -like \"$wordToComplete*\" } |
Sort-Object -Property ListItemText
}
";
const TEMPLATE_FLAG: &'static str =
"[CompletionResult]::new('!DASH_NAME!', '!NAME!', [CompletionResultType]::ParameterName, '!DOC!')";
/// Generate completions for PowerShell.
///
/// Note that these completions are based on what was produced for ripgrep <=13
/// using Clap 2.x. Improvements on this are welcome.
pub(crate) fn generate() -> String {
let mut flags = String::new();
for (i, flag) in FLAGS.iter().enumerate() {
let doc = flag.doc_short().replace("'", "''");
let dash_name = format!("--{}", flag.name_long());
let name = flag.name_long();
if i > 0 {
flags.push('\n');
}
flags.push_str(" ");
flags.push_str(
&TEMPLATE_FLAG
.replace("!DASH_NAME!", &dash_name)
.replace("!NAME!", &name)
.replace("!DOC!", &doc),
);
if let Some(byte) = flag.name_short() {
let dash_name = format!("-{}", char::from(byte));
let name = char::from(byte).to_string();
flags.push_str("\n ");
flags.push_str(
&TEMPLATE_FLAG
.replace("!DASH_NAME!", &dash_name)
.replace("!NAME!", &name)
.replace("!DOC!", &doc),
);
}
if let Some(negated) = flag.name_negated() {
let dash_name = format!("--{}", negated);
flags.push_str("\n ");
flags.push_str(
&TEMPLATE_FLAG
.replace("!DASH_NAME!", &dash_name)
.replace("!NAME!", &negated)
.replace("!DOC!", &doc),
);
}
}
TEMPLATE.trim_start().replace("!FLAGS!", &flags)
}

View File

@@ -0,0 +1,637 @@
#compdef rg
##
# zsh completion function for ripgrep
#
# Run ci/test-complete after building to ensure that the options supported by
# this function stay in synch with the `rg` binary.
#
# For convenience, a completion reference guide is included at the bottom of
# this file.
#
# Originally based on code from the zsh-users project — see copyright notice
# below.
_rg() {
local curcontext=$curcontext no='!' descr ret=1
local -a context line state state_descr args tmp suf
local -A opt_args
# ripgrep has many options which negate the effect of a more common one — for
# example, `--no-column` to negate `--column`, and `--messages` to negate
# `--no-messages`. There are so many of these, and they're so infrequently
# used, that some users will probably find it irritating if they're completed
# indiscriminately, so let's not do that unless either the current prefix
# matches one of those negation options or the user has the `complete-all`
# style set. Note that this prefix check has to be updated manually to account
# for all of the potential negation options listed below!
if
# We also want to list all of these options during testing
[[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] ||
# (--[imnp]* => --ignore*, --messages, --no-*, --pcre2-unicode)
[[ $PREFIX$SUFFIX == --[imnp]* ]] ||
zstyle -t ":completion:${curcontext}:" complete-all
then
no=
fi
# We make heavy use of argument groups here to prevent the option specs from
# growing unwieldy. These aren't supported in zsh <5.4, though, so we'll strip
# them out below if necessary. This makes the exclusions inaccurate on those
# older versions, but oh well — it's not that big a deal
args=(
+ '(exclusive)' # Misc. fully exclusive options
'(: * -)'{-h,--help}'[display help information]'
'(: * -)'{-V,--version}'[display version information]'
'(: * -)'--pcre2-version'[print the version of PCRE2 used by ripgrep, if available]'
+ '(buffered)' # buffering options
'--line-buffered[force line buffering]'
$no"--no-line-buffered[don't force line buffering]"
'--block-buffered[force block buffering]'
$no"--no-block-buffered[don't force block buffering]"
+ '(case)' # Case-sensitivity options
{-i,--ignore-case}'[search case-insensitively]'
{-s,--case-sensitive}'[search case-sensitively]'
{-S,--smart-case}'[search case-insensitively if pattern is all lowercase]'
+ '(context-a)' # Context (after) options
'(context-c)'{-A+,--after-context=}'[specify lines to show after each match]:number of lines'
+ '(context-b)' # Context (before) options
'(context-c)'{-B+,--before-context=}'[specify lines to show before each match]:number of lines'
+ '(context-c)' # Context (combined) options
'(context-a context-b)'{-C+,--context=}'[specify lines to show before and after each match]:number of lines'
+ '(column)' # Column options
'--column[show column numbers for matches]'
$no"--no-column[don't show column numbers for matches]"
+ '(count)' # Counting options
{-c,--count}'[only show count of matching lines for each file]'
'--count-matches[only show count of individual matches for each file]'
'--include-zero[include files with zero matches in summary]'
$no"--no-include-zero[don't include files with zero matches in summary]"
+ '(encoding)' # Encoding options
{-E+,--encoding=}'[specify text encoding of files to search]: :_rg_encodings'
$no'--no-encoding[use default text encoding]'
+ '(engine)' # Engine choice options
'--engine=[select which regex engine to use]:when:((
default\:"use default engine"
pcre2\:"identical to --pcre2"
auto\:"identical to --auto-hybrid-regex"
))'
+ file # File-input options
'(1)*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'
+ '(file-match)' # Files with/without match options
'(stats)'{-l,--files-with-matches}'[only show names of files with matches]'
'(stats)--files-without-match[only show names of files without matches]'
+ '(file-name)' # File-name options
{-H,--with-filename}'[show file name for matches]'
{-I,--no-filename}"[don't show file name for matches]"
+ '(file-system)' # File system options
"--one-file-system[don't descend into directories on other file systems]"
$no'--no-one-file-system[descend into directories on other file systems]'
+ '(fixed)' # Fixed-string options
{-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
$no"--no-fixed-strings[don't treat pattern as literal string]"
+ '(follow)' # Symlink-following options
{-L,--follow}'[follow symlinks]'
$no"--no-follow[don't follow symlinks]"
+ '(generate)' # Options for generating ancillary data
'--generate=[generate man page or completion scripts]:when:((
man\:"man page"
complete-bash\:"shell completions for bash"
complete-zsh\:"shell completions for zsh"
complete-fish\:"shell completions for fish"
complete-powershell\:"shell completions for PowerShell"
))'
+ glob # File-glob options
'*'{-g+,--glob=}'[include/exclude files matching specified glob]:glob'
'*--iglob=[include/exclude files matching specified case-insensitive glob]:glob'
+ '(glob-case-insensitive)' # File-glob case sensitivity options
'--glob-case-insensitive[treat -g/--glob patterns case insensitively]'
$no'--no-glob-case-insensitive[treat -g/--glob patterns case sensitively]'
+ '(heading)' # Heading options
'(pretty-vimgrep)--heading[show matches grouped by file name]'
"(pretty-vimgrep)--no-heading[don't show matches grouped by file name]"
+ '(hidden)' # Hidden-file options
{-.,--hidden}'[search hidden files and directories]'
$no"--no-hidden[don't search hidden files and directories]"
+ '(hybrid)' # hybrid regex options
'--auto-hybrid-regex[DEPRECATED: dynamically use PCRE2 if necessary]'
$no"--no-auto-hybrid-regex[DEPRECATED: don't dynamically use PCRE2 if necessary]"
+ '(ignore)' # Ignore-file options
"(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
$no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
+ '(ignore-file-case-insensitive)' # Ignore-file case sensitivity options
'--ignore-file-case-insensitive[process ignore files case insensitively]'
$no'--no-ignore-file-case-insensitive[process ignore files case sensitively]'
+ '(ignore-exclude)' # Local exclude (ignore)-file options
"--no-ignore-exclude[don't respect local exclude (ignore) files]"
$no'--ignore-exclude[respect local exclude (ignore) files]'
+ '(ignore-global)' # Global ignore-file options
"--no-ignore-global[don't respect global ignore files]"
$no'--ignore-global[respect global ignore files]'
+ '(ignore-parent)' # Parent ignore-file options
"--no-ignore-parent[don't respect ignore files in parent directories]"
$no'--ignore-parent[respect ignore files in parent directories]'
+ '(ignore-vcs)' # VCS ignore-file options
"--no-ignore-vcs[don't respect version control ignore files]"
$no'--ignore-vcs[respect version control ignore files]'
+ '(require-git)' # git specific settings
"--no-require-git[don't require git repository to respect gitignore rules]"
$no'--require-git[require git repository to respect gitignore rules]'
+ '(ignore-dot)' # .ignore options
"--no-ignore-dot[don't respect .ignore files]"
$no'--ignore-dot[respect .ignore files]'
+ '(ignore-files)' # custom global ignore file options
"--no-ignore-files[don't respect --ignore-file flags]"
$no'--ignore-files[respect --ignore-file files]'
+ '(json)' # JSON options
'--json[output results in JSON Lines format]'
$no"--no-json[don't output results in JSON Lines format]"
+ '(line-number)' # Line-number options
{-n,--line-number}'[show line numbers for matches]'
{-N,--no-line-number}"[don't show line numbers for matches]"
+ '(line-terminator)' # Line-terminator options
'--crlf[use CRLF as line terminator]'
$no"--no-crlf[don't use CRLF as line terminator]"
'(text)--null-data[use NUL as line terminator]'
+ '(max-columns-preview)' # max column preview options
'--max-columns-preview[show preview for long lines (with -M)]'
$no"--no-max-columns-preview[don't show preview for long lines (with -M)]"
+ '(max-depth)' # Directory-depth options
{-d,--max-depth}'[specify max number of directories to descend]:number of directories'
'--maxdepth=[alias for --max-depth]:number of directories'
'!--maxdepth=:number of directories'
+ '(messages)' # Error-message options
'(--no-ignore-messages)--no-messages[suppress some error messages]'
$no"--messages[don't suppress error messages affected by --no-messages]"
+ '(messages-ignore)' # Ignore-error message options
"--no-ignore-messages[don't show ignore-file parse error messages]"
$no'--ignore-messages[show ignore-file parse error messages]'
+ '(mmap)' # mmap options
'--mmap[search using memory maps when possible]'
"--no-mmap[don't search using memory maps]"
+ '(multiline)' # Multiline options
{-U,--multiline}'[permit matching across multiple lines]'
$no'(multiline-dotall)--no-multiline[restrict matches to at most one line each]'
+ '(multiline-dotall)' # Multiline DOTALL options
'(--no-multiline)--multiline-dotall[allow "." to match newline (with -U)]'
$no"(--no-multiline)--no-multiline-dotall[don't allow \".\" to match newline (with -U)]"
+ '(only)' # Only-match options
{-o,--only-matching}'[show only matching part of each line]'
+ '(passthru)' # Pass-through options
'(--vimgrep)--passthru[show both matching and non-matching lines]'
'(--vimgrep)--passthrough[alias for --passthru]'
+ '(pcre2)' # PCRE2 options
{-P,--pcre2}'[enable matching with PCRE2]'
$no'(pcre2-unicode)--no-pcre2[disable matching with PCRE2]'
+ '(pcre2-unicode)' # PCRE2 Unicode options
$no'(--no-pcre2 --no-pcre2-unicode)--pcre2-unicode[DEPRECATED: enable PCRE2 Unicode mode (with -P)]'
'(--no-pcre2 --pcre2-unicode)--no-pcre2-unicode[DEPRECATED: disable PCRE2 Unicode mode (with -P)]'
+ '(pre)' # Preprocessing options
'(-z --search-zip)--pre=[specify preprocessor utility]:preprocessor utility:_command_names -e'
$no'--no-pre[disable preprocessor utility]'
+ pre-glob # Preprocessing glob options
'*--pre-glob[include/exclude files for preprocessing with --pre]'
+ '(pretty-vimgrep)' # Pretty/vimgrep display options
'(heading)'{-p,--pretty}'[alias for --color=always --heading -n]'
'(heading passthru)--vimgrep[show results in vim-compatible format]'
+ regexp # Explicit pattern options
'(1 file)*'{-e+,--regexp=}'[specify pattern]:pattern'
+ '(replace)' # Replacement options
{-r+,--replace=}'[specify string used to replace matches]:replace string'
+ '(sort)' # File-sorting options
'(threads)--sort=[sort results in ascending order (disables parallelism)]:sort method:((
none\:"no sorting"
path\:"sort by file path"
modified\:"sort by last modified time"
accessed\:"sort by last accessed time"
created\:"sort by creation time"
))'
'(threads)--sortr=[sort results in descending order (disables parallelism)]:sort method:((
none\:"no sorting"
path\:"sort by file path"
modified\:"sort by last modified time"
accessed\:"sort by last accessed time"
created\:"sort by creation time"
))'
'(threads)--sort-files[DEPRECATED: sort results by file path (disables parallelism)]'
$no"--no-sort-files[DEPRECATED: do not sort results]"
+ '(stats)' # Statistics options
'(--files file-match)--stats[show search statistics]'
$no"--no-stats[don't show search statistics]"
+ '(text)' # Binary-search options
{-a,--text}'[search binary files as if they were text]'
"--binary[search binary files, don't print binary data]"
$no"--no-binary[don't search binary files]"
$no"(--null-data)--no-text[don't search binary files as if they were text]"
+ '(threads)' # Thread-count options
'(sort)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
+ '(trim)' # Trim options
'--trim[trim any ASCII whitespace prefix from each line]'
$no"--no-trim[don't trim ASCII whitespace prefix from each line]"
+ type # Type options
'*'{-t+,--type=}'[only search files matching specified type]: :_rg_types'
'*--type-add=[add new glob for specified file type]: :->typespec'
'*--type-clear=[clear globs previously defined for specified file type]: :_rg_types'
# This should actually be exclusive with everything but other type options
'(: *)--type-list[show all supported file types and their associated globs]'
'*'{-T+,--type-not=}"[don't search files matching specified file type]: :_rg_types"
+ '(word-line)' # Whole-word/line match options
{-w,--word-regexp}'[only show matches surrounded by word boundaries]'
{-x,--line-regexp}'[only show matches surrounded by line boundaries]'
+ '(unicode)' # Unicode options
$no'--unicode[enable Unicode mode]'
'--no-unicode[disable Unicode mode]'
+ '(zip)' # Compression options
'(--pre)'{-z,--search-zip}'[search in compressed files]'
$no"--no-search-zip[don't search in compressed files]"
+ misc # Other options — no need to separate these at the moment
'(-b --byte-offset)'{-b,--byte-offset}'[show 0-based byte offset for each matching line]'
$no"--no-byte-offset[don't show byte offsets for each matching line]"
'--color=[specify when to use colors in output]:when:((
never\:"never use colors"
auto\:"use colors or not based on stdout, TERM, etc."
always\:"always use colors"
ansi\:"always use ANSI colors (even on Windows)"
))'
'*--colors=[specify color and style settings]: :->colorspec'
'--context-separator=[specify string used to separate non-continuous context lines in output]:separator'
$no"--no-context-separator[don't print context separators]"
'--debug[show debug messages]'
'--field-context-separator[set string to delimit fields in context lines]'
'--field-match-separator[set string to delimit fields in matching lines]'
'--hostname-bin=[executable for getting system hostname]:hostname executable:_command_names -e'
'--hyperlink-format=[specify pattern for hyperlinks]:pattern'
'--trace[show more verbose debug messages]'
'--dfa-size-limit=[specify upper size limit of generated DFA]:DFA size (bytes)'
"(1 stats)--files[show each file that would be searched (but don't search)]"
'*--ignore-file=[specify additional ignore file]:ignore file:_files'
'(-v --invert-match)'{-v,--invert-match}'[invert matching]'
$no"--no-invert-match[do not invert matching]"
'(-M --max-columns)'{-M+,--max-columns=}'[specify max length of lines to print]:number of bytes'
'(-m --max-count)'{-m+,--max-count=}'[specify max number of matches per file]:number of matches'
'--max-filesize=[specify size above which files should be ignored]:file size (bytes)'
"--no-config[don't load configuration files]"
'(-0 --null)'{-0,--null}'[print NUL byte after file names]'
'--path-separator=[specify path separator to use when printing file names]:separator'
'(-q --quiet)'{-q,--quiet}'[suppress normal output]'
'--regex-size-limit=[specify upper size limit of compiled regex]:regex size (bytes)'
'*'{-u,--unrestricted}'[reduce level of "smart" searching]'
'--stop-on-nonmatch[stop on first non-matching line after a matching one]'
+ operand # Operands
'(--files --type-list file regexp)1: :_guard "^-*" pattern'
'(--type-list)*: :_files'
)
# This is used with test-complete to verify that there are no options
# listed in the help output that aren't also defined here
[[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] && {
print -rl - $args
return 0
}
# Strip out argument groups where unsupported (see above)
[[ $ZSH_VERSION == (4|5.<0-3>)(.*)# ]] &&
args=( ${(@)args:#(#i)(+|[a-z0-9][a-z0-9_-]#|\([a-z0-9][a-z0-9_-]#\))} )
_arguments -C -s -S : $args && ret=0
case $state in
colorspec)
if [[ ${IPREFIX#--*=}$PREFIX == [^:]# ]]; then
suf=( -qS: )
tmp=(
'column:specify coloring for column numbers'
'line:specify coloring for line numbers'
'match:specify coloring for match text'
'path:specify coloring for file names'
)
descr='color/style type'
elif [[ ${IPREFIX#--*=}$PREFIX == (column|line|match|path):[^:]# ]]; then
suf=( -qS: )
tmp=(
'none:clear color/style for type'
'bg:specify background color'
'fg:specify foreground color'
'style:specify text style'
)
descr='color/style attribute'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:(bg|fg):[^:]# ]]; then
tmp=( black blue green red cyan magenta yellow white )
descr='color name or r,g,b'
elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:style:[^:]# ]]; then
tmp=( {,no}bold {,no}intense {,no}underline )
descr='style name'
else
_message -e colorspec 'no more arguments'
fi
(( $#tmp )) && {
compset -P '*:'
_describe -t colorspec $descr tmp $suf && ret=0
}
;;
typespec)
if compset -P '[^:]##:include:'; then
_sequence -s , _rg_types && ret=0
# @todo This bit in particular could be better, but it's a little
# complex, and attempting to solve it seems to run us up against a crash
# bug — zsh # 40362
elif compset -P '[^:]##:'; then
_message 'glob or include directive' && ret=1
elif [[ ! -prefix *:* ]]; then
_rg_types -qS : && ret=0
fi
;;
esac
return ret
}
# Complete encodings
_rg_encodings() {
local -a expl
local -aU _encodings
_encodings=(
!ENCODINGS!
)
_wanted encodings expl encoding compadd -a "$@" - _encodings
}
# Complete file types
_rg_types() {
local -a expl
local -aU _types
_types=( ${(@)${(f)"$( _call_program types $words[1] --type-list )"}//:[[:space:]]##/:} )
if zstyle -t ":completion:${curcontext}:types" extra-verbose; then
_describe -t types 'file type' _types
else
_wanted types expl 'file type' compadd "$@" - ${(@)_types%%:*}
fi
}
_rg "$@"
################################################################################
# ZSH COMPLETION REFERENCE
#
# For the convenience of developers who aren't especially familiar with zsh
# completion functions, a brief reference guide follows. This is in no way
# comprehensive; it covers just enough of the basic structure, syntax, and
# conventions to help someone make simple changes like adding new options. For
# more complete documentation regarding zsh completion functions, please see the
# following:
#
# * http://zsh.sourceforge.net/Doc/Release/Completion-System.html
# * https://github.com/zsh-users/zsh/blob/master/Etc/completion-style-guide
#
# OVERVIEW
#
# Most zsh completion functions are defined in terms of `_arguments`, which is a
# shell function that takes a series of argument specifications. The specs for
# `rg` are stored in an array, which is common for more complex functions; the
# elements of the array are passed to `_arguments` on invocation.
#
# ARGUMENT-SPECIFICATION SYNTAX
#
# The following is a contrived example of the argument specs for a simple tool:
#
# '(: * -)'{-h,--help}'[display help information]'
# '(-q -v --quiet --verbose)'{-q,--quiet}'[decrease output verbosity]'
# '!(-q -v --quiet --verbose)--silent'
# '(-q -v --quiet --verbose)'{-v,--verbose}'[increase output verbosity]'
# '--color=[specify when to use colors]:when:(always never auto)'
# '*:example file:_files'
#
# Although there may appear to be six specs here, there are actually nine; we
# use brace expansion to combine specs for options that go by multiple names,
# like `-q` and `--quiet`. This is customary, and ties in with the fact that zsh
# merges completion possibilities together when they have the same description.
#
# The first line defines the option `-h`/`--help`. With most tools, it isn't
# useful to complete anything after `--help` because it effectively overrides
# all others; the `(: * -)` at the beginning of the spec tells zsh not to
# complete any other operands (`:` and `*`) or options (`-`) after this one has
# been used. The `[...]` at the end associates a description with `-h`/`--help`;
# as mentioned, zsh will see the identical descriptions and merge these options
# together when offering completion possibilities.
#
# The next line defines `-q`/`--quiet`. Here we don't want to suppress further
# completions entirely, but we don't want to offer `-q` if `--quiet` has been
# given (since they do the same thing), nor do we want to offer `-v` (since it
# doesn't make sense to be quiet and verbose at the same time). We don't need to
# tell zsh not to offer `--quiet` a second time, since that's the default
# behaviour, but since this line expands to two specs describing `-q` *and*
# `--quiet` we do need to explicitly list all of them here.
#
# The next line defines a hidden option `--silent` — maybe it's a deprecated
# synonym for `--quiet`. The leading `!` indicates that zsh shouldn't offer this
# option during completion. The benefit of providing a spec for an option that
# shouldn't be completed is that, if someone *does* use it, we can correctly
# suppress completion of other options afterwards.
#
# The next line defines `-v`/`--verbose`; this works just like `-q`/`--quiet`.
#
# The next line defines `--color`. In this example, `--color` doesn't have a
# corresponding short option, so we don't need to use brace expansion. Further,
# there are no other options it's exclusive with (just itself), so we don't need
# to define those at the beginning. However, it does take a mandatory argument.
# The `=` at the end of `--color=` indicates that the argument may appear either
# like `--color always` or like `--color=always`; this is how most GNU-style
# command-line tools work. The corresponding short option would normally use `+`
# — for example, `-c+` would allow either `-c always` or `-calways`. For this
# option, the arguments are known ahead of time, so we can simply list them in
# parentheses at the end (`when` is used as the description for the argument).
#
# The last line defines an operand (a non-option argument). In this example, the
# operand can be used any number of times (the leading `*`), and it should be a
# file path, so we tell zsh to call the `_files` function to complete it. The
# `example file` in the middle is the description to use for this operand; we
# could use a space instead to accept the default provided by `_files`.
#
# GROUPING ARGUMENT SPECIFICATIONS
#
# Newer versions of zsh support grouping argument specs together. All specs
# following a `+` and then a group name are considered to be members of the
# named group. Grouping is useful mostly for organisational purposes; it makes
# the relationship between different options more obvious, and makes it easier
# to specify exclusions.
#
# We could rewrite our example above using grouping as follows:
#
# '(: * -)'{-h,--help}'[display help information]'
# '--color=[specify when to use colors]:when:(always never auto)'
# '*:example file:_files'
# + '(verbosity)'
# {-q,--quiet}'[decrease output verbosity]'
# '!--silent'
# {-v,--verbose}'[increase output verbosity]'
#
# Here we take advantage of a useful feature of spec grouping — when the group
# name is surrounded by parentheses, as in `(verbosity)`, it tells zsh that all
# of the options in that group are exclusive with each other. As a result, we
# don't need to manually list out the exclusions at the beginning of each
# option.
#
# Groups can also be referred to by name in other argument specs; for example:
#
# '(xyz)--aaa' '*: :_files'
# + xyz --xxx --yyy --zzz
#
# Here we use the group name `xyz` to tell zsh that `--xxx`, `--yyy`, and
# `--zzz` are not to be completed after `--aaa`. This makes the exclusion list
# much more compact and reusable.
#
# CONVENTIONS
#
# zsh completion functions generally adhere to the following conventions:
#
# * Use two spaces for indentation
# * Combine specs for options with different names using brace expansion
# * In combined specs, list the short option first (as in `{-a,--text}`)
# * Use `+` or `=` as described above for options that take arguments
# * Provide a description for all options, option-arguments, and operands
# * Capitalise/punctuate argument descriptions as phrases, not complete
# sentences — 'display help information', never 'Display help information.'
# (but still capitalise acronyms and proper names)
# * Write argument descriptions as verb phrases — 'display x', 'enable y',
# 'use z'
# * Word descriptions to make it clear when an option expects an argument;
# usually this is done with the word 'specify', as in 'specify x' or
# 'use specified x')
# * Write argument descriptions as tersely as possible — for example, articles
# like 'a' and 'the' should be omitted unless it would be confusing
#
# Other conventions currently used by this function:
#
# * Order argument specs alphabetically by group name, then option name
# * Group options that are directly related, mutually exclusive, or frequently
# referenced by other argument specs
# * Use only characters in the set [a-z0-9_-] in group names
# * Order exclusion lists as follows: short options, long options, groups
# * Use American English in descriptions
# * Use 'don't' in descriptions instead of 'do not'
# * Word descriptions for related options as similarly as possible. For example,
# `--foo[enable foo]` and `--no-foo[disable foo]`, or `--foo[use foo]` and
# `--no-foo[don't use foo]`
# * Word descriptions to make it clear when an option only makes sense with
# another option, usually by adding '(with -x)' to the end
# * Don't quote strings or variables unnecessarily. When quotes are required,
# prefer single-quotes to double-quotes
# * Prefix option specs with `$no` when the option serves only to negate the
# behaviour of another option that must be provided explicitly by the user.
# This prevents rarely used options from cluttering up the completion menu
################################################################################
# ------------------------------------------------------------------------------
# Copyright (c) 2011 Github zsh-users - http://github.com/zsh-users
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of the zsh-users nor the
# names of its contributors may be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL ZSH-USERS BE LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# ------------------------------------------------------------------------------
# Description
# -----------
#
# Completion script for ripgrep
#
# ------------------------------------------------------------------------------
# Authors
# -------
#
# * arcizan <ghostrevery@gmail.com>
# * MaskRay <i@maskray.me>
#
# ------------------------------------------------------------------------------
# Local Variables:
# mode: shell-script
# coding: utf-8-unix
# indent-tabs-mode: nil
# sh-indentation: 2
# sh-basic-offset: 2
# End:
# vim: ft=zsh sw=2 ts=2 et

View File

@@ -0,0 +1,23 @@
/*!
Provides completions for ripgrep's CLI for the zsh shell.
Unlike completion short for other shells (at time of writing), zsh's
completions for ripgrep are maintained by hand. This is because:
1. They are lovingly written by an expert in such things.
2. Are much higher in quality than the ones below that are auto-generated.
Namely, the zsh completions take application level context about flag
compatibility into account.
3. There is a CI script that fails if a new flag is added to ripgrep that
isn't included in the zsh completions.
4. There is a wealth of documentation in the zsh script explaining how it
works and how it can be extended.
In principle, I'd be open to maintaining any completion script by hand so
long as it meets criteria 3 and 4 above.
*/
/// Generate completions for zsh.
pub(crate) fn generate() -> String {
include_str!("rg.zsh").replace("!ENCODINGS!", super::ENCODINGS.trim_end())
}

170
crates/core/flags/config.rs Normal file
View File

@@ -0,0 +1,170 @@
/*!
This module provides routines for reading ripgrep config "rc" files.
The primary output of these routines is a sequence of arguments, where each
argument corresponds precisely to one shell argument.
*/
use std::{
ffi::OsString,
path::{Path, PathBuf},
};
use bstr::{io::BufReadExt, ByteSlice};
/// Return a sequence of arguments derived from ripgrep rc configuration files.
pub fn args() -> Vec<OsString> {
let config_path = match std::env::var_os("RIPGREP_CONFIG_PATH") {
None => return vec![],
Some(config_path) => {
if config_path.is_empty() {
return vec![];
}
PathBuf::from(config_path)
}
};
let (args, errs) = match parse(&config_path) {
Ok((args, errs)) => (args, errs),
Err(err) => {
message!(
"failed to read the file specified in RIPGREP_CONFIG_PATH: {}",
err
);
return vec![];
}
};
if !errs.is_empty() {
for err in errs {
message!("{}:{}", config_path.display(), err);
}
}
log::debug!(
"{}: arguments loaded from config file: {:?}",
config_path.display(),
args
);
args
}
/// Parse a single ripgrep rc file from the given path.
///
/// On success, this returns a set of shell arguments, in order, that should
/// be pre-pended to the arguments given to ripgrep at the command line.
///
/// If the file could not be read, then an error is returned. If there was
/// a problem parsing one or more lines in the file, then errors are returned
/// for each line in addition to successfully parsed arguments.
fn parse<P: AsRef<Path>>(
path: P,
) -> anyhow::Result<(Vec<OsString>, Vec<anyhow::Error>)> {
let path = path.as_ref();
match std::fs::File::open(&path) {
Ok(file) => parse_reader(file),
Err(err) => anyhow::bail!("{}: {}", path.display(), err),
}
}
/// Parse a single ripgrep rc file from the given reader.
///
/// Callers should not provided a buffered reader, as this routine will use its
/// own buffer internally.
///
/// On success, this returns a set of shell arguments, in order, that should
/// be pre-pended to the arguments given to ripgrep at the command line.
///
/// If the reader could not be read, then an error is returned. If there was a
/// problem parsing one or more lines, then errors are returned for each line
/// in addition to successfully parsed arguments.
fn parse_reader<R: std::io::Read>(
rdr: R,
) -> anyhow::Result<(Vec<OsString>, Vec<anyhow::Error>)> {
let mut bufrdr = std::io::BufReader::new(rdr);
let (mut args, mut errs) = (vec![], vec![]);
let mut line_number = 0;
bufrdr.for_byte_line_with_terminator(|line| {
line_number += 1;
let line = line.trim();
if line.is_empty() || line[0] == b'#' {
return Ok(true);
}
match line.to_os_str() {
Ok(osstr) => {
args.push(osstr.to_os_string());
}
Err(err) => {
errs.push(anyhow::anyhow!("{line_number}: {err}"));
}
}
Ok(true)
})?;
Ok((args, errs))
}
#[cfg(test)]
mod tests {
use super::parse_reader;
use std::ffi::OsString;
#[test]
fn basic() {
let (args, errs) = parse_reader(
&b"\
# Test
--context=0
--smart-case
-u
# --bar
--foo
"[..],
)
.unwrap();
assert!(errs.is_empty());
let args: Vec<String> =
args.into_iter().map(|s| s.into_string().unwrap()).collect();
assert_eq!(args, vec!["--context=0", "--smart-case", "-u", "--foo",]);
}
// We test that we can handle invalid UTF-8 on Unix-like systems.
#[test]
#[cfg(unix)]
fn error() {
use std::os::unix::ffi::OsStringExt;
let (args, errs) = parse_reader(
&b"\
quux
foo\xFFbar
baz
"[..],
)
.unwrap();
assert!(errs.is_empty());
assert_eq!(
args,
vec![
OsString::from("quux"),
OsString::from_vec(b"foo\xFFbar".to_vec()),
OsString::from("baz"),
]
);
}
// ... but test that invalid UTF-8 fails on Windows.
#[test]
#[cfg(not(unix))]
fn error() {
let (args, errs) = parse_reader(
&b"\
quux
foo\xFFbar
baz
"[..],
)
.unwrap();
assert_eq!(errs.len(), 1);
assert_eq!(args, vec![OsString::from("quux"), OsString::from("baz"),]);
}
}

7675
crates/core/flags/defs.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,259 @@
/*!
Provides routines for generating ripgrep's "short" and "long" help
documentation.
The short version is used when the `-h` flag is given, while the long version
is used when the `--help` flag is given.
*/
use std::{collections::BTreeMap, fmt::Write};
use crate::flags::{defs::FLAGS, doc::version, Category, Flag};
const TEMPLATE_SHORT: &'static str = include_str!("template.short.help");
const TEMPLATE_LONG: &'static str = include_str!("template.long.help");
/// Wraps `std::write!` and asserts there is no failure.
///
/// We only write to `String` in this module.
macro_rules! write {
($($tt:tt)*) => { std::write!($($tt)*).unwrap(); }
}
/// Generate short documentation, i.e., for `-h`.
pub(crate) fn generate_short() -> String {
let mut cats: BTreeMap<Category, (Vec<String>, Vec<String>)> =
BTreeMap::new();
let (mut maxcol1, mut maxcol2) = (0, 0);
for flag in FLAGS.iter().copied() {
let columns =
cats.entry(flag.doc_category()).or_insert((vec![], vec![]));
let (col1, col2) = generate_short_flag(flag);
maxcol1 = maxcol1.max(col1.len());
maxcol2 = maxcol2.max(col2.len());
columns.0.push(col1);
columns.1.push(col2);
}
let mut out =
TEMPLATE_SHORT.replace("!!VERSION!!", &version::generate_digits());
for (cat, (col1, col2)) in cats.iter() {
let var = format!("!!{name}!!", name = cat.as_str());
let val = format_short_columns(col1, col2, maxcol1, maxcol2);
out = out.replace(&var, &val);
}
out
}
/// Generate short for a single flag.
///
/// The first element corresponds to the flag name while the second element
/// corresponds to the documentation string.
fn generate_short_flag(flag: &dyn Flag) -> (String, String) {
let (mut col1, mut col2) = (String::new(), String::new());
// Some of the variable names are fine for longer form
// docs, but they make the succinct short help very noisy.
// So just shorten some of them.
let var = flag.doc_variable().map(|s| {
let mut s = s.to_string();
s = s.replace("SEPARATOR", "SEP");
s = s.replace("REPLACEMENT", "TEXT");
s = s.replace("NUM+SUFFIX?", "NUM");
s
});
// Generate the first column, the flag name.
if let Some(byte) = flag.name_short() {
let name = char::from(byte);
write!(col1, r"-{name}");
write!(col1, r", ");
}
write!(col1, r"--{name}", name = flag.name_long());
if let Some(var) = var.as_ref() {
write!(col1, r"={var}");
}
// And now the second column, with the description.
write!(col2, "{}", flag.doc_short());
(col1, col2)
}
/// Write two columns of documentation.
///
/// `maxcol1` should be the maximum length (in bytes) of the first column,
/// while `maxcol2` should be the maximum length (in bytes) of the second
/// column.
fn format_short_columns(
col1: &[String],
col2: &[String],
maxcol1: usize,
_maxcol2: usize,
) -> String {
assert_eq!(col1.len(), col2.len(), "columns must have equal length");
const PAD: usize = 2;
let mut out = String::new();
for (i, (c1, c2)) in col1.iter().zip(col2.iter()).enumerate() {
if i > 0 {
write!(out, "\n");
}
let pad = maxcol1 - c1.len() + PAD;
write!(out, " ");
write!(out, "{c1}");
write!(out, "{}", " ".repeat(pad));
write!(out, "{c2}");
}
out
}
/// Generate long documentation, i.e., for `--help`.
pub(crate) fn generate_long() -> String {
let mut cats = BTreeMap::new();
for flag in FLAGS.iter().copied() {
let mut cat = cats.entry(flag.doc_category()).or_insert(String::new());
if !cat.is_empty() {
write!(cat, "\n\n");
}
generate_long_flag(flag, &mut cat);
}
let mut out =
TEMPLATE_LONG.replace("!!VERSION!!", &version::generate_digits());
for (cat, value) in cats.iter() {
let var = format!("!!{name}!!", name = cat.as_str());
out = out.replace(&var, value);
}
out
}
/// Write generated documentation for `flag` to `out`.
fn generate_long_flag(flag: &dyn Flag, out: &mut String) {
if let Some(byte) = flag.name_short() {
let name = char::from(byte);
write!(out, r" -{name}");
if let Some(var) = flag.doc_variable() {
write!(out, r" {var}");
}
write!(out, r", ");
} else {
write!(out, r" ");
}
let name = flag.name_long();
write!(out, r"--{name}");
if let Some(var) = flag.doc_variable() {
write!(out, r"={var}");
}
write!(out, "\n");
let doc = flag.doc_long().trim();
let doc = super::render_custom_markup(doc, "flag", |name, out| {
let Some(flag) = crate::flags::parse::lookup(name) else {
unreachable!(r"found unrecognized \flag{{{name}}} in --help docs")
};
if let Some(name) = flag.name_short() {
write!(out, r"-{}/", char::from(name));
}
write!(out, r"--{}", flag.name_long());
});
let doc = super::render_custom_markup(&doc, "flag-negate", |name, out| {
let Some(flag) = crate::flags::parse::lookup(name) else {
unreachable!(
r"found unrecognized \flag-negate{{{name}}} in --help docs"
)
};
let Some(name) = flag.name_negated() else {
let long = flag.name_long();
unreachable!(
"found \\flag-negate{{{long}}} in --help docs but \
{long} does not have a negation"
);
};
write!(out, r"--{name}");
});
let mut cleaned = remove_roff(&doc);
if let Some(negated) = flag.name_negated() {
// Flags that can be negated that aren't switches, like
// --context-separator, are somewhat weird. Because of that, the docs
// for those flags should discuss the semantics of negation explicitly.
// But for switches, the behavior is always the same.
if flag.is_switch() {
write!(cleaned, "\n\nThis flag can be disabled with --{negated}.");
}
}
let indent = " ".repeat(8);
let wrapopts = textwrap::Options::new(71)
// Normally I'd be fine with breaking at hyphens, but ripgrep's docs
// includes a lot of flag names, and they in turn contain hyphens.
// Breaking flag names across lines is not great.
.word_splitter(textwrap::WordSplitter::NoHyphenation);
for (i, paragraph) in cleaned.split("\n\n").enumerate() {
if i > 0 {
write!(out, "\n\n");
}
let mut new = paragraph.to_string();
if paragraph.lines().all(|line| line.starts_with(" ")) {
// Re-indent but don't refill so as to preserve line breaks
// in code/shell example snippets.
new = textwrap::indent(&new, &indent);
} else {
new = new.replace("\n", " ");
new = textwrap::refill(&new, &wrapopts);
new = textwrap::indent(&new, &indent);
}
write!(out, "{}", new.trim_end());
}
}
/// Removes roff syntax from `v` such that the result is approximately plain
/// text readable.
///
/// This is basically a mish mash of heuristics based on the specific roff used
/// in the docs for the flags in this tool. If new kinds of roff are used in
/// the docs, then this may need to be updated to handle them.
fn remove_roff(v: &str) -> String {
let mut lines = vec![];
for line in v.trim().lines() {
assert!(!line.is_empty(), "roff should have no empty lines");
if line.starts_with(".") {
if line.starts_with(".IP ") {
let item_label = line
.split(" ")
.nth(1)
.expect("first argument to .IP")
.replace(r"\(bu", r"•")
.replace(r"\fB", "")
.replace(r"\fP", ":");
lines.push(format!("{item_label}"));
} else if line.starts_with(".IB ") || line.starts_with(".BI ") {
let pieces = line
.split_whitespace()
.skip(1)
.collect::<Vec<_>>()
.concat();
lines.push(format!("{pieces}"));
} else if line.starts_with(".sp")
|| line.starts_with(".PP")
|| line.starts_with(".TP")
{
lines.push("".to_string());
}
} else if line.starts_with(r"\fB") && line.ends_with(r"\fP") {
let line = line.replace(r"\fB", "").replace(r"\fP", "");
lines.push(format!("{line}:"));
} else {
lines.push(line.to_string());
}
}
// Squash multiple adjacent paragraph breaks into one.
lines.dedup_by(|l1, l2| l1.is_empty() && l2.is_empty());
lines
.join("\n")
.replace(r"\fB", "")
.replace(r"\fI", "")
.replace(r"\fP", "")
.replace(r"\-", "-")
.replace(r"\\", r"\")
}

View File

@@ -0,0 +1,110 @@
/*!
Provides routines for generating ripgrep's man page in `roff` format.
*/
use std::{collections::BTreeMap, fmt::Write};
use crate::flags::{defs::FLAGS, doc::version, Flag};
const TEMPLATE: &'static str = include_str!("template.rg.1");
/// Wraps `std::write!` and asserts there is no failure.
///
/// We only write to `String` in this module.
macro_rules! write {
($($tt:tt)*) => { std::write!($($tt)*).unwrap(); }
}
/// Wraps `std::writeln!` and asserts there is no failure.
///
/// We only write to `String` in this module.
macro_rules! writeln {
($($tt:tt)*) => { std::writeln!($($tt)*).unwrap(); }
}
/// Returns a `roff` formatted string corresponding to ripgrep's entire man
/// page.
pub(crate) fn generate() -> String {
let mut cats = BTreeMap::new();
for flag in FLAGS.iter().copied() {
let mut cat = cats.entry(flag.doc_category()).or_insert(String::new());
if !cat.is_empty() {
writeln!(cat, ".sp");
}
generate_flag(flag, &mut cat);
}
let mut out = TEMPLATE.replace("!!VERSION!!", &version::generate_digits());
for (cat, value) in cats.iter() {
let var = format!("!!{name}!!", name = cat.as_str());
out = out.replace(&var, value);
}
out
}
/// Writes `roff` formatted documentation for `flag` to `out`.
fn generate_flag(flag: &'static dyn Flag, out: &mut String) {
if let Some(byte) = flag.name_short() {
let name = char::from(byte);
write!(out, r"\fB\-{name}\fP");
if let Some(var) = flag.doc_variable() {
write!(out, r" \fI{var}\fP");
}
write!(out, r", ");
}
let name = flag.name_long();
write!(out, r"\fB\-\-{name}\fP");
if let Some(var) = flag.doc_variable() {
write!(out, r"=\fI{var}\fP");
}
write!(out, "\n");
writeln!(out, ".RS 4");
let doc = flag.doc_long().trim();
// Convert \flag{foo} into something nicer.
let doc = super::render_custom_markup(doc, "flag", |name, out| {
let Some(flag) = crate::flags::parse::lookup(name) else {
unreachable!(r"found unrecognized \flag{{{name}}} in roff docs")
};
out.push_str(r"\fB");
if let Some(name) = flag.name_short() {
write!(out, r"\-{}/", char::from(name));
}
write!(out, r"\-\-{}", flag.name_long());
out.push_str(r"\fP");
});
// Convert \flag-negate{foo} into something nicer.
let doc = super::render_custom_markup(&doc, "flag-negate", |name, out| {
let Some(flag) = crate::flags::parse::lookup(name) else {
unreachable!(
r"found unrecognized \flag-negate{{{name}}} in roff docs"
)
};
let Some(name) = flag.name_negated() else {
let long = flag.name_long();
unreachable!(
"found \\flag-negate{{{long}}} in roff docs but \
{long} does not have a negation"
);
};
out.push_str(r"\fB");
write!(out, r"\-\-{name}");
out.push_str(r"\fP");
});
writeln!(out, "{doc}");
if let Some(negated) = flag.name_negated() {
// Flags that can be negated that aren't switches, like
// --context-separator, are somewhat weird. Because of that, the docs
// for those flags should discuss the semantics of negation explicitly.
// But for switches, the behavior is always the same.
if flag.is_switch() {
writeln!(out, ".sp");
writeln!(
out,
r"This flag can be disabled with \fB\-\-{negated}\fP."
);
}
}
writeln!(out, ".RE");
}

View File

@@ -0,0 +1,38 @@
/*!
Modules for generating documentation for ripgrep's flags.
*/
pub(crate) mod help;
pub(crate) mod man;
pub(crate) mod version;
/// Searches for `\tag{...}` occurrences in `doc` and calls `replacement` for
/// each such tag found.
///
/// The first argument given to `replacement` is the tag value, `...`. The
/// second argument is the buffer that accumulates the full replacement text.
///
/// Since this function is only intended to be used on doc strings written into
/// the program source code, callers should panic in `replacement` if there are
/// any errors or unexpected circumstances.
fn render_custom_markup(
mut doc: &str,
tag: &str,
mut replacement: impl FnMut(&str, &mut String),
) -> String {
let mut out = String::with_capacity(doc.len());
let tag_prefix = format!(r"\{tag}{{");
while let Some(offset) = doc.find(&tag_prefix) {
out.push_str(&doc[..offset]);
let start = offset + tag_prefix.len();
let Some(end) = doc[start..].find('}').map(|i| start + i) else {
unreachable!(r"found {tag_prefix} without closing }}");
};
let name = &doc[start..end];
replacement(name, &mut out);
doc = &doc[end + 1..];
}
out.push_str(doc);
out
}

View File

@@ -0,0 +1,61 @@
ripgrep !!VERSION!!
Andrew Gallant <jamslam@gmail.com>
ripgrep (rg) recursively searches the current directory for lines matching
a regex pattern. By default, ripgrep will respect gitignore rules and
automatically skip hidden files/directories and binary files.
Use -h for short descriptions and --help for more details.
Project home page: https://github.com/BurntSushi/ripgrep
USAGE:
rg [OPTIONS] PATTERN [PATH ...]
rg [OPTIONS] -e PATTERN ... [PATH ...]
rg [OPTIONS] -f PATTERNFILE ... [PATH ...]
rg [OPTIONS] --files [PATH ...]
rg [OPTIONS] --type-list
command | rg [OPTIONS] PATTERN
rg [OPTIONS] --help
rg [OPTIONS] --version
POSITIONAL ARGUMENTS:
<PATTERN>
A regular expression used for searching. To match a pattern beginning
with a dash, use the -e/--regexp flag.
For example, to search for the literal '-foo', you can use this flag:
rg -e -foo
You can also use the special '--' delimiter to indicate that no more
flags will be provided. Namely, the following is equivalent to the
above:
rg -- -foo
<PATH>...
A file or directory to search. Directories are searched recursively.
File paths specified on the command line override glob and ignore
rules.
INPUT OPTIONS:
!!input!!
SEARCH OPTIONS:
!!search!!
FILTER OPTIONS:
!!filter!!
OUTPUT OPTIONS:
!!output!!
OUTPUT MODES:
!!output-modes!!
LOGGING OPTIONS:
!!logging!!
OTHER BEHAVIORS:
!!other-behaviors!!

View File

@@ -0,0 +1,424 @@
.TH RG 1 2023-11-26 "!!VERSION!!" "User Commands"
.
.
.SH NAME
rg \- recursively search the current directory for lines matching a pattern
.
.
.SH SYNOPSIS
.\" I considered using GNU troff's .SY and .YS "synopsis" macros here, but it
.\" looks like they aren't portable. Specifically, they don't appear to be in
.\" BSD's mdoc used on macOS.
.sp
\fBrg\fP [\fIOPTIONS\fP] \fIPATTERN\fP [\fIPATH\fP...]
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-e\fP \fIPATTERN\fP... [\fIPATH\fP...]
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-f\fP \fIPATTERNFILE\fP... [\fIPATH\fP...]
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-\-files\fP [\fIPATH\fP...]
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-\-type\-list\fP
.sp
\fIcommand\fP | \fBrg\fP [\fIOPTIONS\fP] \fIPATTERN\fP
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-\-help\fP
.sp
\fBrg\fP [\fIOPTIONS\fP] \fB\-\-version\fP
.
.
.SH DESCRIPTION
ripgrep (rg) recursively searches the current directory for a regex pattern.
By default, ripgrep will respect your \fB.gitignore\fP and automatically skip
hidden files/directories and binary files.
.sp
ripgrep's default regex engine uses finite automata and guarantees linear
time searching. Because of this, features like backreferences and arbitrary
look-around are not supported. However, if ripgrep is built with PCRE2,
then the \fB\-P/\-\-pcre2\fP flag can be used to enable backreferences and
look-around.
.sp
ripgrep supports configuration files. Set \fBRIPGREP_CONFIG_PATH\fP to a
configuration file. The file can specify one shell argument per line. Lines
starting with \fB#\fP are ignored. For more details, see \fBCONFIGURATION
FILES\fP below.
.sp
ripgrep will automatically detect if stdin exists and search stdin for a regex
pattern, e.g. \fBls | rg foo\fP. In some environments, stdin may exist when
it shouldn't. To turn off stdin detection, one can explicitly specify the
directory to search, e.g. \fBrg foo ./\fP.
.sp
Like other tools such as \fBls\fP, ripgrep will alter its output depending on
whether stdout is connected to a tty. By default, when printing a tty, ripgrep
will enable colors, line numbers and a heading format that lists each matching
file path once instead of once per matching line.
.sp
Tip: to disable all smart filtering and make ripgrep behave a bit more like
classical grep, use \fBrg -uuu\fP.
.
.
.SH REGEX SYNTAX
ripgrep uses Rust's regex engine by default, which documents its syntax:
\fIhttps://docs.rs/regex/1.*/regex/#syntax\fP
.sp
ripgrep uses byte-oriented regexes, which has some additional documentation:
\fIhttps://docs.rs/regex/1.*/regex/bytes/index.html#syntax\fP
.sp
To a first approximation, ripgrep uses Perl-like regexes without look-around or
backreferences. This makes them very similar to the "extended" (ERE) regular
expressions supported by *egrep*, but with a few additional features like
Unicode character classes.
.sp
If you're using ripgrep with the \fB\-P/\-\-pcre2\fP flag, then please consult
\fIhttps://www.pcre.org\fP or the PCRE2 man pages for documentation on the
supported syntax.
.
.
.SH POSITIONAL ARGUMENTS
.TP 12
\fIPATTERN\fP
A regular expression used for searching. To match a pattern beginning with a
dash, use the \fB\-e/\-\-regexp\fP option.
.TP 12
\fIPATH\fP
A file or directory to search. Directories are searched recursively. File paths
specified explicitly on the command line override glob and ignore rules.
.
.
.SH OPTIONS
This section documents all flags that ripgrep accepts. Flags are grouped into
categories below according to their function.
.sp
Note that many options can be turned on and off. In some cases, those flags are
not listed explicitly below. For example, the \fB\-\-column\fP flag (listed
below) enables column numbers in ripgrep's output, but the \fB\-\-no\-column\fP
flag (not listed below) disables them. The reverse can also exist. For example,
the \fB\-\-no\-ignore\fP flag (listed below) disables ripgrep's \fBgitignore\fP
logic, but the \fB\-\-ignore\fP flag (not listed below) enables it. These
flags are useful for overriding a ripgrep configuration file (or alias) on the
command line. Each flag's documentation notes whether an inverted flag exists.
In all cases, the flag specified last takes precedence.
.
.SS INPUT OPTIONS
!!input!!
.
.SS SEARCH OPTIONS
!!search!!
.
.SS FILTER OPTIONS
!!filter!!
.
.SS OUTPUT OPTIONS
!!output!!
.
.SS OUTPUT MODES
!!output-modes!!
.
.SS LOGGING OPTIONS
!!logging!!
.
.SS OTHER BEHAVIORS
!!other-behaviors!!
.
.
.SH EXIT STATUS
If ripgrep finds a match, then the exit status of the program is \fB0\fP.
If no match could be found, then the exit status is \fB1\fP. If an error
occurred, then the exit status is always \fB2\fP unless ripgrep was run with
the \fB\-q/\-\-quiet\fP flag and a match was found. In summary:
.sp
.IP \(bu 3n
\fB0\fP exit status occurs only when at least one match was found, and if
no error occurred, unless \fB\-q/\-\-quiet\fP was given.
.
.IP \(bu 3n
\fB1\fP exit status occurs only when no match was found and no error occurred.
.
.IP \(bu 3n
\fB2\fP exit status occurs when an error occurred. This is true for both
catastrophic errors (e.g., a regex syntax error) and for soft errors (e.g.,
unable to read a file).
.
.
.SH AUTOMATIC FILTERING
ripgrep does a fair bit of automatic filtering by default. This section
describes that filtering and how to control it.
.sp
\fBTIP\fP: To disable automatic filtering, use \fBrg -uuu\fP.
.sp
ripgrep's automatic "smart" filtering is one of the most apparent
differentiating features between ripgrep and other tools like \fBgrep\fP. As
such, its behavior may be surprising to users that aren't expecting it.
.sp
ripgrep does four types of filtering automatically:
.sp
.
.IP 1. 3n
Files and directories that match ignore rules are not searched.
.IP 2. 3n
Hidden files and directories are not searched.
.IP 3. 3n
Binary files (files with a \fBNUL\fP byte) are not searched.
.IP 4. 3n
Symbolic links are not followed.
.PP
The first type of filtering is the most sophisticated. ripgrep will attempt to
respect your \fBgitignore\fP rules as faithfully as possible. In particular,
this includes the following:
.
.IP \(bu 3n
Any global rules, e.g., in \fB$HOME/.config/git/ignore\fP.
.
.IP \(bu 3n
Any rules in relevant \fB.gitignore\fP files. This includes \fB.gitignore\fP
files in parent directories that are part of the same \fBgit\fP repository.
(Unless \fB\-\-no\-require\-git\fP is given.)
.
.IP \(bu 3n
Any local rules, e.g., in \fB.git/info/exclude\fP.
.PP
In some cases, ripgrep and \fBgit\fP will not always be in sync in terms
of which files are ignored. For example, a file that is ignored via
\fB.gitignore\fP but is tracked by \fBgit\fP would not be searched by ripgrep
even though \fBgit\fP tracks it. This is unlikely to ever be fixed. Instead,
you should either make sure your exclude rules match the files you track
precisely, or otherwise use \fBgit grep\fP for search.
.sp
Additional ignore rules can be provided outside of a \fBgit\fP context:
.
.IP \(bu 3n
Any rules in \fB.ignore\fP. ripgrep will also respect \fB.ignore\fP files in
parent directories.
.
.IP \(bu 3n
Any rules in \fB.rgignore\fP. ripgrep will also respect \fB.rgignore\fP files
in parent directories.
.
.IP \(bu 3n
Any rules in files specified with the \fB\-\-ignore\-file\fP flag.
.PP
The precedence of ignore rules is as follows, with later items overriding
earlier items:
.
.IP \(bu 3n
Files given by \fB\-\-ignore\-file\fP.
.
.IP \(bu 3n
Global gitignore rules, e.g., from \fB$HOME/.config/git/ignore\fP.
.
.IP \(bu 3n
Local rules from \fB.git/info/exclude\fP.
.
.IP \(bu 3n
Rules from \fB.gitignore\fP.
.
.IP \(bu 3n
Rules from \fB.ignore\fP.
.
.IP \(bu 3n
Rules from \fB.rgignore\fP.
.PP
So for example, if \fIfoo\fP were in a \fB.gitignore\fP and \fB!\fP\fIfoo\fP
were in an \fB.rgignore\fP, then \fIfoo\fP would not be ignored since
\fB.rgignore\fP takes precedence over \fB.gitignore\fP.
.sp
Each of the types of filtering can be configured via command line flags:
.
.IP \(bu 3n
There are several flags starting with \fB\-\-no\-ignore\fP that toggle which,
if any, ignore rules are respected. \fB\-\-no\-ignore\fP by itself will disable
all
of them.
.
.IP \(bu 3n
\fB\-./\-\-hidden\fP will force ripgrep to search hidden files and directories.
.
.IP \(bu 3n
\fB\-\-binary\fP will force ripgrep to search binary files.
.
.IP \(bu 3n
\fB\-L/\-\-follow\fP will force ripgrep to follow symlinks.
.PP
As a special short hand, the \fB\-u\fP flag can be specified up to three times.
Each additional time incrementally decreases filtering:
.
.IP \(bu 3n
\fB\-u\fP is equivalent to \fB\-\-no\-ignore\fP.
.
.IP \(bu 3n
\fB\-uu\fP is equivalent to \fB\-\-no\-ignore \-\-hidden\fP.
.
.IP \(bu 3n
\fB\-uuu\fP is equivalent to \fB\-\-no\-ignore \-\-hidden \-\-binary\fP.
.PP
In particular, \fBrg -uuu\fP should search the same exact content as \fBgrep
-r\fP.
.
.
.SH CONFIGURATION FILES
ripgrep supports reading configuration files that change ripgrep's default
behavior. The format of the configuration file is an "rc" style and is very
simple. It is defined by two rules:
.
.IP 1. 3n
Every line is a shell argument, after trimming whitespace.
.
.IP 2. 3n
Lines starting with \fB#\fP (optionally preceded by any amount of whitespace)
are ignored.
.PP
ripgrep will look for a single configuration file if and only if the
\fBRIPGREP_CONFIG_PATH\fP environment variable is set and is non-empty.
ripgrep will parse arguments from this file on startup and will behave as if
the arguments in this file were prepended to any explicit arguments given to
ripgrep on the command line. Note though that the \fBrg\fP command you run
must still be valid. That is, it must always contain at least one pattern at
the command line, even if the configuration file uses the \fB\-e/\-\-regexp\fP
flag.
.sp
For example, if your ripgreprc file contained a single line:
.sp
.EX
\-\-smart\-case
.EE
.sp
then the following command
.sp
.EX
RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo
.EE
.sp
would behave identically to the following command:
.sp
.EX
rg \-\-smart-case foo
.EE
.sp
Another example is adding types, like so:
.sp
.EX
\-\-type-add
web:*.{html,css,js}*
.EE
.sp
The above would behave identically to the following command:
.sp
.EX
rg \-\-type\-add 'web:*.{html,css,js}*' foo
.EE
.sp
The same applies to using globs. This:
.sp
.EX
\-\-glob=!.git
.EE
.sp
or this:
.sp
.EX
\-\-glob
!.git
.EE
.sp
would behave identically to the following command:
.sp
.EX
rg \-\-glob '!.git' foo
.EE
.sp
The bottom line is that every shell argument needs to be on its own line. So
for example, a config file containing
.sp
.EX
\-j 4
.EE
.sp
is probably not doing what you intend. Instead, you want
.sp
.EX
\-j
4
.EE
.sp
or
.sp
.EX
\-j4
.EE
.sp
ripgrep also provides a flag, \fB\-\-no\-config\fP, that when present will
suppress any and all support for configuration. This includes any future
support for auto-loading configuration files from pre-determined paths.
.sp
Conflicts between configuration files and explicit arguments are handled
exactly like conflicts in the same command line invocation. That is, assuming
your config file contains only \fB\-\-smart\-case\fP, then this command:
.sp
.EX
RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo \-\-case\-sensitive
.EE
.sp
is exactly equivalent to
.sp
.EX
rg \-\-smart\-case foo \-\-case\-sensitive
.EE
.sp
in which case, the \fB\-\-case\-sensitive\fP flag would override the
\fB\-\-smart\-case\fP flag.
.
.
.SH SHELL COMPLETION
Shell completion files are included in the release tarball for Bash, Fish, Zsh
and PowerShell.
.sp
For \fBbash\fP, move \fBrg.bash\fP to \fB$XDG_CONFIG_HOME/bash_completion\fP or
\fB/etc/bash_completion.d/\fP.
.sp
For \fBfish\fP, move \fBrg.fish\fP to \fB$HOME/.config/fish/completions\fP.
.sp
For \fBzsh\fP, move \fB_rg\fP to one of your \fB$fpath\fP directories.
.
.
.SH CAVEATS
ripgrep may abort unexpectedly when using default settings if it searches a
file that is simultaneously truncated. This behavior can be avoided by passing
the \fB\-\-no\-mmap\fP flag which will forcefully disable the use of memory
maps in all cases.
.sp
ripgrep may use a large amount of memory depending on a few factors. Firstly,
if ripgrep uses parallelism for search (the default), then the entire
output for each individual file is buffered into memory in order to prevent
interleaving matches in the output. To avoid this, you can disable parallelism
with the \fB\-j1\fP flag. Secondly, ripgrep always needs to have at least a
single line in memory in order to execute a search. A file with a very long
line can thus cause ripgrep to use a lot of memory. Generally, this only occurs
when searching binary data with the \fB\-a/\-\-text\fP flag enabled. (When the
\fB\-a/\-\-text\fP flag isn't enabled, ripgrep will replace all NUL bytes with
line terminators, which typically prevents exorbitant memory usage.) Thirdly,
when ripgrep searches a large file using a memory map, the process will likely
report its resident memory usage as the size of the file. However, this does
not mean ripgrep actually needed to use that much heap memory; the operating
system will generally handle this for you.
.
.
.SH VERSION
!!VERSION!!
.
.
.SH HOMEPAGE
\fIhttps://github.com/BurntSushi/ripgrep\fP
.sp
Please report bugs and feature requests to the issue tracker. Please do your
best to provide a reproducible test case for bugs. This should include the
corpus being searched, the \fBrg\fP command, the actual output and the expected
output. Please also include the output of running the same \fBrg\fP command but
with the \fB\-\-debug\fP flag.
.sp
If you have questions that don't obviously fall into the "bug" or "feature
request" category, then they are welcome in the Discussions section of the
issue tracker: \fIhttps://github.com/BurntSushi/ripgrep/discussions\fP.
.
.
.SH AUTHORS
Andrew Gallant <\fIjamslam@gmail.com\fP>

View File

@@ -0,0 +1,38 @@
ripgrep !!VERSION!!
Andrew Gallant <jamslam@gmail.com>
ripgrep (rg) recursively searches the current directory for lines matching
a regex pattern. By default, ripgrep will respect gitignore rules and
automatically skip hidden files/directories and binary files.
Use -h for short descriptions and --help for more details.
Project home page: https://github.com/BurntSushi/ripgrep
USAGE:
rg [OPTIONS] PATTERN [PATH ...]
POSITIONAL ARGUMENTS:
<PATTERN> A regular expression used for searching.
<PATH>... A file or directory to search.
INPUT OPTIONS:
!!input!!
SEARCH OPTIONS:
!!search!!
FILTER OPTIONS:
!!filter!!
OUTPUT OPTIONS:
!!output!!
OUTPUT MODES:
!!output-modes!!
LOGGING OPTIONS:
!!logging!!
OTHER BEHAVIORS:
!!other-behaviors!!

View File

@@ -0,0 +1,180 @@
/*!
Provides routines for generating version strings.
Version strings can be just the digits, an overall short one-line description
or something more verbose that includes things like CPU target feature support.
*/
use std::fmt::Write;
/// Generates just the numerical part of the version of ripgrep.
///
/// This includes the git revision hash.
pub(crate) fn generate_digits() -> String {
let semver = option_env!("CARGO_PKG_VERSION").unwrap_or("N/A");
match option_env!("RIPGREP_BUILD_GIT_HASH") {
None => semver.to_string(),
Some(hash) => format!("{semver} (rev {hash})"),
}
}
/// Generates a short version string of the form `ripgrep x.y.z`.
pub(crate) fn generate_short() -> String {
let digits = generate_digits();
format!("ripgrep {digits}")
}
/// Generates a longer multi-line version string.
///
/// This includes not only the version of ripgrep but some other information
/// about its build. For example, SIMD support and PCRE2 support.
pub(crate) fn generate_long() -> String {
let (compile, runtime) = (compile_cpu_features(), runtime_cpu_features());
let mut out = String::new();
writeln!(out, "{}", generate_short()).unwrap();
writeln!(out).unwrap();
writeln!(out, "features:{}", features().join(",")).unwrap();
if !compile.is_empty() {
writeln!(out, "simd(compile):{}", compile.join(",")).unwrap();
}
if !runtime.is_empty() {
writeln!(out, "simd(runtime):{}", runtime.join(",")).unwrap();
}
let (pcre2_version, _) = generate_pcre2();
writeln!(out, "\n{pcre2_version}").unwrap();
out
}
/// Generates multi-line version string with PCRE2 information.
///
/// This also returns whether PCRE2 is actually available in this build of
/// ripgrep.
pub(crate) fn generate_pcre2() -> (String, bool) {
let mut out = String::new();
#[cfg(feature = "pcre2")]
{
use grep::pcre2;
let (major, minor) = pcre2::version();
write!(out, "PCRE2 {}.{} is available", major, minor).unwrap();
if cfg!(target_pointer_width = "64") && pcre2::is_jit_available() {
writeln!(out, " (JIT is available)").unwrap();
} else {
writeln!(out, " (JIT is unavailable)").unwrap();
}
(out, true)
}
#[cfg(not(feature = "pcre2"))]
{
writeln!(out, "PCRE2 is not available in this build of ripgrep.")
.unwrap();
(out, false)
}
}
/// Returns the relevant SIMD features supported by the CPU at runtime.
///
/// This is kind of a dirty violation of abstraction, since it assumes
/// knowledge about what specific SIMD features are being used by various
/// components.
fn runtime_cpu_features() -> Vec<String> {
#[cfg(target_arch = "x86_64")]
{
let mut features = vec![];
let sse2 = is_x86_feature_detected!("sse2");
features.push(format!("{sign}SSE2", sign = sign(sse2)));
let ssse3 = is_x86_feature_detected!("ssse3");
features.push(format!("{sign}SSSE3", sign = sign(ssse3)));
let avx2 = is_x86_feature_detected!("avx2");
features.push(format!("{sign}AVX2", sign = sign(avx2)));
features
}
#[cfg(target_arch = "aarch64")]
{
let mut features = vec![];
// memchr and aho-corasick only use NEON when it is available at
// compile time. This isn't strictly necessary, but NEON is supposed
// to be available for all aarch64 targets. If this isn't true, please
// file an issue at https://github.com/BurntSushi/memchr.
let neon = cfg!(target_feature = "neon");
features.push(format!("{sign}NEON", sign = sign(neon)));
features
}
#[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
{
vec![]
}
}
/// Returns the SIMD features supported while compiling ripgrep.
///
/// In essence, any features listed here are required to run ripgrep correctly.
///
/// This is kind of a dirty violation of abstraction, since it assumes
/// knowledge about what specific SIMD features are being used by various
/// components.
///
/// An easy way to enable everything available on your current CPU is to
/// compile ripgrep with `RUSTFLAGS="-C target-cpu=native"`. But note that
/// the binary produced by this will not be portable.
fn compile_cpu_features() -> Vec<String> {
#[cfg(target_arch = "x86_64")]
{
let mut features = vec![];
let sse2 = cfg!(target_feature = "sse2");
features.push(format!("{sign}SSE2", sign = sign(sse2)));
let ssse3 = cfg!(target_feature = "ssse3");
features.push(format!("{sign}SSSE3", sign = sign(ssse3)));
let avx2 = cfg!(target_feature = "avx2");
features.push(format!("{sign}AVX2", sign = sign(avx2)));
features
}
#[cfg(target_arch = "aarch64")]
{
let mut features = vec![];
let neon = cfg!(target_feature = "neon");
features.push(format!("{sign}NEON", sign = sign(neon)));
features
}
#[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
{
vec![]
}
}
/// Returns a list of "features" supported (or not) by this build of ripgrpe.
fn features() -> Vec<String> {
let mut features = vec![];
let simd_accel = cfg!(feature = "simd-accel");
features.push(format!("{sign}simd-accel", sign = sign(simd_accel)));
let pcre2 = cfg!(feature = "pcre2");
features.push(format!("{sign}pcre2", sign = sign(pcre2)));
features
}
/// Returns `+` when `enabled` is `true` and `-` otherwise.
fn sign(enabled: bool) -> &'static str {
if enabled {
"+"
} else {
"-"
}
}

1462
crates/core/flags/hiargs.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,758 @@
/*!
Provides the definition of low level arguments from CLI flags.
*/
use std::{
ffi::{OsStr, OsString},
path::PathBuf,
};
use {
bstr::{BString, ByteVec},
grep::printer::{HyperlinkFormat, UserColorSpec},
};
/// A collection of "low level" arguments.
///
/// The "low level" here is meant to constrain this type to be as close to the
/// actual CLI flags and arguments as possible. Namely, other than some
/// convenience types to help validate flag values and deal with overrides
/// between flags, these low level arguments do not contain any higher level
/// abstractions.
///
/// Another self-imposed constraint is that populating low level arguments
/// should not require anything other than validating what the user has
/// provided. For example, low level arguments should not contain a
/// `HyperlinkConfig`, since in order to get a full configuration, one needs to
/// discover the hostname of the current system (which might require running a
/// binary or a syscall).
///
/// Low level arguments are populated by the parser directly via the `update`
/// method on the corresponding implementation of the `Flag` trait.
#[derive(Debug, Default)]
pub(crate) struct LowArgs {
// Essential arguments.
pub(crate) special: Option<SpecialMode>,
pub(crate) mode: Mode,
pub(crate) positional: Vec<OsString>,
pub(crate) patterns: Vec<PatternSource>,
// Everything else, sorted lexicographically.
pub(crate) binary: BinaryMode,
pub(crate) boundary: Option<BoundaryMode>,
pub(crate) buffer: BufferMode,
pub(crate) byte_offset: bool,
pub(crate) case: CaseMode,
pub(crate) color: ColorChoice,
pub(crate) colors: Vec<UserColorSpec>,
pub(crate) column: Option<bool>,
pub(crate) context: ContextMode,
pub(crate) context_separator: ContextSeparator,
pub(crate) crlf: bool,
pub(crate) dfa_size_limit: Option<usize>,
pub(crate) encoding: EncodingMode,
pub(crate) engine: EngineChoice,
pub(crate) field_context_separator: FieldContextSeparator,
pub(crate) field_match_separator: FieldMatchSeparator,
pub(crate) fixed_strings: bool,
pub(crate) follow: bool,
pub(crate) glob_case_insensitive: bool,
pub(crate) globs: Vec<String>,
pub(crate) heading: Option<bool>,
pub(crate) hidden: bool,
pub(crate) hostname_bin: Option<PathBuf>,
pub(crate) hyperlink_format: HyperlinkFormat,
pub(crate) iglobs: Vec<String>,
pub(crate) ignore_file: Vec<PathBuf>,
pub(crate) ignore_file_case_insensitive: bool,
pub(crate) include_zero: bool,
pub(crate) invert_match: bool,
pub(crate) line_number: Option<bool>,
pub(crate) logging: Option<LoggingMode>,
pub(crate) max_columns: Option<u64>,
pub(crate) max_columns_preview: bool,
pub(crate) max_count: Option<u64>,
pub(crate) max_depth: Option<usize>,
pub(crate) max_filesize: Option<u64>,
pub(crate) mmap: MmapMode,
pub(crate) multiline: bool,
pub(crate) multiline_dotall: bool,
pub(crate) no_config: bool,
pub(crate) no_ignore_dot: bool,
pub(crate) no_ignore_exclude: bool,
pub(crate) no_ignore_files: bool,
pub(crate) no_ignore_global: bool,
pub(crate) no_ignore_messages: bool,
pub(crate) no_ignore_parent: bool,
pub(crate) no_ignore_vcs: bool,
pub(crate) no_messages: bool,
pub(crate) no_require_git: bool,
pub(crate) no_unicode: bool,
pub(crate) null: bool,
pub(crate) null_data: bool,
pub(crate) one_file_system: bool,
pub(crate) only_matching: bool,
pub(crate) path_separator: Option<u8>,
pub(crate) pre: Option<PathBuf>,
pub(crate) pre_glob: Vec<String>,
pub(crate) quiet: bool,
pub(crate) regex_size_limit: Option<usize>,
pub(crate) replace: Option<BString>,
pub(crate) search_zip: bool,
pub(crate) sort: Option<SortMode>,
pub(crate) stats: bool,
pub(crate) stop_on_nonmatch: bool,
pub(crate) threads: Option<usize>,
pub(crate) trim: bool,
pub(crate) type_changes: Vec<TypeChange>,
pub(crate) unrestricted: usize,
pub(crate) vimgrep: bool,
pub(crate) with_filename: Option<bool>,
}
/// A "special" mode that supercedes everything else.
///
/// When one of these modes is present, it overrides everything else and causes
/// ripgrep to short-circuit. In particular, we avoid converting low-level
/// argument types into higher level arguments types that can fail for various
/// reasons related to the environment. (Parsing the low-level arguments can
/// fail too, but usually not in a way that can't be worked around by removing
/// the corresponding arguments from the CLI command.) This is overall a hedge
/// to ensure that version and help information are basically always available.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum SpecialMode {
/// Show a condensed version of "help" output. Generally speaking, this
/// shows each flag and an extremely terse description of that flag on
/// a single line. This corresponds to the `-h` flag.
HelpShort,
/// Shows a very verbose version of the "help" output. The docs for some
/// flags will be paragraphs long. This corresponds to the `--help` flag.
HelpLong,
/// Show condensed version information. e.g., `ripgrep x.y.z`.
VersionShort,
/// Show verbose version information. Includes "short" information as well
/// as features included in the build.
VersionLong,
/// Show PCRE2's version information, or an error if this version of
/// ripgrep wasn't compiled with PCRE2 support.
VersionPCRE2,
}
/// The overall mode that ripgrep should operate in.
///
/// If ripgrep were designed without the legacy of grep, these would probably
/// be sub-commands? Perhaps not, since they aren't as frequently used.
///
/// The point of putting these in one enum is that they are all mutually
/// exclusive and override one another.
///
/// Note that -h/--help and -V/--version are not included in this because
/// they always overrides everything else, regardless of where it appears
/// in the command line. They are treated as "special" modes that short-circuit
/// ripgrep's usual flow.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum Mode {
/// ripgrep will execute a search of some kind.
Search(SearchMode),
/// Show the files that *would* be searched, but don't actually search
/// them.
Files,
/// List all file type definitions configured, including the default file
/// types and any additional file types added to the command line.
Types,
/// Generate various things like the man page and completion files.
Generate(GenerateMode),
}
impl Default for Mode {
fn default() -> Mode {
Mode::Search(SearchMode::Standard)
}
}
impl Mode {
/// Update this mode to the new mode while implementing various override
/// semantics. For example, a search mode cannot override a non-search
/// mode.
pub(crate) fn update(&mut self, new: Mode) {
match *self {
// If we're in a search mode, then anything can override it.
Mode::Search(_) => *self = new,
_ => {
// Once we're in a non-search mode, other non-search modes
// can override it. But search modes cannot. So for example,
// `--files -l` will still be Mode::Files.
if !matches!(*self, Mode::Search(_)) {
*self = new;
}
}
}
}
}
/// The kind of search that ripgrep is going to perform.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum SearchMode {
/// The default standard mode of operation. ripgrep looks for matches and
/// prints them when found.
///
/// There is no specific flag for this mode since it's the default. But
/// some of the modes below, like JSON, have negation flags like --no-json
/// that let you revert back to this default mode.
Standard,
/// Show files containing at least one match.
FilesWithMatches,
/// Show files that don't contain any matches.
FilesWithoutMatch,
/// Show files containing at least one match and the number of matching
/// lines.
Count,
/// Show files containing at least one match and the total number of
/// matches.
CountMatches,
/// Print matches in a JSON lines format.
JSON,
}
/// The thing to generate via the --generate flag.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum GenerateMode {
/// Generate the raw roff used for the man page.
Man,
/// Completions for bash.
CompleteBash,
/// Completions for zsh.
CompleteZsh,
/// Completions for fish.
CompleteFish,
/// Completions for PowerShell.
CompletePowerShell,
}
/// Indicates how ripgrep should treat binary data.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum BinaryMode {
/// Automatically determine the binary mode to use. Essentially, when
/// a file is searched explicitly, then it will be searched using the
/// `SearchAndSuppress` strategy. Otherwise, it will be searched in a way
/// that attempts to skip binary files as much as possible. That is, once
/// a file is classified as binary, searching will immediately stop.
Auto,
/// Search files even when they have binary data, but if a match is found,
/// suppress it and emit a warning.
///
/// In this mode, `NUL` bytes are replaced with line terminators. This is
/// a heuristic meant to reduce heap memory usage, since true binary data
/// isn't line oriented. If one attempts to treat such data as line
/// oriented, then one may wind up with impractically large lines. For
/// example, many binary files contain very long runs of NUL bytes.
SearchAndSuppress,
/// Treat all files as if they were plain text. There's no skipping and no
/// replacement of `NUL` bytes with line terminators.
AsText,
}
impl Default for BinaryMode {
fn default() -> BinaryMode {
BinaryMode::Auto
}
}
/// Indicates what kind of boundary mode to use (line or word).
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum BoundaryMode {
/// Only allow matches when surrounded by line bounaries.
Line,
/// Only allow matches when surrounded by word bounaries.
Word,
}
/// Indicates the buffer mode that ripgrep should use when printing output.
///
/// The default is `Auto`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum BufferMode {
/// Select the buffer mode, 'line' or 'block', automatically based on
/// whether stdout is connected to a tty.
Auto,
/// Flush the output buffer whenever a line terminator is seen.
///
/// This is useful when wants to see search results more immediately,
/// for example, with `tail -f`.
Line,
/// Flush the output buffer whenever it reaches some fixed size. The size
/// is usually big enough to hold many lines.
///
/// This is useful for maximum performance, particularly when printing
/// lots of results.
Block,
}
impl Default for BufferMode {
fn default() -> BufferMode {
BufferMode::Auto
}
}
/// Indicates the case mode for how to interpret all patterns given to ripgrep.
///
/// The default is `Sensitive`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum CaseMode {
/// Patterns are matched case sensitively. i.e., `a` does not match `A`.
Sensitive,
/// Patterns are matched case insensitively. i.e., `a` does match `A`.
Insensitive,
/// Patterns are automatically matched case insensitively only when they
/// consist of all lowercase literal characters. For example, the pattern
/// `a` will match `A` but `A` will not match `a`.
Smart,
}
impl Default for CaseMode {
fn default() -> CaseMode {
CaseMode::Sensitive
}
}
/// Indicates whether ripgrep should include color/hyperlinks in its output.
///
/// The default is `Auto`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum ColorChoice {
/// Color and hyperlinks will never be used.
Never,
/// Color and hyperlinks will be used only when stdout is connected to a
/// tty.
Auto,
/// Color will always be used.
Always,
/// Color will always be used and only ANSI escapes will be used.
///
/// This only makes sense in the context of legacy Windows console APIs.
/// At time of writing, ripgrep will try to use the legacy console APIs
/// if ANSI coloring isn't believed to be possible. This option will force
/// ripgrep to use ANSI coloring.
Ansi,
}
impl Default for ColorChoice {
fn default() -> ColorChoice {
ColorChoice::Auto
}
}
impl ColorChoice {
/// Convert this color choice to the corresponding termcolor type.
pub(crate) fn to_termcolor(&self) -> termcolor::ColorChoice {
match *self {
ColorChoice::Never => termcolor::ColorChoice::Never,
ColorChoice::Auto => termcolor::ColorChoice::Auto,
ColorChoice::Always => termcolor::ColorChoice::Always,
ColorChoice::Ansi => termcolor::ColorChoice::AlwaysAnsi,
}
}
}
/// Indicates the line context options ripgrep should use for output.
///
/// The default is no context at all.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum ContextMode {
/// All lines will be printed. That is, the context is unbounded.
Passthru,
/// Only show a certain number of lines before and after each match.
Limited(ContextModeLimited),
}
impl Default for ContextMode {
fn default() -> ContextMode {
ContextMode::Limited(ContextModeLimited::default())
}
}
impl ContextMode {
/// Set the "before" context.
///
/// If this was set to "passthru" context, then it is overridden in favor
/// of limited context with the given value for "before" and `0` for
/// "after."
pub(crate) fn set_before(&mut self, lines: usize) {
match *self {
ContextMode::Passthru => {
*self = ContextMode::Limited(ContextModeLimited {
before: Some(lines),
after: None,
both: None,
})
}
ContextMode::Limited(ContextModeLimited {
ref mut before,
..
}) => *before = Some(lines),
}
}
/// Set the "after" context.
///
/// If this was set to "passthru" context, then it is overridden in favor
/// of limited context with the given value for "after" and `0` for
/// "before."
pub(crate) fn set_after(&mut self, lines: usize) {
match *self {
ContextMode::Passthru => {
*self = ContextMode::Limited(ContextModeLimited {
before: None,
after: Some(lines),
both: None,
})
}
ContextMode::Limited(ContextModeLimited {
ref mut after, ..
}) => *after = Some(lines),
}
}
/// Set the "both" context.
///
/// If this was set to "passthru" context, then it is overridden in favor
/// of limited context with the given value for "both" and `None` for
/// "before" and "after".
pub(crate) fn set_both(&mut self, lines: usize) {
match *self {
ContextMode::Passthru => {
*self = ContextMode::Limited(ContextModeLimited {
before: None,
after: None,
both: Some(lines),
})
}
ContextMode::Limited(ContextModeLimited {
ref mut both, ..
}) => *both = Some(lines),
}
}
/// A convenience function for use in tests that returns the limited
/// context. If this mode isn't limited, then it panics.
#[cfg(test)]
pub(crate) fn get_limited(&self) -> (usize, usize) {
match *self {
ContextMode::Passthru => unreachable!("context mode is passthru"),
ContextMode::Limited(ref limited) => limited.get(),
}
}
}
/// A context mode for a finite number of lines.
///
/// Namely, this indicates that a specific number of lines (possibly zero)
/// should be shown before and/or after each matching line.
///
/// Note that there is a subtle difference between `Some(0)` and `None`. In the
/// former case, it happens when `0` is given explicitly, where as `None` is
/// the default value and occurs when no value is specified.
///
/// `both` is only set by the -C/--context flag. The reason why we don't just
/// set before = after = --context is because the before and after context
/// settings always take precedent over the -C/--context setting, regardless of
/// order. Thus, we need to keep track of them separately.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) struct ContextModeLimited {
before: Option<usize>,
after: Option<usize>,
both: Option<usize>,
}
impl ContextModeLimited {
/// Returns the specific number of contextual lines that should be shown
/// around each match. This takes proper precedent into account, i.e.,
/// that `before` and `after` both partially override `both` in all cases.
///
/// By default, this returns `(0, 0)`.
pub(crate) fn get(&self) -> (usize, usize) {
let (mut before, mut after) =
self.both.map(|lines| (lines, lines)).unwrap_or((0, 0));
// --before and --after always override --context, regardless
// of where they appear relative to each other.
if let Some(lines) = self.before {
before = lines;
}
if let Some(lines) = self.after {
after = lines;
}
(before, after)
}
}
/// Represents the separator to use between non-contiguous sections of
/// contextual lines.
///
/// The default is `--`.
#[derive(Clone, Debug, Eq, PartialEq)]
pub(crate) struct ContextSeparator(Option<BString>);
impl Default for ContextSeparator {
fn default() -> ContextSeparator {
ContextSeparator(Some(BString::from("--")))
}
}
impl ContextSeparator {
/// Create a new context separator from the user provided argument. This
/// handles unescaping.
pub(crate) fn new(os: &OsStr) -> anyhow::Result<ContextSeparator> {
let Some(string) = os.to_str() else {
anyhow::bail!(
"separator must be valid UTF-8 (use escape sequences \
to provide a separator that is not valid UTF-8)"
)
};
Ok(ContextSeparator(Some(Vec::unescape_bytes(string).into())))
}
/// Creates a new separator that intructs the printer to disable contextual
/// separators entirely.
pub(crate) fn disabled() -> ContextSeparator {
ContextSeparator(None)
}
/// Return the raw bytes of this separator.
///
/// If context separators were disabled, then this returns `None`.
///
/// Note that this may return a `Some` variant with zero bytes.
pub(crate) fn into_bytes(self) -> Option<Vec<u8>> {
self.0.map(|sep| sep.into())
}
}
/// The encoding mode the searcher will use.
///
/// The default is `Auto`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum EncodingMode {
/// Use only BOM sniffing to auto-detect an encoding.
Auto,
/// Use an explicit encoding forcefully, but let BOM sniffing override it.
Some(grep::searcher::Encoding),
/// Use no explicit encoding and disable all BOM sniffing. This will
/// always result in searching the raw bytes, regardless of their
/// true encoding.
Disabled,
}
impl Default for EncodingMode {
fn default() -> EncodingMode {
EncodingMode::Auto
}
}
/// The regex engine to use.
///
/// The default is `Default`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum EngineChoice {
/// Uses the default regex engine: Rust's `regex` crate.
///
/// (Well, technically it uses `regex-automata`, but `regex-automata` is
/// the implementation of the `regex` crate.)
Default,
/// Dynamically select the right engine to use.
///
/// This works by trying to use the default engine, and if the pattern does
/// not compile, it switches over to the PCRE2 engine if it's available.
Auto,
/// Uses the PCRE2 regex engine if it's available.
PCRE2,
}
impl Default for EngineChoice {
fn default() -> EngineChoice {
EngineChoice::Default
}
}
/// The field context separator to use to between metadata for each contextual
/// line.
///
/// The default is `-`.
#[derive(Clone, Debug, Eq, PartialEq)]
pub(crate) struct FieldContextSeparator(BString);
impl Default for FieldContextSeparator {
fn default() -> FieldContextSeparator {
FieldContextSeparator(BString::from("-"))
}
}
impl FieldContextSeparator {
/// Create a new separator from the given argument value provided by the
/// user. Unescaping it automatically handled.
pub(crate) fn new(os: &OsStr) -> anyhow::Result<FieldContextSeparator> {
let Some(string) = os.to_str() else {
anyhow::bail!(
"separator must be valid UTF-8 (use escape sequences \
to provide a separator that is not valid UTF-8)"
)
};
Ok(FieldContextSeparator(Vec::unescape_bytes(string).into()))
}
/// Return the raw bytes of this separator.
///
/// Note that this may return an empty `Vec`.
pub(crate) fn into_bytes(self) -> Vec<u8> {
self.0.into()
}
}
/// The field match separator to use to between metadata for each matching
/// line.
///
/// The default is `:`.
#[derive(Clone, Debug, Eq, PartialEq)]
pub(crate) struct FieldMatchSeparator(BString);
impl Default for FieldMatchSeparator {
fn default() -> FieldMatchSeparator {
FieldMatchSeparator(BString::from(":"))
}
}
impl FieldMatchSeparator {
/// Create a new separator from the given argument value provided by the
/// user. Unescaping it automatically handled.
pub(crate) fn new(os: &OsStr) -> anyhow::Result<FieldMatchSeparator> {
let Some(string) = os.to_str() else {
anyhow::bail!(
"separator must be valid UTF-8 (use escape sequences \
to provide a separator that is not valid UTF-8)"
)
};
Ok(FieldMatchSeparator(Vec::unescape_bytes(string).into()))
}
/// Return the raw bytes of this separator.
///
/// Note that this may return an empty `Vec`.
pub(crate) fn into_bytes(self) -> Vec<u8> {
self.0.into()
}
}
/// The type of logging to do. `Debug` emits some details while `Trace` emits
/// much more.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum LoggingMode {
Debug,
Trace,
}
/// Indicates when to use memory maps.
///
/// The default is `Auto`.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum MmapMode {
/// This instructs ripgrep to use heuristics for selecting when to and not
/// to use memory maps for searching.
Auto,
/// This instructs ripgrep to always try memory maps when possible. (Memory
/// maps are not possible to use in all circumstances, for example, for
/// virtual files.)
AlwaysTryMmap,
/// Never use memory maps under any circumstances. This includes even
/// when multi-line search is enabled where ripgrep will read the entire
/// contents of a file on to the heap before searching it.
Never,
}
impl Default for MmapMode {
fn default() -> MmapMode {
MmapMode::Auto
}
}
/// Represents a source of patterns that ripgrep should search for.
///
/// The reason to unify these is so that we can retain the order of `-f/--flag`
/// and `-e/--regexp` flags relative to one another.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum PatternSource {
/// Comes from the `-e/--regexp` flag.
Regexp(String),
/// Comes from the `-f/--file` flag.
File(PathBuf),
}
/// The sort criteria, if present.
#[derive(Debug, Eq, PartialEq)]
pub(crate) struct SortMode {
/// Whether to reverse the sort criteria (i.e., descending order).
pub(crate) reverse: bool,
/// The actual sorting criteria.
pub(crate) kind: SortModeKind,
}
/// The criteria to use for sorting.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum SortModeKind {
/// Sort by path.
Path,
/// Sort by last modified time.
LastModified,
/// Sort by last accessed time.
LastAccessed,
/// Sort by creation time.
Created,
}
impl SortMode {
/// Checks whether the selected sort mode is supported. If it isn't, an
/// error (hopefully explaining why) is returned.
pub(crate) fn supported(&self) -> anyhow::Result<()> {
match self.kind {
SortModeKind::Path => Ok(()),
SortModeKind::LastModified => {
let md = std::env::current_exe()
.and_then(|p| p.metadata())
.and_then(|md| md.modified());
let Err(err) = md else { return Ok(()) };
anyhow::bail!(
"sorting by last modified isn't supported: {err}"
);
}
SortModeKind::LastAccessed => {
let md = std::env::current_exe()
.and_then(|p| p.metadata())
.and_then(|md| md.accessed());
let Err(err) = md else { return Ok(()) };
anyhow::bail!(
"sorting by last accessed isn't supported: {err}"
);
}
SortModeKind::Created => {
let md = std::env::current_exe()
.and_then(|p| p.metadata())
.and_then(|md| md.created());
let Err(err) = md else { return Ok(()) };
anyhow::bail!(
"sorting by creation time isn't supported: {err}"
);
}
}
}
}
/// A single instance of either a change or a selection of one ripgrep's
/// file types.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum TypeChange {
/// Clear the given type from ripgrep.
Clear { name: String },
/// Add the given type definition (name and glob) to ripgrep.
Add { def: String },
/// Select the given type for filtering.
Select { name: String },
/// Select the given type for filtering but negate it.
Negate { name: String },
}

302
crates/core/flags/mod.rs Normal file
View File

@@ -0,0 +1,302 @@
/*!
Defines ripgrep's command line interface.
This modules deals with everything involving ripgrep's flags and positional
arguments. This includes generating shell completions, `--help` output and even
ripgrep's man page. It's also responsible for parsing and validating every
flag (including reading ripgrep's config file), and manages the contact points
between these flags and ripgrep's cast of supporting libraries. For example,
once [`HiArgs`] has been created, it knows how to create a multi threaded
recursive directory traverser.
*/
use std::{
ffi::OsString,
fmt::Debug,
panic::{RefUnwindSafe, UnwindSafe},
};
pub(crate) use crate::flags::{
complete::{
bash::generate as generate_complete_bash,
fish::generate as generate_complete_fish,
powershell::generate as generate_complete_powershell,
zsh::generate as generate_complete_zsh,
},
doc::{
help::{
generate_long as generate_help_long,
generate_short as generate_help_short,
},
man::generate as generate_man_page,
version::{
generate_long as generate_version_long,
generate_pcre2 as generate_version_pcre2,
generate_short as generate_version_short,
},
},
hiargs::HiArgs,
lowargs::{GenerateMode, Mode, SearchMode, SpecialMode},
parse::{parse, ParseResult},
};
mod complete;
mod config;
mod defs;
mod doc;
mod hiargs;
mod lowargs;
mod parse;
/// A trait that encapsulates the definition of an optional flag for ripgrep.
///
/// This trait is meant to be used via dynamic dispatch. Namely, the `defs`
/// module provides a single global slice of `&dyn Flag` values correspondings
/// to all of the flags in ripgrep.
///
/// ripgrep's required positional arguments are handled by the parser and by
/// the conversion from low-level arguments to high level arguments. Namely,
/// all of ripgrep's positional arguments are treated as file paths, except
/// in certain circumstances where the first argument is treated as a regex
/// pattern.
///
/// Note that each implementation of this trait requires a long flag name,
/// but can also optionally have a short version and even a negation flag.
/// For example, the `-E/--encoding` flag accepts a value, but it also has a
/// `--no-encoding` negation flag for reverting back to "automatic" encoding
/// detection. All three of `-E`, `--encoding` and `--no-encoding` are provided
/// by a single implementation of this trait.
///
/// ripgrep only supports flags that are switches or flags that accept a single
/// value. Flags that accept multiple values are an unsupported abberation.
trait Flag: Debug + Send + Sync + UnwindSafe + RefUnwindSafe + 'static {
/// Returns true if this flag is a switch. When a flag is a switch, the
/// CLI parser will not look for a value after the flag is seen.
fn is_switch(&self) -> bool;
/// A short single byte name for this flag. This returns `None` by default,
/// which signifies that the flag has no short name.
///
/// The byte returned must be an ASCII codepoint that is a `.` or is
/// alpha-numeric.
fn name_short(&self) -> Option<u8> {
None
}
/// Returns the long name of this flag. All flags must have a "long" name.
///
/// The long name must be at least 2 bytes, and all of its bytes must be
/// ASCII codepoints that are either `-` or alpha-numeric.
fn name_long(&self) -> &'static str;
/// Returns a list of aliases for this flag.
///
/// The aliases must follow the same rules as `Flag::name_long`.
///
/// By default, an empty slice is returned.
fn aliases(&self) -> &'static [&'static str] {
&[]
}
/// Returns a negated name for this flag. The negation of a flag is
/// intended to have the opposite meaning of a flag or to otherwise turn
/// something "off" or revert it to its default behavior.
///
/// Negated flags are not listed in their own section in the `-h/--help`
/// output or man page. Instead, they are automatically mentioned at the
/// end of the documentation section of the flag they negated.
///
/// The aliases must follow the same rules as `Flag::name_long`.
///
/// By default, a flag has no negation and this returns `None`.
fn name_negated(&self) -> Option<&'static str> {
None
}
/// Returns the variable name describing the type of value this flag
/// accepts. This should always be set for non-switch flags and never set
/// for switch flags.
///
/// For example, the `--max-count` flag has its variable name set to `NUM`.
///
/// The convention is to capitalize variable names.
///
/// By default this returns `None`.
fn doc_variable(&self) -> Option<&'static str> {
None
}
/// Returns the category of this flag.
///
/// Every flag must have a single category. Categories are used to organize
/// flags in the generated documentation.
fn doc_category(&self) -> Category;
/// A (very) short documentation string describing what this flag does.
///
/// This may sacrifice "proper English" in order to be as terse as
/// possible. Generally, we try to ensure that `rg -h` doesn't have any
/// lines that exceed 79 columns.
fn doc_short(&self) -> &'static str;
/// A (possibly very) longer documentation string describing in full
/// detail what this flag does. This should be in mandoc/mdoc format.
fn doc_long(&self) -> &'static str;
/// If this is a non-switch flag that accepts a small set of specific
/// values, then this should list them.
///
/// This returns an empty slice by default.
fn doc_choices(&self) -> &'static [&'static str] {
&[]
}
fn completion_type(&self) -> CompletionType {
CompletionType::Other
}
/// Given the parsed value (which might just be a switch), this should
/// update the state in `args` based on the value given for this flag.
///
/// This may update state for other flags as appropriate.
///
/// The `-V/--version` and `-h/--help` flags are treated specially in the
/// parser and should do nothing here.
///
/// By convention, implementations should generally not try to "do"
/// anything other than validate the value given. For example, the
/// implementation for `--hostname-bin` should not try to resolve the
/// hostname to use by running the binary provided. That should be saved
/// for a later step. This convention is used to ensure that getting the
/// low-level arguments is as reliable and quick as possible. It also
/// ensures that "doing something" occurs a minimal number of times. For
/// example, by avoiding trying to find the hostname here, we can do it
/// once later no matter how many times `--hostname-bin` is provided.
///
/// Implementations should not include the flag name in the error message
/// returned. The flag name is included automatically by the parser.
fn update(
&self,
value: FlagValue,
args: &mut crate::flags::lowargs::LowArgs,
) -> anyhow::Result<()>;
}
/// The category that a flag belongs to.
///
/// Categories are used to organize flags into "logical" groups in the
/// generated documentation.
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, PartialOrd, Ord)]
enum Category {
/// Flags related to how ripgrep reads its input. Its "input" generally
/// consists of the patterns it is trying to match and the haystacks it is
/// trying to search.
Input,
/// Flags related to the operation of the search itself. For example,
/// whether case insensitive matching is enabled.
Search,
/// Flags related to how ripgrep filters haystacks. For example, whether
/// to respect gitignore files or not.
Filter,
/// Flags related to how ripgrep shows its search results. For example,
/// whether to show line numbers or not.
Output,
/// Flags related to changing ripgrep's output at a more fundamental level.
/// For example, flags like `--count` suppress printing of individual
/// lines, and instead just print the total count of matches for each file
/// searched.
OutputModes,
/// Flags related to logging behavior such as emitting non-fatal error
/// messages or printing search statistics.
Logging,
/// Other behaviors not related to ripgrep's core functionality. For
/// example, printing the file type globbing rules, or printing the list
/// of files ripgrep would search without actually searching them.
OtherBehaviors,
}
impl Category {
/// Returns a string representation of this category.
///
/// This string is the name of the variable used in various templates for
/// generated documentation. This name can be used for interpolation.
fn as_str(&self) -> &'static str {
match *self {
Category::Input => "input",
Category::Search => "search",
Category::Filter => "filter",
Category::Output => "output",
Category::OutputModes => "output-modes",
Category::Logging => "logging",
Category::OtherBehaviors => "other-behaviors",
}
}
}
/// The kind of argument a flag accepts, to be used for shell completions.
#[derive(Clone, Copy, Debug)]
enum CompletionType {
/// No special category. is_switch() and doc_choices() may apply.
Other,
/// A path to a file.
Filename,
/// A command in $PATH.
Executable,
/// The name of a file type, as used by e.g. --type.
Filetype,
/// The name of an encoding_rs encoding, as used by --encoding.
Encoding,
}
/// Represents a value parsed from the command line.
///
/// This doesn't include the corresponding flag, but values come in one of
/// two forms: a switch (on or off) or an arbitrary value.
///
/// Note that the CLI doesn't directly support negated switches. For example,
/// you can'd do anything like `-n=false` or any of that nonsense. Instead,
/// the CLI parser knows about which flag names are negations and which aren't
/// (courtesy of the `Flag` trait). If a flag given is known as a negation,
/// then a `FlagValue::Switch(false)` value is passed into `Flag::update`.
#[derive(Debug)]
enum FlagValue {
/// A flag that is either on or off.
Switch(bool),
/// A flag that comes with an arbitrary user value.
Value(OsString),
}
impl FlagValue {
/// Return the yes or no value of this switch.
///
/// If this flag value is not a switch, then this panics.
///
/// This is useful when writing the implementation of `Flag::update`.
/// namely, callers usually know whether a switch or a value is expected.
/// If a flag is something different, then it indicates a bug, and thus a
/// panic is acceptable.
fn unwrap_switch(self) -> bool {
match self {
FlagValue::Switch(yes) => yes,
FlagValue::Value(_) => {
unreachable!("got flag value but expected switch")
}
}
}
/// Return the user provided value of this flag.
///
/// If this flag is a switch, then this panics.
///
/// This is useful when writing the implementation of `Flag::update`.
/// namely, callers usually know whether a switch or a value is expected.
/// If a flag is something different, then it indicates a bug, and thus a
/// panic is acceptable.
fn unwrap_value(self) -> OsString {
match self {
FlagValue::Switch(_) => {
unreachable!("got switch but expected flag value")
}
FlagValue::Value(v) => v,
}
}
}

476
crates/core/flags/parse.rs Normal file
View File

@@ -0,0 +1,476 @@
/*!
Parses command line arguments into a structured and typed representation.
*/
use std::{borrow::Cow, collections::BTreeSet, ffi::OsString};
use anyhow::Context;
use crate::flags::{
defs::FLAGS,
hiargs::HiArgs,
lowargs::{LoggingMode, LowArgs, SpecialMode},
Flag, FlagValue,
};
/// The result of parsing CLI arguments.
///
/// This is basically a `anyhow::Result<T>`, but with one extra variant that is
/// inhabited whenever ripgrep should execute a "special" mode. That is, when a
/// user provides the `-h/--help` or `-V/--version` flags.
///
/// This special variant exists to allow CLI parsing to short circuit as
/// quickly as is reasonable. For example, it lets CLI parsing avoid reading
/// ripgrep's configuration and converting low level arguments into a higher
/// level representation.
#[derive(Debug)]
pub(crate) enum ParseResult<T> {
Special(SpecialMode),
Ok(T),
Err(anyhow::Error),
}
impl<T> ParseResult<T> {
/// If this result is `Ok`, then apply `then` to it. Otherwise, return this
/// result unchanged.
fn and_then<U>(
self,
mut then: impl FnMut(T) -> ParseResult<U>,
) -> ParseResult<U> {
match self {
ParseResult::Special(mode) => ParseResult::Special(mode),
ParseResult::Ok(t) => then(t),
ParseResult::Err(err) => ParseResult::Err(err),
}
}
}
/// Parse CLI arguments and convert then to their high level representation.
pub(crate) fn parse() -> ParseResult<HiArgs> {
parse_low().and_then(|low| match HiArgs::from_low_args(low) {
Ok(hi) => ParseResult::Ok(hi),
Err(err) => ParseResult::Err(err),
})
}
/// Parse CLI arguments only into their low level representation.
///
/// This takes configuration into account. That is, it will try to read
/// `RIPGREP_CONFIG_PATH` and prepend any arguments found there to the
/// arguments passed to this process.
///
/// This will also set one-time global state flags, such as the log level and
/// whether messages should be printed.
fn parse_low() -> ParseResult<LowArgs> {
if let Err(err) = crate::logger::Logger::init() {
let err = anyhow::anyhow!("failed to initialize logger: {err}");
return ParseResult::Err(err);
}
let parser = Parser::new();
let mut low = LowArgs::default();
if let Err(err) = parser.parse(std::env::args_os().skip(1), &mut low) {
return ParseResult::Err(err);
}
// Even though we haven't parsed the config file yet (assuming it exists),
// we can still use the arguments given on the CLI to setup ripgrep's
// logging preferences. Even if the config file changes them in some way,
// it's really the best we can do. This way, for example, folks can pass
// `--trace` and see any messages logged during config file parsing.
set_log_levels(&low);
// Before we try to take configuration into account, we can bail early
// if a special mode was enabled. This is basically only for version and
// help output which shouldn't be impacted by extra configuration.
if let Some(special) = low.special.take() {
return ParseResult::Special(special);
}
// If the end user says no config, then respect it.
if low.no_config {
log::debug!("not reading config files because --no-config is present");
return ParseResult::Ok(low);
}
// Look for arguments from a config file. If we got nothing (whether the
// file is empty or RIPGREP_CONFIG_PATH wasn't set), then we don't need
// to re-parse.
let config_args = crate::flags::config::args();
if config_args.is_empty() {
log::debug!("no extra arguments found from configuration file");
return ParseResult::Ok(low);
}
// The final arguments are just the arguments from the CLI appending to
// the end of the config arguments.
let mut final_args = config_args;
final_args.extend(std::env::args_os().skip(1));
// Now do the CLI parsing dance again.
let mut low = LowArgs::default();
if let Err(err) = parser.parse(final_args.into_iter(), &mut low) {
return ParseResult::Err(err);
}
// Reset the message and logging levels, since they could have changed.
set_log_levels(&low);
ParseResult::Ok(low)
}
/// Sets global state flags that control logging based on low-level arguments.
fn set_log_levels(low: &LowArgs) {
crate::messages::set_messages(!low.no_messages);
crate::messages::set_ignore_messages(!low.no_ignore_messages);
match low.logging {
Some(LoggingMode::Trace) => {
log::set_max_level(log::LevelFilter::Trace)
}
Some(LoggingMode::Debug) => {
log::set_max_level(log::LevelFilter::Debug)
}
None => log::set_max_level(log::LevelFilter::Warn),
}
}
/// Parse the sequence of CLI arguments given a low level typed set of
/// arguments.
///
/// This is exposed for testing that the correct low-level arguments are parsed
/// from a CLI. It just runs the parser once over the CLI arguments. It doesn't
/// setup logging or read from a config file.
///
/// This assumes the iterator given does *not* begin with the binary name.
#[cfg(test)]
pub(crate) fn parse_low_raw(
rawargs: impl IntoIterator<Item = impl Into<OsString>>,
) -> anyhow::Result<LowArgs> {
let mut args = LowArgs::default();
Parser::new().parse(rawargs, &mut args)?;
Ok(args)
}
/// Return the metadata for the flag of the given name.
pub(super) fn lookup(name: &str) -> Option<&'static dyn Flag> {
// N.B. Creating a new parser might look expensive, but it only builds
// the lookup trie exactly once. That is, we get a `&'static Parser` from
// `Parser::new()`.
match Parser::new().find_long(name) {
FlagLookup::Match(&FlagInfo { flag, .. }) => Some(flag),
_ => None,
}
}
/// A parser for turning a sequence of command line arguments into a more
/// strictly typed set of arguments.
#[derive(Debug)]
struct Parser {
/// A single map that contains all possible flag names. This includes
/// short and long names, aliases and negations. This maps those names to
/// indices into `info`.
map: FlagMap,
/// A map from IDs returned by the `map` to the corresponding flag
/// information.
info: Vec<FlagInfo>,
}
impl Parser {
/// Create a new parser.
///
/// This always creates the same parser and only does it once. Callers may
/// call this repeatedly, and the parser will only be built once.
fn new() -> &'static Parser {
use std::sync::OnceLock;
// Since a parser's state is immutable and completely determined by
// FLAGS, and since FLAGS is a constant, we can initialize it exactly
// once.
static P: OnceLock<Parser> = OnceLock::new();
P.get_or_init(|| {
let mut infos = vec![];
for &flag in FLAGS.iter() {
infos.push(FlagInfo {
flag,
name: Ok(flag.name_long()),
kind: FlagInfoKind::Standard,
});
for alias in flag.aliases() {
infos.push(FlagInfo {
flag,
name: Ok(alias),
kind: FlagInfoKind::Alias,
});
}
if let Some(byte) = flag.name_short() {
infos.push(FlagInfo {
flag,
name: Err(byte),
kind: FlagInfoKind::Standard,
});
}
if let Some(name) = flag.name_negated() {
infos.push(FlagInfo {
flag,
name: Ok(name),
kind: FlagInfoKind::Negated,
});
}
}
let map = FlagMap::new(&infos);
Parser { map, info: infos }
})
}
/// Parse the given CLI arguments into a low level representation.
///
/// The iterator given should *not* start with the binary name.
fn parse<I, O>(&self, rawargs: I, args: &mut LowArgs) -> anyhow::Result<()>
where
I: IntoIterator<Item = O>,
O: Into<OsString>,
{
let mut p = lexopt::Parser::from_args(rawargs);
while let Some(arg) = p.next().context("invalid CLI arguments")? {
let lookup = match arg {
lexopt::Arg::Value(value) => {
args.positional.push(value);
continue;
}
lexopt::Arg::Short(ch) if ch == 'h' => {
// Special case -h/--help since behavior is different
// based on whether short or long flag is given.
args.special = Some(SpecialMode::HelpShort);
continue;
}
lexopt::Arg::Short(ch) if ch == 'V' => {
// Special case -V/--version since behavior is different
// based on whether short or long flag is given.
args.special = Some(SpecialMode::VersionShort);
continue;
}
lexopt::Arg::Short(ch) => self.find_short(ch),
lexopt::Arg::Long(name) if name == "help" => {
// Special case -h/--help since behavior is different
// based on whether short or long flag is given.
args.special = Some(SpecialMode::HelpLong);
continue;
}
lexopt::Arg::Long(name) if name == "version" => {
// Special case -V/--version since behavior is different
// based on whether short or long flag is given.
args.special = Some(SpecialMode::VersionLong);
continue;
}
lexopt::Arg::Long(name) => self.find_long(name),
};
let mat = match lookup {
FlagLookup::Match(mat) => mat,
FlagLookup::UnrecognizedShort(name) => {
anyhow::bail!("unrecognized flag -{name}")
}
FlagLookup::UnrecognizedLong(name) => {
let mut msg = format!("unrecognized flag --{name}");
if let Some(suggest_msg) = suggest(&name) {
msg = format!("{msg}\n\n{suggest_msg}");
}
anyhow::bail!("{msg}")
}
};
let value = if matches!(mat.kind, FlagInfoKind::Negated) {
// Negated flags are always switches, even if the non-negated
// flag is not. For example, --context-separator accepts a
// value, but --no-context-separator does not.
FlagValue::Switch(false)
} else if mat.flag.is_switch() {
FlagValue::Switch(true)
} else {
FlagValue::Value(p.value().with_context(|| {
format!("missing value for flag {mat}")
})?)
};
mat.flag
.update(value, args)
.with_context(|| format!("error parsing flag {mat}"))?;
}
Ok(())
}
/// Look for a flag by its short name.
fn find_short(&self, ch: char) -> FlagLookup<'_> {
if !ch.is_ascii() {
return FlagLookup::UnrecognizedShort(ch);
}
let byte = u8::try_from(ch).unwrap();
let Some(index) = self.map.find(&[byte]) else {
return FlagLookup::UnrecognizedShort(ch);
};
FlagLookup::Match(&self.info[index])
}
/// Look for a flag by its long name.
///
/// This also works for aliases and negated names.
fn find_long(&self, name: &str) -> FlagLookup<'_> {
let Some(index) = self.map.find(name.as_bytes()) else {
return FlagLookup::UnrecognizedLong(name.to_string());
};
FlagLookup::Match(&self.info[index])
}
}
/// The result of looking up a flag name.
#[derive(Debug)]
enum FlagLookup<'a> {
/// Lookup found a match and the metadata for the flag is attached.
Match(&'a FlagInfo),
/// The given short name is unrecognized.
UnrecognizedShort(char),
/// The given long name is unrecognized.
UnrecognizedLong(String),
}
/// The info about a flag associated with a flag's ID in the the flag map.
#[derive(Debug)]
struct FlagInfo {
/// The flag object and its associated metadata.
flag: &'static dyn Flag,
/// The actual name that is stored in the Aho-Corasick automaton. When this
/// is a byte, it corresponds to a short single character ASCII flag. The
/// actual pattern that's in the Aho-Corasick automaton is just the single
/// byte.
name: Result<&'static str, u8>,
/// The type of flag that is stored for the corresponding Aho-Corasick
/// pattern.
kind: FlagInfoKind,
}
/// The kind of flag that is being matched.
#[derive(Debug)]
enum FlagInfoKind {
/// A standard flag, e.g., --passthru.
Standard,
/// A negation of a standard flag, e.g., --no-multiline.
Negated,
/// An alias for a standard flag, e.g., --passthrough.
Alias,
}
impl std::fmt::Display for FlagInfo {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self.name {
Ok(long) => write!(f, "--{long}"),
Err(short) => write!(f, "-{short}", short = char::from(short)),
}
}
}
/// A map from flag names (short, long, negated and aliases) to their ID.
///
/// Once an ID is known, it can be used to look up a flag's metadata in the
/// parser's internal state.
#[derive(Debug)]
struct FlagMap {
map: std::collections::HashMap<Vec<u8>, usize>,
}
impl FlagMap {
/// Create a new map of flags for the given flag information.
///
/// The index of each flag info corresponds to its ID.
fn new(infos: &[FlagInfo]) -> FlagMap {
let mut map = std::collections::HashMap::with_capacity(infos.len());
for (i, info) in infos.iter().enumerate() {
match info.name {
Ok(name) => {
assert_eq!(None, map.insert(name.as_bytes().to_vec(), i));
}
Err(byte) => {
assert_eq!(None, map.insert(vec![byte], i));
}
}
}
FlagMap { map }
}
/// Look for a match of `name` in the given Aho-Corasick automaton.
///
/// This only returns a match if the one found has a length equivalent to
/// the length of the name given.
fn find(&self, name: &[u8]) -> Option<usize> {
self.map.get(name).copied()
}
}
/// Possibly return a message suggesting flags similar in the name to the one
/// given.
///
/// The one given should be a flag given by the user (without the leading
/// dashes) that was unrecognized. This attempts to find existing flags that
/// are similar to the one given.
fn suggest(unrecognized: &str) -> Option<String> {
let similars = find_similar_names(unrecognized);
if similars.is_empty() {
return None;
}
let list = similars
.into_iter()
.map(|name| format!("--{name}"))
.collect::<Vec<String>>()
.join(", ");
Some(format!("similar flags that are available: {list}"))
}
/// Return a sequence of names similar to the unrecognized name given.
fn find_similar_names(unrecognized: &str) -> Vec<&'static str> {
// The jaccard similarity threshold at which we consider two flag names
// similar enough that it's worth suggesting it to the end user.
//
// This value was determined by some ad hoc experimentation. It might need
// further tweaking.
const THRESHOLD: f64 = 0.4;
let mut similar = vec![];
let bow_given = ngrams(unrecognized);
for &flag in FLAGS.iter() {
let name = flag.name_long();
let bow = ngrams(name);
if jaccard_index(&bow_given, &bow) >= THRESHOLD {
similar.push(name);
}
if let Some(name) = flag.name_negated() {
let bow = ngrams(name);
if jaccard_index(&bow_given, &bow) >= THRESHOLD {
similar.push(name);
}
}
for name in flag.aliases() {
let bow = ngrams(name);
if jaccard_index(&bow_given, &bow) >= THRESHOLD {
similar.push(name);
}
}
}
similar
}
/// A "bag of words" is a set of ngrams.
type BagOfWords<'a> = BTreeSet<Cow<'a, [u8]>>;
/// Returns the jaccard index (a measure of similarity) between sets of ngrams.
fn jaccard_index(ngrams1: &BagOfWords<'_>, ngrams2: &BagOfWords<'_>) -> f64 {
let union = u32::try_from(ngrams1.union(ngrams2).count())
.expect("fewer than u32::MAX flags");
let intersection = u32::try_from(ngrams1.intersection(ngrams2).count())
.expect("fewer than u32::MAX flags");
f64::from(intersection) / f64::from(union)
}
/// Returns all 3-grams in the slice given.
///
/// If the slice doesn't contain a 3-gram, then one is artificially created by
/// padding it out with a character that will never appear in a flag name.
fn ngrams(flag_name: &str) -> BagOfWords<'_> {
// We only allow ASCII flag names, so we can just use bytes.
let slice = flag_name.as_bytes();
let seq: Vec<Cow<[u8]>> = match slice.len() {
0 => vec![Cow::Owned(b"!!!".to_vec())],
1 => vec![Cow::Owned(vec![slice[0], b'!', b'!'])],
2 => vec![Cow::Owned(vec![slice[0], slice[1], b'!'])],
_ => slice.windows(3).map(Cow::Borrowed).collect(),
};
BTreeSet::from_iter(seq)
}

160
crates/core/haystack.rs Normal file
View File

@@ -0,0 +1,160 @@
/*!
Defines a builder for haystacks.
A "haystack" represents something we want to search. It encapsulates the logic
for whether a haystack ought to be searched or not, separate from the standard
ignore rules and other filtering logic.
Effectively, a haystack wraps a directory entry and adds some light application
level logic around it.
*/
use std::path::Path;
/// A builder for constructing things to search over.
#[derive(Clone, Debug)]
pub(crate) struct HaystackBuilder {
strip_dot_prefix: bool,
}
impl HaystackBuilder {
/// Return a new haystack builder with a default configuration.
pub(crate) fn new() -> HaystackBuilder {
HaystackBuilder { strip_dot_prefix: false }
}
/// Create a new haystack from a possibly missing directory entry.
///
/// If the directory entry isn't present, then the corresponding error is
/// logged if messages have been configured. Otherwise, if the directory
/// entry is deemed searchable, then it is returned as a haystack.
pub(crate) fn build_from_result(
&self,
result: Result<ignore::DirEntry, ignore::Error>,
) -> Option<Haystack> {
match result {
Ok(dent) => self.build(dent),
Err(err) => {
err_message!("{err}");
None
}
}
}
/// Create a new haystack using this builder's configuration.
///
/// If a directory entry could not be created or should otherwise not be
/// searched, then this returns `None` after emitting any relevant log
/// messages.
fn build(&self, dent: ignore::DirEntry) -> Option<Haystack> {
let hay = Haystack { dent, strip_dot_prefix: self.strip_dot_prefix };
if let Some(err) = hay.dent.error() {
ignore_message!("{err}");
}
// If this entry was explicitly provided by an end user, then we always
// want to search it.
if hay.is_explicit() {
return Some(hay);
}
// At this point, we only want to search something if it's explicitly a
// file. This omits symlinks. (If ripgrep was configured to follow
// symlinks, then they have already been followed by the directory
// traversal.)
if hay.is_file() {
return Some(hay);
}
// We got nothing. Emit a debug message, but only if this isn't a
// directory. Otherwise, emitting messages for directories is just
// noisy.
if !hay.is_dir() {
log::debug!(
"ignoring {}: failed to pass haystack filter: \
file type: {:?}, metadata: {:?}",
hay.dent.path().display(),
hay.dent.file_type(),
hay.dent.metadata()
);
}
None
}
/// When enabled, if the haystack's file path starts with `./` then it is
/// stripped.
///
/// This is useful when implicitly searching the current working directory.
pub(crate) fn strip_dot_prefix(
&mut self,
yes: bool,
) -> &mut HaystackBuilder {
self.strip_dot_prefix = yes;
self
}
}
/// A haystack is a thing we want to search.
///
/// Generally, a haystack is either a file or stdin.
#[derive(Clone, Debug)]
pub(crate) struct Haystack {
dent: ignore::DirEntry,
strip_dot_prefix: bool,
}
impl Haystack {
/// Return the file path corresponding to this haystack.
///
/// If this haystack corresponds to stdin, then a special `<stdin>` path
/// is returned instead.
pub(crate) fn path(&self) -> &Path {
if self.strip_dot_prefix && self.dent.path().starts_with("./") {
self.dent.path().strip_prefix("./").unwrap()
} else {
self.dent.path()
}
}
/// Returns true if and only if this entry corresponds to stdin.
pub(crate) fn is_stdin(&self) -> bool {
self.dent.is_stdin()
}
/// Returns true if and only if this entry corresponds to a haystack to
/// search that was explicitly supplied by an end user.
///
/// Generally, this corresponds to either stdin or an explicit file path
/// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
/// an explicit haystack, but, e.g., `./some-dir/some-other-file` is not.
///
/// However, note that ripgrep does not see through shell globbing. e.g.,
/// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
/// as an explicit haystack.
pub(crate) fn is_explicit(&self) -> bool {
// stdin is obvious. When an entry has a depth of 0, that means it
// was explicitly provided to our directory iterator, which means it
// was in turn explicitly provided by the end user. The !is_dir check
// means that we want to search files even if their symlinks, again,
// because they were explicitly provided. (And we never want to try
// to search a directory.)
self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
}
/// Returns true if and only if this haystack points to a directory after
/// following symbolic links.
fn is_dir(&self) -> bool {
let ft = match self.dent.file_type() {
None => return false,
Some(ft) => ft,
};
if ft.is_dir() {
return true;
}
// If this is a symlink, then we want to follow it to determine
// whether it's a directory or not.
self.dent.path_is_symlink() && self.dent.path().is_dir()
}
/// Returns true if and only if this haystack points to a file.
fn is_file(&self) -> bool {
self.dent.file_type().map_or(false, |ft| ft.is_file())
}
}

72
crates/core/logger.rs Normal file
View File

@@ -0,0 +1,72 @@
/*!
Defines a super simple logger that works with the `log` crate.
We don't do anything fancy. We just need basic log levels and the ability to
print to stderr. We therefore avoid bringing in extra dependencies just for
this functionality.
*/
use log::{self, Log};
/// The simplest possible logger that logs to stderr.
///
/// This logger does no filtering. Instead, it relies on the `log` crates
/// filtering via its global max_level setting.
#[derive(Debug)]
pub(crate) struct Logger(());
/// A singleton used as the target for an implementation of the `Log` trait.
const LOGGER: &'static Logger = &Logger(());
impl Logger {
/// Create a new logger that logs to stderr and initialize it as the
/// global logger. If there was a problem setting the logger, then an
/// error is returned.
pub(crate) fn init() -> Result<(), log::SetLoggerError> {
log::set_logger(LOGGER)
}
}
impl Log for Logger {
fn enabled(&self, _: &log::Metadata<'_>) -> bool {
// We set the log level via log::set_max_level, so we don't need to
// implement filtering here.
true
}
fn log(&self, record: &log::Record<'_>) {
match (record.file(), record.line()) {
(Some(file), Some(line)) => {
eprintln_locked!(
"{}|{}|{}:{}: {}",
record.level(),
record.target(),
file,
line,
record.args()
);
}
(Some(file), None) => {
eprintln_locked!(
"{}|{}|{}: {}",
record.level(),
record.target(),
file,
record.args()
);
}
_ => {
eprintln_locked!(
"{}|{}: {}",
record.level(),
record.target(),
record.args()
);
}
}
}
fn flush(&self) {
// We use eprintln_locked! which is flushed on every call.
}
}

483
crates/core/main.rs Normal file
View File

@@ -0,0 +1,483 @@
/*!
The main entry point into ripgrep.
*/
use std::{io::Write, process::ExitCode};
use ignore::WalkState;
use crate::flags::{HiArgs, SearchMode};
#[macro_use]
mod messages;
mod flags;
mod haystack;
mod logger;
mod search;
// Since Rust no longer uses jemalloc by default, ripgrep will, by default,
// use the system allocator. On Linux, this would normally be glibc's
// allocator, which is pretty good. In particular, ripgrep does not have a
// particularly allocation heavy workload, so there really isn't much
// difference (for ripgrep's purposes) between glibc's allocator and jemalloc.
//
// However, when ripgrep is built with musl, this means ripgrep will use musl's
// allocator, which appears to be substantially worse. (musl's goal is not to
// have the fastest version of everything. Its goal is to be small and amenable
// to static compilation.) Even though ripgrep isn't particularly allocation
// heavy, musl's allocator appears to slow down ripgrep quite a bit. Therefore,
// when building with musl, we use jemalloc.
//
// We don't unconditionally use jemalloc because it can be nice to use the
// system's default allocator by default. Moreover, jemalloc seems to increase
// compilation times by a bit.
//
// Moreover, we only do this on 64-bit systems since jemalloc doesn't support
// i686.
#[cfg(all(target_env = "musl", target_pointer_width = "64"))]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
/// Then, as it was, then again it will be.
fn main() -> ExitCode {
match run(flags::parse()) {
Ok(code) => code,
Err(err) => {
// Look for a broken pipe error. In this case, we generally want
// to exit "gracefully" with a success exit code. This matches
// existing Unix convention. We need to handle this explicitly
// since the Rust runtime doesn't ask for PIPE signals, and thus
// we get an I/O error instead. Traditional C Unix applications
// quit by getting a PIPE signal that they don't handle, and thus
// the unhandled signal causes the process to unceremoniously
// terminate.
for cause in err.chain() {
if let Some(ioerr) = cause.downcast_ref::<std::io::Error>() {
if ioerr.kind() == std::io::ErrorKind::BrokenPipe {
return ExitCode::from(0);
}
}
}
eprintln_locked!("{:#}", err);
ExitCode::from(2)
}
}
}
/// The main entry point for ripgrep.
///
/// The given parse result determines ripgrep's behavior. The parse
/// result should be the result of parsing CLI arguments in a low level
/// representation, and then followed by an attempt to convert them into a
/// higher level representation. The higher level representation has some nicer
/// abstractions, for example, instead of representing the `-g/--glob` flag
/// as a `Vec<String>` (as in the low level representation), the globs are
/// converted into a single matcher.
fn run(result: crate::flags::ParseResult<HiArgs>) -> anyhow::Result<ExitCode> {
use crate::flags::{Mode, ParseResult};
let args = match result {
ParseResult::Err(err) => return Err(err),
ParseResult::Special(mode) => return special(mode),
ParseResult::Ok(args) => args,
};
let matched = match args.mode() {
Mode::Search(_) if !args.matches_possible() => false,
Mode::Search(mode) if args.threads() == 1 => search(&args, mode)?,
Mode::Search(mode) => search_parallel(&args, mode)?,
Mode::Files if args.threads() == 1 => files(&args)?,
Mode::Files => files_parallel(&args)?,
Mode::Types => return types(&args),
Mode::Generate(mode) => return generate(mode),
};
Ok(if matched && (args.quiet() || !messages::errored()) {
ExitCode::from(0)
} else if messages::errored() {
ExitCode::from(2)
} else {
ExitCode::from(1)
})
}
/// The top-level entry point for single-threaded search.
///
/// This recursively steps through the file list (current directory by default)
/// and searches each file sequentially.
fn search(args: &HiArgs, mode: SearchMode) -> anyhow::Result<bool> {
let started_at = std::time::Instant::now();
let haystack_builder = args.haystack_builder();
let unsorted = args
.walk_builder()?
.build()
.filter_map(|result| haystack_builder.build_from_result(result));
let haystacks = args.sort(unsorted);
let mut matched = false;
let mut searched = false;
let mut stats = args.stats();
let mut searcher = args.search_worker(
args.matcher()?,
args.searcher()?,
args.printer(mode, args.stdout()),
)?;
for haystack in haystacks {
searched = true;
let search_result = match searcher.search(&haystack) {
Ok(search_result) => search_result,
// A broken pipe means graceful termination.
Err(err) if err.kind() == std::io::ErrorKind::BrokenPipe => break,
Err(err) => {
err_message!("{}: {}", haystack.path().display(), err);
continue;
}
};
matched = matched || search_result.has_match();
if let Some(ref mut stats) = stats {
*stats += search_result.stats().unwrap();
}
if matched && args.quit_after_match() {
break;
}
}
if args.has_implicit_path() && !searched {
eprint_nothing_searched();
}
if let Some(ref stats) = stats {
let wtr = searcher.printer().get_mut();
let _ = print_stats(mode, stats, started_at, wtr);
}
Ok(matched)
}
/// The top-level entry point for multi-threaded search.
///
/// The parallelism is itself achieved by the recursive directory traversal.
/// All we need to do is feed it a worker for performing a search on each file.
///
/// Requesting a sorted output from ripgrep (such as with `--sort path`) will
/// automatically disable parallelism and hence sorting is not handled here.
fn search_parallel(args: &HiArgs, mode: SearchMode) -> anyhow::Result<bool> {
use std::sync::atomic::{AtomicBool, Ordering};
let started_at = std::time::Instant::now();
let haystack_builder = args.haystack_builder();
let bufwtr = args.buffer_writer();
let stats = args.stats().map(std::sync::Mutex::new);
let matched = AtomicBool::new(false);
let searched = AtomicBool::new(false);
let mut searcher = args.search_worker(
args.matcher()?,
args.searcher()?,
args.printer(mode, bufwtr.buffer()),
)?;
args.walk_builder()?.build_parallel().run(|| {
let bufwtr = &bufwtr;
let stats = &stats;
let matched = &matched;
let searched = &searched;
let haystack_builder = &haystack_builder;
let mut searcher = searcher.clone();
Box::new(move |result| {
let haystack = match haystack_builder.build_from_result(result) {
Some(haystack) => haystack,
None => return WalkState::Continue,
};
searched.store(true, Ordering::SeqCst);
searcher.printer().get_mut().clear();
let search_result = match searcher.search(&haystack) {
Ok(search_result) => search_result,
Err(err) => {
err_message!("{}: {}", haystack.path().display(), err);
return WalkState::Continue;
}
};
if search_result.has_match() {
matched.store(true, Ordering::SeqCst);
}
if let Some(ref locked_stats) = *stats {
let mut stats = locked_stats.lock().unwrap();
*stats += search_result.stats().unwrap();
}
if let Err(err) = bufwtr.print(searcher.printer().get_mut()) {
// A broken pipe means graceful termination.
if err.kind() == std::io::ErrorKind::BrokenPipe {
return WalkState::Quit;
}
// Otherwise, we continue on our merry way.
err_message!("{}: {}", haystack.path().display(), err);
}
if matched.load(Ordering::SeqCst) && args.quit_after_match() {
WalkState::Quit
} else {
WalkState::Continue
}
})
});
if args.has_implicit_path() && !searched.load(Ordering::SeqCst) {
eprint_nothing_searched();
}
if let Some(ref locked_stats) = stats {
let stats = locked_stats.lock().unwrap();
let mut wtr = searcher.printer().get_mut();
let _ = print_stats(mode, &stats, started_at, &mut wtr);
let _ = bufwtr.print(&mut wtr);
}
Ok(matched.load(Ordering::SeqCst))
}
/// The top-level entry point for file listing without searching.
///
/// This recursively steps through the file list (current directory by default)
/// and prints each path sequentially using a single thread.
fn files(args: &HiArgs) -> anyhow::Result<bool> {
let haystack_builder = args.haystack_builder();
let unsorted = args
.walk_builder()?
.build()
.filter_map(|result| haystack_builder.build_from_result(result));
let haystacks = args.sort(unsorted);
let mut matched = false;
let mut path_printer = args.path_printer_builder().build(args.stdout());
for haystack in haystacks {
matched = true;
if args.quit_after_match() {
break;
}
if let Err(err) = path_printer.write(haystack.path()) {
// A broken pipe means graceful termination.
if err.kind() == std::io::ErrorKind::BrokenPipe {
break;
}
// Otherwise, we have some other error that's preventing us from
// writing to stdout, so we should bubble it up.
return Err(err.into());
}
}
Ok(matched)
}
/// The top-level entry point for multi-threaded file listing without
/// searching.
///
/// This recursively steps through the file list (current directory by default)
/// and prints each path sequentially using multiple threads.
///
/// Requesting a sorted output from ripgrep (such as with `--sort path`) will
/// automatically disable parallelism and hence sorting is not handled here.
fn files_parallel(args: &HiArgs) -> anyhow::Result<bool> {
use std::{
sync::{
atomic::{AtomicBool, Ordering},
mpsc,
},
thread,
};
let haystack_builder = args.haystack_builder();
let mut path_printer = args.path_printer_builder().build(args.stdout());
let matched = AtomicBool::new(false);
let (tx, rx) = mpsc::channel::<crate::haystack::Haystack>();
// We spawn a single printing thread to make sure we don't tear writes.
// We use a channel here under the presumption that it's probably faster
// than using a mutex in the worker threads below, but this has never been
// seriously litigated.
let print_thread = thread::spawn(move || -> std::io::Result<()> {
for haystack in rx.iter() {
path_printer.write(haystack.path())?;
}
Ok(())
});
args.walk_builder()?.build_parallel().run(|| {
let haystack_builder = &haystack_builder;
let matched = &matched;
let tx = tx.clone();
Box::new(move |result| {
let haystack = match haystack_builder.build_from_result(result) {
Some(haystack) => haystack,
None => return WalkState::Continue,
};
matched.store(true, Ordering::SeqCst);
if args.quit_after_match() {
WalkState::Quit
} else {
match tx.send(haystack) {
Ok(_) => WalkState::Continue,
Err(_) => WalkState::Quit,
}
}
})
});
drop(tx);
if let Err(err) = print_thread.join().unwrap() {
// A broken pipe means graceful termination, so fall through.
// Otherwise, something bad happened while writing to stdout, so bubble
// it up.
if err.kind() != std::io::ErrorKind::BrokenPipe {
return Err(err.into());
}
}
Ok(matched.load(Ordering::SeqCst))
}
/// The top-level entry point for `--type-list`.
fn types(args: &HiArgs) -> anyhow::Result<ExitCode> {
let mut count = 0;
let mut stdout = args.stdout();
for def in args.types().definitions() {
count += 1;
stdout.write_all(def.name().as_bytes())?;
stdout.write_all(b": ")?;
let mut first = true;
for glob in def.globs() {
if !first {
stdout.write_all(b", ")?;
}
stdout.write_all(glob.as_bytes())?;
first = false;
}
stdout.write_all(b"\n")?;
}
Ok(ExitCode::from(if count == 0 { 1 } else { 0 }))
}
/// Implements ripgrep's "generate" modes.
///
/// These modes correspond to generating some kind of ancillary data related
/// to ripgrep. At present, this includes ripgrep's man page (in roff format)
/// and supported shell completions.
fn generate(mode: crate::flags::GenerateMode) -> anyhow::Result<ExitCode> {
use crate::flags::GenerateMode;
let output = match mode {
GenerateMode::Man => flags::generate_man_page(),
GenerateMode::CompleteBash => flags::generate_complete_bash(),
GenerateMode::CompleteZsh => flags::generate_complete_zsh(),
GenerateMode::CompleteFish => flags::generate_complete_fish(),
GenerateMode::CompletePowerShell => {
flags::generate_complete_powershell()
}
};
writeln!(std::io::stdout(), "{}", output.trim_end())?;
Ok(ExitCode::from(0))
}
/// Implements ripgrep's "special" modes.
///
/// A special mode is one that generally short-circuits most (not all) of
/// ripgrep's initialization logic and skips right to this routine. The
/// special modes essentially consist of printing help and version output. The
/// idea behind the short circuiting is to ensure there is as little as possible
/// (within reason) that would prevent ripgrep from emitting help output.
///
/// For example, part of the initialization logic that is skipped (among
/// other things) is accessing the current working directory. If that fails,
/// ripgrep emits an error. We don't want to emit an error if it fails and
/// the user requested version or help information.
fn special(mode: crate::flags::SpecialMode) -> anyhow::Result<ExitCode> {
use crate::flags::SpecialMode;
let mut exit = ExitCode::from(0);
let output = match mode {
SpecialMode::HelpShort => flags::generate_help_short(),
SpecialMode::HelpLong => flags::generate_help_long(),
SpecialMode::VersionShort => flags::generate_version_short(),
SpecialMode::VersionLong => flags::generate_version_long(),
// --pcre2-version is a little special because it emits an error
// exit code if this build of ripgrep doesn't support PCRE2.
SpecialMode::VersionPCRE2 => {
let (output, available) = flags::generate_version_pcre2();
if !available {
exit = ExitCode::from(1);
}
output
}
};
writeln!(std::io::stdout(), "{}", output.trim_end())?;
Ok(exit)
}
/// Prints a heuristic error messages when nothing is searched.
///
/// This can happen if an applicable ignore file has one or more rules that
/// are too broad and cause ripgrep to ignore everything.
///
/// We only show this error message when the user does *not* provide an
/// explicit path to search. This is because the message can otherwise be
/// noisy, e.g., when it is intended that there is nothing to search.
fn eprint_nothing_searched() {
err_message!(
"No files were searched, which means ripgrep probably \
applied a filter you didn't expect.\n\
Running with --debug will show why files are being skipped."
);
}
/// Prints the statistics given to the writer given.
///
/// The search mode given determines whether the stats should be printed in
/// a plain text format or in a JSON format.
///
/// The `started` time should be the time at which ripgrep started working.
///
/// If an error occurs while writing, then writing stops and the error is
/// returned. Note that callers should probably ignore this errror, since
/// whether stats fail to print or not generally shouldn't cause ripgrep to
/// enter into an "error" state. And usually the only way for this to fail is
/// if writing to stdout itself fails.
fn print_stats<W: Write>(
mode: SearchMode,
stats: &grep::printer::Stats,
started: std::time::Instant,
mut wtr: W,
) -> std::io::Result<()> {
let elapsed = std::time::Instant::now().duration_since(started);
if matches!(mode, SearchMode::JSON) {
// We specifically match the format laid out by the JSON printer in
// the grep-printer crate. We simply "extend" it with the 'summary'
// message type.
serde_json::to_writer(
&mut wtr,
&serde_json::json!({
"type": "summary",
"data": {
"stats": stats,
"elapsed_total": {
"secs": elapsed.as_secs(),
"nanos": elapsed.subsec_nanos(),
"human": format!("{:0.6}s", elapsed.as_secs_f64()),
},
}
}),
)?;
write!(wtr, "\n")
} else {
write!(
wtr,
"
{matches} matches
{lines} matched lines
{searches_with_match} files contained matches
{searches} files searched
{bytes_printed} bytes printed
{bytes_searched} bytes searched
{search_time:0.6} seconds spent searching
{process_time:0.6} seconds
",
matches = stats.matches(),
lines = stats.matched_lines(),
searches_with_match = stats.searches_with_match(),
searches = stats.searches(),
bytes_printed = stats.bytes_printed(),
bytes_searched = stats.bytes_searched(),
search_time = stats.elapsed().as_secs_f64(),
process_time = elapsed.as_secs_f64(),
)
}
}

139
crates/core/messages.rs Normal file
View File

@@ -0,0 +1,139 @@
/*!
This module defines some macros and some light shared mutable state.
This state is responsible for keeping track of whether we should emit certain
kinds of messages to the user (such as errors) that are distinct from the
standard "debug" or "trace" log messages. This state is specifically set at
startup time when CLI arguments are parsed and then never changed.
The other state tracked here is whether ripgrep experienced an error
condition. Aside from errors associated with invalid CLI arguments, ripgrep
generally does not abort when an error occurs (e.g., if reading a file failed).
But when an error does occur, it will alter ripgrep's exit status. Thus, when
an error message is emitted via `err_message`, then a global flag is toggled
indicating that at least one error occurred. When ripgrep exits, this flag is
consulted to determine what the exit status ought to be.
*/
use std::sync::atomic::{AtomicBool, Ordering};
/// When false, "messages" will not be printed.
static MESSAGES: AtomicBool = AtomicBool::new(false);
/// When false, "messages" related to ignore rules will not be printed.
static IGNORE_MESSAGES: AtomicBool = AtomicBool::new(false);
/// Flipped to true when an error message is printed.
static ERRORED: AtomicBool = AtomicBool::new(false);
/// Like eprintln, but locks stdout to prevent interleaving lines.
///
/// This locks stdout, not stderr, even though this prints to stderr. This
/// avoids the appearance of interleaving output when stdout and stderr both
/// correspond to a tty.)
#[macro_export]
macro_rules! eprintln_locked {
($($tt:tt)*) => {{
{
use std::io::Write;
// This is a bit of an abstraction violation because we explicitly
// lock stdout before printing to stderr. This avoids interleaving
// lines within ripgrep because `search_parallel` uses `termcolor`,
// which accesses the same stdout lock when writing lines.
let stdout = std::io::stdout().lock();
let mut stderr = std::io::stderr().lock();
// We specifically ignore any errors here. One plausible error we
// can get in some cases is a broken pipe error. And when that
// occurs, we should exit gracefully. Otherwise, just abort with
// an error code because there isn't much else we can do.
//
// See: https://github.com/BurntSushi/ripgrep/issues/1966
if let Err(err) = write!(stderr, "rg: ") {
if err.kind() == std::io::ErrorKind::BrokenPipe {
std::process::exit(0);
} else {
std::process::exit(2);
}
}
if let Err(err) = writeln!(stderr, $($tt)*) {
if err.kind() == std::io::ErrorKind::BrokenPipe {
std::process::exit(0);
} else {
std::process::exit(2);
}
}
drop(stdout);
}
}}
}
/// Emit a non-fatal error message, unless messages were disabled.
#[macro_export]
macro_rules! message {
($($tt:tt)*) => {
if crate::messages::messages() {
eprintln_locked!($($tt)*);
}
}
}
/// Like message, but sets ripgrep's "errored" flag, which controls the exit
/// status.
#[macro_export]
macro_rules! err_message {
($($tt:tt)*) => {
crate::messages::set_errored();
message!($($tt)*);
}
}
/// Emit a non-fatal ignore-related error message (like a parse error), unless
/// ignore-messages were disabled.
#[macro_export]
macro_rules! ignore_message {
($($tt:tt)*) => {
if crate::messages::messages() && crate::messages::ignore_messages() {
eprintln_locked!($($tt)*);
}
}
}
/// Returns true if and only if messages should be shown.
pub(crate) fn messages() -> bool {
MESSAGES.load(Ordering::SeqCst)
}
/// Set whether messages should be shown or not.
///
/// By default, they are not shown.
pub(crate) fn set_messages(yes: bool) {
MESSAGES.store(yes, Ordering::SeqCst)
}
/// Returns true if and only if "ignore" related messages should be shown.
pub(crate) fn ignore_messages() -> bool {
IGNORE_MESSAGES.load(Ordering::SeqCst)
}
/// Set whether "ignore" related messages should be shown or not.
///
/// By default, they are not shown.
///
/// Note that this is overridden if `messages` is disabled. Namely, if
/// `messages` is disabled, then "ignore" messages are never shown, regardless
/// of this setting.
pub(crate) fn set_ignore_messages(yes: bool) {
IGNORE_MESSAGES.store(yes, Ordering::SeqCst)
}
/// Returns true if and only if ripgrep came across a non-fatal error.
pub(crate) fn errored() -> bool {
ERRORED.load(Ordering::SeqCst)
}
/// Indicate that ripgrep has come across a non-fatal error.
///
/// Callers should not use this directly. Instead, it is called automatically
/// via the `err_message` macro.
pub(crate) fn set_errored() {
ERRORED.store(true, Ordering::SeqCst);
}

447
crates/core/search.rs Normal file
View File

@@ -0,0 +1,447 @@
/*!
Defines a very high level "search worker" abstraction.
A search worker manages the high level interaction points between the matcher
(i.e., which regex engine is used), the searcher (i.e., how data is actually
read and matched using the regex engine) and the printer. For example, the
search worker is where things like preprocessors or decompression happens.
*/
use std::{io, path::Path};
use {grep::matcher::Matcher, termcolor::WriteColor};
/// The configuration for the search worker.
///
/// Among a few other things, the configuration primarily controls the way we
/// show search results to users at a very high level.
#[derive(Clone, Debug)]
struct Config {
preprocessor: Option<std::path::PathBuf>,
preprocessor_globs: ignore::overrides::Override,
search_zip: bool,
binary_implicit: grep::searcher::BinaryDetection,
binary_explicit: grep::searcher::BinaryDetection,
}
impl Default for Config {
fn default() -> Config {
Config {
preprocessor: None,
preprocessor_globs: ignore::overrides::Override::empty(),
search_zip: false,
binary_implicit: grep::searcher::BinaryDetection::none(),
binary_explicit: grep::searcher::BinaryDetection::none(),
}
}
}
/// A builder for configuring and constructing a search worker.
#[derive(Clone, Debug)]
pub(crate) struct SearchWorkerBuilder {
config: Config,
command_builder: grep::cli::CommandReaderBuilder,
decomp_builder: grep::cli::DecompressionReaderBuilder,
}
impl Default for SearchWorkerBuilder {
fn default() -> SearchWorkerBuilder {
SearchWorkerBuilder::new()
}
}
impl SearchWorkerBuilder {
/// Create a new builder for configuring and constructing a search worker.
pub(crate) fn new() -> SearchWorkerBuilder {
let mut cmd_builder = grep::cli::CommandReaderBuilder::new();
cmd_builder.async_stderr(true);
let mut decomp_builder = grep::cli::DecompressionReaderBuilder::new();
decomp_builder.async_stderr(true);
SearchWorkerBuilder {
config: Config::default(),
command_builder: cmd_builder,
decomp_builder,
}
}
/// Create a new search worker using the given searcher, matcher and
/// printer.
pub(crate) fn build<W: WriteColor>(
&self,
matcher: PatternMatcher,
searcher: grep::searcher::Searcher,
printer: Printer<W>,
) -> SearchWorker<W> {
let config = self.config.clone();
let command_builder = self.command_builder.clone();
let decomp_builder = self.decomp_builder.clone();
SearchWorker {
config,
command_builder,
decomp_builder,
matcher,
searcher,
printer,
}
}
/// Set the path to a preprocessor command.
///
/// When this is set, instead of searching files directly, the given
/// command will be run with the file path as the first argument, and the
/// output of that command will be searched instead.
pub(crate) fn preprocessor(
&mut self,
cmd: Option<std::path::PathBuf>,
) -> anyhow::Result<&mut SearchWorkerBuilder> {
if let Some(ref prog) = cmd {
let bin = grep::cli::resolve_binary(prog)?;
self.config.preprocessor = Some(bin);
} else {
self.config.preprocessor = None;
}
Ok(self)
}
/// Set the globs for determining which files should be run through the
/// preprocessor. By default, with no globs and a preprocessor specified,
/// every file is run through the preprocessor.
pub(crate) fn preprocessor_globs(
&mut self,
globs: ignore::overrides::Override,
) -> &mut SearchWorkerBuilder {
self.config.preprocessor_globs = globs;
self
}
/// Enable the decompression and searching of common compressed files.
///
/// When enabled, if a particular file path is recognized as a compressed
/// file, then it is decompressed before searching.
///
/// Note that if a preprocessor command is set, then it overrides this
/// setting.
pub(crate) fn search_zip(
&mut self,
yes: bool,
) -> &mut SearchWorkerBuilder {
self.config.search_zip = yes;
self
}
/// Set the binary detection that should be used when searching files
/// found via a recursive directory search.
///
/// Generally, this binary detection may be
/// `grep::searcher::BinaryDetection::quit` if we want to skip binary files
/// completely.
///
/// By default, no binary detection is performed.
pub(crate) fn binary_detection_implicit(
&mut self,
detection: grep::searcher::BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_implicit = detection;
self
}
/// Set the binary detection that should be used when searching files
/// explicitly supplied by an end user.
///
/// Generally, this binary detection should NOT be
/// `grep::searcher::BinaryDetection::quit`, since we never want to
/// automatically filter files supplied by the end user.
///
/// By default, no binary detection is performed.
pub(crate) fn binary_detection_explicit(
&mut self,
detection: grep::searcher::BinaryDetection,
) -> &mut SearchWorkerBuilder {
self.config.binary_explicit = detection;
self
}
}
/// The result of executing a search.
///
/// Generally speaking, the "result" of a search is sent to a printer, which
/// writes results to an underlying writer such as stdout or a file. However,
/// every search also has some aggregate statistics or meta data that may be
/// useful to higher level routines.
#[derive(Clone, Debug, Default)]
pub(crate) struct SearchResult {
has_match: bool,
stats: Option<grep::printer::Stats>,
}
impl SearchResult {
/// Whether the search found a match or not.
pub(crate) fn has_match(&self) -> bool {
self.has_match
}
/// Return aggregate search statistics for a single search, if available.
///
/// It can be expensive to compute statistics, so these are only present
/// if explicitly enabled in the printer provided by the caller.
pub(crate) fn stats(&self) -> Option<&grep::printer::Stats> {
self.stats.as_ref()
}
}
/// The pattern matcher used by a search worker.
#[derive(Clone, Debug)]
pub(crate) enum PatternMatcher {
RustRegex(grep::regex::RegexMatcher),
#[cfg(feature = "pcre2")]
PCRE2(grep::pcre2::RegexMatcher),
}
/// The printer used by a search worker.
///
/// The `W` type parameter refers to the type of the underlying writer.
#[derive(Clone, Debug)]
pub(crate) enum Printer<W> {
/// Use the standard printer, which supports the classic grep-like format.
Standard(grep::printer::Standard<W>),
/// Use the summary printer, which supports aggregate displays of search
/// results.
Summary(grep::printer::Summary<W>),
/// A JSON printer, which emits results in the JSON Lines format.
JSON(grep::printer::JSON<W>),
}
impl<W: WriteColor> Printer<W> {
/// Return a mutable reference to the underlying printer's writer.
pub(crate) fn get_mut(&mut self) -> &mut W {
match *self {
Printer::Standard(ref mut p) => p.get_mut(),
Printer::Summary(ref mut p) => p.get_mut(),
Printer::JSON(ref mut p) => p.get_mut(),
}
}
}
/// A worker for executing searches.
///
/// It is intended for a single worker to execute many searches, and is
/// generally intended to be used from a single thread. When searching using
/// multiple threads, it is better to create a new worker for each thread.
#[derive(Clone, Debug)]
pub(crate) struct SearchWorker<W> {
config: Config,
command_builder: grep::cli::CommandReaderBuilder,
decomp_builder: grep::cli::DecompressionReaderBuilder,
matcher: PatternMatcher,
searcher: grep::searcher::Searcher,
printer: Printer<W>,
}
impl<W: WriteColor> SearchWorker<W> {
/// Execute a search over the given haystack.
pub(crate) fn search(
&mut self,
haystack: &crate::haystack::Haystack,
) -> io::Result<SearchResult> {
let bin = if haystack.is_explicit() {
self.config.binary_explicit.clone()
} else {
self.config.binary_implicit.clone()
};
let path = haystack.path();
log::trace!("{}: binary detection: {:?}", path.display(), bin);
self.searcher.set_binary_detection(bin);
if haystack.is_stdin() {
self.search_reader(path, &mut io::stdin().lock())
} else if self.should_preprocess(path) {
self.search_preprocessor(path)
} else if self.should_decompress(path) {
self.search_decompress(path)
} else {
self.search_path(path)
}
}
/// Return a mutable reference to the underlying printer.
pub(crate) fn printer(&mut self) -> &mut Printer<W> {
&mut self.printer
}
/// Returns true if and only if the given file path should be
/// decompressed before searching.
fn should_decompress(&self, path: &Path) -> bool {
if !self.config.search_zip {
return false;
}
self.decomp_builder.get_matcher().has_command(path)
}
/// Returns true if and only if the given file path should be run through
/// the preprocessor.
fn should_preprocess(&self, path: &Path) -> bool {
if !self.config.preprocessor.is_some() {
return false;
}
if self.config.preprocessor_globs.is_empty() {
return true;
}
!self.config.preprocessor_globs.matched(path, false).is_ignore()
}
/// Search the given file path by first asking the preprocessor for the
/// data to search instead of opening the path directly.
fn search_preprocessor(
&mut self,
path: &Path,
) -> io::Result<SearchResult> {
use std::{fs::File, process::Stdio};
let bin = self.config.preprocessor.as_ref().unwrap();
let mut cmd = std::process::Command::new(bin);
cmd.arg(path).stdin(Stdio::from(File::open(path)?));
let mut rdr = self.command_builder.build(&mut cmd).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!(
"preprocessor command could not start: '{:?}': {}",
cmd, err,
),
)
})?;
let result = self.search_reader(path, &mut rdr).map_err(|err| {
io::Error::new(
io::ErrorKind::Other,
format!("preprocessor command failed: '{:?}': {}", cmd, err),
)
});
let close_result = rdr.close();
let search_result = result?;
close_result?;
Ok(search_result)
}
/// Attempt to decompress the data at the given file path and search the
/// result. If the given file path isn't recognized as a compressed file,
/// then search it without doing any decompression.
fn search_decompress(&mut self, path: &Path) -> io::Result<SearchResult> {
let mut rdr = self.decomp_builder.build(path)?;
let result = self.search_reader(path, &mut rdr);
let close_result = rdr.close();
let search_result = result?;
close_result?;
Ok(search_result)
}
/// Search the contents of the given file path.
fn search_path(&mut self, path: &Path) -> io::Result<SearchResult> {
use self::PatternMatcher::*;
let (searcher, printer) = (&mut self.searcher, &mut self.printer);
match self.matcher {
RustRegex(ref m) => search_path(m, searcher, printer, path),
#[cfg(feature = "pcre2")]
PCRE2(ref m) => search_path(m, searcher, printer, path),
}
}
/// Executes a search on the given reader, which may or may not correspond
/// directly to the contents of the given file path. Instead, the reader
/// may actually cause something else to be searched (for example, when
/// a preprocessor is set or when decompression is enabled). In those
/// cases, the file path is used for visual purposes only.
///
/// Generally speaking, this method should only be used when there is no
/// other choice. Searching via `search_path` provides more opportunities
/// for optimizations (such as memory maps).
fn search_reader<R: io::Read>(
&mut self,
path: &Path,
rdr: &mut R,
) -> io::Result<SearchResult> {
use self::PatternMatcher::*;
let (searcher, printer) = (&mut self.searcher, &mut self.printer);
match self.matcher {
RustRegex(ref m) => search_reader(m, searcher, printer, path, rdr),
#[cfg(feature = "pcre2")]
PCRE2(ref m) => search_reader(m, searcher, printer, path, rdr),
}
}
}
/// Search the contents of the given file path using the given matcher,
/// searcher and printer.
fn search_path<M: Matcher, W: WriteColor>(
matcher: M,
searcher: &mut grep::searcher::Searcher,
printer: &mut Printer<W>,
path: &Path,
) -> io::Result<SearchResult> {
match *printer {
Printer::Standard(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::Summary(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::JSON(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_path(&matcher, path, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: Some(sink.stats().clone()),
})
}
}
}
/// Search the contents of the given reader using the given matcher, searcher
/// and printer.
fn search_reader<M: Matcher, R: io::Read, W: WriteColor>(
matcher: M,
searcher: &mut grep::searcher::Searcher,
printer: &mut Printer<W>,
path: &Path,
mut rdr: R,
) -> io::Result<SearchResult> {
match *printer {
Printer::Standard(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, &mut rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::Summary(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, &mut rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: sink.stats().map(|s| s.clone()),
})
}
Printer::JSON(ref mut p) => {
let mut sink = p.sink_with_path(&matcher, path);
searcher.search_reader(&matcher, &mut rdr, &mut sink)?;
Ok(SearchResult {
has_match: sink.has_match(),
stats: Some(sink.stats().clone()),
})
}
}
}

47
crates/globset/Cargo.toml Normal file
View File

@@ -0,0 +1,47 @@
[package]
name = "globset"
version = "0.4.14" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
"""
documentation = "https://docs.rs/globset"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/globset"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/globset"
readme = "README.md"
keywords = ["regex", "glob", "multiple", "set", "pattern"]
license = "Unlicense OR MIT"
edition = "2021"
[lib]
name = "globset"
bench = false
[dependencies]
aho-corasick = "1.1.1"
bstr = { version = "1.6.2", default-features = false, features = ["std"] }
log = { version = "0.4.20", optional = true }
serde = { version = "1.0.188", optional = true }
[dependencies.regex-syntax]
version = "0.8.0"
default-features = false
features = ["std"]
[dependencies.regex-automata]
version = "0.4.0"
default-features = false
features = ["std", "perf", "syntax", "meta", "nfa", "hybrid"]
[dev-dependencies]
glob = "0.3.1"
serde_json = "1.0.107"
[features]
default = ["log"]
# DEPRECATED. It is a no-op. SIMD is done automatically through runtime
# dispatch.
simd-accel = []
serde1 = ["serde"]

View File

@@ -4,11 +4,10 @@ Cross platform single glob and glob set matching. Glob set matching is the
process of matching one or more glob patterns against a single candidate path
simultaneously, and returning all of the globs that matched.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![](https://img.shields.io/crates/v/globset.svg)](https://crates.io/crates/globset)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org/).
### Documentation
@@ -20,14 +19,12 @@ Add this to your `Cargo.toml`:
```toml
[dependencies]
globset = "0.3"
globset = "0.4"
```
and this to your crate root:
### Features
```rust
extern crate globset;
```
* `serde1`: Enables implementing Serde traits on the `Glob` type.
### Example: one glob
@@ -81,12 +78,12 @@ assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
This crate implements globs by converting them to regular expressions, and
executing them with the
[`regex`](https://github.com/rust-lang-nursery/regex)
[`regex`](https://github.com/rust-lang/regex)
crate.
For single glob matching, performance of this crate should be roughly on par
with the performance of the
[`glob`](https://github.com/rust-lang-nursery/glob)
[`glob`](https://github.com/rust-lang/glob)
crate. (`*_regex` correspond to benchmarks for this library while `*_glob`
correspond to benchmarks for the `glob` library.)
Optimizations in the `regex` crate may propel this library past `glob`,
@@ -111,7 +108,7 @@ test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
```
### Comparison with the [`glob`](https://github.com/rust-lang-nursery/glob) crate
### Comparison with the [`glob`](https://github.com/rust-lang/glob) crate
* Supports alternate "or" globs, e.g., `*.{foo,bar}`.
* Can match non-UTF-8 file paths correctly.

View File

@@ -4,16 +4,8 @@ tool itself, see the benchsuite directory.
*/
#![feature(test)]
extern crate glob;
extern crate globset;
#[macro_use]
extern crate lazy_static;
extern crate regex;
extern crate test;
use std::ffi::OsStr;
use std::path::Path;
use globset::{Candidate, Glob, GlobMatcher, GlobSet, GlobSetBuilder};
const EXT: &'static str = "some/a/bigger/path/to/the/crazy/needle.txt";

30
crates/globset/src/fnv.rs Normal file
View File

@@ -0,0 +1,30 @@
/// A convenience alias for creating a hash map with an FNV hasher.
pub(crate) type HashMap<K, V> =
std::collections::HashMap<K, V, std::hash::BuildHasherDefault<Hasher>>;
/// A hasher that implements the FowlerNollVo (FNV) hash.
pub(crate) struct Hasher(u64);
impl Hasher {
const OFFSET_BASIS: u64 = 0xcbf29ce484222325;
const PRIME: u64 = 0x100000001b3;
}
impl Default for Hasher {
fn default() -> Hasher {
Hasher(Hasher::OFFSET_BASIS)
}
}
impl std::hash::Hasher for Hasher {
fn finish(&self) -> u64 {
self.0
}
fn write(&mut self, bytes: &[u8]) {
for &byte in bytes.iter() {
self.0 = self.0 ^ u64::from(byte);
self.0 = self.0.wrapping_mul(Hasher::PRIME);
}
}
}

View File

@@ -1,14 +1,8 @@
use std::fmt;
use std::hash;
use std::iter;
use std::ops::{Deref, DerefMut};
use std::path::{Path, is_separator};
use std::str;
use std::path::{is_separator, Path};
use regex;
use regex::bytes::Regex;
use regex_automata::meta::Regex;
use {Candidate, Error, ErrorKind, new_regex};
use crate::{new_regex, Candidate, Error, ErrorKind};
/// Describes a matching strategy for a particular pattern.
///
@@ -18,7 +12,7 @@ use {Candidate, Error, ErrorKind, new_regex};
/// possible to test whether any of those patterns matches by looking up a
/// file path's extension in a hash table.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum MatchStrategy {
pub(crate) enum MatchStrategy {
/// A pattern matches if and only if the entire file path matches this
/// literal string.
Literal(String),
@@ -53,7 +47,7 @@ pub enum MatchStrategy {
impl MatchStrategy {
/// Returns a matching strategy for the given pattern.
pub fn new(pat: &Glob) -> MatchStrategy {
pub(crate) fn new(pat: &Glob) -> MatchStrategy {
if let Some(lit) = pat.basename_literal() {
MatchStrategy::BasenameLiteral(lit)
} else if let Some(lit) = pat.literal() {
@@ -63,7 +57,7 @@ impl MatchStrategy {
} else if let Some(prefix) = pat.prefix() {
MatchStrategy::Prefix(prefix)
} else if let Some((suffix, component)) = pat.suffix() {
MatchStrategy::Suffix { suffix: suffix, component: component }
MatchStrategy::Suffix { suffix, component }
} else if let Some(ext) = pat.required_ext() {
MatchStrategy::RequiredExtension(ext)
} else {
@@ -85,24 +79,32 @@ pub struct Glob {
}
impl PartialEq for Glob {
fn eq(&self, other: &Glob) -> bool {
self.glob == other.glob && self.opts == other.opts
}
fn eq(&self, other: &Glob) -> bool {
self.glob == other.glob && self.opts == other.opts
}
}
impl hash::Hash for Glob {
fn hash<H: hash::Hasher>(&self, state: &mut H) {
self.glob.hash(state);
self.opts.hash(state);
}
impl std::hash::Hash for Glob {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.glob.hash(state);
self.opts.hash(state);
}
}
impl fmt::Display for Glob {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
impl std::fmt::Display for Glob {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
self.glob.fmt(f)
}
}
impl std::str::FromStr for Glob {
type Err = Error;
fn from_str(glob: &str) -> Result<Self, Self::Err> {
Self::new(glob)
}
}
/// A matcher for a single pattern.
#[derive(Clone, Debug)]
pub struct GlobMatcher {
@@ -119,9 +121,14 @@ impl GlobMatcher {
}
/// Tests whether the given path matches this pattern or not.
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
pub fn is_match_candidate(&self, path: &Candidate<'_>) -> bool {
self.re.is_match(&path.path)
}
/// Returns the `Glob` used to compile this matcher.
pub fn glob(&self) -> &Glob {
&self.pat
}
}
/// A strategic matcher for a single pattern.
@@ -130,8 +137,6 @@ impl GlobMatcher {
struct GlobStrategic {
/// The match strategy to use.
strategy: MatchStrategy,
/// The underlying pattern.
pat: Glob,
/// The pattern, as a compiled regex.
re: Regex,
}
@@ -144,7 +149,7 @@ impl GlobStrategic {
}
/// Tests whether the given path matches this pattern or not.
fn is_match_candidate(&self, candidate: &Candidate) -> bool {
fn is_match_candidate(&self, candidate: &Candidate<'_>) -> bool {
let byte_path = &*candidate.path;
match self.strategy {
@@ -197,6 +202,9 @@ struct GlobOptions {
/// Whether or not to use `\` to escape special characters.
/// e.g., when enabled, `\*` will match a literal `*`.
backslash_escape: bool,
/// Whether or not an empty case in an alternate will be removed.
/// e.g., when enabled, `{,a}` will match "" and "a".
empty_alternates: bool,
}
impl GlobOptions {
@@ -205,6 +213,7 @@ impl GlobOptions {
case_insensitive: false,
literal_separator: false,
backslash_escape: !is_separator('\\'),
empty_alternates: false,
}
}
}
@@ -212,13 +221,17 @@ impl GlobOptions {
#[derive(Clone, Debug, Default, Eq, PartialEq)]
struct Tokens(Vec<Token>);
impl Deref for Tokens {
impl std::ops::Deref for Tokens {
type Target = Vec<Token>;
fn deref(&self) -> &Vec<Token> { &self.0 }
fn deref(&self) -> &Vec<Token> {
&self.0
}
}
impl DerefMut for Tokens {
fn deref_mut(&mut self) -> &mut Vec<Token> { &mut self.0 }
impl std::ops::DerefMut for Tokens {
fn deref_mut(&mut self) -> &mut Vec<Token> {
&mut self.0
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
@@ -229,10 +242,7 @@ enum Token {
RecursivePrefix,
RecursiveSuffix,
RecursiveZeroOrMore,
Class {
negated: bool,
ranges: Vec<(char, char)>,
},
Class { negated: bool, ranges: Vec<(char, char)> },
Alternates(Vec<Tokens>),
}
@@ -244,12 +254,9 @@ impl Glob {
/// Returns a matcher for this pattern.
pub fn compile_matcher(&self) -> GlobMatcher {
let re = new_regex(&self.re)
.expect("regex compilation shouldn't fail");
GlobMatcher {
pat: self.clone(),
re: re,
}
let re =
new_regex(&self.re).expect("regex compilation shouldn't fail");
GlobMatcher { pat: self.clone(), re }
}
/// Returns a strategic matcher.
@@ -260,13 +267,9 @@ impl Glob {
#[cfg(test)]
fn compile_strategic_matcher(&self) -> GlobStrategic {
let strategy = MatchStrategy::new(self);
let re = new_regex(&self.re)
.expect("regex compilation shouldn't fail");
GlobStrategic {
strategy: strategy,
pat: self.clone(),
re: re,
}
let re =
new_regex(&self.re).expect("regex compilation shouldn't fail");
GlobStrategic { strategy, re }
}
/// Returns the original glob pattern used to build this pattern.
@@ -302,10 +305,8 @@ impl Glob {
}
let mut lit = String::new();
for t in &*self.tokens {
match *t {
Token::Literal(c) => lit.push(c),
_ => return None,
}
let Token::Literal(c) = *t else { return None };
lit.push(c);
}
if lit.is_empty() {
None
@@ -325,13 +326,12 @@ impl Glob {
if self.opts.case_insensitive {
return None;
}
let start = match self.tokens.get(0) {
Some(&Token::RecursivePrefix) => 1,
Some(_) => 0,
_ => return None,
let start = match *self.tokens.get(0)? {
Token::RecursivePrefix => 1,
_ => 0,
};
match self.tokens.get(start) {
Some(&Token::ZeroOrMore) => {
match *self.tokens.get(start)? {
Token::ZeroOrMore => {
// If there was no recursive prefix, then we only permit
// `*` if `*` can match a `/`. For example, if `*` can't
// match `/`, then `*.c` doesn't match `foo/bar.c`.
@@ -341,8 +341,8 @@ impl Glob {
}
_ => return None,
}
match self.tokens.get(start + 1) {
Some(&Token::Literal('.')) => {}
match *self.tokens.get(start + 1)? {
Token::Literal('.') => {}
_ => return None,
}
let mut lit = ".".to_string();
@@ -360,7 +360,7 @@ impl Glob {
}
}
/// This is like `ext`, but returns an extension even if it isn't sufficent
/// This is like `ext`, but returns an extension even if it isn't sufficient
/// to imply a match. Namely, if an extension is returned, then it is
/// necessary but not sufficient for a match.
fn required_ext(&self) -> Option<String> {
@@ -396,8 +396,8 @@ impl Glob {
if self.opts.case_insensitive {
return None;
}
let end = match self.tokens.last() {
Some(&Token::ZeroOrMore) => {
let (end, need_sep) = match *self.tokens.last()? {
Token::ZeroOrMore => {
if self.opts.literal_separator {
// If a trailing `*` can't match a `/`, then we can't
// assume a match of the prefix corresponds to a match
@@ -407,16 +407,18 @@ impl Glob {
// literal prefix.
return None;
}
self.tokens.len() - 1
(self.tokens.len() - 1, false)
}
_ => self.tokens.len(),
Token::RecursiveSuffix => (self.tokens.len() - 1, true),
_ => (self.tokens.len(), false),
};
let mut lit = String::new();
for t in &self.tokens[0..end] {
match *t {
Token::Literal(c) => lit.push(c),
_ => return None,
}
let Token::Literal(c) = *t else { return None };
lit.push(c);
}
if need_sep {
lit.push('/');
}
if lit.is_empty() {
None
@@ -442,8 +444,8 @@ impl Glob {
return None;
}
let mut lit = String::new();
let (start, entire) = match self.tokens.get(0) {
Some(&Token::RecursivePrefix) => {
let (start, entire) = match *self.tokens.get(0)? {
Token::RecursivePrefix => {
// We only care if this follows a path component if the next
// token is a literal.
if let Some(&Token::Literal(_)) = self.tokens.get(1) {
@@ -455,8 +457,8 @@ impl Glob {
}
_ => (0, false),
};
let start = match self.tokens.get(start) {
Some(&Token::ZeroOrMore) => {
let start = match *self.tokens.get(start)? {
Token::ZeroOrMore => {
// If literal_separator is enabled, then a `*` can't
// necessarily match everything, so reporting a suffix match
// as a match of the pattern would be a false positive.
@@ -468,10 +470,8 @@ impl Glob {
_ => start,
};
for t in &self.tokens[start..] {
match *t {
Token::Literal(c) => lit.push(c),
_ => return None,
}
let Token::Literal(c) = *t else { return None };
lit.push(c);
}
if lit.is_empty() || lit == "/" {
None
@@ -495,8 +495,8 @@ impl Glob {
if self.opts.case_insensitive {
return None;
}
let start = match self.tokens.get(0) {
Some(&Token::RecursivePrefix) => 1,
let start = match *self.tokens.get(0)? {
Token::RecursivePrefix => 1,
_ => {
// With nothing to gobble up the parent portion of a path,
// we can't assume that matching on only the basename is
@@ -507,7 +507,7 @@ impl Glob {
if self.tokens[start..].is_empty() {
return None;
}
for t in &self.tokens[start..] {
for t in self.tokens[start..].iter() {
match *t {
Token::Literal('/') => return None,
Token::Literal(_) => {} // OK
@@ -524,7 +524,7 @@ impl Glob {
| Token::RecursiveZeroOrMore => {
return None;
}
Token::Class{..} | Token::Alternates(..) => {
Token::Class { .. } | Token::Alternates(..) => {
// We *could* be a little smarter here, but either one
// of these is going to prevent our literal optimizations
// anyway, so give up.
@@ -541,16 +541,11 @@ impl Glob {
/// The basic format of these patterns is `**/{literal}`, where `{literal}`
/// does not contain a path separator.
fn basename_literal(&self) -> Option<String> {
let tokens = match self.basename_tokens() {
None => return None,
Some(tokens) => tokens,
};
let tokens = self.basename_tokens()?;
let mut lit = String::new();
for t in tokens {
match *t {
Token::Literal(c) => lit.push(c),
_ => return None,
}
let Token::Literal(c) = *t else { return None };
lit.push(c);
}
Some(lit)
}
@@ -561,10 +556,7 @@ impl<'a> GlobBuilder<'a> {
///
/// The pattern is not compiled until `build` is called.
pub fn new(glob: &'a str) -> GlobBuilder<'a> {
GlobBuilder {
glob: glob,
opts: GlobOptions::default(),
}
GlobBuilder { glob, opts: GlobOptions::default() }
}
/// Parses and builds the pattern.
@@ -594,7 +586,7 @@ impl<'a> GlobBuilder<'a> {
glob: self.glob.to_string(),
re: tokens.to_regex_with(&self.opts),
opts: self.opts,
tokens: tokens,
tokens,
})
}
}
@@ -608,6 +600,8 @@ impl<'a> GlobBuilder<'a> {
}
/// Toggle whether a literal `/` is required to match a path separator.
///
/// By default this is false: `*` and `?` will match `/`.
pub fn literal_separator(&mut self, yes: bool) -> &mut GlobBuilder<'a> {
self.opts.literal_separator = yes;
self
@@ -625,6 +619,17 @@ impl<'a> GlobBuilder<'a> {
self.opts.backslash_escape = yes;
self
}
/// Toggle whether an empty pattern in a list of alternates is accepted.
///
/// For example, if this is set then the glob `foo{,.txt}` will match both
/// `foo` and `foo.txt`.
///
/// By default this is false.
pub fn empty_alternates(&mut self, yes: bool) -> &mut GlobBuilder<'a> {
self.opts.empty_alternates = yes;
self
}
}
impl Tokens {
@@ -656,7 +661,7 @@ impl Tokens {
tokens: &[Token],
re: &mut String,
) {
for tok in tokens {
for tok in tokens.iter() {
match *tok {
Token::Literal(c) => {
re.push_str(&char_to_escaped_literal(c));
@@ -679,7 +684,7 @@ impl Tokens {
re.push_str("(?:/?|.*/)");
}
Token::RecursiveSuffix => {
re.push_str("(?:/?|/.*)");
re.push_str("/.*");
}
Token::RecursiveZeroOrMore => {
re.push_str("(?:/|/.*/)");
@@ -706,7 +711,7 @@ impl Tokens {
for pat in patterns {
let mut altre = String::new();
self.tokens_to_regex(options, &pat, &mut altre);
if !altre.is_empty() {
if !altre.is_empty() || options.empty_alternates {
parts.push(altre);
}
}
@@ -714,7 +719,7 @@ impl Tokens {
// It is possible to have an empty set in which case the
// resulting alternation '()' would be an error.
if !parts.is_empty() {
re.push('(');
re.push_str("(?:");
re.push_str(&parts.join("|"));
re.push(')');
}
@@ -736,7 +741,9 @@ fn bytes_to_escaped_literal(bs: &[u8]) -> String {
let mut s = String::with_capacity(bs.len());
for &b in bs {
if b <= 0x7F {
s.push_str(&regex::escape(&(b as char).to_string()));
s.push_str(&regex_syntax::escape(
char::from(b).encode_utf8(&mut [0; 4]),
));
} else {
s.push_str(&format!("\\x{:02x}", b));
}
@@ -747,7 +754,7 @@ fn bytes_to_escaped_literal(bs: &[u8]) -> String {
struct Parser<'a> {
glob: &'a str,
stack: Vec<Tokens>,
chars: iter::Peekable<str::Chars<'a>>,
chars: std::iter::Peekable<std::str::Chars<'a>>,
prev: Option<char>,
cur: Option<char>,
opts: &'a GlobOptions,
@@ -755,7 +762,7 @@ struct Parser<'a> {
impl<'a> Parser<'a> {
fn error(&self, kind: ErrorKind) -> Error {
Error { glob: Some(self.glob.to_string()), kind: kind }
Error { glob: Some(self.glob.to_string()), kind }
}
fn parse(&mut self) -> Result<(), Error> {
@@ -837,40 +844,63 @@ impl<'a> Parser<'a> {
fn parse_star(&mut self) -> Result<(), Error> {
let prev = self.prev;
if self.chars.peek() != Some(&'*') {
if self.peek() != Some('*') {
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
assert!(self.bump() == Some('*'));
if !self.have_tokens()? {
self.push_token(Token::RecursivePrefix)?;
let next = self.bump();
if !next.map(is_separator).unwrap_or(true) {
return Err(self.error(ErrorKind::InvalidRecursive));
if !self.peek().map_or(true, is_separator) {
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
} else {
self.push_token(Token::RecursivePrefix)?;
assert!(self.bump().map_or(true, is_separator));
}
return Ok(());
}
self.pop_token()?;
if !prev.map(is_separator).unwrap_or(false) {
if self.stack.len() <= 1
|| (prev != Some(',') && prev != Some('{')) {
return Err(self.error(ErrorKind::InvalidRecursive));
|| (prev != Some(',') && prev != Some('{'))
{
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
}
match self.chars.peek() {
let is_suffix = match self.peek() {
None => {
assert!(self.bump().is_none());
self.push_token(Token::RecursiveSuffix)
true
}
Some(&',') | Some(&'}') if self.stack.len() >= 2 => {
self.push_token(Token::RecursiveSuffix)
}
Some(&c) if is_separator(c) => {
Some(',') | Some('}') if self.stack.len() >= 2 => true,
Some(c) if is_separator(c) => {
assert!(self.bump().map(is_separator).unwrap_or(false));
self.push_token(Token::RecursiveZeroOrMore)
false
}
_ => {
self.push_token(Token::ZeroOrMore)?;
self.push_token(Token::ZeroOrMore)?;
return Ok(());
}
};
match self.pop_token()? {
Token::RecursivePrefix => {
self.push_token(Token::RecursivePrefix)?;
}
Token::RecursiveSuffix => {
self.push_token(Token::RecursiveSuffix)?;
}
_ => {
if is_suffix {
self.push_token(Token::RecursiveSuffix)?;
} else {
self.push_token(Token::RecursiveZeroOrMore)?;
}
}
_ => Err(self.error(ErrorKind::InvalidRecursive)),
}
Ok(())
}
fn parse_class(&mut self) -> Result<(), Error> {
@@ -934,7 +964,10 @@ impl<'a> Parser<'a> {
// invariant: in_range is only set when there is
// already at least one character seen.
add_to_last_range(
&self.glob, ranges.last_mut().unwrap(), c)?;
&self.glob,
ranges.last_mut().unwrap(),
c,
)?;
} else {
ranges.push((c, c));
}
@@ -948,10 +981,7 @@ impl<'a> Parser<'a> {
// it as a literal.
ranges.push(('-', '-'));
}
self.push_token(Token::Class {
negated: negated,
ranges: ranges,
})
self.push_token(Token::Class { negated, ranges })
}
fn bump(&mut self) -> Option<char> {
@@ -959,6 +989,10 @@ impl<'a> Parser<'a> {
self.cur = self.chars.next();
self.cur
}
fn peek(&mut self) -> Option<char> {
self.chars.peek().map(|&ch| ch)
}
}
#[cfg(test)]
@@ -976,15 +1010,16 @@ fn ends_with(needle: &[u8], haystack: &[u8]) -> bool {
#[cfg(test)]
mod tests {
use {GlobSetBuilder, ErrorKind};
use super::{Glob, GlobBuilder, Token};
use super::Token::*;
use super::{Glob, GlobBuilder, Token};
use crate::{ErrorKind, GlobSetBuilder};
#[derive(Clone, Copy, Debug, Default)]
struct Options {
casei: Option<bool>,
litsep: Option<bool>,
bsesc: Option<bool>,
ealtre: Option<bool>,
}
macro_rules! syntax {
@@ -994,7 +1029,7 @@ mod tests {
let pat = Glob::new($pat).unwrap();
assert_eq!($tokens, pat.tokens.0);
}
}
};
}
macro_rules! syntaxerr {
@@ -1004,7 +1039,7 @@ mod tests {
let err = Glob::new($pat).unwrap_err();
assert_eq!(&$err, err.kind());
}
}
};
}
macro_rules! toregex {
@@ -1024,6 +1059,9 @@ mod tests {
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
if let Some(ealtre) = $options.ealtre {
builder.empty_alternates(ealtre);
}
let pat = builder.build().unwrap();
assert_eq!(format!("(?-u){}", $re), pat.regex());
}
@@ -1047,6 +1085,9 @@ mod tests {
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
if let Some(ealtre) = $options.ealtre {
builder.empty_alternates(ealtre);
}
let pat = builder.build().unwrap();
let matcher = pat.compile_matcher();
let strategic = pat.compile_strategic_matcher();
@@ -1075,6 +1116,9 @@ mod tests {
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
if let Some(ealtre) = $options.ealtre {
builder.empty_alternates(ealtre);
}
let pat = builder.build().unwrap();
let matcher = pat.compile_matcher();
let strategic = pat.compile_strategic_matcher();
@@ -1086,7 +1130,9 @@ mod tests {
};
}
fn s(string: &str) -> String { string.to_string() }
fn s(string: &str) -> String {
string.to_string()
}
fn class(s: char, e: char) -> Token {
Class { negated: false, ranges: vec![(s, e)] }
@@ -1110,16 +1156,20 @@ mod tests {
syntax!(any2, "a?b", vec![Literal('a'), Any, Literal('b')]);
syntax!(seq1, "*", vec![ZeroOrMore]);
syntax!(seq2, "a*b", vec![Literal('a'), ZeroOrMore, Literal('b')]);
syntax!(seq3, "*a*b*", vec![
ZeroOrMore, Literal('a'), ZeroOrMore, Literal('b'), ZeroOrMore,
]);
syntax!(
seq3,
"*a*b*",
vec![ZeroOrMore, Literal('a'), ZeroOrMore, Literal('b'), ZeroOrMore,]
);
syntax!(rseq1, "**", vec![RecursivePrefix]);
syntax!(rseq2, "**/", vec![RecursivePrefix]);
syntax!(rseq3, "/**", vec![RecursiveSuffix]);
syntax!(rseq4, "/**/", vec![RecursiveZeroOrMore]);
syntax!(rseq5, "a/**/b", vec![
Literal('a'), RecursiveZeroOrMore, Literal('b'),
]);
syntax!(
rseq5,
"a/**/b",
vec![Literal('a'), RecursiveZeroOrMore, Literal('b'),]
);
syntax!(cls1, "[a]", vec![class('a', 'a')]);
syntax!(cls2, "[!a]", vec![classn('a', 'a')]);
syntax!(cls3, "[a-z]", vec![class('a', 'z')]);
@@ -1131,9 +1181,11 @@ mod tests {
syntax!(cls9, "[a-]", vec![rclass(&[('a', 'a'), ('-', '-')])]);
syntax!(cls10, "[-a-z]", vec![rclass(&[('-', '-'), ('a', 'z')])]);
syntax!(cls11, "[a-z-]", vec![rclass(&[('a', 'z'), ('-', '-')])]);
syntax!(cls12, "[-a-z-]", vec![
rclass(&[('-', '-'), ('a', 'z'), ('-', '-')]),
]);
syntax!(
cls12,
"[-a-z-]",
vec![rclass(&[('-', '-'), ('a', 'z'), ('-', '-')]),]
);
syntax!(cls13, "[]-z]", vec![class(']', 'z')]);
syntax!(cls14, "[--z]", vec![class('-', 'z')]);
syntax!(cls15, "[ --]", vec![class(' ', '-')]);
@@ -1144,13 +1196,6 @@ mod tests {
syntax!(cls20, "[^a]", vec![classn('a', 'a')]);
syntax!(cls21, "[^a-z]", vec![classn('a', 'z')]);
syntaxerr!(err_rseq1, "a**", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq2, "**a", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq3, "a**b", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq4, "***", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq5, "/a**", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq6, "/**a", ErrorKind::InvalidRecursive);
syntaxerr!(err_rseq7, "/a**b", ErrorKind::InvalidRecursive);
syntaxerr!(err_unclosed1, "[", ErrorKind::UnclosedClass);
syntaxerr!(err_unclosed2, "[]", ErrorKind::UnclosedClass);
syntaxerr!(err_unclosed3, "[!", ErrorKind::UnclosedClass);
@@ -1158,25 +1203,23 @@ mod tests {
syntaxerr!(err_range1, "[z-a]", ErrorKind::InvalidRange('z', 'a'));
syntaxerr!(err_range2, "[z--]", ErrorKind::InvalidRange('z', '-'));
const CASEI: Options = Options {
casei: Some(true),
litsep: None,
bsesc: None,
};
const SLASHLIT: Options = Options {
casei: None,
litsep: Some(true),
bsesc: None,
};
const CASEI: Options =
Options { casei: Some(true), litsep: None, bsesc: None, ealtre: None };
const SLASHLIT: Options =
Options { casei: None, litsep: Some(true), bsesc: None, ealtre: None };
const NOBSESC: Options = Options {
casei: None,
litsep: None,
bsesc: Some(false),
ealtre: None,
};
const BSESC: Options = Options {
const BSESC: Options =
Options { casei: None, litsep: None, bsesc: Some(true), ealtre: None };
const EALTRE: Options = Options {
casei: None,
litsep: None,
bsesc: Some(true),
ealtre: Some(true),
};
toregex!(re_casei, "a", "(?i)^a$", &CASEI);
@@ -1194,8 +1237,31 @@ mod tests {
toregex!(re8, "[*]", r"^[\*]$");
toregex!(re9, "[+]", r"^[\+]$");
toregex!(re10, "+", r"^\+$");
toregex!(re11, "**", r"^.*$");
toregex!(re12, "", r"^\xe2\x98\x83$");
toregex!(re11, "", r"^\xe2\x98\x83$");
toregex!(re12, "**", r"^.*$");
toregex!(re13, "**/", r"^.*$");
toregex!(re14, "**/*", r"^(?:/?|.*/).*$");
toregex!(re15, "**/**", r"^.*$");
toregex!(re16, "**/**/*", r"^(?:/?|.*/).*$");
toregex!(re17, "**/**/**", r"^.*$");
toregex!(re18, "**/**/**/*", r"^(?:/?|.*/).*$");
toregex!(re19, "a/**", r"^a/.*$");
toregex!(re20, "a/**/**", r"^a/.*$");
toregex!(re21, "a/**/**/**", r"^a/.*$");
toregex!(re22, "a/**/b", r"^a(?:/|/.*/)b$");
toregex!(re23, "a/**/**/b", r"^a(?:/|/.*/)b$");
toregex!(re24, "a/**/**/**/b", r"^a(?:/|/.*/)b$");
toregex!(re25, "**/b", r"^(?:/?|.*/)b$");
toregex!(re26, "**/**/b", r"^(?:/?|.*/)b$");
toregex!(re27, "**/**/**/b", r"^(?:/?|.*/)b$");
toregex!(re28, "a**", r"^a.*.*$");
toregex!(re29, "**a", r"^.*.*a$");
toregex!(re30, "a**b", r"^a.*.*b$");
toregex!(re31, "***", r"^.*.*.*$");
toregex!(re32, "/a**", r"^/a.*.*$");
toregex!(re33, "/**a", r"^/.*.*a$");
toregex!(re34, "/a**b", r"^/a.*.*b$");
toregex!(re35, "{a,b}", r"^(?:b|a)$");
matches!(match1, "a", "a");
matches!(match2, "a*b", "a_b");
@@ -1228,11 +1294,12 @@ mod tests {
matches!(matchrec18, "/**/test", "/test");
matches!(matchrec19, "**/.*", ".abc");
matches!(matchrec20, "**/.*", "abc/.abc");
matches!(matchrec21, ".*/**", ".abc");
matches!(matchrec21, "**/foo/bar", "foo/bar");
matches!(matchrec22, ".*/**", ".abc/abc");
matches!(matchrec23, "foo/**", "foo");
matches!(matchrec24, "**/foo/bar", "foo/bar");
matches!(matchrec25, "some/*/needle.txt", "some/one/needle.txt");
matches!(matchrec23, "test/**", "test/");
matches!(matchrec24, "test/**", "test/one");
matches!(matchrec25, "test/**", "test/one/two");
matches!(matchrec26, "some/*/needle.txt", "some/one/needle.txt");
matches!(matchrange1, "a[0-9]b", "a0b");
matches!(matchrange2, "a[0-9]b", "a9b");
@@ -1253,8 +1320,11 @@ mod tests {
matches!(matchpat4, "*hello.txt", "some\\path\\to\\hello.txt");
matches!(matchpat5, "*hello.txt", "/an/absolute/path/to/hello.txt");
matches!(matchpat6, "*some/path/to/hello.txt", "some/path/to/hello.txt");
matches!(matchpat7, "*some/path/to/hello.txt",
"a/bigger/some/path/to/hello.txt");
matches!(
matchpat7,
"*some/path/to/hello.txt",
"a/bigger/some/path/to/hello.txt"
);
matches!(matchescape, "_[[]_[]]_[?]_[*]_!_", "_[_]_?_*_!_");
@@ -1276,6 +1346,9 @@ mod tests {
matches!(matchalt11, "{*.foo,*.bar,*.wat}", "test.foo");
matches!(matchalt12, "{*.foo,*.bar,*.wat}", "test.bar");
matches!(matchalt13, "{*.foo,*.bar,*.wat}", "test.wat");
matches!(matchalt14, "foo{,.txt}", "foo.txt");
nmatches!(matchalt15, "foo{,.txt}", "foo");
matches!(matchalt16, "foo{,.txt}", "foo", EALTRE);
matches!(matchslash1, "abc/def", "abc/def", SLASHLIT);
#[cfg(unix)]
@@ -1317,28 +1390,46 @@ mod tests {
nmatches!(matchnot15, "[!-]", "-");
nmatches!(matchnot16, "*hello.txt", "hello.txt-and-then-some");
nmatches!(matchnot17, "*hello.txt", "goodbye.txt");
nmatches!(matchnot18, "*some/path/to/hello.txt",
"some/path/to/hello.txt-and-then-some");
nmatches!(matchnot19, "*some/path/to/hello.txt",
"some/other/path/to/hello.txt");
nmatches!(
matchnot18,
"*some/path/to/hello.txt",
"some/path/to/hello.txt-and-then-some"
);
nmatches!(
matchnot19,
"*some/path/to/hello.txt",
"some/other/path/to/hello.txt"
);
nmatches!(matchnot20, "a", "foo/a");
nmatches!(matchnot21, "./foo", "foo");
nmatches!(matchnot22, "**/foo", "foofoo");
nmatches!(matchnot23, "**/foo/bar", "foofoo/bar");
nmatches!(matchnot24, "/*.c", "mozilla-sha1/sha1.c");
nmatches!(matchnot25, "*.c", "mozilla-sha1/sha1.c", SLASHLIT);
nmatches!(matchnot26, "**/m4/ltoptions.m4",
"csharp/src/packages/repositories.config", SLASHLIT);
nmatches!(
matchnot26,
"**/m4/ltoptions.m4",
"csharp/src/packages/repositories.config",
SLASHLIT
);
nmatches!(matchnot27, "a[^0-9]b", "a0b");
nmatches!(matchnot28, "a[^0-9]b", "a9b");
nmatches!(matchnot29, "[^-]", "-");
nmatches!(matchnot30, "some/*/needle.txt", "some/needle.txt");
nmatches!(
matchrec31,
"some/*/needle.txt", "some/one/two/needle.txt", SLASHLIT);
"some/*/needle.txt",
"some/one/two/needle.txt",
SLASHLIT
);
nmatches!(
matchrec32,
"some/*/needle.txt", "some/one/two/three/needle.txt", SLASHLIT);
"some/*/needle.txt",
"some/one/two/three/needle.txt",
SLASHLIT
);
nmatches!(matchrec33, ".*/**", ".abc");
nmatches!(matchrec34, "foo/**", "foo");
macro_rules! extract {
($which:ident, $name:ident, $pat:expr, $expect:expr) => {
@@ -1357,6 +1448,9 @@ mod tests {
if let Some(bsesc) = $options.bsesc {
builder.backslash_escape(bsesc);
}
if let Some(ealtre) = $options.ealtre {
builder.empty_alternates(ealtre);
}
let pat = builder.build().unwrap();
assert_eq!($expect, pat.$which());
}
@@ -1400,19 +1494,27 @@ mod tests {
literal!(extract_lit7, "foo/bar", Some(s("foo/bar")));
literal!(extract_lit8, "**/foo/bar", None);
basetokens!(extract_basetoks1, "**/foo", Some(&*vec![
Literal('f'), Literal('o'), Literal('o'),
]));
basetokens!(
extract_basetoks1,
"**/foo",
Some(&*vec![Literal('f'), Literal('o'), Literal('o'),])
);
basetokens!(extract_basetoks2, "**/foo", None, CASEI);
basetokens!(extract_basetoks3, "**/foo", Some(&*vec![
Literal('f'), Literal('o'), Literal('o'),
]), SLASHLIT);
basetokens!(
extract_basetoks3,
"**/foo",
Some(&*vec![Literal('f'), Literal('o'), Literal('o'),]),
SLASHLIT
);
basetokens!(extract_basetoks4, "*foo", None, SLASHLIT);
basetokens!(extract_basetoks5, "*foo", None);
basetokens!(extract_basetoks6, "**/fo*o", None);
basetokens!(extract_basetoks7, "**/fo*o", Some(&*vec![
Literal('f'), Literal('o'), ZeroOrMore, Literal('o'),
]), SLASHLIT);
basetokens!(
extract_basetoks7,
"**/fo*o",
Some(&*vec![Literal('f'), Literal('o'), ZeroOrMore, Literal('o'),]),
SLASHLIT
);
ext!(extract_ext1, "**/*.rs", Some(s(".rs")));
ext!(extract_ext2, "**/*.rs.bak", None);
@@ -1435,7 +1537,7 @@ mod tests {
prefix!(extract_prefix1, "/foo", Some(s("/foo")));
prefix!(extract_prefix2, "/foo/*", Some(s("/foo/")));
prefix!(extract_prefix3, "**/foo", None);
prefix!(extract_prefix4, "foo/**", None);
prefix!(extract_prefix4, "foo/**", Some(s("foo/")));
suffix!(extract_suffix1, "**/foo/bar", Some((s("/foo/bar"), true)));
suffix!(extract_suffix2, "*/foo/bar", Some((s("/foo/bar"), false)));

View File

@@ -5,11 +5,9 @@ Glob set matching is the process of matching one or more glob patterns against
a single candidate path simultaneously, and returning all of the globs that
matched. For example, given this set of globs:
```ignore
*.rs
src/lib.rs
src/**/foo.rs
```
* `*.rs`
* `src/lib.rs`
* `src/**/foo.rs`
and a path `src/bar/baz/foo.rs`, then the set would report the first and third
globs as matching.
@@ -19,7 +17,6 @@ globs as matching.
This example shows how to match a single glob against a single file path.
```
# fn example() -> Result<(), globset::Error> {
use globset::Glob;
let glob = Glob::new("*.rs")?.compile_matcher();
@@ -27,7 +24,7 @@ let glob = Glob::new("*.rs")?.compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
# Ok::<(), Box<dyn std::error::Error>>(())
```
# Example: configuring a glob matcher
@@ -36,7 +33,6 @@ This example shows how to use a `GlobBuilder` to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
```
# fn example() -> Result<(), globset::Error> {
use globset::GlobBuilder;
let glob = GlobBuilder::new("*.rs")
@@ -45,7 +41,7 @@ let glob = GlobBuilder::new("*.rs")
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
# Ok(()) } example().unwrap();
# Ok::<(), Box<dyn std::error::Error>>(())
```
# Example: match multiple globs at once
@@ -53,7 +49,6 @@ assert!(!glob.is_match("Cargo.toml"));
This example shows how to match multiple glob patterns at once.
```
# fn example() -> Result<(), globset::Error> {
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
@@ -65,7 +60,7 @@ builder.add(Glob::new("src/**/foo.rs")?);
let set = builder.build()?;
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
# Ok(()) } example().unwrap();
# Ok::<(), Box<dyn std::error::Error>>(())
```
# Syntax
@@ -103,34 +98,47 @@ or to enable case insensitive matching.
#![deny(missing_docs)]
extern crate aho_corasick;
extern crate fnv;
#[macro_use]
extern crate log;
extern crate memchr;
extern crate regex;
use std::borrow::Cow;
use std::collections::{BTreeMap, HashMap};
use std::error::Error as StdError;
use std::ffi::OsStr;
use std::fmt;
use std::hash;
use std::path::Path;
use std::str;
use aho_corasick::{Automaton, AcAutomaton, FullAcAutomaton};
use regex::bytes::{Regex, RegexBuilder, RegexSet};
use pathutil::{
file_name, file_name_ext, normalize_path, os_str_bytes, path_bytes,
use std::{
borrow::Cow,
panic::{RefUnwindSafe, UnwindSafe},
path::Path,
sync::Arc,
};
use glob::MatchStrategy;
pub use glob::{Glob, GlobBuilder, GlobMatcher};
use {
aho_corasick::AhoCorasick,
bstr::{ByteSlice, ByteVec, B},
regex_automata::{
meta::Regex,
util::pool::{Pool, PoolGuard},
PatternSet,
},
};
use crate::{
glob::MatchStrategy,
pathutil::{file_name, file_name_ext, normalize_path},
};
pub use crate::glob::{Glob, GlobBuilder, GlobMatcher};
mod fnv;
mod glob;
mod pathutil;
#[cfg(feature = "serde1")]
mod serde_impl;
#[cfg(feature = "log")]
macro_rules! debug {
($($token:tt)*) => (::log::debug!($($token)*);)
}
#[cfg(not(feature = "log"))]
macro_rules! debug {
($($token:tt)*) => {};
}
/// Represents an error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Error {
@@ -143,8 +151,13 @@ pub struct Error {
/// The kind of error that can occur when parsing a glob pattern.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum ErrorKind {
/// Occurs when a use of `**` is invalid. Namely, `**` can only appear
/// adjacent to a path separator, or the beginning/end of a glob.
/// **DEPRECATED**.
///
/// This error used to occur for consistency with git's glob specification,
/// but the specification now accepts all uses of `**`. When `**` does not
/// appear adjacent to a path separator or at the beginning/end of a glob,
/// it is now treated as two consecutive `*` patterns. As such, this error
/// is no longer used.
InvalidRecursive,
/// Occurs when a character class (e.g., `[abc]`) is not closed.
UnclosedClass,
@@ -172,7 +185,7 @@ pub enum ErrorKind {
__Nonexhaustive,
}
impl StdError for Error {
impl std::error::Error for Error {
fn description(&self) -> &str {
self.kind.description()
}
@@ -199,9 +212,7 @@ impl ErrorKind {
ErrorKind::UnclosedClass => {
"unclosed character class; missing ']'"
}
ErrorKind::InvalidRange(_, _) => {
"invalid character range"
}
ErrorKind::InvalidRange(_, _) => "invalid character range",
ErrorKind::UnopenedAlternates => {
"unopened alternate group; missing '{' \
(maybe escape '}' with '[}]'?)"
@@ -213,17 +224,15 @@ impl ErrorKind {
ErrorKind::NestedAlternates => {
"nested alternate groups are not allowed"
}
ErrorKind::DanglingEscape => {
"dangling '\\'"
}
ErrorKind::DanglingEscape => "dangling '\\'",
ErrorKind::Regex(ref err) => err,
ErrorKind::__Nonexhaustive => unreachable!(),
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
impl std::fmt::Display for Error {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.glob {
None => self.kind.fmt(f),
Some(ref glob) => {
@@ -233,8 +242,8 @@ impl fmt::Display for Error {
}
}
impl fmt::Display for ErrorKind {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
impl std::fmt::Display for ErrorKind {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match *self {
ErrorKind::InvalidRecursive
| ErrorKind::UnclosedClass
@@ -242,9 +251,7 @@ impl fmt::Display for ErrorKind {
| ErrorKind::UnclosedAlternates
| ErrorKind::NestedAlternates
| ErrorKind::DanglingEscape
| ErrorKind::Regex(_) => {
write!(f, "{}", self.description())
}
| ErrorKind::Regex(_) => write!(f, "{}", self.description()),
ErrorKind::InvalidRange(s, e) => {
write!(f, "invalid range; '{}' > '{}'", s, e)
}
@@ -254,31 +261,40 @@ impl fmt::Display for ErrorKind {
}
fn new_regex(pat: &str) -> Result<Regex, Error> {
RegexBuilder::new(pat)
.dot_matches_new_line(true)
.size_limit(10 * (1 << 20))
.dfa_size_limit(10 * (1 << 20))
.build()
.map_err(|err| {
Error {
glob: Some(pat.to_string()),
kind: ErrorKind::Regex(err.to_string()),
}
})
let syntax = regex_automata::util::syntax::Config::new()
.utf8(false)
.dot_matches_new_line(true);
let config = Regex::config()
.utf8_empty(false)
.nfa_size_limit(Some(10 * (1 << 20)))
.hybrid_cache_capacity(10 * (1 << 20));
Regex::builder().syntax(syntax).configure(config).build(pat).map_err(
|err| Error {
glob: Some(pat.to_string()),
kind: ErrorKind::Regex(err.to_string()),
},
)
}
fn new_regex_set<I, S>(pats: I) -> Result<RegexSet, Error>
where S: AsRef<str>, I: IntoIterator<Item=S> {
RegexSet::new(pats).map_err(|err| {
Error {
fn new_regex_set(pats: Vec<String>) -> Result<Regex, Error> {
let syntax = regex_automata::util::syntax::Config::new()
.utf8(false)
.dot_matches_new_line(true);
let config = Regex::config()
.match_kind(regex_automata::MatchKind::All)
.utf8_empty(false)
.nfa_size_limit(Some(10 * (1 << 20)))
.hybrid_cache_capacity(10 * (1 << 20));
Regex::builder()
.syntax(syntax)
.configure(config)
.build_many(&pats)
.map_err(|err| Error {
glob: None,
kind: ErrorKind::Regex(err.to_string()),
}
})
})
}
type Fnv = hash::BuildHasherDefault<fnv::FnvHasher>;
/// GlobSet represents a group of globs that can be matched together in a
/// single pass.
#[derive(Clone, Debug)]
@@ -288,20 +304,28 @@ pub struct GlobSet {
}
impl GlobSet {
/// Create a new [`GlobSetBuilder`]. A `GlobSetBuilder` can be used to add
/// new patterns. Once all patterns have been added, `build` should be
/// called to produce a `GlobSet`, which can then be used for matching.
#[inline]
pub fn builder() -> GlobSetBuilder {
GlobSetBuilder::new()
}
/// Create an empty `GlobSet`. An empty set matches nothing.
#[inline]
pub fn empty() -> GlobSet {
GlobSet {
len: 0,
strats: vec![],
}
GlobSet { len: 0, strats: vec![] }
}
/// Returns true if this set is empty, and therefore matches nothing.
#[inline]
pub fn is_empty(&self) -> bool {
self.len == 0
}
/// Returns the number of globs in this set.
#[inline]
pub fn len(&self) -> usize {
self.len
}
@@ -315,7 +339,7 @@ impl GlobSet {
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn is_match_candidate(&self, path: &Candidate) -> bool {
pub fn is_match_candidate(&self, path: &Candidate<'_>) -> bool {
if self.is_empty() {
return false;
}
@@ -338,7 +362,7 @@ impl GlobSet {
///
/// This takes a Candidate as input, which can be used to amortize the
/// cost of preparing a path for matching.
pub fn matches_candidate(&self, path: &Candidate) -> Vec<usize> {
pub fn matches_candidate(&self, path: &Candidate<'_>) -> Vec<usize> {
let mut into = vec![];
if self.is_empty() {
return into;
@@ -350,7 +374,7 @@ impl GlobSet {
/// Adds the sequence number of every glob pattern that matches the given
/// path to the vec given.
///
/// `into` is is cleared before matching begins, and contains the set of
/// `into` is cleared before matching begins, and contains the set of
/// sequence numbers (in ascending order) after matching ends. If no globs
/// were matched, then `into` will be empty.
pub fn matches_into<P: AsRef<Path>>(
@@ -364,7 +388,7 @@ impl GlobSet {
/// Adds the sequence number of every glob pattern that matches the given
/// path to the vec given.
///
/// `into` is is cleared before matching begins, and contains the set of
/// `into` is cleared before matching begins, and contains the set of
/// sequence numbers (in ascending order) after matching ends. If no globs
/// were matched, then `into` will be empty.
///
@@ -372,7 +396,7 @@ impl GlobSet {
/// cost of preparing a path for matching.
pub fn matches_candidate_into(
&self,
path: &Candidate,
path: &Candidate<'_>,
into: &mut Vec<usize>,
) {
into.clear();
@@ -426,11 +450,17 @@ impl GlobSet {
}
}
}
debug!("built glob set; {} literals, {} basenames, {} extensions, \
debug!(
"built glob set; {} literals, {} basenames, {} extensions, \
{} prefixes, {} suffixes, {} required extensions, {} regexes",
lits.0.len(), base_lits.0.len(), exts.0.len(),
prefixes.literals.len(), suffixes.literals.len(),
required_exts.0.len(), regexes.literals.len());
lits.0.len(),
base_lits.0.len(),
exts.0.len(),
prefixes.literals.len(),
suffixes.literals.len(),
required_exts.0.len(),
regexes.literals.len()
);
Ok(GlobSet {
len: pats.len(),
strats: vec![
@@ -440,13 +470,21 @@ impl GlobSet {
GlobSetMatchStrategy::Suffix(suffixes.suffix()),
GlobSetMatchStrategy::Prefix(prefixes.prefix()),
GlobSetMatchStrategy::RequiredExtension(
required_exts.build()?),
required_exts.build()?,
),
GlobSetMatchStrategy::Regex(regexes.regex_set()?),
],
})
}
}
impl Default for GlobSet {
/// Create a default empty GlobSet.
fn default() -> Self {
GlobSet::empty()
}
}
/// GlobSetBuilder builds a group of patterns that can be used to
/// simultaneously match a file path.
#[derive(Clone, Debug)]
@@ -455,9 +493,9 @@ pub struct GlobSetBuilder {
}
impl GlobSetBuilder {
/// Create a new GlobSetBuilder. A GlobSetBuilder can be used to add new
/// Create a new `GlobSetBuilder`. A `GlobSetBuilder` can be used to add new
/// patterns. Once all patterns have been added, `build` should be called
/// to produce a `GlobSet`, which can then be used for matching.
/// to produce a [`GlobSet`], which can then be used for matching.
pub fn new() -> GlobSetBuilder {
GlobSetBuilder { pats: vec![] }
}
@@ -470,7 +508,6 @@ impl GlobSetBuilder {
}
/// Add a new pattern to this set.
#[allow(dead_code)]
pub fn add(&mut self, pat: Glob) -> &mut GlobSetBuilder {
self.pats.push(pat);
self
@@ -483,23 +520,30 @@ impl GlobSetBuilder {
/// Constructing candidates has a very small cost associated with it, so
/// callers may find it beneficial to amortize that cost when matching a single
/// path against multiple globs or sets of globs.
#[derive(Clone, Debug)]
#[derive(Clone)]
pub struct Candidate<'a> {
path: Cow<'a, [u8]>,
basename: Cow<'a, [u8]>,
ext: Cow<'a, [u8]>,
}
impl<'a> std::fmt::Debug for Candidate<'a> {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
f.debug_struct("Candidate")
.field("path", &self.path.as_bstr())
.field("basename", &self.basename.as_bstr())
.field("ext", &self.ext.as_bstr())
.finish()
}
}
impl<'a> Candidate<'a> {
/// Create a new candidate for matching from the given path.
pub fn new<P: AsRef<Path> + ?Sized>(path: &'a P) -> Candidate<'a> {
let path = path.as_ref();
let basename = file_name(path).unwrap_or(OsStr::new(""));
Candidate {
path: normalize_path(path_bytes(path)),
basename: os_str_bytes(basename),
ext: file_name_ext(basename).unwrap_or(Cow::Borrowed(b"")),
}
let path = normalize_path(Vec::from_path_lossy(path.as_ref()));
let basename = file_name(&path).unwrap_or(Cow::Borrowed(B("")));
let ext = file_name_ext(&basename).unwrap_or(Cow::Borrowed(B("")));
Candidate { path, basename, ext }
}
fn path_prefix(&self, max: usize) -> &[u8] {
@@ -531,7 +575,7 @@ enum GlobSetMatchStrategy {
}
impl GlobSetMatchStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.is_match(candidate),
@@ -544,7 +588,11 @@ impl GlobSetMatchStrategy {
}
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
use self::GlobSetMatchStrategy::*;
match *self {
Literal(ref s) => s.matches_into(candidate, matches),
@@ -559,84 +607,96 @@ impl GlobSetMatchStrategy {
}
#[derive(Clone, Debug)]
struct LiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
struct LiteralStrategy(fnv::HashMap<Vec<u8>, Vec<usize>>);
impl LiteralStrategy {
fn new() -> LiteralStrategy {
LiteralStrategy(BTreeMap::new())
LiteralStrategy(fnv::HashMap::default())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
self.0.contains_key(&*candidate.path)
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
self.0.contains_key(candidate.path.as_bytes())
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
if let Some(hits) = self.0.get(&*candidate.path) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
if let Some(hits) = self.0.get(candidate.path.as_bytes()) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct BasenameLiteralStrategy(BTreeMap<Vec<u8>, Vec<usize>>);
struct BasenameLiteralStrategy(fnv::HashMap<Vec<u8>, Vec<usize>>);
impl BasenameLiteralStrategy {
fn new() -> BasenameLiteralStrategy {
BasenameLiteralStrategy(BTreeMap::new())
BasenameLiteralStrategy(fnv::HashMap::default())
}
fn add(&mut self, global_index: usize, lit: String) {
self.0.entry(lit.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
if candidate.basename.is_empty() {
return false;
}
self.0.contains_key(&*candidate.basename)
self.0.contains_key(candidate.basename.as_bytes())
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
if candidate.basename.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.basename) {
if let Some(hits) = self.0.get(candidate.basename.as_bytes()) {
matches.extend(hits);
}
}
}
#[derive(Clone, Debug)]
struct ExtensionStrategy(HashMap<Vec<u8>, Vec<usize>, Fnv>);
struct ExtensionStrategy(fnv::HashMap<Vec<u8>, Vec<usize>>);
impl ExtensionStrategy {
fn new() -> ExtensionStrategy {
ExtensionStrategy(HashMap::with_hasher(Fnv::default()))
ExtensionStrategy(fnv::HashMap::default())
}
fn add(&mut self, global_index: usize, ext: String) {
self.0.entry(ext.into_bytes()).or_insert(vec![]).push(global_index);
}
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
if candidate.ext.is_empty() {
return false;
}
self.0.contains_key(&*candidate.ext)
self.0.contains_key(candidate.ext.as_bytes())
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
if candidate.ext.is_empty() {
return;
}
if let Some(hits) = self.0.get(&*candidate.ext) {
if let Some(hits) = self.0.get(candidate.ext.as_bytes()) {
matches.extend(hits);
}
}
@@ -644,27 +704,31 @@ impl ExtensionStrategy {
#[derive(Clone, Debug)]
struct PrefixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
matcher: AhoCorasick,
map: Vec<usize>,
longest: usize,
}
impl PrefixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
for m in self.matcher.find_overlapping_iter(path) {
if m.start() == 0 {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
let path = candidate.path_prefix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.start == 0 {
matches.push(self.map[m.pati]);
for m in self.matcher.find_overlapping_iter(path) {
if m.start() == 0 {
matches.push(self.map[m.pattern()]);
}
}
}
@@ -672,45 +736,49 @@ impl PrefixStrategy {
#[derive(Clone, Debug)]
struct SuffixStrategy {
matcher: FullAcAutomaton<Vec<u8>>,
matcher: AhoCorasick,
map: Vec<usize>,
longest: usize,
}
impl SuffixStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
for m in self.matcher.find_overlapping_iter(path) {
if m.end() == path.len() {
return true;
}
}
false
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
let path = candidate.path_suffix(self.longest);
for m in self.matcher.find_overlapping(path) {
if m.end == path.len() {
matches.push(self.map[m.pati]);
for m in self.matcher.find_overlapping_iter(path) {
if m.end() == path.len() {
matches.push(self.map[m.pattern()]);
}
}
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategy(HashMap<Vec<u8>, Vec<(usize, Regex)>, Fnv>);
struct RequiredExtensionStrategy(fnv::HashMap<Vec<u8>, Vec<(usize, Regex)>>);
impl RequiredExtensionStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
if candidate.ext.is_empty() {
return false;
}
match self.0.get(&*candidate.ext) {
match self.0.get(candidate.ext.as_bytes()) {
None => false,
Some(regexes) => {
for &(_, ref re) in regexes {
if re.is_match(&*candidate.path) {
if re.is_match(candidate.path.as_bytes()) {
return true;
}
}
@@ -720,13 +788,17 @@ impl RequiredExtensionStrategy {
}
#[inline(never)]
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
if candidate.ext.is_empty() {
return;
}
if let Some(regexes) = self.0.get(&*candidate.ext) {
if let Some(regexes) = self.0.get(candidate.ext.as_bytes()) {
for &(global_index, ref re) in regexes {
if re.is_match(&*candidate.path) {
if re.is_match(candidate.path.as_bytes()) {
matches.push(global_index);
}
}
@@ -736,19 +808,40 @@ impl RequiredExtensionStrategy {
#[derive(Clone, Debug)]
struct RegexSetStrategy {
matcher: RegexSet,
matcher: Regex,
map: Vec<usize>,
// We use a pool of PatternSets to hopefully allocating a fresh one on each
// call.
//
// TODO: In the next semver breaking release, we should drop this pool and
// expose an opaque type that wraps PatternSet. Then callers can provide
// it to `matches_into` directly. Callers might still want to use a pool
// or similar to amortize allocation, but that matches the status quo and
// absolves us of needing to do it here.
patset: Arc<Pool<PatternSet, PatternSetPoolFn>>,
}
type PatternSetPoolFn =
Box<dyn Fn() -> PatternSet + Send + Sync + UnwindSafe + RefUnwindSafe>;
impl RegexSetStrategy {
fn is_match(&self, candidate: &Candidate) -> bool {
self.matcher.is_match(&*candidate.path)
fn is_match(&self, candidate: &Candidate<'_>) -> bool {
self.matcher.is_match(candidate.path.as_bytes())
}
fn matches_into(&self, candidate: &Candidate, matches: &mut Vec<usize>) {
for i in self.matcher.matches(&*candidate.path) {
fn matches_into(
&self,
candidate: &Candidate<'_>,
matches: &mut Vec<usize>,
) {
let input = regex_automata::Input::new(candidate.path.as_bytes());
let mut patset = self.patset.get();
patset.clear();
self.matcher.which_overlapping_matches(&input, &mut patset);
for i in patset.iter() {
matches.push(self.map[i]);
}
PoolGuard::put(patset);
}
}
@@ -761,11 +854,7 @@ struct MultiStrategyBuilder {
impl MultiStrategyBuilder {
fn new() -> MultiStrategyBuilder {
MultiStrategyBuilder {
literals: vec![],
map: vec![],
longest: 0,
}
MultiStrategyBuilder { literals: vec![], map: vec![], longest: 0 }
}
fn add(&mut self, global_index: usize, literal: String) {
@@ -777,39 +866,42 @@ impl MultiStrategyBuilder {
}
fn prefix(self) -> PrefixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
PrefixStrategy {
matcher: AcAutomaton::new(it).into_full(),
matcher: AhoCorasick::new(&self.literals).unwrap(),
map: self.map,
longest: self.longest,
}
}
fn suffix(self) -> SuffixStrategy {
let it = self.literals.into_iter().map(|s| s.into_bytes());
SuffixStrategy {
matcher: AcAutomaton::new(it).into_full(),
matcher: AhoCorasick::new(&self.literals).unwrap(),
map: self.map,
longest: self.longest,
}
}
fn regex_set(self) -> Result<RegexSetStrategy, Error> {
let matcher = new_regex_set(self.literals)?;
let pattern_len = matcher.pattern_len();
let create: PatternSetPoolFn =
Box::new(move || PatternSet::new(pattern_len));
Ok(RegexSetStrategy {
matcher: new_regex_set(self.literals)?,
matcher,
map: self.map,
patset: Arc::new(Pool::new(create)),
})
}
}
#[derive(Clone, Debug)]
struct RequiredExtensionStrategyBuilder(
HashMap<Vec<u8>, Vec<(usize, String)>>,
fnv::HashMap<Vec<u8>, Vec<(usize, String)>>,
);
impl RequiredExtensionStrategyBuilder {
fn new() -> RequiredExtensionStrategyBuilder {
RequiredExtensionStrategyBuilder(HashMap::new())
RequiredExtensionStrategyBuilder(fnv::HashMap::default())
}
fn add(&mut self, global_index: usize, ext: String, regex: String) {
@@ -820,7 +912,7 @@ impl RequiredExtensionStrategyBuilder {
}
fn build(self) -> Result<RequiredExtensionStrategy, Error> {
let mut exts = HashMap::with_hasher(Fnv::default());
let mut exts = fnv::HashMap::default();
for (ext, regexes) in self.0.into_iter() {
exts.insert(ext.clone(), vec![]);
for (global_index, regex) in regexes {
@@ -832,10 +924,34 @@ impl RequiredExtensionStrategyBuilder {
}
}
/// Escape meta-characters within the given glob pattern.
///
/// The escaping works by surrounding meta-characters with brackets. For
/// example, `*` becomes `[*]`.
pub fn escape(s: &str) -> String {
let mut escaped = String::with_capacity(s.len());
for c in s.chars() {
match c {
// note that ! does not need escaping because it is only special
// inside brackets
'?' | '*' | '[' | ']' => {
escaped.push('[');
escaped.push(c);
escaped.push(']');
}
c => {
escaped.push(c);
}
}
}
escaped
}
#[cfg(test)]
mod tests {
use super::GlobSetBuilder;
use glob::Glob;
use crate::glob::Glob;
use super::{GlobSet, GlobSetBuilder};
#[test]
fn set_works() {
@@ -864,4 +980,43 @@ mod tests {
assert!(!set.is_match(""));
assert!(!set.is_match("a"));
}
#[test]
fn default_set_is_empty_works() {
let set: GlobSet = Default::default();
assert!(!set.is_match(""));
assert!(!set.is_match("a"));
}
#[test]
fn escape() {
use super::escape;
assert_eq!("foo", escape("foo"));
assert_eq!("foo[*]", escape("foo*"));
assert_eq!("[[][]]", escape("[]"));
assert_eq!("[*][?]", escape("*?"));
assert_eq!("src/[*][*]/[*].rs", escape("src/**/*.rs"));
assert_eq!("bar[[]ab[]]baz", escape("bar[ab]baz"));
assert_eq!("bar[[]!![]]!baz", escape("bar[!!]!baz"));
}
// This tests that regex matching doesn't "remember" the results of
// previous searches. That is, if any memory is reused from a previous
// search, then it should be cleared first.
#[test]
fn set_does_not_remember() {
let mut builder = GlobSetBuilder::new();
builder.add(Glob::new("*foo*").unwrap());
builder.add(Glob::new("*bar*").unwrap());
builder.add(Glob::new("*quux*").unwrap());
let set = builder.build().unwrap();
let matches = set.matches("ZfooZquuxZ");
assert_eq!(2, matches.len());
assert_eq!(0, matches[0]);
assert_eq!(2, matches[1]);
let matches = set.matches("nada");
assert_eq!(0, matches.len());
}
}

View File

@@ -0,0 +1,129 @@
use std::borrow::Cow;
use bstr::{ByteSlice, ByteVec};
/// The final component of the path, if it is a normal file.
///
/// If the path terminates in `.`, `..`, or consists solely of a root of
/// prefix, file_name will return None.
pub(crate) fn file_name<'a>(path: &Cow<'a, [u8]>) -> Option<Cow<'a, [u8]>> {
if path.last_byte().map_or(true, |b| b == b'.') {
return None;
}
let last_slash = path.rfind_byte(b'/').map(|i| i + 1).unwrap_or(0);
Some(match *path {
Cow::Borrowed(path) => Cow::Borrowed(&path[last_slash..]),
Cow::Owned(ref path) => {
let mut path = path.clone();
path.drain_bytes(..last_slash);
Cow::Owned(path)
}
})
}
/// Return a file extension given a path's file name.
///
/// Note that this does NOT match the semantics of std::path::Path::extension.
/// Namely, the extension includes the `.` and matching is otherwise more
/// liberal. Specifically, the extension is:
///
/// * None, if the file name given is empty;
/// * None, if there is no embedded `.`;
/// * Otherwise, the portion of the file name starting with the final `.`.
///
/// e.g., A file name of `.rs` has an extension `.rs`.
///
/// N.B. This is done to make certain glob match optimizations easier. Namely,
/// a pattern like `*.rs` is obviously trying to match files with a `rs`
/// extension, but it also matches files like `.rs`, which doesn't have an
/// extension according to std::path::Path::extension.
pub(crate) fn file_name_ext<'a>(
name: &Cow<'a, [u8]>,
) -> Option<Cow<'a, [u8]>> {
if name.is_empty() {
return None;
}
let last_dot_at = match name.rfind_byte(b'.') {
None => return None,
Some(i) => i,
};
Some(match *name {
Cow::Borrowed(name) => Cow::Borrowed(&name[last_dot_at..]),
Cow::Owned(ref name) => {
let mut name = name.clone();
name.drain_bytes(..last_dot_at);
Cow::Owned(name)
}
})
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(unix)]
pub(crate) fn normalize_path(path: Cow<'_, [u8]>) -> Cow<'_, [u8]> {
// UNIX only uses /, so we're good.
path
}
/// Normalizes a path to use `/` as a separator everywhere, even on platforms
/// that recognize other characters as separators.
#[cfg(not(unix))]
pub(crate) fn normalize_path(mut path: Cow<[u8]>) -> Cow<[u8]> {
use std::path::is_separator;
for i in 0..path.len() {
if path[i] == b'/' || !is_separator(char::from(path[i])) {
continue;
}
path.to_mut()[i] = b'/';
}
path
}
#[cfg(test)]
mod tests {
use std::borrow::Cow;
use bstr::{ByteVec, B};
use super::{file_name_ext, normalize_path};
macro_rules! ext {
($name:ident, $file_name:expr, $ext:expr) => {
#[test]
fn $name() {
let bs = Vec::from($file_name);
let got = file_name_ext(&Cow::Owned(bs));
assert_eq!($ext.map(|s| Cow::Borrowed(B(s))), got);
}
};
}
ext!(ext1, "foo.rs", Some(".rs"));
ext!(ext2, ".rs", Some(".rs"));
ext!(ext3, "..rs", Some(".rs"));
ext!(ext4, "", None::<&str>);
ext!(ext5, "foo", None::<&str>);
macro_rules! normalize {
($name:ident, $path:expr, $expected:expr) => {
#[test]
fn $name() {
let bs = Vec::from_slice($path);
let got = normalize_path(Cow::Owned(bs));
assert_eq!($expected.to_vec(), got.into_owned());
}
};
}
normalize!(normal1, b"foo", b"foo");
normalize!(normal2, b"foo/bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal3, b"foo\\bar", b"foo\\bar");
#[cfg(not(unix))]
normalize!(normal3, b"foo\\bar", b"foo/bar");
#[cfg(unix)]
normalize!(normal4, b"foo\\bar/baz", b"foo\\bar/baz");
#[cfg(not(unix))]
normalize!(normal4, b"foo\\bar/baz", b"foo/bar/baz");
}

View File

@@ -0,0 +1,128 @@
use serde::{
de::{Error, SeqAccess, Visitor},
{Deserialize, Deserializer, Serialize, Serializer},
};
use crate::{Glob, GlobSet, GlobSetBuilder};
impl Serialize for Glob {
fn serialize<S: Serializer>(
&self,
serializer: S,
) -> Result<S::Ok, S::Error> {
serializer.serialize_str(self.glob())
}
}
struct GlobVisitor;
impl<'de> Visitor<'de> for GlobVisitor {
type Value = Glob;
fn expecting(
&self,
formatter: &mut std::fmt::Formatter,
) -> std::fmt::Result {
formatter.write_str("a glob pattern")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where
E: Error,
{
Glob::new(v).map_err(serde::de::Error::custom)
}
}
impl<'de> Deserialize<'de> for Glob {
fn deserialize<D: Deserializer<'de>>(
deserializer: D,
) -> Result<Self, D::Error> {
deserializer.deserialize_str(GlobVisitor)
}
}
struct GlobSetVisitor;
impl<'de> Visitor<'de> for GlobSetVisitor {
type Value = GlobSet;
fn expecting(
&self,
formatter: &mut std::fmt::Formatter,
) -> std::fmt::Result {
formatter.write_str("an array of glob patterns")
}
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
where
A: SeqAccess<'de>,
{
let mut builder = GlobSetBuilder::new();
while let Some(glob) = seq.next_element()? {
builder.add(glob);
}
builder.build().map_err(serde::de::Error::custom)
}
}
impl<'de> Deserialize<'de> for GlobSet {
fn deserialize<D: Deserializer<'de>>(
deserializer: D,
) -> Result<Self, D::Error> {
deserializer.deserialize_seq(GlobSetVisitor)
}
}
#[cfg(test)]
mod tests {
use std::collections::HashMap;
use crate::{Glob, GlobSet};
#[test]
fn glob_deserialize_borrowed() {
let string = r#"{"markdown": "*.md"}"#;
let map: HashMap<String, Glob> =
serde_json::from_str(&string).unwrap();
assert_eq!(map["markdown"], Glob::new("*.md").unwrap());
}
#[test]
fn glob_deserialize_owned() {
let string = r#"{"markdown": "*.md"}"#;
let v: serde_json::Value = serde_json::from_str(&string).unwrap();
let map: HashMap<String, Glob> = serde_json::from_value(v).unwrap();
assert_eq!(map["markdown"], Glob::new("*.md").unwrap());
}
#[test]
fn glob_deserialize_error() {
let string = r#"{"error": "["}"#;
let map = serde_json::from_str::<HashMap<String, Glob>>(&string);
assert!(map.is_err());
}
#[test]
fn glob_json_works() {
let test_glob = Glob::new("src/**/*.rs").unwrap();
let ser = serde_json::to_string(&test_glob).unwrap();
assert_eq!(ser, "\"src/**/*.rs\"");
let de: Glob = serde_json::from_str(&ser).unwrap();
assert_eq!(test_glob, de);
}
#[test]
fn glob_set_deserialize() {
let j = r#" ["src/**/*.rs", "README.md"] "#;
let set: GlobSet = serde_json::from_str(j).unwrap();
assert!(set.is_match("src/lib.rs"));
assert!(!set.is_match("Cargo.lock"));
}
}

33
crates/grep/Cargo.toml Normal file
View File

@@ -0,0 +1,33 @@
[package]
name = "grep"
version = "0.3.1" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
Fast line oriented regex searching as a library.
"""
documentation = "https://docs.rs/grep"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/grep"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/grep"
readme = "README.md"
keywords = ["regex", "grep", "egrep", "search", "pattern"]
license = "Unlicense OR MIT"
edition = "2021"
[dependencies]
grep-cli = { version = "0.1.10", path = "../cli" }
grep-matcher = { version = "0.1.7", path = "../matcher" }
grep-pcre2 = { version = "0.1.7", path = "../pcre2", optional = true }
grep-printer = { version = "0.2.1", path = "../printer" }
grep-regex = { version = "0.1.12", path = "../regex" }
grep-searcher = { version = "0.1.13", path = "../searcher" }
[dev-dependencies]
termcolor = "1.0.4"
walkdir = "2.2.7"
[features]
simd-accel = ["grep-searcher/simd-accel"]
pcre2 = ["grep-pcre2"]
# This feature is DEPRECATED. Runtime dispatch is used for SIMD now.
avx-accel = []

34
crates/grep/README.md Normal file
View File

@@ -0,0 +1,34 @@
grep
----
ripgrep, as a library.
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![](https://img.shields.io/crates/v/grep.svg)](https://crates.io/crates/grep)
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org/).
### Documentation
[https://docs.rs/grep](https://docs.rs/grep)
NOTE: This crate isn't ready for wide use yet. Ambitious individuals can
probably piece together the parts, but there is no high level documentation
describing how all of the pieces fit together.
### Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
grep = "0.2"
```
### Features
This crate provides a `pcre2` feature (disabled by default) which, when
enabled, re-exports the `grep-pcre2` crate as an alternative `Matcher`
implementation to the standard `grep-regex` implementation.

View File

@@ -0,0 +1,69 @@
use std::{env, error::Error, ffi::OsString, io::IsTerminal, process};
use {
grep::{
cli,
printer::{ColorSpecs, StandardBuilder},
regex::RegexMatcher,
searcher::{BinaryDetection, SearcherBuilder},
},
termcolor::ColorChoice,
walkdir::WalkDir,
};
fn main() {
if let Err(err) = try_main() {
eprintln!("{}", err);
process::exit(1);
}
}
fn try_main() -> Result<(), Box<dyn Error>> {
let mut args: Vec<OsString> = env::args_os().collect();
if args.len() < 2 {
return Err("Usage: simplegrep <pattern> [<path> ...]".into());
}
if args.len() == 2 {
args.push(OsString::from("./"));
}
search(cli::pattern_from_os(&args[1])?, &args[2..])
}
fn search(pattern: &str, paths: &[OsString]) -> Result<(), Box<dyn Error>> {
let matcher = RegexMatcher::new_line_matcher(&pattern)?;
let mut searcher = SearcherBuilder::new()
.binary_detection(BinaryDetection::quit(b'\x00'))
.line_number(false)
.build();
let mut printer = StandardBuilder::new()
.color_specs(ColorSpecs::default_with_color())
.build(cli::stdout(if std::io::stdout().is_terminal() {
ColorChoice::Auto
} else {
ColorChoice::Never
}));
for path in paths {
for result in WalkDir::new(path) {
let dent = match result {
Ok(dent) => dent,
Err(err) => {
eprintln!("{}", err);
continue;
}
};
if !dent.file_type().is_file() {
continue;
}
let result = searcher.search_path(
&matcher,
dent.path(),
printer.sink_with_path(&matcher, dent.path()),
);
if let Err(err) = result {
eprintln!("{}: {}", dent.path().display(), err);
}
}
}
Ok(())
}

21
crates/grep/src/lib.rs Normal file
View File

@@ -0,0 +1,21 @@
/*!
ripgrep, as a library.
This library is intended to provide a high level facade to the crates that
make up ripgrep's core searching routines. However, there is no high level
documentation available yet guiding users on how to fit all of the pieces
together.
Every public API item in the constituent crates is documented, but examples
are sparse.
A cookbook and a guide are planned.
*/
pub extern crate grep_cli as cli;
pub extern crate grep_matcher as matcher;
#[cfg(feature = "pcre2")]
pub extern crate grep_pcre2 as pcre2;
pub extern crate grep_printer as printer;
pub extern crate grep_regex as regex;
pub extern crate grep_searcher as searcher;

44
crates/ignore/Cargo.toml Normal file
View File

@@ -0,0 +1,44 @@
[package]
name = "ignore"
version = "0.4.22" #:version
authors = ["Andrew Gallant <jamslam@gmail.com>"]
description = """
A fast library for efficiently matching ignore files such as `.gitignore`
against file paths.
"""
documentation = "https://docs.rs/ignore"
homepage = "https://github.com/BurntSushi/ripgrep/tree/master/crates/ignore"
repository = "https://github.com/BurntSushi/ripgrep/tree/master/crates/ignore"
readme = "README.md"
keywords = ["glob", "ignore", "gitignore", "pattern", "file"]
license = "Unlicense OR MIT"
edition = "2021"
[lib]
name = "ignore"
bench = false
[dependencies]
crossbeam-deque = "0.8.3"
globset = { version = "0.4.14", path = "../globset" }
log = "0.4.20"
memchr = "2.6.3"
same-file = "1.0.6"
walkdir = "2.4.0"
[dependencies.regex-automata]
version = "0.4.0"
default-features = false
features = ["std", "perf", "syntax", "meta", "nfa", "hybrid", "dfa-onepass"]
[target.'cfg(windows)'.dependencies.winapi-util]
version = "0.1.2"
[dev-dependencies]
bstr = { version = "1.6.2", default-features = false, features = ["std"] }
crossbeam-channel = "0.5.8"
[features]
# DEPRECATED. It is a no-op. SIMD is done automatically through runtime
# dispatch.
simd-accel = []

21
crates/ignore/LICENSE-MIT Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2015 Andrew Gallant
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -4,11 +4,10 @@ The ignore crate provides a fast recursive directory iterator that respects
various filters such as globs, file types and `.gitignore` files. This crate
also provides lower level direct access to gitignore and file type matchers.
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
[![Build status](https://github.com/BurntSushi/ripgrep/workflows/ci/badge.svg)](https://github.com/BurntSushi/ripgrep/actions)
[![](https://img.shields.io/crates/v/ignore.svg)](https://crates.io/crates/ignore)
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
Dual-licensed under MIT or the [UNLICENSE](https://unlicense.org/).
### Documentation
@@ -23,12 +22,6 @@ Add this to your `Cargo.toml`:
ignore = "0.4"
```
and this to your crate root:
```rust
extern crate ignore;
```
### Example
This example shows the most basic usage of this crate. This code will

24
crates/ignore/UNLICENSE Normal file
View File

@@ -0,0 +1,24 @@
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org/>

View File

@@ -0,0 +1,64 @@
use std::{env, io::Write, path::Path};
use {bstr::ByteVec, ignore::WalkBuilder, walkdir::WalkDir};
fn main() {
let mut path = env::args().nth(1).unwrap();
let mut parallel = false;
let mut simple = false;
let (tx, rx) = crossbeam_channel::bounded::<DirEntry>(100);
if path == "parallel" {
path = env::args().nth(2).unwrap();
parallel = true;
} else if path == "walkdir" {
path = env::args().nth(2).unwrap();
simple = true;
}
let stdout_thread = std::thread::spawn(move || {
let mut stdout = std::io::BufWriter::new(std::io::stdout());
for dent in rx {
stdout.write(&*Vec::from_path_lossy(dent.path())).unwrap();
stdout.write(b"\n").unwrap();
}
});
if parallel {
let walker = WalkBuilder::new(path).threads(6).build_parallel();
walker.run(|| {
let tx = tx.clone();
Box::new(move |result| {
use ignore::WalkState::*;
tx.send(DirEntry::Y(result.unwrap())).unwrap();
Continue
})
});
} else if simple {
let walker = WalkDir::new(path);
for result in walker {
tx.send(DirEntry::X(result.unwrap())).unwrap();
}
} else {
let walker = WalkBuilder::new(path).build();
for result in walker {
tx.send(DirEntry::Y(result.unwrap())).unwrap();
}
}
drop(tx);
stdout_thread.join().unwrap();
}
enum DirEntry {
X(walkdir::DirEntry),
Y(ignore::DirEntry),
}
impl DirEntry {
fn path(&self) -> &Path {
match *self {
DirEntry::X(ref x) => x.path(),
DirEntry::Y(ref y) => y.path(),
}
}
}

View File

@@ -0,0 +1,348 @@
/// This list represents the default file types that ripgrep ships with. In
/// general, any file format is fair game, although it should generally be
/// limited to reasonably popular open formats. For other cases, you can add
/// types to each invocation of ripgrep with the '--type-add' flag.
///
/// If you would like to add or improve this list, please file a PR:
/// <https://github.com/BurntSushi/ripgrep>.
///
/// Please try to keep this list sorted lexicographically and wrapped to 79
/// columns (inclusive).
#[rustfmt::skip]
pub(crate) const DEFAULT_TYPES: &[(&[&str], &[&str])] = &[
(&["ada"], &["*.adb", "*.ads"]),
(&["agda"], &["*.agda", "*.lagda"]),
(&["aidl"], &["*.aidl"]),
(&["alire"], &["alire.toml"]),
(&["amake"], &["*.mk", "*.bp"]),
(&["asciidoc"], &["*.adoc", "*.asc", "*.asciidoc"]),
(&["asm"], &["*.asm", "*.s", "*.S"]),
(&["asp"], &[
"*.aspx", "*.aspx.cs", "*.aspx.vb", "*.ascx", "*.ascx.cs",
"*.ascx.vb", "*.asp"
]),
(&["ats"], &["*.ats", "*.dats", "*.sats", "*.hats"]),
(&["avro"], &["*.avdl", "*.avpr", "*.avsc"]),
(&["awk"], &["*.awk"]),
(&["bat", "batch"], &["*.bat"]),
(&["bazel"], &[
"*.bazel", "*.bzl", "*.BUILD", "*.bazelrc", "BUILD", "MODULE.bazel",
"WORKSPACE", "WORKSPACE.bazel",
]),
(&["bitbake"], &["*.bb", "*.bbappend", "*.bbclass", "*.conf", "*.inc"]),
(&["brotli"], &["*.br"]),
(&["buildstream"], &["*.bst"]),
(&["bzip2"], &["*.bz2", "*.tbz2"]),
(&["c"], &["*.[chH]", "*.[chH].in", "*.cats"]),
(&["cabal"], &["*.cabal"]),
(&["candid"], &["*.did"]),
(&["carp"], &["*.carp"]),
(&["cbor"], &["*.cbor"]),
(&["ceylon"], &["*.ceylon"]),
(&["clojure"], &["*.clj", "*.cljc", "*.cljs", "*.cljx"]),
(&["cmake"], &["*.cmake", "CMakeLists.txt"]),
(&["cmd"], &["*.bat", "*.cmd"]),
(&["cml"], &["*.cml"]),
(&["coffeescript"], &["*.coffee"]),
(&["config"], &["*.cfg", "*.conf", "*.config", "*.ini"]),
(&["coq"], &["*.v"]),
(&["cpp"], &[
"*.[ChH]", "*.cc", "*.[ch]pp", "*.[ch]xx", "*.hh", "*.inl",
"*.[ChH].in", "*.cc.in", "*.[ch]pp.in", "*.[ch]xx.in", "*.hh.in",
]),
(&["creole"], &["*.creole"]),
(&["crystal"], &["Projectfile", "*.cr", "*.ecr", "shard.yml"]),
(&["cs"], &["*.cs"]),
(&["csharp"], &["*.cs"]),
(&["cshtml"], &["*.cshtml"]),
(&["csproj"], &["*.csproj"]),
(&["css"], &["*.css", "*.scss"]),
(&["csv"], &["*.csv"]),
(&["cuda"], &["*.cu", "*.cuh"]),
(&["cython"], &["*.pyx", "*.pxi", "*.pxd"]),
(&["d"], &["*.d"]),
(&["dart"], &["*.dart"]),
(&["devicetree"], &["*.dts", "*.dtsi"]),
(&["dhall"], &["*.dhall"]),
(&["diff"], &["*.patch", "*.diff"]),
(&["dita"], &["*.dita", "*.ditamap", "*.ditaval"]),
(&["docker"], &["*Dockerfile*"]),
(&["dockercompose"], &["docker-compose.yml", "docker-compose.*.yml"]),
(&["dts"], &["*.dts", "*.dtsi"]),
(&["dvc"], &["Dvcfile", "*.dvc"]),
(&["ebuild"], &["*.ebuild", "*.eclass"]),
(&["edn"], &["*.edn"]),
(&["elisp"], &["*.el"]),
(&["elixir"], &["*.ex", "*.eex", "*.exs", "*.heex", "*.leex", "*.livemd"]),
(&["elm"], &["*.elm"]),
(&["erb"], &["*.erb"]),
(&["erlang"], &["*.erl", "*.hrl"]),
(&["fennel"], &["*.fnl"]),
(&["fidl"], &["*.fidl"]),
(&["fish"], &["*.fish"]),
(&["flatbuffers"], &["*.fbs"]),
(&["fortran"], &[
"*.f", "*.F", "*.f77", "*.F77", "*.pfo",
"*.f90", "*.F90", "*.f95", "*.F95",
]),
(&["fsharp"], &["*.fs", "*.fsx", "*.fsi"]),
(&["fut"], &["*.fut"]),
(&["gap"], &["*.g", "*.gap", "*.gi", "*.gd", "*.tst"]),
(&["gn"], &["*.gn", "*.gni"]),
(&["go"], &["*.go"]),
(&["gprbuild"], &["*.gpr"]),
(&["gradle"], &[
"*.gradle", "*.gradle.kts", "gradle.properties", "gradle-wrapper.*",
"gradlew", "gradlew.bat",
]),
(&["graphql"], &["*.graphql", "*.graphqls"]),
(&["groovy"], &["*.groovy", "*.gradle"]),
(&["gzip"], &["*.gz", "*.tgz"]),
(&["h"], &["*.h", "*.hh", "*.hpp"]),
(&["haml"], &["*.haml"]),
(&["hare"], &["*.ha"]),
(&["haskell"], &["*.hs", "*.lhs", "*.cpphs", "*.c2hs", "*.hsc"]),
(&["hbs"], &["*.hbs"]),
(&["hs"], &["*.hs", "*.lhs"]),
(&["html"], &["*.htm", "*.html", "*.ejs"]),
(&["hy"], &["*.hy"]),
(&["idris"], &["*.idr", "*.lidr"]),
(&["janet"], &["*.janet"]),
(&["java"], &["*.java", "*.jsp", "*.jspx", "*.properties"]),
(&["jinja"], &["*.j2", "*.jinja", "*.jinja2"]),
(&["jl"], &["*.jl"]),
(&["js"], &["*.js", "*.jsx", "*.vue", "*.cjs", "*.mjs"]),
(&["json"], &["*.json", "composer.lock", "*.sarif"]),
(&["jsonl"], &["*.jsonl"]),
(&["julia"], &["*.jl"]),
(&["jupyter"], &["*.ipynb", "*.jpynb"]),
(&["k"], &["*.k"]),
(&["kotlin"], &["*.kt", "*.kts"]),
(&["lean"], &["*.lean"]),
(&["less"], &["*.less"]),
(&["license"], &[
// General
"COPYING", "COPYING[.-]*",
"COPYRIGHT", "COPYRIGHT[.-]*",
"EULA", "EULA[.-]*",
"licen[cs]e", "licen[cs]e.*",
"LICEN[CS]E", "LICEN[CS]E[.-]*", "*[.-]LICEN[CS]E*",
"NOTICE", "NOTICE[.-]*",
"PATENTS", "PATENTS[.-]*",
"UNLICEN[CS]E", "UNLICEN[CS]E[.-]*",
// GPL (gpl.txt, etc.)
"agpl[.-]*",
"gpl[.-]*",
"lgpl[.-]*",
// Other license-specific (APACHE-2.0.txt, etc.)
"AGPL-*[0-9]*",
"APACHE-*[0-9]*",
"BSD-*[0-9]*",
"CC-BY-*",
"GFDL-*[0-9]*",
"GNU-*[0-9]*",
"GPL-*[0-9]*",
"LGPL-*[0-9]*",
"MIT-*[0-9]*",
"MPL-*[0-9]*",
"OFL-*[0-9]*",
]),
(&["lilypond"], &["*.ly", "*.ily"]),
(&["lisp"], &["*.el", "*.jl", "*.lisp", "*.lsp", "*.sc", "*.scm"]),
(&["lock"], &["*.lock", "package-lock.json"]),
(&["log"], &["*.log"]),
(&["lua"], &["*.lua"]),
(&["lz4"], &["*.lz4"]),
(&["lzma"], &["*.lzma"]),
(&["m4"], &["*.ac", "*.m4"]),
(&["make"], &[
"[Gg][Nn][Uu]makefile", "[Mm]akefile",
"[Gg][Nn][Uu]makefile.am", "[Mm]akefile.am",
"[Gg][Nn][Uu]makefile.in", "[Mm]akefile.in",
"*.mk", "*.mak"
]),
(&["mako"], &["*.mako", "*.mao"]),
(&["man"], &["*.[0-9lnpx]", "*.[0-9][cEFMmpSx]"]),
(&["markdown", "md"], &[
"*.markdown",
"*.md",
"*.mdown",
"*.mdwn",
"*.mkd",
"*.mkdn",
"*.mdx",
]),
(&["matlab"], &["*.m"]),
(&["meson"], &["meson.build", "meson_options.txt", "meson.options"]),
(&["minified"], &["*.min.html", "*.min.css", "*.min.js"]),
(&["mint"], &["*.mint"]),
(&["mk"], &["mkfile"]),
(&["ml"], &["*.ml"]),
(&["motoko"], &["*.mo"]),
(&["msbuild"], &[
"*.csproj", "*.fsproj", "*.vcxproj", "*.proj", "*.props", "*.targets",
"*.sln",
]),
(&["nim"], &["*.nim", "*.nimf", "*.nimble", "*.nims"]),
(&["nix"], &["*.nix"]),
(&["objc"], &["*.h", "*.m"]),
(&["objcpp"], &["*.h", "*.mm"]),
(&["ocaml"], &["*.ml", "*.mli", "*.mll", "*.mly"]),
(&["org"], &["*.org", "*.org_archive"]),
(&["pants"], &["BUILD"]),
(&["pascal"], &["*.pas", "*.dpr", "*.lpr", "*.pp", "*.inc"]),
(&["pdf"], &["*.pdf"]),
(&["perl"], &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm", "*.t"]),
(&["php"], &[
// note that PHP 6 doesn't exist
// See: https://wiki.php.net/rfc/php6
"*.php", "*.php3", "*.php4", "*.php5", "*.php7", "*.php8",
"*.pht", "*.phtml"
]),
(&["po"], &["*.po"]),
(&["pod"], &["*.pod"]),
(&["postscript"], &["*.eps", "*.ps"]),
(&["prolog"], &["*.pl", "*.pro", "*.prolog", "*.P"]),
(&["protobuf"], &["*.proto"]),
(&["ps"], &["*.cdxml", "*.ps1", "*.ps1xml", "*.psd1", "*.psm1"]),
(&["puppet"], &["*.epp", "*.erb", "*.pp", "*.rb"]),
(&["purs"], &["*.purs"]),
(&["py", "python"], &["*.py", "*.pyi"]),
(&["qmake"], &["*.pro", "*.pri", "*.prf"]),
(&["qml"], &["*.qml"]),
(&["r"], &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
(&["racket"], &["*.rkt"]),
(&["raku"], &[
"*.raku", "*.rakumod", "*.rakudoc", "*.rakutest",
"*.p6", "*.pl6", "*.pm6"
]),
(&["rdoc"], &["*.rdoc"]),
(&["readme"], &["README*", "*README"]),
(&["reasonml"], &["*.re", "*.rei"]),
(&["red"], &["*.r", "*.red", "*.reds"]),
(&["rescript"], &["*.res", "*.resi"]),
(&["robot"], &["*.robot"]),
(&["rst"], &["*.rst"]),
(&["ruby"], &[
// Idiomatic files
"config.ru", "Gemfile", ".irbrc", "Rakefile",
// Extensions
"*.gemspec", "*.rb", "*.rbw"
]),
(&["rust"], &["*.rs"]),
(&["sass"], &["*.sass", "*.scss"]),
(&["scala"], &["*.scala", "*.sbt"]),
(&["sh"], &[
// Portable/misc. init files
".login", ".logout", ".profile", "profile",
// bash-specific init files
".bash_login", "bash_login",
".bash_logout", "bash_logout",
".bash_profile", "bash_profile",
".bashrc", "bashrc", "*.bashrc",
// csh-specific init files
".cshrc", "*.cshrc",
// ksh-specific init files
".kshrc", "*.kshrc",
// tcsh-specific init files
".tcshrc",
// zsh-specific init files
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
// Extensions
"*.bash", "*.csh", "*.ksh", "*.sh", "*.tcsh", "*.zsh",
]),
(&["slim"], &["*.skim", "*.slim", "*.slime"]),
(&["smarty"], &["*.tpl"]),
(&["sml"], &["*.sml", "*.sig"]),
(&["solidity"], &["*.sol"]),
(&["soy"], &["*.soy"]),
(&["spark"], &["*.spark"]),
(&["spec"], &["*.spec"]),
(&["sql"], &["*.sql", "*.psql"]),
(&["stylus"], &["*.styl"]),
(&["sv"], &["*.v", "*.vg", "*.sv", "*.svh", "*.h"]),
(&["svg"], &["*.svg"]),
(&["swift"], &["*.swift"]),
(&["swig"], &["*.def", "*.i"]),
(&["systemd"], &[
"*.automount", "*.conf", "*.device", "*.link", "*.mount", "*.path",
"*.scope", "*.service", "*.slice", "*.socket", "*.swap", "*.target",
"*.timer",
]),
(&["taskpaper"], &["*.taskpaper"]),
(&["tcl"], &["*.tcl"]),
(&["tex"], &["*.tex", "*.ltx", "*.cls", "*.sty", "*.bib", "*.dtx", "*.ins"]),
(&["texinfo"], &["*.texi"]),
(&["textile"], &["*.textile"]),
(&["tf"], &[
"*.tf", "*.auto.tfvars", "terraform.tfvars", "*.tf.json",
"*.auto.tfvars.json", "terraform.tfvars.json", "*.terraformrc",
"terraform.rc", "*.tfrc", "*.terraform.lock.hcl",
]),
(&["thrift"], &["*.thrift"]),
(&["toml"], &["*.toml", "Cargo.lock"]),
(&["ts", "typescript"], &["*.ts", "*.tsx", "*.cts", "*.mts"]),
(&["twig"], &["*.twig"]),
(&["txt"], &["*.txt"]),
(&["typoscript"], &["*.typoscript", "*.ts"]),
(&["usd"], &["*.usd", "*.usda", "*.usdc"]),
(&["v"], &["*.v", "*.vsh"]),
(&["vala"], &["*.vala"]),
(&["vb"], &["*.vb"]),
(&["vcl"], &["*.vcl"]),
(&["verilog"], &["*.v", "*.vh", "*.sv", "*.svh"]),
(&["vhdl"], &["*.vhd", "*.vhdl"]),
(&["vim"], &[
"*.vim", ".vimrc", ".gvimrc", "vimrc", "gvimrc", "_vimrc", "_gvimrc",
]),
(&["vimscript"], &[
"*.vim", ".vimrc", ".gvimrc", "vimrc", "gvimrc", "_vimrc", "_gvimrc",
]),
(&["webidl"], &["*.idl", "*.webidl", "*.widl"]),
(&["wiki"], &["*.mediawiki", "*.wiki"]),
(&["xml"], &[
"*.xml", "*.xml.dist", "*.dtd", "*.xsl", "*.xslt", "*.xsd", "*.xjb",
"*.rng", "*.sch", "*.xhtml",
]),
(&["xz"], &["*.xz", "*.txz"]),
(&["yacc"], &["*.y"]),
(&["yaml"], &["*.yaml", "*.yml"]),
(&["yang"], &["*.yang"]),
(&["z"], &["*.Z"]),
(&["zig"], &["*.zig"]),
(&["zsh"], &[
".zshenv", "zshenv",
".zlogin", "zlogin",
".zlogout", "zlogout",
".zprofile", "zprofile",
".zshrc", "zshrc",
"*.zsh",
]),
(&["zstd"], &["*.zst", "*.zstd"]),
];
#[cfg(test)]
mod tests {
use super::DEFAULT_TYPES;
#[test]
fn default_types_are_sorted() {
let mut names = DEFAULT_TYPES.iter().map(|(aliases, _)| aliases[0]);
let Some(mut previous_name) = names.next() else {
return;
};
for name in names {
assert!(
name > previous_name,
r#""{}" should be sorted before "{}" in `DEFAULT_TYPES`"#,
name,
previous_name
);
previous_name = name;
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -7,20 +7,22 @@ Note that this module implements the specification as described in the
the `git` command line tool.
*/
use std::cell::RefCell;
use std::env;
use std::fs::File;
use std::io::{self, BufRead, Read};
use std::path::{Path, PathBuf};
use std::str;
use std::sync::Arc;
use std::{
fs::File,
io::{BufRead, BufReader, Read},
path::{Path, PathBuf},
sync::Arc,
};
use globset::{Candidate, GlobBuilder, GlobSet, GlobSetBuilder};
use regex::bytes::Regex;
use thread_local::ThreadLocal;
use {
globset::{Candidate, GlobBuilder, GlobSet, GlobSetBuilder},
regex_automata::util::pool::Pool,
};
use pathutil::{is_file_name, strip_prefix};
use {Error, Match, PartialErrorBuilder};
use crate::{
pathutil::{is_file_name, strip_prefix},
Error, Match, PartialErrorBuilder,
};
/// Glob represents a single glob in a gitignore file.
///
@@ -69,8 +71,7 @@ impl Glob {
/// Returns true if and only if this glob has a `**/` prefix.
fn has_doublestar_prefix(&self) -> bool {
self.actual.starts_with("**/")
|| (self.actual == "**" && self.is_only_dir)
self.actual.starts_with("**/") || self.actual == "**"
}
}
@@ -83,7 +84,7 @@ pub struct Gitignore {
globs: Vec<Glob>,
num_ignores: u64,
num_whitelists: u64,
matches: Option<Arc<ThreadLocal<RefCell<Vec<usize>>>>>,
matches: Option<Arc<Pool<Vec<usize>>>>,
}
impl Gitignore {
@@ -127,16 +128,7 @@ impl Gitignore {
/// `$XDG_CONFIG_HOME/git/ignore` is read. If `$XDG_CONFIG_HOME` is not
/// set or is empty, then `$HOME/.config/git/ignore` is used instead.
pub fn global() -> (Gitignore, Option<Error>) {
match gitconfig_excludes_path() {
None => (Gitignore::empty(), None),
Some(path) => {
if !path.is_file() {
(Gitignore::empty(), None)
} else {
Gitignore::new(path)
}
}
}
GitignoreBuilder::new("").build_global()
}
/// Creates a new empty gitignore matcher that never matches anything.
@@ -259,8 +251,7 @@ impl Gitignore {
return Match::None;
}
let path = path.as_ref();
let _matches = self.matches.as_ref().unwrap().get_default();
let mut matches = _matches.borrow_mut();
let mut matches = self.matches.as_ref().unwrap().get();
let candidate = Candidate::new(path);
self.set.matches_candidate_into(&candidate, &mut *matches);
for &i in matches.iter().rev() {
@@ -342,23 +333,50 @@ impl GitignoreBuilder {
pub fn build(&self) -> Result<Gitignore, Error> {
let nignore = self.globs.iter().filter(|g| !g.is_whitelist()).count();
let nwhite = self.globs.iter().filter(|g| g.is_whitelist()).count();
let set =
self.builder.build().map_err(|err| {
Error::Glob {
glob: None,
err: err.to_string(),
}
})?;
let set = self
.builder
.build()
.map_err(|err| Error::Glob { glob: None, err: err.to_string() })?;
Ok(Gitignore {
set: set,
set,
root: self.root.clone(),
globs: self.globs.clone(),
num_ignores: nignore as u64,
num_whitelists: nwhite as u64,
matches: Some(Arc::new(ThreadLocal::default())),
matches: Some(Arc::new(Pool::new(|| vec![]))),
})
}
/// Build a global gitignore matcher using the configuration in this
/// builder.
///
/// This consumes ownership of the builder unlike `build` because it
/// must mutate the builder to add the global gitignore globs.
///
/// Note that this ignores the path given to this builder's constructor
/// and instead derives the path automatically from git's global
/// configuration.
pub fn build_global(mut self) -> (Gitignore, Option<Error>) {
match gitconfig_excludes_path() {
None => (Gitignore::empty(), None),
Some(path) => {
if !path.is_file() {
(Gitignore::empty(), None)
} else {
let mut errs = PartialErrorBuilder::default();
errs.maybe_push_ignore_io(self.add(path));
match self.build() {
Ok(gi) => (gi, errs.into_error_option()),
Err(err) => {
errs.push(err);
(Gitignore::empty(), errs.into_error_option())
}
}
}
}
}
}
/// Add each glob from the file path given.
///
/// The file given should be formatted as a `gitignore` file.
@@ -372,7 +390,7 @@ impl GitignoreBuilder {
Err(err) => return Some(Error::Io(err).with_path(path)),
Ok(file) => file,
};
let rdr = io::BufReader::new(file);
let rdr = BufReader::new(file);
let mut errs = PartialErrorBuilder::default();
for (i, line) in rdr.lines().enumerate() {
let lineno = (i + 1) as u64;
@@ -419,6 +437,8 @@ impl GitignoreBuilder {
from: Option<PathBuf>,
mut line: &str,
) -> Result<&mut GitignoreBuilder, Error> {
#![allow(deprecated)]
if line.starts_with("#") {
return Ok(self);
}
@@ -429,13 +449,12 @@ impl GitignoreBuilder {
return Ok(self);
}
let mut glob = Glob {
from: from,
from,
original: line.to_string(),
actual: String::new(),
is_whitelist: false,
is_only_dir: false,
};
let mut literal_separator = false;
let mut is_absolute = false;
if line.starts_with("\\!") || line.starts_with("\\#") {
line = &line[1..];
@@ -450,29 +469,26 @@ impl GitignoreBuilder {
// then the glob can only match the beginning of a path
// (relative to the location of gitignore). We achieve this by
// simply banning wildcards from matching /.
literal_separator = true;
line = &line[1..];
is_absolute = true;
}
}
// If it ends with a slash, then this should only match directories,
// but the slash should otherwise not be used while globbing.
if let Some((i, c)) = line.char_indices().rev().nth(0) {
if c == '/' {
glob.is_only_dir = true;
line = &line[..i];
if line.as_bytes().last() == Some(&b'/') {
glob.is_only_dir = true;
line = &line[..line.len() - 1];
// If the slash was escaped, then remove the escape.
// See: https://github.com/BurntSushi/ripgrep/issues/2236
if line.as_bytes().last() == Some(&b'\\') {
line = &line[..line.len() - 1];
}
}
// If there is a literal slash, then we note that so that globbing
// doesn't let wildcards match slashes.
glob.actual = line.to_string();
if is_absolute || line.chars().any(|c| c == '/') {
literal_separator = true;
}
// If there was a slash, then this is a glob that must match the entire
// path name. Otherwise, we should let it match anywhere, so use a **/
// prefix.
if !literal_separator {
// If there is a literal slash, then this is a glob that must match the
// entire path name. Otherwise, we should let it match anywhere, so use
// a **/ prefix.
if !is_absolute && !line.chars().any(|c| c == '/') {
// ... but only if we don't already have a **/ prefix.
if !glob.has_doublestar_prefix() {
glob.actual = format!("**/{}", glob.actual);
@@ -484,18 +500,15 @@ impl GitignoreBuilder {
if glob.actual.ends_with("/**") {
glob.actual = format!("{}/*", glob.actual);
}
let parsed =
GlobBuilder::new(&glob.actual)
.literal_separator(literal_separator)
.case_insensitive(self.case_insensitive)
.backslash_escape(true)
.build()
.map_err(|err| {
Error::Glob {
glob: Some(glob.original.clone()),
err: err.kind().to_string(),
}
})?;
let parsed = GlobBuilder::new(&glob.actual)
.literal_separator(true)
.case_insensitive(self.case_insensitive)
.backslash_escape(true)
.build()
.map_err(|err| Error::Glob {
glob: Some(glob.original.clone()),
err: err.kind().to_string(),
})?;
self.builder.add(parsed);
self.globs.push(glob);
Ok(self)
@@ -503,12 +516,16 @@ impl GitignoreBuilder {
/// Toggle whether the globs should be matched case insensitively or not.
///
/// When this option is changed, only globs added after the change will be affected.
/// When this option is changed, only globs added after the change will be
/// affected.
///
/// This is disabled by default.
pub fn case_insensitive(
&mut self, yes: bool
&mut self,
yes: bool,
) -> Result<&mut GitignoreBuilder, Error> {
// TODO: This should not return a `Result`. Fix this in the next semver
// release.
self.case_insensitive = yes;
Ok(self)
}
@@ -517,8 +534,8 @@ impl GitignoreBuilder {
/// Return the file path of the current environment's global gitignore file.
///
/// Note that the file path returned may not exist.
fn gitconfig_excludes_path() -> Option<PathBuf> {
// git supports $HOME/.gitconfig and $XDG_CONFIG_DIR/git/config. Notably,
pub fn gitconfig_excludes_path() -> Option<PathBuf> {
// git supports $HOME/.gitconfig and $XDG_CONFIG_HOME/git/config. Notably,
// both can be active at the same time, where $HOME/.gitconfig takes
// precedent. So if $HOME/.gitconfig defines a `core.excludesFile`, then
// we're done.
@@ -542,22 +559,22 @@ fn gitconfig_home_contents() -> Option<Vec<u8>> {
};
let mut file = match File::open(home.join(".gitconfig")) {
Err(_) => return None,
Ok(file) => io::BufReader::new(file),
Ok(file) => BufReader::new(file),
};
let mut contents = vec![];
file.read_to_end(&mut contents).ok().map(|_| contents)
}
/// Returns the file contents of git's global config file, if one exists, in
/// the user's XDG_CONFIG_DIR directory.
/// the user's XDG_CONFIG_HOME directory.
fn gitconfig_xdg_contents() -> Option<Vec<u8>> {
let path = env::var_os("XDG_CONFIG_HOME")
let path = std::env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })
.or_else(|| home_dir().map(|p| p.join(".config")))
.map(|x| x.join("git/config"));
let mut file = match path.and_then(|p| File::open(p).ok()) {
None => return None,
Some(file) => io::BufReader::new(file),
Some(file) => BufReader::new(file),
};
let mut contents = vec![];
file.read_to_end(&mut contents).ok().map(|_| contents)
@@ -567,7 +584,7 @@ fn gitconfig_xdg_contents() -> Option<Vec<u8>> {
///
/// Specifically, this respects XDG_CONFIG_HOME.
fn excludes_file_default() -> Option<PathBuf> {
env::var_os("XDG_CONFIG_HOME")
std::env::var_os("XDG_CONFIG_HOME")
.and_then(|x| if x.is_empty() { None } else { Some(PathBuf::from(x)) })
.or_else(|| home_dir().map(|p| p.join(".config")))
.map(|x| x.join("git/ignore"))
@@ -576,19 +593,28 @@ fn excludes_file_default() -> Option<PathBuf> {
/// Extract git's `core.excludesfile` config setting from the raw file contents
/// given.
fn parse_excludes_file(data: &[u8]) -> Option<PathBuf> {
use std::sync::OnceLock;
use regex_automata::{meta::Regex, util::syntax};
// N.B. This is the lazy approach, and isn't technically correct, but
// probably works in more circumstances. I guess we would ideally have
// a full INI parser. Yuck.
lazy_static! {
static ref RE: Regex = Regex::new(
r"(?im)^\s*excludesfile\s*=\s*(.+)\s*$"
).unwrap();
};
let caps = match RE.captures(data) {
None => return None,
Some(caps) => caps,
};
str::from_utf8(&caps[1]).ok().map(|s| PathBuf::from(expand_tilde(s)))
static RE: OnceLock<Regex> = OnceLock::new();
let re = RE.get_or_init(|| {
Regex::builder()
.configure(Regex::config().utf8_empty(false))
.syntax(syntax::Config::new().utf8(false))
.build(r#"(?im-u)^\s*excludesfile\s*=\s*"?\s*(\S+?)\s*"?\s*$"#)
.unwrap()
});
// We don't care about amortizing allocs here I think. This should only
// be called ~once per traversal or so? (Although it's not guaranteed...)
let mut caps = re.create_captures();
re.captures(data, &mut caps);
let span = caps.get_group(1)?;
let candidate = &data[span];
std::str::from_utf8(candidate).ok().map(|s| PathBuf::from(expand_tilde(s)))
}
/// Expands ~ in file paths to the value of $HOME.
@@ -602,16 +628,16 @@ fn expand_tilde(path: &str) -> String {
/// Returns the location of the user's home directory.
fn home_dir() -> Option<PathBuf> {
// We're fine with using env::home_dir for now. Its bugs are, IMO, pretty
// minor corner cases. We should still probably eventually migrate to
// the `dirs` crate to get a proper implementation.
// We're fine with using std::env::home_dir for now. Its bugs are, IMO,
// pretty minor corner cases.
#![allow(deprecated)]
env::home_dir()
std::env::home_dir()
}
#[cfg(test)]
mod tests {
use std::path::Path;
use super::{Gitignore, GitignoreBuilder};
fn gi_from_str<P: AsRef<Path>>(root: P, s: &str) -> Gitignore {
@@ -689,6 +715,9 @@ mod tests {
ignored!(ig39, ROOT, "\\?", "?");
ignored!(ig40, ROOT, "\\*", "*");
ignored!(ig41, ROOT, "\\a", "a");
ignored!(ig42, ROOT, "s*.rs", "sfoo.rs");
ignored!(ig43, ROOT, "**", "foo.rs");
ignored!(ig44, ROOT, "**/**/*", "a/foo.rs");
not_ignored!(ignot1, ROOT, "amonths", "months");
not_ignored!(ignot2, ROOT, "monthsa", "months");
@@ -704,12 +733,16 @@ mod tests {
not_ignored!(ignot12, ROOT, "\n\n\n", "foo");
not_ignored!(ignot13, ROOT, "foo/**", "foo", true);
not_ignored!(
ignot14, "./third_party/protobuf", "m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config");
ignot14,
"./third_party/protobuf",
"m4/ltoptions.m4",
"./third_party/protobuf/csharp/src/packages/repositories.config"
);
not_ignored!(ignot15, ROOT, "!/bar", "foo/bar");
not_ignored!(ignot16, ROOT, "*\n!**/", "foo", true);
not_ignored!(ignot17, ROOT, "src/*.rs", "src/grep/src/main.rs");
not_ignored!(ignot18, ROOT, "path1/*", "path2/path1/foo");
not_ignored!(ignot19, ROOT, "s*.rs", "src/foo.rs");
fn bytes(s: &str) -> Vec<u8> {
s.to_string().into_bytes()
@@ -739,6 +772,22 @@ mod tests {
assert!(super::parse_excludes_file(&data).is_none());
}
#[test]
fn parse_excludes_file4() {
let data = bytes("[core]\nexcludesFile = \"~/foo/bar\"");
let got = super::parse_excludes_file(&data);
assert_eq!(
path_string(got.unwrap()),
super::expand_tilde("~/foo/bar")
);
}
#[test]
fn parse_excludes_file5() {
let data = bytes("[core]\nexcludesFile = \" \"~/foo/bar \" \"");
assert!(super::parse_excludes_file(&data).is_none());
}
// See: https://github.com/BurntSushi/ripgrep/issues/106
#[test]
fn regression_106() {
@@ -748,9 +797,12 @@ mod tests {
#[test]
fn case_insensitive() {
let gi = GitignoreBuilder::new(ROOT)
.case_insensitive(true).unwrap()
.add_str(None, "*.html").unwrap()
.build().unwrap();
.case_insensitive(true)
.unwrap()
.add_str(None, "*.html")
.unwrap()
.build()
.unwrap();
assert!(gi.matched("foo.html", false).is_ignore());
assert!(gi.matched("foo.HTML", false).is_ignore());
assert!(!gi.matched("foo.htm", false).is_ignore());

Some files were not shown because too many files have changed in this diff Show More