When searching in parallel with many more arguments than threads, the
first arguments are searched last -- unlike in the -j1 case.
This is unexpected for users who know about the parallel nature of rg
and think they can give the scheduler a hint by positioning larger
input files (L1, L2, ..) before smaller ones (█, ██). Instead, this can
result in sub-optimal thread usage and thus longer runtime (simplified
example with 2 threads):
T1: █ ██ █ █ █ █ ██ █ █ █ █ █ ██ ╠═════════════L1════════════╣
T2: █ █ ██ █ █ ██ █ █ █ ██ █ █ ╠═════L2════╣
┏━━━━┳━━━━┳━━━━┳━━━━┓
This is caused by assigning work to ┃ T1 ┃ T2 ┃ T3 ┃ T4 ┃
per-thread stacks in a round-robin ┡━━━━╇━━━━╇━━━━╇━━━━┩
manner, starting here → │ L1 │ L2 │ L3 │ L4 │ ↵
├────├────┼────┼────┤
│ s5 │ s6 │ s7 │ s8 │ ↵
├────┼────┼────┼────┤
╷ .. ╷ .. ╷ .. ╷ .. ╷
├────┼────┼────┼────┤
│ st │ su │ sv │ sw │ ↵
├────┼────┼────┼────┘
│ sx │ sy │ sz │
└────┴────┴────┘
and then processing them bottom-up: ↥ ↥ ↥ ↥
╷ .. ╷ .. ╷ .. ╷ .. ╷
This patch reverses the input order ├────┼────┼────┼────┤
so the two reversals cancel each other │ s7 │ s6 │ s5 │ L4 │ ↵
out. Now at least the first N ├────┼────┼────┼────┘
arguments, N=number-of-threads, are │ L3 │ L2 │ L1 │
processed before any others (then └────┴────┴────┘
work-stealing may happen):
T1: ╠═════════════L1════════════╣ █ ██ █ █ █ █ █ █ ██
T2: ╠═════L2════╣ █ █ ██ █ █ ██ █ █ █ ██ █ █ ██ █ █ █
(With some more shuffling T1 could always be assigned L1 etc., but
that would mostly be for optics).
Closes#2849
The *BSD build systems make use of "Makefile.inc" a lot. Make the
"make" type recognize this file by default. And more generally,
`Makefile.*` seems to be a convention, so just generalize it.
Closes#2846
This makes it so the presence of `.jj` will cause ripgrep to treat it
as a VCS directory, just as if `.git` were present. This is useful for
ripgrep's default behavior when working with jj repositories that don't
have a `.git` but do have `.gitignore`. Namely, ripgrep requires the
presence of a VCS repository in order to respect `.gitignore`.
We don't handle clone-specific exclude rules for jj repositories without
`.git` though. It seems it isn't 100% set yet where we can find
those[1].
Closes#2842
[1]: https://github.com/BurntSushi/ripgrep/pull/2842#discussion_r2020076722
The previous code deleted too many parts of the path when constructing
the absolute path, resulting in a shortened final path. This patch
creates the correct absolute path by only removing the necessary parts.
Fixes#829, Fixes#2731, Fixes#2747, Fixes#2778, Fixes#2836, Fixes#2933Closes#2933
The fish completions now also pay attention to the configuration file
to determine whether to suggest negation options and not just to the
current command line.
This doesn't cover all edge cases. For example the config file is
cached, and so changes may not take effect until the next shell
session. But the cases it doesn't cover are hopefully very rare.
Closes#2708
This feature causes nothing but problems and is frequently broken. The
only optimization it was enabling were SIMD optimizations for
transcoding. In particular, for UTF-16 transcoding. This is performed by
the [`encoding_rs`](https://github.com/hsivonen/encoding_rs) crate,
which specifically uses unstable portable SIMD APIs instead of the
stable non-portable SIMD APIs.
SIMD optimizations that apply to search have long been making use of
stable APIs, and are automatically enabled when your target supports
them. This is, IMO, the correct user experience and one that
`encoding_rs` refuses to support. I'm done dealing with it, so
transcoding will only use scalar code until the SIMD optimizations in
`encoding_rs` work on stable. (This doesn't mean that `encoding_rs` has
to change. This could also be fixed by stabilizing `std::simd`.)
Fixes#2748
It looks like there is a reference cycle caused by the compiled
matchers (compiled HashMap holds ref to Ignore and Ignore holds ref
to HashMap). Using weak refs fixes issue #2690 in my test project.
Also confirmed via before and after when profiling the code, see the
attached screenshots in #2692.
Fixes#2690
- Stop using `-n __fish_use_subcommand`. This had the effect of
ignoring options if a positional argument has already been given, but
that's not how ripgrep works.
- Only suggest negation options if the option they're negating is
passed (e.g., only complete `--no-pcre2` if `--pcre2` is present). The
zsh completions already do this.
- Take into account whether an option takes an argument. If an option
is not a switch then it won't suggest further options until the
argument is given, e.g. `-C<tab>` won't suggest options but `-i<tab>`
will.
- Suggest correct arguments for options. We already completed a fixed
set of choices where available, but now we go further:
- Filenames are only suggested for options that take filenames.
- `--pre` and `--hostname-bin` suggest binaries from `$PATH`.
- `-t`/`--type`/&c use `--type-list` for suggestions, like in zsh,
with a preview of the glob patterns.
- `--encoding` uses a hardcoded list extracted from the zsh
completions. This has been refactored into a separate file, and the
range globs (`{1..5}`) replaced by comma globs (`{1,2,3,4,5}`) since
those work in both shells. I verified that this produces the same
list as before in zsh, and the same list in fish (albeit in a
different order).
PR #2684
This is an embarrassing oversight. A `todo!()` actually made its way
into a release! Oof.
This was working in ripgrep 13, but I had redone some aspects of sorting
and this just got left undone.
Fixes#2664
As the FIXME comment says, ripgrep is not yet using the new line
terminator option in regex-automata exposed for exactly this purpose.
Because of that, line anchors like `(?m:^)` and `(?m:$)` will only match
`\n` as a line terminator. This means that when --null-data is used in
combination with --line-regexp, the anchors inserted by --line-regexp
will not match correctly. This is only a big deal in the "fast" path,
which requires the regex engine to deal with line terminators itself
correctly. The slow path strips line terminators regardless of what they
are, and so the line anchors can match (begin/end of haystack).
Fixes#2658
And also, negated options don't take arguments.
Specifically, the fish completion generator currently forgets to add
`-l` to negation options, leading to a list of these errors:
complete: too many arguments
~/.config/fish/completions/rg.fish (line 146):
complete -c rg -n '__fish_use_subcommand' no-sort-files -d '(DEPRECATED) Sort results by file path.'
^
from sourcing file ~/.config/fish/completions/rg.fish
(Type 'help complete' for related documentation)
To reproduce, run `fish -c 'rg --generate=complete-fish | source'`.
It also potentially suggests a list of choices for negation options,
even though those never take arguments. That case doesn't occur with
any of the current options but it's an easy fix.
Fixes#2659, Closes#2655
Basically, unless the -a/--text flag is given, it is generally always an
error to search for an explicit NUL byte because the binary detection
will prevent it from matching.
Fixes#1838
The --vimgrep flag has some severe footguns when using a pattern that
matches very frequently. We had already written some docs to warn about
that, but now we also include a suggestion to avoid exorbitant heap
usage.
Closes#2505
This adds info about whether PCRE2 is available or not to the output of
--version. Essentially, --version now subsumes --pcre2-version, although
we do retain the former because it (usefully) emits an exit code based
on whether PCRE2 is available or not.
Closes#2645
Previously, we were applying the -M/--max-columns flag *before* triming
prefix ASCII whitespace. But this doesn't make a whole lot of sense. We
should be trimming first, but the result of trimming is ultimately what
we'll be printing and that's what -M/--max-columns should be applied to.
Fixes#2458
When one does not provide any paths to ripgrep to search, it has to
guess between searching stdin and the current working directory. It is
possible for this guess to be wrong, and having the heuristics and the
choice in the debug logs is useful for diagnosing this.
The failure mode here is still pretty bad because you need to know to
reach for the `--debug` flag in the first place. Namely, the typical
failure mode is that ripgrep tries to search stdin while the intent is
for it to search the current working directory, and thus likely blocking
forever waiting for data on stdin.
(Arguably this is a problem with the process architecture that invokes
ripgrep. It shouldn't give ripgrep an open stdin handle that isn't
closed.)
Closes#2524
Previously, every worker would increment the shared num_pending count on
every new work item, and decrement it after finishing them, leading to
lots of contention. Now, we only track the number of workers actively
running, so there is no contention except when workers go to sleep or
wake up.
Closes#2642
This actually just kind of fell out of the migration off of Clap as a
result of treating `-p/--pretty` more rigorously as an alias for
`--line-number --heading --color always`.
Fixes#2381, Closes#2637