This fixes a bug only present on Windows that would permit someoen to
execute an arbitrary program if they crafted an appropriate directory
tree. Namely, if someone put an executable named 'xz.exe' in the root of
a directory tree and one ran 'rg -z foo' from the root of that tree,
then the 'xz.exe' executable in that tree would execute if there are any
'xz' files anywhere in the tree.
The root cause of this problem is that 'CreateProcess' on Windows will
implicitly look in the current working directory for an executable when
it is given a relative path to a program. Rust's standard library allows
this behavior to occur, so we work around it here. We work around it by
explicitly resolving programs like 'xz' via 'PATH'. That way, we only
ever pass an absolute path to 'CreateProcess', which avoids the implicit
behavior of checking the current working directory.
This fix doesn't apply to non-Windows systems as it is believed to only
impact Windows. In theory, the bug could apply on Unix if '.' is in
one's PATH, but at that point, you reap what you sow.
While the extent to which this is a security problem isn't clear, I
think users generally expect to be able to download or clone
repositories from the Internet and run ripgrep on them without fear of
anything too awful happening. Being able to execute an arbitrary program
probably violates that expectation. Therefore, CVE-2021-3013[1] was
created for this issue.
We apply the same logic to the --pre command, since the --pre command is
likely in a user's config file and it would be surprising for something
that the user is searching to modify which preprocessor command is used.
The --pre and -z/--search-zip flags are the only two ways that ripgrep
will invoke external programs, so this should cover any possible
exploitable cases of this bug.
[1] - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3013
It seems that PowerShell uses sockets instead of FIFOs to redirect the
output between commands. So add `is_socket` to our `is_readable_stdin`
check.
This seems unlikely to cause problems and it probably more generally
correct than what we had before. In theory, it could cause problems if
it produces false positives, in which case, ripgrep will try to read
stdin when it should search the current working directory. (And this
usually winds up manifesting as ripgrep blocking forever.) But, if the
stdin handle reports itself as a socket, then it seems like we should
read it.
Fixes#1741, Closes#1742
`ignore::Error` wraps `std::io::Error` with additional information
(as well as expose non-IO errors). For people wanting to inspect what
the error is, they have to recursively match the Enum. This provides
`io_error` and `into_io_error` helpers to do this for the user.
PR #1740
This comes up as a corner case where folks provide -e/--regexp in a
configuration file and then expect to be able to run 'rg' with no args.
However, ripgrep fails because it still expects at least one pattern
even though one was specified in the config file.
This occurs because ripgrep has to parse its CLI parameters before
reading the config file. (For log output settings and to handle the
--no-config flag.) This initial parse will fail if there are no patterns
specified.
The only way to solve this that I can see is to somehow relax the
requirements of the initial parse. But this is problematic because we
would still need to enforce those requirements in cases where we don't
do a second parse (when no config file is present).
All in all, this doesn't seem like a problem that is worth solving.
Closes#1730
This brings in all other semver updates.
This did require updating some tests, since bstr changed its debug
output for NUL bytes to be a bit more idiomatic.
This updates encoding_rs, crossbeam-utils and crossbeam-channel. This
serves two purposes. The encoding_rs update fixes a compilation failure
on the latest nightly. The crossbeam updates are good sense and to
reduce duplicate dependencies such as cfg-if. (Although, we note that
the log crate still pulls in cfg-if 0.1, so ripgrep has a duplicate
dependency there for now. But it's very small.)
Fixes#1721, Closes#1705
Bazel supports `BUILD.bazel` as well as `WORKSPACE.bazel`. In
addition, it is common to ship BUILD/WORKSPACE templates for
external repositories suffixed with .bazel for easier tool
recognition.
Co-authored-by: Brandon Adams <brandon.adams@imc.com>
PR #1716
Since the translation from a glob to a regex always
disables Unicode in the regex, it follows that we shouldn't
need regex's Unicode features enabled.
Now, ripgrep enables Unicode features in its regex
dependency and of course uses them, which will cause
globset to have it enabled in the ripgrep build as well. So
this doesn't actually change anything for ripgrep. But this
does slim thing downs for folks using globset independently
of ripgrep.
PR #1712
It's not quite clear why I added this originally. ripgrep doesn't have
its `-a` flag enabled. It's possible I tricked myself into adding it
because ripgrep's binary detection has evolved to be more like GNU
grep's nowadays.
In any case, using `-a` on data that is non-binary can only improve
performance because it removes the overhead for checking whether the
data is binary or not. So this was giving an artificial boost to GNU
grep.
None of these tools got particularly popular (except for pt briefly),
but they do not appear to be active projects nowadays. While ucg was
fast, sift and pt were ecscruiating slow in a number of cases that
required special care in the benchmarks.
This also fixes the ordering of benchmark output to reflect the ordering
in the source of the benchsuite script.
Since the English subtitle file actually changed its content, we tweak
the benchmark to use a slightly bigger sample that more closely matches
the file size of the Russian subtitle file.
Also, the BurntSushi/linux repo has been updated and I've confirmed that
it builds on my Linux machine.
Fixes#1257
This should bring a compilation time improvement when building static
buils of PCRE2 by enabling parallelism for C compilation.
Kudos to @JoshTriplett for the tip!