search: add support for searching compressed files

This commit adds opt-in support for searching compressed files during recursive search. This behavior is only enabled when the `-z/--search-zip` flag is passed to ripgrep. When enabled, a limited set of common compression formats are recognized via file extension, and a new process is spawned to perform the decompression. ripgrep then searches the stdout of that spawned process. Closes #539
2025-07-31 20:21:59 -07:00 · 2018-01-07 21:35:58 +05:30
parent a8543f798d
commit f007f940c5
18 changed files with 373 additions and 24 deletions
--- a/README.md
+++ b/README.md
@@ -91,6 +91,8 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
  as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
  automatically detecting UTF-16 is provided. Other text encodings must be
  specifically specified with the `-E/--encoding` flag.)
+* `ripgrep` supports searching files compressed in a common format (gzip, xz,
+  lzma or bzip2 current) with the `-z/--search-zip` flag.

 In other words, use `ripgrep` if you like speed, filtering by default, fewer
 bugs, and Unicode support.
@@ -109,12 +111,10 @@ give you a glimpse at some important downsides or missing features of
  support for Unicode categories (e.g., `\p{Sc}` to match currency symbols or
  `\p{Lu}` to match any uppercase letter). (Fancier regexes will never be
  supported.)
-* `ripgrep` doesn't yet support searching compressed files. (Likely to be
-  supported in the future.)
 * `ripgrep` doesn't have multiline search. (Unlikely to ever be supported.)

-In other words, if you like fancy regexes, searching compressed files or
-multiline search, then `ripgrep` may not quite meet your needs (yet).
+In other words, if you like fancy regexes or multiline search, then `ripgrep`
+may not quite meet your needs (yet).

 ### Feature comparison