search: migrate to bstr

This is an initial attempt at migrating grep-searcher to use the new bstr crate (not yet published). This is mostly an improvement, although a significant problem is that the grep-matcher crate controls the `Index` impls for the `Match` type, which we use quite heavily. Thus, in order to impl `Index` for `BStr`, we need add bstr as a public dependency to grep-matcher. This is really bad news because grep-matcher is supposed to be a light-weight core crate that defines a matcher interface, which is itself intended to be a public dependency. Thus, a semver bump on bstr will have very undesirable ripple effects thoughout ripgrep's library crates. This would be something we could stomach if bstr was solid at 1.0 and committed to avoiding breaking changes. But it's not there yet.
edition: fix build.rs
2025-07-27 02:01:58 -07:00 · 2019-01-20 12:32:09 -05:00 · 2019-01-19 10:46:57 -05:00 · 2019-01-19 10:44:30 -05:00 · 2019-01-19 10:44:30 -05:00 · 2019-01-19 10:44:30 -05:00
79 changed files with 4161 additions and 1769 deletions
--- a/.travis.yml
+++ b/.travis.yml
@@ -62,13 +62,13 @@ matrix:
    # Minimum Rust supported channel. We enable these to make sure ripgrep
    # continues to work on the advertised minimum Rust version.
    - os: linux
-      rust: 1.23.0
+      rust: 1.32.0
      env: TARGET=x86_64-unknown-linux-gnu
    - os: linux
-      rust: 1.23.0
+      rust: 1.32.0
      env: TARGET=x86_64-unknown-linux-musl
    - os: linux
-      rust: 1.23.0
+      rust: 1.32.0
      env: TARGET=arm-unknown-linux-gnueabihf GCC_VERSION=4.8
      addons:
        apt:
@@ -93,7 +93,7 @@ deploy:
  skip_cleanup: true
  on:
    condition: $TRAVIS_RUST_VERSION = nightly
-    branch: master
+    branch: master  # i guess we do need this after all?
    tags: true
  api_key:
    secure: "IbSnsbGkxSydR/sozOf1/SRvHplzwRUHzcTjM7BKnr7GccL86gRPUrsrvD103KjQUGWIc1TnK1YTq5M0Onswg/ORDjqa1JEJPkPdPnVh9ipbF7M2De/7IlB4X4qXLKoApn8+bx2x/mfYXu4G+G1/2QdbaKK2yfXZKyjz0YFx+6CNrVCT2Nk8q7aHvOOzAL58vsG8iPDpupuhxlMDDn/UhyOWVInmPPQ0iJR1ZUJN8xJwXvKvBbfp3AhaBiAzkhXHNLgBR8QC5noWWMXnuVDMY3k4f3ic0V+p/qGUCN/nhptuceLxKFicMCYObSZeUzE5RAI0/OBW7l3z2iCoc+TbAnn+JrX/ObJCfzgAOXAU3tLaBFMiqQPGFKjKg1ltSYXomOFP/F7zALjpvFp4lYTBajRR+O3dqaxA9UQuRjw27vOeUpMcga4ZzL4VXFHzrxZKBHN//XIGjYAVhJ1NSSeGpeJV5/+jYzzWKfwSagRxQyVCzMooYFFXzn8Yxdm3PJlmp3GaAogNkdB9qKcrEvRINCelalzALPi0hD/HUDi8DD2PNTCLLMo6VSYtvc685Zbe+KgNzDV1YyTrRCUW6JotrS0r2ULLwnsh40hSB//nNv3XmwNmC/CmW5QAnIGj8cBMF4S2t6ohADIndojdAfNiptmaZOIT6owK7bWMgPMyopo="
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,5 @@
-0.10.0 (TBD)
-============
+0.10.0 (2018-09-07)
+===================
 This is a new minor version release of ripgrep that contains some major new
 features, a huge number of bug fixes, and is the first release based on
 libripgrep. The entirety of ripgrep's core search and printing code has been
@@ -10,24 +10,32 @@ format.

 **BREAKING CHANGES**:

+* The minimum version required to compile Rust has now changed to track the
+  latest stable version of Rust. Patch releases will continue to compile with
+  the same version of Rust as the previous patch release, but new minor
+  versions will use the current stable version of the Rust compile as its
+  minimum supported version.
 * The match semantics of `-w/--word-regexp` have changed slightly. They used
  to be `\b(?:<your pattern>)\b`, but now it's
-  `(?:^|\W)(?:<your pattern>)(?:$|\W)`.
-  See [#389](https://github.com/BurntSushi/ripgrep/issues/389) for more
-  details.
+  `(?:^|\W)(?:<your pattern>)(?:$|\W)`. This matches the behavior of GNU grep
+  and is believed to be closer to the intended semantics of the flag. See
+  [#389](https://github.com/BurntSushi/ripgrep/issues/389) for more details.

 Feature enhancements:

 * [FEATURE #162](https://github.com/BurntSushi/ripgrep/issues/162):
-  libripgrep is now a thing, composed of the following crates:
-  `grep`, `grep-matcher`, `grep-pcre2`, `grep-printer`, `grep-regex` and
-  `grep-searcher`.
+  libripgrep is now a thing. The primary crate is
+  [`grep`](https://docs.rs/grep).
 * [FEATURE #176](https://github.com/BurntSushi/ripgrep/issues/176):
  Add `-U/--multiline` flag that permits matching over multiple lines.
 * [FEATURE #188](https://github.com/BurntSushi/ripgrep/issues/188):
  Add `-P/--pcre2` flag that gives support for look-around and backreferences.
 * [FEATURE #244](https://github.com/BurntSushi/ripgrep/issues/244):
  Add `--json` flag that prints results in a JSON Lines format.
+* [FEATURE #321](https://github.com/BurntSushi/ripgrep/issues/321):
+  Add `--one-file-system` flag to skip directories on different file systems.
+* [FEATURE #404](https://github.com/BurntSushi/ripgrep/issues/404):
+  Add `--sort` and `--sortr` flag for more sorting. Deprecate `--sort-files`.
 * [FEATURE #416](https://github.com/BurntSushi/ripgrep/issues/416):
  Add `--crlf` flag to permit `$` to work with carriage returns on Windows.
 * [FEATURE #917](https://github.com/BurntSushi/ripgrep/issues/917):
@@ -36,6 +44,10 @@ Feature enhancements:
  Add `--null-data` flag, which makes ripgrep use NUL as a line terminator.
 * [FEATURE #997](https://github.com/BurntSushi/ripgrep/issues/997):
  The `--passthru` flag now works with the `--replace` flag.
+* [FEATURE #1038-1](https://github.com/BurntSushi/ripgrep/issues/1038):
+  Add `--line-buffered` and `--block-buffered` for forcing a buffer strategy.
+* [FEATURE #1038-2](https://github.com/BurntSushi/ripgrep/issues/1038):
+  Add `--pre-glob` for filtering files through the `--pre` flag.

 Bug fixes:

@@ -53,14 +65,22 @@ Bug fixes:
  Matching empty lines now works correctly in several corner cases.
 * [BUG #764](https://github.com/BurntSushi/ripgrep/issues/764):
  Color escape sequences now coalesce, which reduces output size.
+* [BUG #842](https://github.com/BurntSushi/ripgrep/issues/842):
+  Add man page to binary Debian package.
 * [BUG #922](https://github.com/BurntSushi/ripgrep/issues/922):
  ripgrep is now more robust with respect to memory maps failing.
 * [BUG #937](https://github.com/BurntSushi/ripgrep/issues/937):
  Color escape sequences are no longer emitted for empty matches.
 * [BUG #940](https://github.com/BurntSushi/ripgrep/issues/940):
  Context from the `--passthru` flag should not impact process exit status.
+* [BUG #984](https://github.com/BurntSushi/ripgrep/issues/984):
+  Fixes bug in `ignore` crate where first path was always treated as a symlink.
+* [BUG #990](https://github.com/BurntSushi/ripgrep/issues/990):
+  Read stderr asynchronously when running a process.
 * [BUG #1013](https://github.com/BurntSushi/ripgrep/issues/1013):
  Add compile time and runtime CPU features to `--version` output.
+* [BUG #1028](https://github.com/BurntSushi/ripgrep/pull/1028):
+  Don't complete bare pattern after `-f` in zsh.


 0.9.0 (2018-08-03)
@@ -96,7 +116,7 @@ multi-line search support and a JSON output format.

 Feature enhancements:

-* Added or improved file type filtering for Android, Bazel, Fuschia, Haskell,
+* Added or improved file type filtering for Android, Bazel, Fuchsia, Haskell,
  Java and Puppet.
 * [FEATURE #411](https://github.com/BurntSushi/ripgrep/issues/411):
  Add a `--stats` flag, which emits aggregate statistics after search results.
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "ripgrep"
-version = "0.9.0"  #:version
+version = "0.10.0"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 ripgrep is a line-oriented search tool that recursively searches your current
@@ -17,6 +17,7 @@ license = "Unlicense OR MIT"
 exclude = ["HomebrewFormula"]
 build = "build.rs"
 autotests = false
+edition = "2018"

 [badges]
 travis-ci = { repository = "BurntSushi/ripgrep" }
@@ -35,6 +36,7 @@ path = "tests/tests.rs"
 members = [
  "globset",
  "grep",
+  "grep-cli",
  "grep-matcher",
  "grep-pcre2",
  "grep-printer",
@@ -44,43 +46,62 @@ members = [
 ]

 [dependencies]
-atty = "0.2.11"
-globset = { version = "0.4.0", path = "globset" }
-grep = { version = "0.2.0", path = "grep" }
-ignore = { version = "0.4.0", path = "ignore" }
-lazy_static = "1"
-log = "0.4"
-num_cpus = "1"
-regex = "1"
-same-file = "1"
-serde_json = "1"
-termcolor = "1"
+grep = { version = "0.2.3", path = "grep" }
+ignore = { version = "0.4.4", path = "ignore" }
+lazy_static = "1.1.0"
+log = "0.4.5"
+num_cpus = "1.8.0"
+regex = "1.0.5"
+serde_json = "1.0.23"
+termcolor = "1.0.3"

 [dependencies.clap]
-version = "2.29.4"
+version = "2.32.0"
 default-features = false
-features = ["suggestions", "color"]
-
-[target.'cfg(windows)'.dependencies.winapi]
-version = "0.3"
-features = ["std", "fileapi", "winnt"]
+features = ["suggestions"]

 [build-dependencies]
-lazy_static = "1"
+lazy_static = "1.1.0"

 [build-dependencies.clap]
-version = "2.29.4"
+version = "2.32.0"
 default-features = false
-features = ["suggestions", "color"]
+features = ["suggestions"]

 [dev-dependencies]
-serde = "1"
-serde_derive = "1"
+serde = "1.0.77"
+serde_derive = "1.0.77"

 [features]
-avx-accel = ["grep/avx-accel"]
 simd-accel = ["grep/simd-accel"]
 pcre2 = ["grep/pcre2"]

 [profile.release]
 debug = 1
+
+[package.metadata.deb]
+features = ["pcre2"]
+section = "utils"
+assets = [
+  ["target/release/rg", "usr/bin/", "755"],
+  ["COPYING", "usr/share/doc/ripgrep/", "644"],
+  ["LICENSE-MIT", "usr/share/doc/ripgrep/", "644"],
+  ["UNLICENSE", "usr/share/doc/ripgrep/", "644"],
+  ["CHANGELOG.md", "usr/share/doc/ripgrep/CHANGELOG", "644"],
+  ["README.md", "usr/share/doc/ripgrep/README", "644"],
+  ["FAQ.md", "usr/share/doc/ripgrep/FAQ", "644"],
+  # The man page is automatically generated by ripgrep's build process, so
+  # this file isn't actually commited. Instead, to create a dpkg, either
+  # create a deployment/deb directory and copy the man page to it, or use the
+  # 'ci/build_deb.sh' script.
+  ["deployment/deb/rg.1", "usr/share/man/man1/rg.1", "644"],
+  # Similarly for shell completions.
+  ["deployment/deb/rg.bash", "usr/share/bash-completion/completions/rg", "644"],
+  ["deployment/deb/rg.fish", "usr/share/fish/completions/rg.fish", "644"],
+  ["deployment/deb/_rg", "usr/share/zsh/vendor-completions/", "644"],
+]
+extended-description = """\
+ripgrep (rg) recursively searches your current directory for a regex pattern.
+By default, ripgrep will respect your .gitignore and automatically skip hidden
+files/directories and binary files.
+"""
--- a/FAQ.md
+++ b/FAQ.md
@@ -635,7 +635,7 @@ real    0m1.714s
 user    0m1.669s
 sys     0m0.044s

-[andrew@Cheetah 2016] time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
+$ time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
 21225780:EverymajordevelopmentinthehistoryofAmerica

 real    0m1.997s
@@ -675,14 +675,18 @@ no longer needs to do any kind of UTF-8 checks. This allows the file to get
 memory mapped and passed right through PCRE2's JIT at impressive speeds. (As
 a brief and interesting historical note, the configuration of "memory map +
 multiline + no-Unicode" is exactly the configuration used by The Silver
-Searcher. This analysis perhaps sheds some reasoning as to why it converged on
-that specific setting!)
+Searcher. This analysis perhaps sheds some reasoning as to why that
+configuration is useful!)

 In summary, if you want PCRE2 to go as fast as possible and you don't care
 about Unicode and you don't care about matches possibly spanning across
 multiple lines, then enable multiline mode with `-U` and disable PCRE2's
 Unicode support with the `--no-pcre2-unicode` flag.

+Caveat emptor: This author is not a PCRE2 expert, so there may be APIs that can
+improve performance that the author missed. Similarly, there may be alternative
+designs for a searching tool that are more amenable to how PCRE2 works.
+

 <h3 name="rg-other-cmd">
 When I run <code>rg</code>, why does it execute some other command?
--- a/GUIDE.md
+++ b/GUIDE.md
@@ -227,7 +227,7 @@ with the following contents:
 ```

 ripgrep treats `.ignore` files with higher precedence than `.gitignore` files
-(and treats `.rgignore` files with higher precdence than `.ignore` files).
+(and treats `.rgignore` files with higher precedence than `.ignore` files).
 This means ripgrep will see the `!log/` whitelist rule first and search that
 directory.

@@ -580,7 +580,7 @@ override it.

 If you're confused about what configuration file ripgrep is reading arguments
 from, then running ripgrep with the `--debug` flag should help clarify things.
-The debug output should note what config file is being loaded and the arugments
+The debug output should note what config file is being loaded and the arguments
 that have been read from the configuration.

 Finally, if you want to make absolutely sure that ripgrep *isn't* reading a
--- a/README.md
+++ b/README.md
@@ -23,7 +23,7 @@ Please see the [CHANGELOG](CHANGELOG.md) for a release history.
 * [Installation](#installation)
 * [User Guide](GUIDE.md)
 * [Frequently Asked Questions](FAQ.md)
-* [Regex syntax](https://docs.rs/regex/0.2.5/regex/#syntax)
+* [Regex syntax](https://docs.rs/regex/1/regex/#syntax)
 * [Configuration files](GUIDE.md#configuration-file)
 * [Shell completions](FAQ.md#complete)
 * [Building](#building)
@@ -103,6 +103,10 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
  of search results, searching multiple patterns, highlighting matches with
  color and full Unicode support. Unlike GNU grep, ripgrep stays fast while
  supporting Unicode (which is always on).
+* ripgrep has optional support for switching its regex engine to use PCRE2.
+  Among other things, this makes it possible to use look-around and
+  backreferences in your patterns, which are not supported in ripgrep's default
+  regex engine. PCRE2 support is enabled with `-P`.
 * ripgrep supports searching files in text encodings other than UTF-8, such
  as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
  automatically detecting UTF-16 is provided. Other text encodings must be
@@ -114,7 +118,7 @@ increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
  detection and so on.

 In other words, use ripgrep if you like speed, filtering by default, fewer
-bugs, and Unicode support.
+bugs and Unicode support.


 ### Why shouldn't I use ripgrep?
@@ -131,8 +135,8 @@ or more of the following:
 * You need a portable and ubiquitous tool. While ripgrep works on Windows,
  macOS and Linux, it is not ubiquitous and it does not conform to any
  standard such as POSIX. The best tool for this job is good old grep.
-* There still exists some other minor feature (or bug) found in another tool
-  that isn't in ripgrep.
+* There still exists some other feature (or bug) not listed in this README that
+  you rely on that's in another tool that isn't in ripgrep.
 * There is a performance edge case where ripgrep doesn't do well where another
  tool does do well. (Please file a bug report!)
 * ripgrep isn't possible to install on your machine or isn't available for your
@@ -159,7 +163,7 @@ Summarizing, ripgrep is fast because:
  latter is better for large directories. ripgrep chooses the best searching
  strategy for you automatically.
 * Applies your ignore patterns in `.gitignore` files using a
-  [`RegexSet`](https://docs.rs/regex/1.0.0/regex/struct.RegexSet.html).
+  [`RegexSet`](https://docs.rs/regex/1/regex/struct.RegexSet.html).
  That means a single file path can be matched against multiple glob patterns
  simultaneously.
 * It uses a lock-free parallel recursive directory iterator, courtesy of
@@ -284,20 +288,27 @@ $ # (Or using the attribute name, which is also ripgrep.)

 If you're a **Debian** user (or a user of a Debian derivative like **Ubuntu**),
 then ripgrep can be installed using a binary `.deb` file provided in each
-[ripgrep release](https://github.com/BurntSushi/ripgrep/releases). Note that
-ripgrep is not in the official Debian or Ubuntu repositories.
+[ripgrep release](https://github.com/BurntSushi/ripgrep/releases).

 ```
-$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.9.0/ripgrep_0.9.0_amd64.deb
-$ sudo dpkg -i ripgrep_0.9.0_amd64.deb
+$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.10.0/ripgrep_0.10.0_amd64.deb
+$ sudo dpkg -i ripgrep_0.10.0_amd64.deb
 ```

-If you run Debian Buster (currently Debian testing) or Debian sid, ripgrep is 
+If you run Debian Buster (currently Debian testing) or Debian sid, ripgrep is
 [officially maintained by Debian](https://tracker.debian.org/pkg/rust-ripgrep).
 ```
 $ sudo apt-get install ripgrep
 ```

+If you're an **Ubuntu Cosmic (18.10)** (or newer) user, ripgrep is
+[available](https://launchpad.net/ubuntu/+source/rust-ripgrep) using the same
+packaging as Debian:
+
+```
+$ sudo apt-get install ripgrep
+```
+
 (N.B. Various snaps for ripgrep on Ubuntu are also available, but none of them
 seem to work right and generate a number of very strange bug reports that I
 don't know how to fix and don't have the time to fix. Therefore, it is no
@@ -326,7 +337,7 @@ If you're a **NetBSD** user, then you can install ripgrep from

 If you're a **Rust programmer**, ripgrep can be installed with `cargo`.

-* Note that the minimum supported version of Rust for ripgrep is **1.23.0**,
+* Note that the minimum supported version of Rust for ripgrep is **1.28.0**,
  although ripgrep may work with older versions.
 * Note that the binary may be bigger than expected because it contains debug
  symbols. This is intentional. To remove debug symbols and therefore reduce
@@ -347,7 +358,10 @@ ripgrep isn't currently in any other package repositories.

 ripgrep is written in Rust, so you'll need to grab a
 [Rust installation](https://www.rust-lang.org/) in order to compile it.
-ripgrep compiles with Rust 1.23.0 (stable) or newer. Building is easy:
+ripgrep compiles with Rust 1.28.0 (stable) or newer. In general, ripgrep tracks
+the latest stable release of the Rust compiler.
+
+To build ripgrep:

 ```
 $ git clone https://github.com/BurntSushi/ripgrep
@@ -382,14 +396,15 @@ $ cargo build --release --features 'pcre2'
 ```

 (Tip: use `--features 'pcre2 simd-accel avx-accel'` to also include compile
-time SIMD optimizations.)
+time SIMD optimizations, which will only work with a nightly compiler.)

-Enabling the PCRE2 feature will attempt to automatically find and link with
-your system's PCRE2 library via `pkg-config`. If one doesn't exist, then
-ripgrep will build PCRE2 from source using your system's C compiler and then
-statically link it into the final executable. Static linking can be forced even
-when there is an available PCRE2 system library by either building ripgrep with
-the MUSL target or by setting `PCRE2_SYS_STATIC=1`.
+Enabling the PCRE2 feature works with a stable Rust compiler and will
+attempt to automatically find and link with your system's PCRE2 library via
+`pkg-config`. If one doesn't exist, then ripgrep will build PCRE2 from source
+using your system's C compiler and then statically link it into the final
+executable. Static linking can be forced even when there is an available PCRE2
+system library by either building ripgrep with the MUSL target or by setting
+`PCRE2_SYS_STATIC=1`.

 ripgrep can be built with the MUSL target on Linux by first installing the MUSL
 library on your system (consult your friendly neighborhood package manager).
@@ -401,7 +416,10 @@ $ rustup target add x86_64-unknown-linux-musl
 $ cargo build --release --target x86_64-unknown-linux-musl
 ```

-Applying the `--features` flag from above works as expected.
+Applying the `--features` flag from above works as expected. If you want to
+build a static executable with MUSL and with PCRE2, then you will need to have
+`musl-gcc` installed, which might be in a separate package from the actual
+MUSL library, depending on your Linux distribution.


 ### Running tests
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -45,11 +45,10 @@ install:
  - rustc -V
  - cargo -V

-# ???
+# Hack to work around a harmless warning in Appveyor builds?
 build: false

 # Equivalent to Travis' `script` phase
-# TODO modify this phase as you see fit
 test_script:
  - cargo test --verbose --all --features pcre2

@@ -60,7 +59,7 @@ before_deploy:
  - copy target\release\rg.exe staging
  - ps: copy target\release\build\ripgrep-*\out\_rg.ps1 staging
  - cd staging
-    # release zipfile will look like 'rust-everywhere-v1.2.3-x86_64-pc-windows-msvc'
+  # release zipfile will look like 'ripgrep-1.2.3-x86_64-pc-windows-msvc'
  - 7z a ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip *
  - appveyor PushArtifact ../%PROJECT_NAME%-%APPVEYOR_REPO_TAG_NAME%-%TARGET%.zip

@@ -73,9 +72,6 @@ deploy:
  provider: GitHub
  # deploy when a new tag is pushed and only on the stable channel
  on:
-    # channel to use to produce the release artifacts
-    # NOTE make sure you only release *once* per target
-    # TODO you may want to pick a different channel
    CHANNEL: stable
    appveyor_repo_tag: true

@@ -83,7 +79,3 @@ branches:
  only:
    - /\d+\.\d+\.\d+/
    - master
-    # - appveyor
-    # - /\d+\.\d+\.\d+/
-  # except:
-    # - master
--- a/build.rs
+++ b/build.rs
@@ -1,10 +1,4 @@
-#[macro_use]
-extern crate clap;
-#[macro_use]
-extern crate lazy_static;
-
 use std::env;
-use std::ffi::OsString;
 use std::fs::{self, File};
 use std::io::{self, Read, Write};
 use std::path::Path;
@@ -19,22 +13,6 @@ use app::{RGArg, RGArgKind};
 mod app;

 fn main() {
-    // If our version of Rust has runtime SIMD detection, then set a cfg so
-    // we know we can test for it. We use this when generating ripgrep's
-    // --version output.
-    let version = rustc_version();
-    let parsed = match Version::parse(&version) {
-        Ok(parsed) => parsed,
-        Err(err) => {
-            eprintln!("failed to parse `rustc --version`: {}", err);
-            return;
-        }
-    };
-    let minimum = Version { major: 1, minor: 27, patch: 0 };
-    if version.contains("nightly") || parsed >= minimum {
-        println!("cargo:rustc-cfg=ripgrep_runtime_cpu");
-    }
-
    // OUT_DIR is set by Cargo and it's where any additional build artifacts
    // are written.
    let outdir = match env::var_os("OUT_DIR") {
@@ -185,7 +163,12 @@ fn formatted_arg(arg: &RGArg) -> io::Result<String> {
 }

 fn formatted_doc_txt(arg: &RGArg) -> io::Result<String> {
-    let paragraphs: Vec<&str> = arg.doc_long.split("\n\n").collect();
+    let paragraphs: Vec<String> = arg.doc_long
+        .replace("{", "&#123;")
+        .replace("}", r"&#125;")
+        .split("\n\n")
+        .map(|s| s.to_string())
+        .collect();
    if paragraphs.is_empty() {
        return Err(ioerr(format!("missing docs for --{}", arg.name)));
    }
@@ -199,63 +182,3 @@ fn formatted_doc_txt(arg: &RGArg) -> io::Result<String> {
 fn ioerr(msg: String) -> io::Error {
    io::Error::new(io::ErrorKind::Other, msg)
 }
-
-fn rustc_version() -> String {
-    let rustc = env::var_os("RUSTC").unwrap_or(OsString::from("rustc"));
-    let output = process::Command::new(&rustc)
-        .arg("--version")
-        .output()
-        .unwrap()
-        .stdout;
-    String::from_utf8(output).unwrap()
-}
-
-#[derive(Clone, Copy, Debug, Eq, PartialEq, PartialOrd, Ord)]
-struct Version {
-    major: u32,
-    minor: u32,
-    patch: u32,
-}
-
-impl Version {
-    fn parse(mut s: &str) -> Result<Version, String> {
-        if !s.starts_with("rustc ") {
-            return Err(format!("unrecognized version string: {}", s));
-        }
-        s = &s["rustc ".len()..];
-
-        let parts: Vec<&str> = s.split(".").collect();
-        if parts.len() < 3 {
-            return Err(format!("not enough version parts: {:?}", parts));
-        }
-
-        let mut num = String::new();
-        for c in parts[0].chars() {
-            if !c.is_digit(10) {
-                break;
-            }
-            num.push(c);
-        }
-        let major = num.parse::<u32>().map_err(|e| e.to_string())?;
-
-        num.clear();
-        for c in parts[1].chars() {
-            if !c.is_digit(10) {
-                break;
-            }
-            num.push(c);
-        }
-        let minor = num.parse::<u32>().map_err(|e| e.to_string())?;
-
-        num.clear();
-        for c in parts[2].chars() {
-            if !c.is_digit(10) {
-                break;
-            }
-            num.push(c);
-        }
-        let patch = num.parse::<u32>().map_err(|e| e.to_string())?;
-
-        Ok(Version { major, minor, patch })
-    }
-}
--- a/ci/before_deploy.sh
+++ b/ci/before_deploy.sh
@@ -11,7 +11,9 @@ mk_artifacts() {
    if is_arm; then
        cargo build --target "$TARGET" --release
    else
-        cargo build --target "$TARGET" --release --features 'pcre2'
+        # Technically, MUSL builds will force PCRE2 to get statically compiled,
+        # but we also want PCRE2 statically build for macOS binaries.
+        PCRE2_SYS_STATIC=1 cargo build --target "$TARGET" --release --features 'pcre2'
    fi
 }

--- a/ci/build_deb.sh
+++ b/ci/build_deb.sh
@@ -0,0 +1,43 @@
+#!/bin/bash
+
+set -e
+
+# This script builds a binary dpkg for Debian based distros. It does not
+# currently run in CI, and is instead run manually and the resulting dpkg is
+# uploaded to GitHub via the web UI.
+#
+# Note that this requires 'cargo deb', which can be installed with
+# 'cargo install cargo-deb'.
+#
+# This should be run from the root of the ripgrep repo.
+
+if ! command -V cargo-deb > /dev/null 2>&1; then
+    echo "cargo-deb command missing" >&2
+    exit 1
+fi
+
+# 'cargo deb' does not seem to provide a way to specify an asset that is
+# created at build time, such as ripgrep's man page. To work around this,
+# we force a debug build, copy out the man page (and shell completions)
+# produced from that build, put it into a predictable location and then build
+# the deb, which knows where to look.
+
+DEPLOY_DIR=deployment/deb
+mkdir -p "$DEPLOY_DIR"
+cargo build
+
+# Find and copy man page.
+manpage="$(find ./target/debug -name rg.1 -print0 | xargs -0 ls -t | head -n1)"
+cp "$manpage" "$DEPLOY_DIR/"
+
+# Do the same for shell completions.
+compbash="$(find ./target/debug -name rg.bash -print0 | xargs -0 ls -t | head -n1)"
+cp "$compbash" "$DEPLOY_DIR/"
+compfish="$(find ./target/debug -name rg.fish -print0 | xargs -0 ls -t | head -n1)"
+cp "$compfish" "$DEPLOY_DIR/"
+compzsh="complete/_rg"
+cp "$compzsh" "$DEPLOY_DIR/"
+
+# Since we're distributing the dpkg, we don't know whether the user will have
+# PCRE2 installed, so just do a static build.
+PCRE2_SYS_STATIC=1 cargo deb
--- a/complete/_rg
+++ b/complete/_rg
@@ -44,6 +44,12 @@ _rg() {
    '(: * -)'{-h,--help}'[display help information]'
    '(: * -)'{-V,--version}'[display version information]'

+    + '(buffered)' # buffering options
+    '--line-buffered[force line buffering]'
+    $no"--no-line-buffered[don't force line buffering]"
+    '--block-buffered[force block buffering]'
+    $no"--no-block-buffered[don't force block buffering]"
+
    + '(case)' # Case-sensitivity options
    {-i,--ignore-case}'[search case-insensitively]'
    {-s,--case-sensitive}'[search case-sensitively]'
@@ -71,7 +77,7 @@ _rg() {
    $no'--no-encoding[use default text encoding]'

    + file # File-input options
-    '*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'
+    '(1)*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'

    + '(file-match)' # Files with/without match options
    '(stats)'{-l,--files-with-matches}'[only show names of files with matches]'
@@ -81,6 +87,10 @@ _rg() {
    {-H,--with-filename}'[show file name for matches]'
    "--no-filename[don't show file name for matches]"

+    + '(file-system)' # File system options
+    "--one-file-system[don't descend into directories on other file systems]"
+    $no'--no-one-file-system[descend into directories on other file systems]'
+
    + '(fixed)' # Fixed-string options
    {-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
    $no"--no-fixed-strings[don't treat pattern as literal string]"
@@ -166,13 +176,16 @@ _rg() {
    $no'(pcre2-unicode)--no-pcre2[disable matching with PCRE2]'

    + '(pcre2-unicode)' # PCRE2 Unicode options
-    $no'(--no-pcre2-unicode)--pcre2-unicode[enable PCRE2 Unicode mode (with -P)]'
-    '(--no-pcre2-unicode)--no-pcre2-unicode[disable PCRE2 Unicode mode (with -P)]'
+    $no'(--no-pcre2 --no-pcre2-unicode)--pcre2-unicode[enable PCRE2 Unicode mode (with -P)]'
+    '(--no-pcre2 --pcre2-unicode)--no-pcre2-unicode[disable PCRE2 Unicode mode (with -P)]'

    + '(pre)' # Preprocessing options
    '(-z --search-zip)--pre=[specify preprocessor utility]:preprocessor utility:_command_names -e'
    $no'--no-pre[disable preprocessor utility]'

+    + pre-glob # Preprocessing glob options
+    '*--pre-glob[include/exclude files for preprocessing with --pre]'
+
    + '(pretty-vimgrep)' # Pretty/vimgrep display options
    '(heading)'{-p,--pretty}'[alias for --color=always --heading -n]'
    '(heading passthru)--vimgrep[show results in vim-compatible format]'
@@ -184,8 +197,21 @@ _rg() {
    {-r+,--replace=}'[specify string used to replace matches]:replace string'

    + '(sort)' # File-sorting options
-    '(threads)--sort-files[sort results by file path (disables parallelism)]'
-    $no"--no-sort-files[don't sort results by file path]"
+    '(threads)--sort=[sort results in ascending order (disables parallelism)]:sort method:((
+      none\:"no sorting"
+      path\:"sort by file path"
+      modified\:"sort by last modified time"
+      accessed\:"sort by last accessed time"
+      created\:"sort by creation time"
+    ))'
+    '(threads)--sortr=[sort results in descending order (disables parallelism)]:sort method:((
+      none\:"no sorting"
+      path\:"sort by file path"
+      modified\:"sort by last modified time"
+      accessed\:"sort by last accessed time"
+      created\:"sort by creation time"
+    ))'
+    '!(threads)--sort-files[sort results by file path (disables parallelism)]'

    + '(stats)' # Statistics options
    '(--files file-match)--stats[show search statistics]'
@@ -196,7 +222,7 @@ _rg() {
    $no"(--null-data)--no-text[don't search binary files as if they were text]"

    + '(threads)' # Thread-count options
-    '(--sort-files)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
+    '(sort)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'

    + '(trim)' # Trim options
    '--trim[trim any ASCII whitespace prefix from each line]'
--- a/doc/rg.1.txt.tpl
+++ b/doc/rg.1.txt.tpl
@@ -28,27 +28,37 @@ Synopsis
 DESCRIPTION
 -----------
 ripgrep (rg) recursively searches your current directory for a regex pattern.
-By default, ripgrep will respect your `.gitignore` and automatically skip
-hidden files/directories and binary files.
+By default, ripgrep will respect your .gitignore and automatically skip hidden
+files/directories and binary files.

-ripgrep's regex engine uses finite automata and guarantees linear time
-searching. Because of this, features like backreferences and arbitrary
-lookaround are not supported.
+ripgrep's default regex engine uses finite automata and guarantees linear
+time searching. Because of this, features like backreferences and arbitrary
+look-around are not supported. However, if ripgrep is built with PCRE2, then
+the --pcre2 flag can be used to enable backreferences and look-around.
+
+ripgrep supports configuration files. Set RIPGREP_CONFIG_PATH to a
+configuration file. The file can specify one shell argument per line. Lines
+starting with '#' are ignored. For more details, see the man page or the
+README.


 REGEX SYNTAX
 ------------
-ripgrep uses Rust's regex engine, which documents its syntax:
-https://docs.rs/regex/0.2.5/regex/#syntax
+ripgrep uses Rust's regex engine by default, which documents its syntax:
+https://docs.rs/regex/*/regex/#syntax

 ripgrep uses byte-oriented regexes, which has some additional documentation:
-https://docs.rs/regex/0.2.5/regex/bytes/index.html#syntax
+https://docs.rs/regex/*/regex/bytes/index.html#syntax

 To a first approximation, ripgrep uses Perl-like regexes without look-around or
 backreferences. This makes them very similar to the "extended" (ERE) regular
 expressions supported by `egrep`, but with a few additional features like
 Unicode character classes.

+If you're using ripgrep with the --pcre2 flag, then please consult
+https://www.pcre.org or the PCRE2 man pages for documentation on the supported
+syntax.
+

 POSITIONAL ARGUMENTS
 --------------------
@@ -58,7 +68,7 @@ _PATTERN_::

 _PATH_::
  A file or directory to search. Directories are searched recursively. Paths
-  specified expicitly on the command line override glob and ignore rules.
+  specified explicitly on the command line override glob and ignore rules.


 OPTIONS
--- a/globset/Cargo.toml
+++ b/globset/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "globset"
-version = "0.4.1"  #:version
+version = "0.4.2"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 Cross platform single glob and glob set matching. Glob set matching is the
@@ -19,14 +19,14 @@ name = "globset"
 bench = false

 [dependencies]
-aho-corasick = "0.6.0"
-fnv = "1.0"
-log = "0.4"
-memchr = "2"
-regex = "1"
+aho-corasick = "0.6.8"
+fnv = "1.0.6"
+log = "0.4.5"
+memchr = "2.1.0"
+regex = "1.1.0"

 [dev-dependencies]
-glob = "0.2"
+glob = "0.2.11"

 [features]
 simd-accel = []
--- a/grep-cli/Cargo.toml
+++ b/grep-cli/Cargo.toml
@@ -0,0 +1,25 @@
+[package]
+name = "grep-cli"
+version = "0.1.1"  #:version
+authors = ["Andrew Gallant <jamslam@gmail.com>"]
+description = """
+Utilities for search oriented command line applications.
+"""
+documentation = "https://docs.rs/grep-cli"
+homepage = "https://github.com/BurntSushi/ripgrep"
+repository = "https://github.com/BurntSushi/ripgrep"
+readme = "README.md"
+keywords = ["regex", "grep", "cli", "utility", "util"]
+license = "Unlicense/MIT"
+
+[dependencies]
+atty = "0.2.11"
+globset = { version = "0.4.2", path = "../globset" }
+lazy_static = "1.1.0"
+log = "0.4.5"
+regex = "1.1"
+same-file = "1.0.4"
+termcolor = "1.0.4"
+
+[target.'cfg(windows)'.dependencies.winapi-util]
+version = "0.1.1"
--- a/grep-cli/LICENSE-MIT
+++ b/grep-cli/LICENSE-MIT
@@ -0,0 +1,21 @@
+The MIT License (MIT)
+
+Copyright (c) 2015 Andrew Gallant
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
--- a/grep-cli/README.md
+++ b/grep-cli/README.md
@@ -0,0 +1,38 @@
+grep-cli
+--------
+A utility library that provides common routines desired in search oriented
+command line applications. This includes, but is not limited to, parsing hex
+escapes, detecting whether stdin is readable and more. To the extent possible,
+this crate strives for compatibility across Windows, macOS and Linux.
+
+[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep)
+[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep)
+[![](https://img.shields.io/crates/v/grep-cli.svg)](https://crates.io/crates/grep-cli)
+
+Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
+
+
+### Documentation
+
+[https://docs.rs/grep-cli](https://docs.rs/grep-cli)
+
+**NOTE:** You probably don't want to use this crate directly. Instead, you
+should prefer the facade defined in the
+[`grep`](https://docs.rs/grep)
+crate.
+
+
+### Usage
+
+Add this to your `Cargo.toml`:
+
+```toml
+[dependencies]
+grep-cli = "0.1"
+```
+
+and this to your crate root:
+
+```rust
+extern crate grep_cli;
+```
--- a/grep-cli/UNLICENSE
+++ b/grep-cli/UNLICENSE
@@ -0,0 +1,24 @@
+This is free and unencumbered software released into the public domain.
+
+Anyone is free to copy, modify, publish, use, compile, sell, or
+distribute this software, either in source code form or as a compiled
+binary, for any purpose, commercial or non-commercial, and by any
+means.
+
+In jurisdictions that recognize copyright laws, the author or authors
+of this software dedicate any and all copyright interest in the
+software to the public domain. We make this dedication for the benefit
+of the public at large and to the detriment of our heirs and
+successors. We intend this dedication to be an overt act of
+relinquishment in perpetuity of all present and future rights to this
+software under copyright law.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+OTHER DEALINGS IN THE SOFTWARE.
+
+For more information, please refer to <http://unlicense.org/>
--- a/grep-cli/src/decompress.rs
+++ b/grep-cli/src/decompress.rs
@@ -0,0 +1,381 @@
+use std::ffi::{OsStr, OsString};
+use std::fs::File;
+use std::io;
+use std::path::Path;
+use std::process::Command;
+
+use globset::{Glob, GlobSet, GlobSetBuilder};
+
+use process::{CommandError, CommandReader, CommandReaderBuilder};
+
+/// A builder for a matcher that determines which files get decompressed.
+#[derive(Clone, Debug)]
+pub struct DecompressionMatcherBuilder {
+    /// The commands for each matching glob.
+    commands: Vec<DecompressionCommand>,
+    /// Whether to include the default matching rules.
+    defaults: bool,
+}
+
+/// A representation of a single command for decompressing data
+/// out-of-proccess.
+#[derive(Clone, Debug)]
+struct DecompressionCommand {
+    /// The glob that matches this command.
+    glob: String,
+    /// The command or binary name.
+    bin: OsString,
+    /// The arguments to invoke with the command.
+    args: Vec<OsString>,
+}
+
+impl Default for DecompressionMatcherBuilder {
+    fn default() -> DecompressionMatcherBuilder {
+        DecompressionMatcherBuilder::new()
+    }
+}
+
+impl DecompressionMatcherBuilder {
+    /// Create a new builder for configuring a decompression matcher.
+    pub fn new() -> DecompressionMatcherBuilder {
+        DecompressionMatcherBuilder {
+            commands: vec![],
+            defaults: true,
+        }
+    }
+
+    /// Build a matcher for determining how to decompress files.
+    ///
+    /// If there was a problem compiling the matcher, then an error is
+    /// returned.
+    pub fn build(&self) -> Result<DecompressionMatcher, CommandError> {
+        let defaults =
+            if !self.defaults {
+                vec![]
+            } else {
+                default_decompression_commands()
+            };
+        let mut glob_builder = GlobSetBuilder::new();
+        let mut commands = vec![];
+        for decomp_cmd in defaults.iter().chain(&self.commands) {
+            let glob = Glob::new(&decomp_cmd.glob).map_err(|err| {
+                CommandError::io(io::Error::new(io::ErrorKind::Other, err))
+            })?;
+            glob_builder.add(glob);
+            commands.push(decomp_cmd.clone());
+        }
+        let globs = glob_builder.build().map_err(|err| {
+            CommandError::io(io::Error::new(io::ErrorKind::Other, err))
+        })?;
+        Ok(DecompressionMatcher { globs, commands })
+    }
+
+    /// When enabled, the default matching rules will be compiled into this
+    /// matcher before any other associations. When disabled, only the
+    /// rules explicitly given to this builder will be used.
+    ///
+    /// This is enabled by default.
+    pub fn defaults(&mut self, yes: bool) -> &mut DecompressionMatcherBuilder {
+        self.defaults = yes;
+        self
+    }
+
+    /// Associates a glob with a command to decompress files matching the glob.
+    ///
+    /// If multiple globs match the same file, then the most recently added
+    /// glob takes precedence.
+    ///
+    /// The syntax for the glob is documented in the
+    /// [`globset` crate](https://docs.rs/globset/#syntax).
+    pub fn associate<P, I, A>(
+        &mut self,
+        glob: &str,
+        program: P,
+        args: I,
+    ) -> &mut DecompressionMatcherBuilder
+    where P: AsRef<OsStr>,
+          I: IntoIterator<Item=A>,
+          A: AsRef<OsStr>,
+    {
+
+        let glob = glob.to_string();
+        let bin = program.as_ref().to_os_string();
+        let args = args
+            .into_iter()
+            .map(|a| a.as_ref().to_os_string())
+            .collect();
+        self.commands.push(DecompressionCommand { glob, bin, args });
+        self
+    }
+}
+
+/// A matcher for determining how to decompress files.
+#[derive(Clone, Debug)]
+pub struct DecompressionMatcher {
+    /// The set of globs to match. Each glob has a corresponding entry in
+    /// `commands`. When a glob matches, the corresponding command should be
+    /// used to perform out-of-process decompression.
+    globs: GlobSet,
+    /// The commands for each matching glob.
+    commands: Vec<DecompressionCommand>,
+}
+
+impl Default for DecompressionMatcher {
+    fn default() -> DecompressionMatcher {
+        DecompressionMatcher::new()
+    }
+}
+
+impl DecompressionMatcher {
+    /// Create a new matcher with default rules.
+    ///
+    /// To add more matching rules, build a matcher with
+    /// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html).
+    pub fn new() -> DecompressionMatcher {
+        DecompressionMatcherBuilder::new()
+            .build()
+            .expect("built-in matching rules should always compile")
+    }
+
+    /// Return a pre-built command based on the given file path that can
+    /// decompress its contents. If no such decompressor is known, then this
+    /// returns `None`.
+    ///
+    /// If there are multiple possible commands matching the given path, then
+    /// the command added last takes precedence.
+    pub fn command<P: AsRef<Path>>(&self, path: P) -> Option<Command> {
+        for i in self.globs.matches(path).into_iter().rev() {
+            let decomp_cmd = &self.commands[i];
+            let mut cmd = Command::new(&decomp_cmd.bin);
+            cmd.args(&decomp_cmd.args);
+            return Some(cmd);
+        }
+        None
+    }
+
+    /// Returns true if and only if the given file path has at least one
+    /// matching command to perform decompression on.
+    pub fn has_command<P: AsRef<Path>>(&self, path: P) -> bool {
+        self.globs.is_match(path)
+    }
+}
+
+/// Configures and builds a streaming reader for decompressing data.
+#[derive(Clone, Debug, Default)]
+pub struct DecompressionReaderBuilder {
+    matcher: DecompressionMatcher,
+    command_builder: CommandReaderBuilder,
+}
+
+impl DecompressionReaderBuilder {
+    /// Create a new builder with the default configuration.
+    pub fn new() -> DecompressionReaderBuilder {
+        DecompressionReaderBuilder::default()
+    }
+
+    /// Build a new streaming reader for decompressing data.
+    ///
+    /// If decompression is done out-of-process and if there was a problem
+    /// spawning the process, then its error is logged at the debug level and a
+    /// passthru reader is returned that does no decompression. This behavior
+    /// typically occurs when the given file path matches a decompression
+    /// command, but is executing in an environment where the decompression
+    /// command is not available.
+    ///
+    /// If the given file path could not be matched with a decompression
+    /// strategy, then a passthru reader is returned that does no
+    /// decompression.
+    pub fn build<P: AsRef<Path>>(
+        &self,
+        path: P,
+    ) -> Result<DecompressionReader, CommandError> {
+        let path = path.as_ref();
+        let mut cmd = match self.matcher.command(path) {
+            None => return DecompressionReader::new_passthru(path),
+            Some(cmd) => cmd,
+        };
+        cmd.arg(path);
+
+        match self.command_builder.build(&mut cmd) {
+            Ok(cmd_reader) => Ok(DecompressionReader { rdr: Ok(cmd_reader) }),
+            Err(err) => {
+                debug!(
+                    "{}: error spawning command '{:?}': {} \
+                     (falling back to uncompressed reader)",
+                    path.display(),
+                    cmd,
+                    err,
+                );
+                DecompressionReader::new_passthru(path)
+            }
+        }
+    }
+
+    /// Set the matcher to use to look up the decompression command for each
+    /// file path.
+    ///
+    /// A set of sensible rules is enabled by default. Setting this will
+    /// completely replace the current rules.
+    pub fn matcher(
+        &mut self,
+        matcher: DecompressionMatcher,
+    ) -> &mut DecompressionReaderBuilder {
+        self.matcher = matcher;
+        self
+    }
+
+    /// Get the underlying matcher currently used by this builder.
+    pub fn get_matcher(&self) -> &DecompressionMatcher {
+        &self.matcher
+    }
+
+    /// When enabled, the reader will asynchronously read the contents of the
+    /// command's stderr output. When disabled, stderr is only read after the
+    /// stdout stream has been exhausted (or if the process quits with an error
+    /// code).
+    ///
+    /// Note that when enabled, this may require launching an additional
+    /// thread in order to read stderr. This is done so that the process being
+    /// executed is never blocked from writing to stdout or stderr. If this is
+    /// disabled, then it is possible for the process to fill up the stderr
+    /// buffer and deadlock.
+    ///
+    /// This is enabled by default.
+    pub fn async_stderr(
+        &mut self,
+        yes: bool,
+    ) -> &mut DecompressionReaderBuilder {
+        self.command_builder.async_stderr(yes);
+        self
+    }
+}
+
+/// A streaming reader for decompressing the contents of a file.
+///
+/// The purpose of this reader is to provide a seamless way to decompress the
+/// contents of file using existing tools in the current environment. This is
+/// meant to be an alternative to using decompression libraries in favor of the
+/// simplicity and portability of using external commands such as `gzip` and
+/// `xz`. This does impose the overhead of spawning a process, so other means
+/// for performing decompression should be sought if this overhead isn't
+/// acceptable.
+///
+/// A decompression reader comes with a default set of matching rules that are
+/// meant to associate file paths with the corresponding command to use to
+/// decompress them. For example, a glob like `*.gz` matches gzip compressed
+/// files with the command `gzip -d -c`. If a file path does not match any
+/// existing rules, or if it matches a rule whose command does not exist in the
+/// current environment, then the decompression reader passes through the
+/// contents of the underlying file without doing any decompression.
+///
+/// The default matching rules are probably good enough for most cases, and if
+/// they require revision, pull requests are welcome. In cases where they must
+/// be changed or extended, they can be customized through the use of
+/// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html)
+/// and
+/// [`DecompressionReaderBuilder`](struct.DecompressionReaderBuilder.html).
+///
+/// By default, this reader will asynchronously read the processes' stderr.
+/// This prevents subtle deadlocking bugs for noisy processes that write a lot
+/// to stderr. Currently, the entire contents of stderr is read on to the heap.
+///
+/// # Example
+///
+/// This example shows how to read the decompressed contents of a file without
+/// needing to explicitly choose the decompression command to run.
+///
+/// Note that if you need to decompress multiple files, it is better to use
+/// `DecompressionReaderBuilder`, which will amortize the cost of compiling the
+/// matcher.
+///
+/// ```no_run
+/// use std::io::Read;
+/// use std::process::Command;
+/// use grep_cli::DecompressionReader;
+///
+/// # fn example() -> Result<(), Box<::std::error::Error>> {
+/// let mut rdr = DecompressionReader::new("/usr/share/man/man1/ls.1.gz")?;
+/// let mut contents = vec![];
+/// rdr.read_to_end(&mut contents)?;
+/// # Ok(()) }
+/// ```
+#[derive(Debug)]
+pub struct DecompressionReader {
+    rdr: Result<CommandReader, File>,
+}
+
+impl DecompressionReader {
+    /// Build a new streaming reader for decompressing data.
+    ///
+    /// If decompression is done out-of-process and if there was a problem
+    /// spawning the process, then its error is returned.
+    ///
+    /// If the given file path could not be matched with a decompression
+    /// strategy, then a passthru reader is returned that does no
+    /// decompression.
+    ///
+    /// This uses the default matching rules for determining how to decompress
+    /// the given file. To change those matching rules, use
+    /// [`DecompressionReaderBuilder`](struct.DecompressionReaderBuilder.html)
+    /// and
+    /// [`DecompressionMatcherBuilder`](struct.DecompressionMatcherBuilder.html).
+    ///
+    /// When creating readers for many paths. it is better to use the builder
+    /// since it will amortize the cost of constructing the matcher.
+    pub fn new<P: AsRef<Path>>(
+        path: P,
+    ) -> Result<DecompressionReader, CommandError> {
+        DecompressionReaderBuilder::new().build(path)
+    }
+
+    /// Creates a new "passthru" decompression reader that reads from the file
+    /// corresponding to the given path without doing decompression and without
+    /// executing another process.
+    fn new_passthru(path: &Path) -> Result<DecompressionReader, CommandError> {
+        let file = File::open(path)?;
+        Ok(DecompressionReader { rdr: Err(file) })
+    }
+}
+
+impl io::Read for DecompressionReader {
+    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
+        match self.rdr {
+            Ok(ref mut rdr) => rdr.read(buf),
+            Err(ref mut rdr) => rdr.read(buf),
+        }
+    }
+}
+
+fn default_decompression_commands() -> Vec<DecompressionCommand> {
+    const ARGS_GZIP: &[&str] = &["gzip", "-d", "-c"];
+    const ARGS_BZIP: &[&str] = &["bzip2", "-d", "-c"];
+    const ARGS_XZ: &[&str] = &["xz", "-d", "-c"];
+    const ARGS_LZ4: &[&str] = &["lz4", "-d", "-c"];
+    const ARGS_LZMA: &[&str] = &["xz", "--format=lzma", "-d", "-c"];
+
+    fn cmd(glob: &str, args: &[&str]) -> DecompressionCommand {
+        DecompressionCommand {
+            glob: glob.to_string(),
+            bin: OsStr::new(&args[0]).to_os_string(),
+            args: args
+                .iter()
+                .skip(1)
+                .map(|s| OsStr::new(s).to_os_string())
+                .collect(),
+        }
+    }
+    vec![
+        cmd("*.gz", ARGS_GZIP),
+        cmd("*.tgz", ARGS_GZIP),
+
+        cmd("*.bz2", ARGS_BZIP),
+        cmd("*.tbz2", ARGS_BZIP),
+
+        cmd("*.xz", ARGS_XZ),
+        cmd("*.txz", ARGS_XZ),
+
+        cmd("*.lz4", ARGS_LZ4),
+
+        cmd("*.lzma", ARGS_LZMA),
+    ]
+}
--- a/grep-cli/src/escape.rs
+++ b/grep-cli/src/escape.rs
@@ -0,0 +1,315 @@
+use std::ffi::OsStr;
+use std::str;
+
+/// A single state in the state machine used by `unescape`.
+#[derive(Clone, Copy, Eq, PartialEq)]
+enum State {
+    /// The state after seeing a `\`.
+    Escape,
+    /// The state after seeing a `\x`.
+    HexFirst,
+    /// The state after seeing a `\x[0-9A-Fa-f]`.
+    HexSecond(char),
+    /// Default state.
+    Literal,
+}
+
+/// Escapes arbitrary bytes into a human readable string.
+///
+/// This converts `\t`, `\r` and `\n` into their escaped forms. It also
+/// converts the non-printable subset of ASCII in addition to invalid UTF-8
+/// bytes to hexadecimal escape sequences. Everything else is left as is.
+///
+/// The dual of this routine is [`unescape`](fn.unescape.html).
+///
+/// # Example
+///
+/// This example shows how to convert a byte string that contains a `\n` and
+/// invalid UTF-8 bytes into a `String`.
+///
+/// Pay special attention to the use of raw strings. That is, `r"\n"` is
+/// equivalent to `"\\n"`.
+///
+/// ```
+/// use grep_cli::escape;
+///
+/// assert_eq!(r"foo\nbar\xFFbaz", escape(b"foo\nbar\xFFbaz"));
+/// ```
+pub fn escape(mut bytes: &[u8]) -> String {
+    let mut escaped = String::new();
+    while let Some(result) = decode_utf8(bytes) {
+        match result {
+            Ok(cp) => {
+                escape_char(cp, &mut escaped);
+                bytes = &bytes[cp.len_utf8()..];
+            }
+            Err(byte) => {
+                escape_byte(byte, &mut escaped);
+                bytes = &bytes[1..];
+            }
+        }
+    }
+    escaped
+}
+
+/// Escapes an OS string into a human readable string.
+///
+/// This is like [`escape`](fn.escape.html), but accepts an OS string.
+pub fn escape_os(string: &OsStr) -> String {
+    #[cfg(unix)]
+    fn imp(string: &OsStr) -> String {
+        use std::os::unix::ffi::OsStrExt;
+
+        escape(string.as_bytes())
+    }
+
+    #[cfg(not(unix))]
+    fn imp(string: &OsStr) -> String {
+        escape(string.to_string_lossy().as_bytes())
+    }
+
+    imp(string)
+}
+
+/// Unescapes a string.
+///
+/// It supports a limited set of escape sequences:
+///
+/// * `\t`, `\r` and `\n` are mapped to their corresponding ASCII bytes.
+/// * `\xZZ` hexadecimal escapes are mapped to their byte.
+///
+/// Everything else is left as is, including non-hexadecimal escapes like
+/// `\xGG`.
+///
+/// This is useful when it is desirable for a command line argument to be
+/// capable of specifying arbitrary bytes or otherwise make it easier to
+/// specify non-printable characters.
+///
+/// The dual of this routine is [`escape`](fn.escape.html).
+///
+/// # Example
+///
+/// This example shows how to convert an escaped string (which is valid UTF-8)
+/// into a corresponding sequence of bytes. Each escape sequence is mapped to
+/// its bytes, which may include invalid UTF-8.
+///
+/// Pay special attention to the use of raw strings. That is, `r"\n"` is
+/// equivalent to `"\\n"`.
+///
+/// ```
+/// use grep_cli::unescape;
+///
+/// assert_eq!(&b"foo\nbar\xFFbaz"[..], &*unescape(r"foo\nbar\xFFbaz"));
+/// ```
+pub fn unescape(s: &str) -> Vec<u8> {
+    use self::State::*;
+
+    let mut bytes = vec![];
+    let mut state = Literal;
+    for c in s.chars() {
+        match state {
+            Escape => {
+                match c {
+                    '\\' => { bytes.push(b'\\'); state = Literal; }
+                    'n' => { bytes.push(b'\n'); state = Literal; }
+                    'r' => { bytes.push(b'\r'); state = Literal; }
+                    't' => { bytes.push(b'\t'); state = Literal; }
+                    'x' => { state = HexFirst; }
+                    c => {
+                        bytes.extend(format!(r"\{}", c).into_bytes());
+                        state = Literal;
+                    }
+                }
+            }
+            HexFirst => {
+                match c {
+                    '0'...'9' | 'A'...'F' | 'a'...'f' => {
+                        state = HexSecond(c);
+                    }
+                    c => {
+                        bytes.extend(format!(r"\x{}", c).into_bytes());
+                        state = Literal;
+                    }
+                }
+            }
+            HexSecond(first) => {
+                match c {
+                    '0'...'9' | 'A'...'F' | 'a'...'f' => {
+                        let ordinal = format!("{}{}", first, c);
+                        let byte = u8::from_str_radix(&ordinal, 16).unwrap();
+                        bytes.push(byte);
+                        state = Literal;
+                    }
+                    c => {
+                        let original = format!(r"\x{}{}", first, c);
+                        bytes.extend(original.into_bytes());
+                        state = Literal;
+                    }
+                }
+            }
+            Literal => {
+                match c {
+                    '\\' => { state = Escape; }
+                    c => { bytes.extend(c.to_string().as_bytes()); }
+                }
+            }
+        }
+    }
+    match state {
+        Escape => bytes.push(b'\\'),
+        HexFirst => bytes.extend(b"\\x"),
+        HexSecond(c) => bytes.extend(format!("\\x{}", c).into_bytes()),
+        Literal => {}
+    }
+    bytes
+}
+
+/// Unescapes an OS string.
+///
+/// This is like [`unescape`](fn.unescape.html), but accepts an OS string.
+///
+/// Note that this first lossily decodes the given OS string as UTF-8. That
+/// is, an escaped string (the thing given) should be valid UTF-8.
+pub fn unescape_os(string: &OsStr) -> Vec<u8> {
+    unescape(&string.to_string_lossy())
+}
+
+/// Adds the given codepoint to the given string, escaping it if necessary.
+fn escape_char(cp: char, into: &mut String) {
+    if cp.is_ascii() {
+        escape_byte(cp as u8, into);
+    } else {
+        into.push(cp);
+    }
+}
+
+/// Adds the given byte to the given string, escaping it if necessary.
+fn escape_byte(byte: u8, into: &mut String) {
+    match byte {
+        0x21...0x5B | 0x5D...0x7D => into.push(byte as char),
+        b'\n' => into.push_str(r"\n"),
+        b'\r' => into.push_str(r"\r"),
+        b'\t' => into.push_str(r"\t"),
+        b'\\' => into.push_str(r"\\"),
+        _ => into.push_str(&format!(r"\x{:02X}", byte)),
+    }
+}
+
+/// Decodes the next UTF-8 encoded codepoint from the given byte slice.
+///
+/// If no valid encoding of a codepoint exists at the beginning of the given
+/// byte slice, then the first byte is returned instead.
+///
+/// This returns `None` if and only if `bytes` is empty.
+fn decode_utf8(bytes: &[u8]) -> Option<Result<char, u8>> {
+    if bytes.is_empty() {
+        return None;
+    }
+    let len = match utf8_len(bytes[0]) {
+        None => return Some(Err(bytes[0])),
+        Some(len) if len > bytes.len() => return Some(Err(bytes[0])),
+        Some(len) =>  len,
+    };
+    match str::from_utf8(&bytes[..len]) {
+        Ok(s) => Some(Ok(s.chars().next().unwrap())),
+        Err(_) => Some(Err(bytes[0])),
+    }
+}
+
+/// Given a UTF-8 leading byte, this returns the total number of code units
+/// in the following encoded codepoint.
+///
+/// If the given byte is not a valid UTF-8 leading byte, then this returns
+/// `None`.
+fn utf8_len(byte: u8) -> Option<usize> {
+    if byte <= 0x7F {
+        Some(1)
+    } else if byte <= 0b110_11111 {
+        Some(2)
+    } else if byte <= 0b1110_1111 {
+        Some(3)
+    } else if byte <= 0b1111_0111 {
+        Some(4)
+    } else {
+        None
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::{escape, unescape};
+
+    fn b(bytes: &'static [u8]) -> Vec<u8> {
+        bytes.to_vec()
+    }
+
+    #[test]
+    fn empty() {
+        assert_eq!(b(b""), unescape(r""));
+        assert_eq!(r"", escape(b""));
+    }
+
+    #[test]
+    fn backslash() {
+        assert_eq!(b(b"\\"), unescape(r"\\"));
+        assert_eq!(r"\\", escape(b"\\"));
+    }
+
+    #[test]
+    fn nul() {
+        assert_eq!(b(b"\x00"), unescape(r"\x00"));
+        assert_eq!(r"\x00", escape(b"\x00"));
+    }
+
+    #[test]
+    fn nl() {
+        assert_eq!(b(b"\n"), unescape(r"\n"));
+        assert_eq!(r"\n", escape(b"\n"));
+    }
+
+    #[test]
+    fn tab() {
+        assert_eq!(b(b"\t"), unescape(r"\t"));
+        assert_eq!(r"\t", escape(b"\t"));
+    }
+
+    #[test]
+    fn carriage() {
+        assert_eq!(b(b"\r"), unescape(r"\r"));
+        assert_eq!(r"\r", escape(b"\r"));
+    }
+
+    #[test]
+    fn nothing_simple() {
+        assert_eq!(b(b"\\a"), unescape(r"\a"));
+        assert_eq!(b(b"\\a"), unescape(r"\\a"));
+        assert_eq!(r"\\a", escape(b"\\a"));
+    }
+
+    #[test]
+    fn nothing_hex0() {
+        assert_eq!(b(b"\\x"), unescape(r"\x"));
+        assert_eq!(b(b"\\x"), unescape(r"\\x"));
+        assert_eq!(r"\\x", escape(b"\\x"));
+    }
+
+    #[test]
+    fn nothing_hex1() {
+        assert_eq!(b(b"\\xz"), unescape(r"\xz"));
+        assert_eq!(b(b"\\xz"), unescape(r"\\xz"));
+        assert_eq!(r"\\xz", escape(b"\\xz"));
+    }
+
+    #[test]
+    fn nothing_hex2() {
+        assert_eq!(b(b"\\xzz"), unescape(r"\xzz"));
+        assert_eq!(b(b"\\xzz"), unescape(r"\\xzz"));
+        assert_eq!(r"\\xzz", escape(b"\\xzz"));
+    }
+
+    #[test]
+    fn invalid_utf8() {
+        assert_eq!(r"\xFF", escape(b"\xFF"));
+        assert_eq!(r"a\xFFb", escape(b"a\xFFb"));
+    }
+}
--- a/grep-cli/src/human.rs
+++ b/grep-cli/src/human.rs
@@ -0,0 +1,171 @@
+use std::error;
+use std::fmt;
+use std::io;
+use std::num::ParseIntError;
+
+use regex::Regex;
+
+/// An error that occurs when parsing a human readable size description.
+///
+/// This error provides a end user friendly message describing why the
+/// description coudln't be parsed and what the expected format is.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub struct ParseSizeError {
+    original: String,
+    kind: ParseSizeErrorKind,
+}
+
+#[derive(Clone, Debug, Eq, PartialEq)]
+enum ParseSizeErrorKind {
+    InvalidFormat,
+    InvalidInt(ParseIntError),
+    Overflow,
+}
+
+impl ParseSizeError {
+    fn format(original: &str) -> ParseSizeError {
+        ParseSizeError {
+            original: original.to_string(),
+            kind: ParseSizeErrorKind::InvalidFormat,
+        }
+    }
+
+    fn int(original: &str, err: ParseIntError) -> ParseSizeError {
+        ParseSizeError {
+            original: original.to_string(),
+            kind: ParseSizeErrorKind::InvalidInt(err),
+        }
+    }
+
+    fn overflow(original: &str) -> ParseSizeError {
+        ParseSizeError {
+            original: original.to_string(),
+            kind: ParseSizeErrorKind::Overflow,
+        }
+    }
+}
+
+impl error::Error for ParseSizeError {
+    fn description(&self) -> &str { "invalid size" }
+}
+
+impl fmt::Display for ParseSizeError {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        use self::ParseSizeErrorKind::*;
+
+        match self.kind {
+            InvalidFormat => {
+                write!(
+                    f,
+                    "invalid format for size '{}', which should be a sequence \
+                     of digits followed by an optional 'K', 'M' or 'G' \
+                     suffix",
+                    self.original
+                )
+            }
+            InvalidInt(ref err) => {
+                write!(
+                    f,
+                    "invalid integer found in size '{}': {}",
+                    self.original,
+                    err
+                )
+            }
+            Overflow => {
+                write!(f, "size too big in '{}'", self.original)
+            }
+        }
+    }
+}
+
+impl From<ParseSizeError> for io::Error {
+    fn from(size_err: ParseSizeError) -> io::Error {
+        io::Error::new(io::ErrorKind::Other, size_err)
+    }
+}
+
+/// Parse a human readable size like `2M` into a corresponding number of bytes.
+///
+/// Supported size suffixes are `K` (for kilobyte), `M` (for megabyte) and `G`
+/// (for gigabyte). If a size suffix is missing, then the size is interpreted
+/// as bytes. If the size is too big to fit into a `u64`, then this returns an
+/// error.
+///
+/// Additional suffixes may be added over time.
+pub fn parse_human_readable_size(size: &str) -> Result<u64, ParseSizeError> {
+    lazy_static! {
+        // Normally I'd just parse something this simple by hand to avoid the
+        // regex dep, but we bring regex in any way for glob matching, so might
+        // as well use it.
+        static ref RE: Regex = Regex::new(r"^([0-9]+)([KMG])?$").unwrap();
+    }
+
+    let caps = match RE.captures(size) {
+        Some(caps) => caps,
+        None => return Err(ParseSizeError::format(size)),
+    };
+    let value: u64 = caps[1].parse().map_err(|err| {
+        ParseSizeError::int(size, err)
+    })?;
+    let suffix = match caps.get(2) {
+        None => return Ok(value),
+        Some(cap) => cap.as_str(),
+    };
+    let bytes = match suffix {
+        "K" => value.checked_mul(1<<10),
+        "M" => value.checked_mul(1<<20),
+        "G" => value.checked_mul(1<<30),
+        // Because if the regex matches this group, it must be [KMG].
+        _ => unreachable!(),
+    };
+    bytes.ok_or_else(|| ParseSizeError::overflow(size))
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn suffix_none() {
+        let x = parse_human_readable_size("123").unwrap();
+        assert_eq!(123, x);
+    }
+
+    #[test]
+    fn suffix_k() {
+        let x = parse_human_readable_size("123K").unwrap();
+        assert_eq!(123 * (1<<10), x);
+    }
+
+    #[test]
+    fn suffix_m() {
+        let x = parse_human_readable_size("123M").unwrap();
+        assert_eq!(123 * (1<<20), x);
+    }
+
+    #[test]
+    fn suffix_g() {
+        let x = parse_human_readable_size("123G").unwrap();
+        assert_eq!(123 * (1<<30), x);
+    }
+
+    #[test]
+    fn invalid_empty() {
+        assert!(parse_human_readable_size("").is_err());
+    }
+
+    #[test]
+    fn invalid_non_digit() {
+        assert!(parse_human_readable_size("a").is_err());
+    }
+
+    #[test]
+    fn invalid_overflow() {
+        assert!(parse_human_readable_size("9999999999999999G").is_err());
+    }
+
+    #[test]
+    fn invalid_suffix() {
+        assert!(parse_human_readable_size("123T").is_err());
+    }
+}
--- a/grep-cli/src/lib.rs
+++ b/grep-cli/src/lib.rs
@@ -0,0 +1,251 @@
+/*!
+This crate provides common routines used in command line applications, with a
+focus on routines useful for search oriented applications. As a utility
+library, there is no central type or function. However, a key focus of this
+crate is to improve failure modes and provide user friendly error messages
+when things go wrong.
+
+To the best extent possible, everything in this crate works on Windows, macOS
+and Linux.
+
+
+# Standard I/O
+
+The
+[`is_readable_stdin`](fn.is_readable_stdin.html),
+[`is_tty_stderr`](fn.is_tty_stderr.html),
+[`is_tty_stdin`](fn.is_tty_stdin.html)
+and
+[`is_tty_stdout`](fn.is_tty_stdout.html)
+routines query aspects of standard I/O. `is_readable_stdin` determines whether
+stdin can be usefully read from, while the `tty` methods determine whether a
+tty is attached to stdin/stdout/stderr.
+
+`is_readable_stdin` is useful when writing an application that changes behavior
+based on whether the application was invoked with data on stdin. For example,
+`rg foo` might recursively search the current working directory for
+occurrences of `foo`, but `rg foo < file` might only search the contents of
+`file`.
+
+The `tty` methods are useful for similar reasons. Namely, commands like `ls`
+will change their output depending on whether they are printing to a terminal
+or not. For example, `ls` shows a file on each line when stdout is redirected
+to a file or a pipe, but condenses the output to show possibly many files on
+each line when stdout is connected to a tty.
+
+
+# Coloring and buffering
+
+The
+[`stdout`](fn.stdout.html),
+[`stdout_buffered_block`](fn.stdout_buffered_block.html)
+and
+[`stdout_buffered_line`](fn.stdout_buffered_line.html)
+routines are alternative constructors for
+[`StandardStream`](struct.StandardStream.html).
+A `StandardStream` implements `termcolor::WriteColor`, which provides a way
+to emit colors to terminals. Its key use is the encapsulation of buffering
+style. Namely, `stdout` will return a line buffered `StandardStream` if and
+only if stdout is connected to a tty, and will otherwise return a block
+buffered `StandardStream`. Line buffering is important for use with a tty
+because it typically decreases the latency at which the end user sees output.
+Block buffering is used otherwise because it is faster, and redirecting stdout
+to a file typically doesn't benefit from the decreased latency that line
+buffering provides.
+
+The `stdout_buffered_block` and `stdout_buffered_line` can be used to
+explicitly set the buffering strategy regardless of whether stdout is connected
+to a tty or not.
+
+
+# Escaping
+
+The
+[`escape`](fn.escape.html),
+[`escape_os`](fn.escape_os.html),
+[`unescape`](fn.unescape.html)
+and
+[`unescape_os`](fn.unescape_os.html)
+routines provide a user friendly way of dealing with UTF-8 encoded strings that
+can express arbitrary bytes. For example, you might want to accept a string
+containing arbitrary bytes as a command line argument, but most interactive
+shells make such strings difficult to type. Instead, we can ask users to use
+escape sequences.
+
+For example, `a\xFFz` is itself a valid UTF-8 string corresponding to the
+following bytes:
+
+```ignore
+[b'a', b'\\', b'x', b'F', b'F', b'z']
+```
+
+However, we can
+interpret `\xFF` as an escape sequence with the `unescape`/`unescape_os`
+routines, which will yield
+
+```ignore
+[b'a', b'\xFF', b'z']
+```
+
+instead. For example:
+
+```
+use grep_cli::unescape;
+
+// Note the use of a raw string!
+assert_eq!(vec![b'a', b'\xFF', b'z'], unescape(r"a\xFFz"));
+```
+
+The `escape`/`escape_os` routines provide the reverse transformation, which
+makes it easy to show user friendly error messages involving arbitrary bytes.
+
+
+# Building patterns
+
+Typically, regular expression patterns must be valid UTF-8. However, command
+line arguments aren't guaranteed to be valid UTF-8. Unfortunately, the
+standard library's UTF-8 conversion functions from `OsStr`s do not provide
+good error messages. However, the
+[`pattern_from_bytes`](fn.pattern_from_bytes.html)
+and
+[`pattern_from_os`](fn.pattern_from_os.html)
+do, including reporting exactly where the first invalid UTF-8 byte is seen.
+
+Additionally, it can be useful to read patterns from a file while reporting
+good error messages that include line numbers. The
+[`patterns_from_path`](fn.patterns_from_path.html),
+[`patterns_from_reader`](fn.patterns_from_reader.html)
+and
+[`patterns_from_stdin`](fn.patterns_from_stdin.html)
+routines do just that. If any pattern is found that is invalid UTF-8, then the
+error includes the file path (if available) along with the line number and the
+byte offset at which the first invalid UTF-8 byte was observed.
+
+
+# Read process output
+
+Sometimes a command line application needs to execute other processes and read
+its stdout in a streaming fashion. The
+[`CommandReader`](struct.CommandReader.html)
+provides this functionality with an explicit goal of improving failure modes.
+In particular, if the process exits with an error code, then stderr is read
+and converted into a normal Rust error to show to end users. This makes the
+underlying failure modes explicit and gives more information to end users for
+debugging the problem.
+
+As a special case,
+[`DecompressionReader`](struct.DecompressionReader.html)
+provides a way to decompress arbitrary files by matching their file extensions
+up with corresponding decompression programs (such as `gzip` and `xz`). This
+is useful as a means of performing simplistic decompression in a portable
+manner without binding to specific compression libraries. This does come with
+some overhead though, so if you need to decompress lots of small files, this
+may not be an appropriate convenience to use.
+
+Each reader has a corresponding builder for additional configuration, such as
+whether to read stderr asynchronously in order to avoid deadlock (which is
+enabled by default).
+
+
+# Miscellaneous parsing
+
+The
+[`parse_human_readable_size`](fn.parse_human_readable_size.html)
+routine parses strings like `2M` and converts them to the corresponding number
+of bytes (`2 * 1<<20` in this case). If an invalid size is found, then a good
+error message is crafted that typically tells the user how to fix the problem.
+*/
+
+#![deny(missing_docs)]
+
+extern crate atty;
+extern crate globset;
+#[macro_use]
+extern crate lazy_static;
+#[macro_use]
+extern crate log;
+extern crate regex;
+extern crate same_file;
+extern crate termcolor;
+#[cfg(windows)]
+extern crate winapi_util;
+
+mod decompress;
+mod escape;
+mod human;
+mod pattern;
+mod process;
+mod wtr;
+
+pub use decompress::{
+    DecompressionMatcher, DecompressionMatcherBuilder,
+    DecompressionReader, DecompressionReaderBuilder,
+};
+pub use escape::{escape, escape_os, unescape, unescape_os};
+pub use human::{ParseSizeError, parse_human_readable_size};
+pub use pattern::{
+    InvalidPatternError,
+    pattern_from_os, pattern_from_bytes,
+    patterns_from_path, patterns_from_reader, patterns_from_stdin,
+};
+pub use process::{CommandError, CommandReader, CommandReaderBuilder};
+pub use wtr::{
+    StandardStream,
+    stdout, stdout_buffered_line, stdout_buffered_block,
+};
+
+/// Returns true if and only if stdin is believed to be readable.
+///
+/// When stdin is readable, command line programs may choose to behave
+/// differently than when stdin is not readable. For example, `command foo`
+/// might search the current directory for occurrences of `foo` where as
+/// `command foo < some-file` or `cat some-file | command foo` might instead
+/// only search stdin for occurrences of `foo`.
+pub fn is_readable_stdin() -> bool {
+    #[cfg(unix)]
+    fn imp() -> bool {
+        use std::os::unix::fs::FileTypeExt;
+        use same_file::Handle;
+
+        let ft = match Handle::stdin().and_then(|h| h.as_file().metadata()) {
+            Err(_) => return false,
+            Ok(md) => md.file_type(),
+        };
+        ft.is_file() || ft.is_fifo()
+    }
+
+    #[cfg(windows)]
+    fn imp() -> bool {
+        use winapi_util as winutil;
+
+        winutil::file::typ(winutil::HandleRef::stdin())
+            .map(|t| t.is_disk() || t.is_pipe())
+            .unwrap_or(false)
+    }
+
+    !is_tty_stdin() && imp()
+}
+
+/// Returns true if and only if stdin is believed to be connectted to a tty
+/// or a console.
+pub fn is_tty_stdin() -> bool {
+    atty::is(atty::Stream::Stdin)
+}
+
+/// Returns true if and only if stdout is believed to be connectted to a tty
+/// or a console.
+///
+/// This is useful for when you want your command line program to produce
+/// different output depending on whether it's printing directly to a user's
+/// terminal or whether it's being redirected somewhere else. For example,
+/// implementations of `ls` will often show one item per line when stdout is
+/// redirected, but will condensed output when printing to a tty.
+pub fn is_tty_stdout() -> bool {
+    atty::is(atty::Stream::Stdout)
+}
+
+/// Returns true if and only if stderr is believed to be connectted to a tty
+/// or a console.
+pub fn is_tty_stderr() -> bool {
+    atty::is(atty::Stream::Stderr)
+}
--- a/grep-cli/src/pattern.rs
+++ b/grep-cli/src/pattern.rs
@@ -0,0 +1,205 @@
+use std::error;
+use std::ffi::OsStr;
+use std::fmt;
+use std::fs::File;
+use std::io::{self, BufRead};
+use std::path::Path;
+use std::str;
+
+use escape::{escape, escape_os};
+
+/// An error that occurs when a pattern could not be converted to valid UTF-8.
+///
+/// The purpose of this error is to give a more targeted failure mode for
+/// patterns written by end users that are not valid UTF-8.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub struct InvalidPatternError {
+    original: String,
+    valid_up_to: usize,
+}
+
+impl InvalidPatternError {
+    /// Returns the index in the given string up to which valid UTF-8 was
+    /// verified.
+    pub fn valid_up_to(&self) -> usize {
+        self.valid_up_to
+    }
+}
+
+impl error::Error for InvalidPatternError {
+    fn description(&self) -> &str { "invalid pattern" }
+}
+
+impl fmt::Display for InvalidPatternError {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        write!(
+            f,
+            "found invalid UTF-8 in pattern at byte offset {} \
+             (use hex escape sequences to match arbitrary bytes \
+             in a pattern, e.g., \\xFF): '{}'",
+            self.valid_up_to,
+            self.original,
+        )
+    }
+}
+
+impl From<InvalidPatternError> for io::Error {
+    fn from(paterr: InvalidPatternError) -> io::Error {
+        io::Error::new(io::ErrorKind::Other, paterr)
+    }
+}
+
+/// Convert an OS string into a regular expression pattern.
+///
+/// This conversion fails if the given pattern is not valid UTF-8, in which
+/// case, a targeted error with more information about where the invalid UTF-8
+/// occurs is given. The error also suggests the use of hex escape sequences,
+/// which are supported by many regex engines.
+pub fn pattern_from_os(pattern: &OsStr) -> Result<&str, InvalidPatternError> {
+    pattern.to_str().ok_or_else(|| {
+        let valid_up_to = pattern
+            .to_string_lossy()
+            .find('\u{FFFD}')
+            .expect("a Unicode replacement codepoint for invalid UTF-8");
+        InvalidPatternError {
+            original: escape_os(pattern),
+            valid_up_to: valid_up_to,
+        }
+    })
+}
+
+/// Convert arbitrary bytes into a regular expression pattern.
+///
+/// This conversion fails if the given pattern is not valid UTF-8, in which
+/// case, a targeted error with more information about where the invalid UTF-8
+/// occurs is given. The error also suggests the use of hex escape sequences,
+/// which are supported by many regex engines.
+pub fn pattern_from_bytes(
+    pattern: &[u8],
+) -> Result<&str, InvalidPatternError> {
+    str::from_utf8(pattern).map_err(|err| {
+        InvalidPatternError {
+            original: escape(pattern),
+            valid_up_to: err.valid_up_to(),
+        }
+    })
+}
+
+/// Read patterns from a file path, one per line.
+///
+/// If there was a problem reading or if any of the patterns contain invalid
+/// UTF-8, then an error is returned. If there was a problem with a specific
+/// pattern, then the error message will include the line number and the file
+/// path.
+pub fn patterns_from_path<P: AsRef<Path>>(path: P) -> io::Result<Vec<String>> {
+    let path = path.as_ref();
+    let file = File::open(path).map_err(|err| {
+        io::Error::new(
+            io::ErrorKind::Other,
+            format!("{}: {}", path.display(), err),
+        )
+    })?;
+    patterns_from_reader(file).map_err(|err| {
+        io::Error::new(
+            io::ErrorKind::Other,
+            format!("{}:{}", path.display(), err),
+        )
+    })
+}
+
+/// Read patterns from stdin, one per line.
+///
+/// If there was a problem reading or if any of the patterns contain invalid
+/// UTF-8, then an error is returned. If there was a problem with a specific
+/// pattern, then the error message will include the line number and the fact
+/// that it came from stdin.
+pub fn patterns_from_stdin() -> io::Result<Vec<String>> {
+    let stdin = io::stdin();
+    let locked = stdin.lock();
+    patterns_from_reader(locked).map_err(|err| {
+        io::Error::new(
+            io::ErrorKind::Other,
+            format!("<stdin>:{}", err),
+        )
+    })
+}
+
+/// Read patterns from any reader, one per line.
+///
+/// If there was a problem reading or if any of the patterns contain invalid
+/// UTF-8, then an error is returned. If there was a problem with a specific
+/// pattern, then the error message will include the line number.
+///
+/// Note that this routine uses its own internal buffer, so the caller should
+/// not provide their own buffered reader if possible.
+///
+/// # Example
+///
+/// This shows how to parse patterns, one per line.
+///
+/// ```
+/// use grep_cli::patterns_from_reader;
+///
+/// # fn example() -> Result<(), Box<::std::error::Error>> {
+/// let patterns = "\
+/// foo
+/// bar\\s+foo
+/// [a-z]{3}
+/// ";
+///
+/// assert_eq!(patterns_from_reader(patterns.as_bytes())?, vec![
+///     r"foo",
+///     r"bar\s+foo",
+///     r"[a-z]{3}",
+/// ]);
+/// # Ok(()) }
+/// ```
+pub fn patterns_from_reader<R: io::Read>(rdr: R) -> io::Result<Vec<String>> {
+    let mut patterns = vec![];
+    let mut bufrdr = io::BufReader::new(rdr);
+    let mut line = vec![];
+    let mut line_number = 0;
+    while {
+        line.clear();
+        line_number += 1;
+        bufrdr.read_until(b'\n', &mut line)? > 0
+    } {
+        line.pop().unwrap(); // remove trailing '\n'
+        if line.last() == Some(&b'\r') {
+            line.pop().unwrap();
+        }
+        match pattern_from_bytes(&line) {
+            Ok(pattern) => patterns.push(pattern.to_string()),
+            Err(err) => {
+                return Err(io::Error::new(
+                    io::ErrorKind::Other,
+                    format!("{}: {}", line_number, err),
+                ));
+            }
+        }
+    }
+    Ok(patterns)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn bytes() {
+        let pat = b"abc\xFFxyz";
+        let err = pattern_from_bytes(pat).unwrap_err();
+        assert_eq!(3, err.valid_up_to());
+    }
+
+    #[test]
+    #[cfg(unix)]
+    fn os() {
+        use std::os::unix::ffi::OsStrExt;
+        use std::ffi::OsStr;
+
+        let pat = OsStr::from_bytes(b"abc\xFFxyz");
+        let err = pattern_from_os(pat).unwrap_err();
+        assert_eq!(3, err.valid_up_to());
+    }
+}
--- a/grep-cli/src/process.rs
+++ b/grep-cli/src/process.rs
@@ -0,0 +1,267 @@
+use std::error;
+use std::fmt;
+use std::io::{self, Read};
+use std::iter;
+use std::process;
+use std::thread::{self, JoinHandle};
+
+/// An error that can occur while running a command and reading its output.
+///
+/// This error can be seamlessly converted to an `io::Error` via a `From`
+/// implementation.
+#[derive(Debug)]
+pub struct CommandError {
+    kind: CommandErrorKind,
+}
+
+#[derive(Debug)]
+enum CommandErrorKind {
+    Io(io::Error),
+    Stderr(Vec<u8>),
+}
+
+impl CommandError {
+    /// Create an error from an I/O error.
+    pub(crate) fn io(ioerr: io::Error) -> CommandError {
+        CommandError { kind: CommandErrorKind::Io(ioerr) }
+    }
+
+    /// Create an error from the contents of stderr (which may be empty).
+    pub(crate) fn stderr(bytes: Vec<u8>) -> CommandError {
+        CommandError { kind: CommandErrorKind::Stderr(bytes) }
+    }
+}
+
+impl error::Error for CommandError {
+    fn description(&self) -> &str { "command error" }
+}
+
+impl fmt::Display for CommandError {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        match self.kind {
+            CommandErrorKind::Io(ref e) => e.fmt(f),
+            CommandErrorKind::Stderr(ref bytes) => {
+                let msg = String::from_utf8_lossy(bytes);
+                if msg.trim().is_empty() {
+                    write!(f, "<stderr is empty>")
+                } else {
+                    let div = iter::repeat('-').take(79).collect::<String>();
+                    write!(f, "\n{div}\n{msg}\n{div}", div=div, msg=msg.trim())
+                }
+            }
+        }
+    }
+}
+
+impl From<io::Error> for CommandError {
+    fn from(ioerr: io::Error) -> CommandError {
+        CommandError { kind: CommandErrorKind::Io(ioerr) }
+    }
+}
+
+impl From<CommandError> for io::Error {
+    fn from(cmderr: CommandError) -> io::Error {
+        match cmderr.kind {
+            CommandErrorKind::Io(ioerr) => ioerr,
+            CommandErrorKind::Stderr(_) => {
+                io::Error::new(io::ErrorKind::Other, cmderr)
+            }
+        }
+    }
+}
+
+/// Configures and builds a streaming reader for process output.
+#[derive(Clone, Debug, Default)]
+pub struct CommandReaderBuilder {
+    async_stderr: bool,
+}
+
+impl CommandReaderBuilder {
+    /// Create a new builder with the default configuration.
+    pub fn new() -> CommandReaderBuilder {
+        CommandReaderBuilder::default()
+    }
+
+    /// Build a new streaming reader for the given command's output.
+    ///
+    /// The caller should set everything that's required on the given command
+    /// before building a reader, such as its arguments, environment and
+    /// current working directory. Settings such as the stdout and stderr (but
+    /// not stdin) pipes will be overridden so that they can be controlled by
+    /// the reader.
+    ///
+    /// If there was a problem spawning the given command, then its error is
+    /// returned.
+    pub fn build(
+        &self,
+        command: &mut process::Command,
+    ) -> Result<CommandReader, CommandError> {
+        let mut child = command
+            .stdout(process::Stdio::piped())
+            .stderr(process::Stdio::piped())
+            .spawn()?;
+        let stdout = child.stdout.take().unwrap();
+        let stderr =
+            if self.async_stderr {
+                StderrReader::async(child.stderr.take().unwrap())
+            } else {
+                StderrReader::sync(child.stderr.take().unwrap())
+            };
+        Ok(CommandReader {
+            child: child,
+            stdout: stdout,
+            stderr: stderr,
+            done: false,
+        })
+    }
+
+    /// When enabled, the reader will asynchronously read the contents of the
+    /// command's stderr output. When disabled, stderr is only read after the
+    /// stdout stream has been exhausted (or if the process quits with an error
+    /// code).
+    ///
+    /// Note that when enabled, this may require launching an additional
+    /// thread in order to read stderr. This is done so that the process being
+    /// executed is never blocked from writing to stdout or stderr. If this is
+    /// disabled, then it is possible for the process to fill up the stderr
+    /// buffer and deadlock.
+    ///
+    /// This is enabled by default.
+    pub fn async_stderr(&mut self, yes: bool) -> &mut CommandReaderBuilder {
+        self.async_stderr = yes;
+        self
+    }
+}
+
+/// A streaming reader for a command's output.
+///
+/// The purpose of this reader is to provide an easy way to execute processes
+/// whose stdout is read in a streaming way while also making the processes'
+/// stderr available when the process fails with an exit code. This makes it
+/// possible to execute processes while surfacing the underlying failure mode
+/// in the case of an error.
+///
+/// Moreover, by default, this reader will asynchronously read the processes'
+/// stderr. This prevents subtle deadlocking bugs for noisy processes that
+/// write a lot to stderr. Currently, the entire contents of stderr is read
+/// on to the heap.
+///
+/// # Example
+///
+/// This example shows how to invoke `gzip` to decompress the contents of a
+/// file. If the `gzip` command reports a failing exit status, then its stderr
+/// is returned as an error.
+///
+/// ```no_run
+/// use std::io::Read;
+/// use std::process::Command;
+/// use grep_cli::CommandReader;
+///
+/// # fn example() -> Result<(), Box<::std::error::Error>> {
+/// let mut cmd = Command::new("gzip");
+/// cmd.arg("-d").arg("-c").arg("/usr/share/man/man1/ls.1.gz");
+///
+/// let mut rdr = CommandReader::new(&mut cmd)?;
+/// let mut contents = vec![];
+/// rdr.read_to_end(&mut contents)?;
+/// # Ok(()) }
+/// ```
+#[derive(Debug)]
+pub struct CommandReader {
+    child: process::Child,
+    stdout: process::ChildStdout,
+    stderr: StderrReader,
+    done: bool,
+}
+
+impl CommandReader {
+    /// Create a new streaming reader for the given command using the default
+    /// configuration.
+    ///
+    /// The caller should set everything that's required on the given command
+    /// before building a reader, such as its arguments, environment and
+    /// current working directory. Settings such as the stdout and stderr (but
+    /// not stdin) pipes will be overridden so that they can be controlled by
+    /// the reader.
+    ///
+    /// If there was a problem spawning the given command, then its error is
+    /// returned.
+    ///
+    /// If the caller requires additional configuration for the reader
+    /// returned, then use
+    /// [`CommandReaderBuilder`](struct.CommandReaderBuilder.html).
+    pub fn new(
+        cmd: &mut process::Command,
+    ) -> Result<CommandReader, CommandError> {
+        CommandReaderBuilder::new().build(cmd)
+    }
+}
+
+impl io::Read for CommandReader {
+    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
+        if self.done {
+            return Ok(0);
+        }
+        let nread = self.stdout.read(buf)?;
+        if nread == 0 {
+            self.done = true;
+            // Reap the child now that we're done reading. If the command
+            // failed, report stderr as an error.
+            if !self.child.wait()?.success() {
+                return Err(io::Error::from(self.stderr.read_to_end()));
+            }
+        }
+        Ok(nread)
+    }
+}
+
+/// A reader that encapsulates the asynchronous or synchronous reading of
+/// stderr.
+#[derive(Debug)]
+enum StderrReader {
+    Async(Option<JoinHandle<CommandError>>),
+    Sync(process::ChildStderr),
+}
+
+impl StderrReader {
+    /// Create a reader for stderr that reads contents asynchronously.
+    fn async(mut stderr: process::ChildStderr) -> StderrReader {
+        let handle = thread::spawn(move || {
+            stderr_to_command_error(&mut stderr)
+        });
+        StderrReader::Async(Some(handle))
+    }
+
+    /// Create a reader for stderr that reads contents synchronously.
+    fn sync(stderr: process::ChildStderr) -> StderrReader {
+        StderrReader::Sync(stderr)
+    }
+
+    /// Consumes all of stderr on to the heap and returns it as an error.
+    ///
+    /// If there was a problem reading stderr itself, then this returns an I/O
+    /// command error.
+    fn read_to_end(&mut self) -> CommandError {
+        match *self {
+            StderrReader::Async(ref mut handle) => {
+                let handle = handle
+                    .take()
+                    .expect("read_to_end cannot be called more than once");
+                handle
+                    .join()
+                    .expect("stderr reading thread does not panic")
+            }
+            StderrReader::Sync(ref mut stderr) => {
+                stderr_to_command_error(stderr)
+            }
+        }
+    }
+}
+
+fn stderr_to_command_error(stderr: &mut process::ChildStderr) -> CommandError {
+    let mut bytes = vec![];
+    match stderr.read_to_end(&mut bytes) {
+        Ok(_) => CommandError::stderr(bytes),
+        Err(err) => CommandError::io(err),
+    }
+}
--- a/grep-cli/src/wtr.rs
+++ b/grep-cli/src/wtr.rs
@@ -0,0 +1,133 @@
+use std::io;
+
+use termcolor;
+
+use is_tty_stdout;
+
+/// A writer that supports coloring with either line or block buffering.
+pub struct StandardStream(StandardStreamKind);
+
+/// Returns a possibly buffered writer to stdout for the given color choice.
+///
+/// The writer returned is either line buffered or block buffered. The decision
+/// between these two is made automatically based on whether a tty is attached
+/// to stdout or not. If a tty is attached, then line buffering is used.
+/// Otherwise, block buffering is used. In general, block buffering is more
+/// efficient, but may increase the time it takes for the end user to see the
+/// first bits of output.
+///
+/// If you need more fine grained control over the buffering mode, then use one
+/// of `stdout_buffered_line` or `stdout_buffered_block`.
+///
+/// The color choice given is passed along to the underlying writer. To
+/// completely disable colors in all cases, use `ColorChoice::Never`.
+pub fn stdout(color_choice: termcolor::ColorChoice) -> StandardStream {
+    if is_tty_stdout() {
+        stdout_buffered_line(color_choice)
+    } else {
+        stdout_buffered_block(color_choice)
+    }
+}
+
+/// Returns a line buffered writer to stdout for the given color choice.
+///
+/// This writer is useful when printing results directly to a tty such that
+/// users see output as soon as it's written. The downside of this approach
+/// is that it can be slower, especially when there is a lot of output.
+///
+/// You might consider using
+/// [`stdout`](fn.stdout.html)
+/// instead, which chooses the buffering strategy automatically based on
+/// whether stdout is connected to a tty.
+pub fn stdout_buffered_line(
+    color_choice: termcolor::ColorChoice,
+) -> StandardStream {
+    let out = termcolor::StandardStream::stdout(color_choice);
+    StandardStream(StandardStreamKind::LineBuffered(out))
+}
+
+/// Returns a block buffered writer to stdout for the given color choice.
+///
+/// This writer is useful when printing results to a file since it amortizes
+/// the cost of writing data. The downside of this approach is that it can
+/// increase the latency of display output when writing to a tty.
+///
+/// You might consider using
+/// [`stdout`](fn.stdout.html)
+/// instead, which chooses the buffering strategy automatically based on
+/// whether stdout is connected to a tty.
+pub fn stdout_buffered_block(
+    color_choice: termcolor::ColorChoice,
+) -> StandardStream {
+    let out = termcolor::BufferedStandardStream::stdout(color_choice);
+    StandardStream(StandardStreamKind::BlockBuffered(out))
+}
+
+enum StandardStreamKind {
+    LineBuffered(termcolor::StandardStream),
+    BlockBuffered(termcolor::BufferedStandardStream),
+}
+
+impl io::Write for StandardStream {
+    #[inline]
+    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref mut w) => w.write(buf),
+            BlockBuffered(ref mut w) => w.write(buf),
+        }
+    }
+
+    #[inline]
+    fn flush(&mut self) -> io::Result<()> {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref mut w) => w.flush(),
+            BlockBuffered(ref mut w) => w.flush(),
+        }
+    }
+}
+
+impl termcolor::WriteColor for StandardStream {
+    #[inline]
+    fn supports_color(&self) -> bool {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref w) => w.supports_color(),
+            BlockBuffered(ref w) => w.supports_color(),
+        }
+    }
+
+    #[inline]
+    fn set_color(&mut self, spec: &termcolor::ColorSpec) -> io::Result<()> {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref mut w) => w.set_color(spec),
+            BlockBuffered(ref mut w) => w.set_color(spec),
+        }
+    }
+
+    #[inline]
+    fn reset(&mut self) -> io::Result<()> {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref mut w) => w.reset(),
+            BlockBuffered(ref mut w) => w.reset(),
+        }
+    }
+
+    #[inline]
+    fn is_synchronous(&self) -> bool {
+        use self::StandardStreamKind::*;
+
+        match self.0 {
+            LineBuffered(ref w) => w.is_synchronous(),
+            BlockBuffered(ref w) => w.is_synchronous(),
+        }
+    }
+}
--- a/grep-matcher/Cargo.toml
+++ b/grep-matcher/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep-matcher"
-version = "0.1.0"  #:version
+version = "0.1.1"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 A trait for regular expressions, with a focus on line oriented search.
@@ -13,11 +13,14 @@ keywords = ["regex", "pattern", "trait"]
 license = "Unlicense/MIT"
 autotests = false

-[dependencies]
-memchr = "2"
+[dependencies.bstr]
+version = "*"
+path = "/home/andrew/rust/bstr"
+default-features = false
+features = ["std"]

 [dev-dependencies]
-regex = "1"
+regex = "1.1"

 [[test]]
 name = "integration"
--- a/grep-matcher/src/interpolate.rs
+++ b/grep-matcher/src/interpolate.rs
@@ -1,6 +1,6 @@
 use std::str;

-use memchr::memchr;
+use bstr::B;

 /// Interpolate capture references in `replacement` and write the interpolation
 /// result to `dst`. References in `replacement` take the form of $N or $name,
@@ -22,7 +22,7 @@ pub fn interpolate<A, N>(
    N: FnMut(&str) -> Option<usize>
 {
    while !replacement.is_empty() {
-        match memchr(b'$', replacement) {
+        match B(replacement).find_byte(b'$') {
            None => break,
            Some(i) => {
                dst.extend(&replacement[..i]);
--- a/grep-matcher/src/lib.rs
+++ b/grep-matcher/src/lib.rs
@@ -38,13 +38,15 @@ implementations.

 #![deny(missing_docs)]

-extern crate memchr;
+extern crate bstr;

 use std::fmt;
 use std::io;
 use std::ops;
 use std::u64;

+use bstr::BStr;
+
 use interpolate::interpolate;

 mod interpolate;
@@ -180,6 +182,22 @@ impl ops::IndexMut<Match> for [u8] {
    }
 }

+impl ops::Index<Match> for BStr {
+    type Output = BStr;
+
+    #[inline]
+    fn index(&self, index: Match) -> &BStr {
+        &self[index.start..index.end]
+    }
+}
+
+impl ops::IndexMut<Match> for BStr {
+    #[inline]
+    fn index_mut(&mut self, index: Match) -> &mut BStr {
+        &mut self[index.start..index.end]
+    }
+}
+
 impl ops::Index<Match> for str {
    type Output = str;

@@ -266,6 +284,16 @@ impl LineTerminator {
            LineTerminatorImp::CRLF => &[b'\r', b'\n'],
        }
    }
+
+    /// Returns true if and only if the given slice ends with this line
+    /// terminator.
+    ///
+    /// If this line terminator is `CRLF`, then this only checks whether the
+    /// last byte is `\n`.
+    #[inline]
+    pub fn is_suffix(&self, slice: &[u8]) -> bool {
+        slice.last().map_or(false, |&b| b == self.as_byte())
+    }
 }

 impl Default for  LineTerminator {
--- a/grep-pcre2/Cargo.toml
+++ b/grep-pcre2/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep-pcre2"
-version = "0.1.0"  #:version
+version = "0.1.2"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 Use PCRE2 with the 'grep' crate.
@@ -13,5 +13,5 @@ keywords = ["regex", "grep", "pcre", "backreference", "look"]
 license = "Unlicense/MIT"

 [dependencies]
-grep-matcher = { version = "0.1.0", path = "../grep-matcher" }
-pcre2 = "0.1"
+grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
+pcre2 = "0.1.1"
--- a/grep-pcre2/src/matcher.rs
+++ b/grep-pcre2/src/matcher.rs
@@ -199,16 +199,34 @@ impl RegexMatcherBuilder {
        self
    }

-    /// Enable PCRE2's JIT.
+    /// Enable PCRE2's JIT and return an error if it's not available.
    ///
    /// This generally speeds up matching quite a bit. The downside is that it
    /// can increase the time it takes to compile a pattern.
    ///
-    /// This is disabled by default.
+    /// If the JIT isn't available or if JIT compilation returns an error, then
+    /// regex compilation will fail with the corresponding error.
+    ///
+    /// This is disabled by default, and always overrides `jit_if_available`.
    pub fn jit(&mut self, yes: bool) -> &mut RegexMatcherBuilder {
        self.builder.jit(yes);
        self
    }
+
+    /// Enable PCRE2's JIT if it's available.
+    ///
+    /// This generally speeds up matching quite a bit. The downside is that it
+    /// can increase the time it takes to compile a pattern.
+    ///
+    /// If the JIT isn't available or if JIT compilation returns an error,
+    /// then a debug message with the error will be emitted and the regex will
+    /// otherwise silently fall back to non-JIT matching.
+    ///
+    /// This is disabled by default, and always overrides `jit`.
+    pub fn jit_if_available(&mut self, yes: bool) -> &mut RegexMatcherBuilder {
+        self.builder.jit_if_available(yes);
+        self
+    }
 }

 /// An implementation of the `Matcher` trait using PCRE2.
--- a/grep-printer/Cargo.toml
+++ b/grep-printer/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep-printer"
-version = "0.1.0"  #:version
+version = "0.1.1"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 An implementation of the grep crate's Sink trait that provides standard
@@ -18,13 +18,13 @@ default = ["serde1"]
 serde1 = ["base64", "serde", "serde_derive", "serde_json"]

 [dependencies]
-base64 = { version = "0.9", optional = true }
-grep-matcher = { version = "0.1.0", path = "../grep-matcher" }
-grep-searcher = { version = "0.1.0", path = "../grep-searcher" }
-termcolor = "1"
-serde = { version = "1", optional = true }
-serde_derive = { version = "1", optional = true }
-serde_json = { version = "1", optional = true }
+base64 = { version = "0.10.0", optional = true }
+grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
+grep-searcher = { version = "0.1.1", path = "../grep-searcher" }
+termcolor = "1.0.4"
+serde = { version = "1.0.77", optional = true }
+serde_derive = { version = "1.0.77", optional = true }
+serde_json = { version = "1.0.27", optional = true }

 [dev-dependencies]
-grep-regex = { version = "0.1.0", path = "../grep-regex" }
+grep-regex = { version = "0.1.1", path = "../grep-regex" }
--- a/grep-printer/src/color.rs
+++ b/grep-printer/src/color.rs
@@ -4,6 +4,25 @@ use std::str::FromStr;

 use termcolor::{Color, ColorSpec, ParseColorError};

+/// Returns a default set of color specifications.
+///
+/// This may change over time, but the color choices are meant to be fairly
+/// conservative that work across terminal themes.
+///
+/// Additional color specifications can be added to the list returned. More
+/// recently added specifications override previously added specifications.
+pub fn default_color_specs() -> Vec<UserColorSpec> {
+    vec![
+        #[cfg(unix)]
+        "path:fg:magenta".parse().unwrap(),
+        #[cfg(windows)]
+        "path:fg:cyan".parse().unwrap(),
+        "line:fg:green".parse().unwrap(),
+        "match:fg:red".parse().unwrap(),
+        "match:style:bold".parse().unwrap(),
+    ]
+}
+
 /// An error that can occur when parsing color specifications.
 #[derive(Clone, Debug, Eq, PartialEq)]
 pub enum ColorError {
@@ -227,6 +246,15 @@ impl ColorSpecs {
        merged
    }

+    /// Create a default set of specifications that have color.
+    ///
+    /// This is distinct from `ColorSpecs`'s `Default` implementation in that
+    /// this provides a set of default color choices, where as the `Default`
+    /// implementation provides no color choices.
+    pub fn default_with_color() -> ColorSpecs {
+        ColorSpecs::new(&default_color_specs())
+    }
+
    /// Return the color specification for coloring file paths.
    pub fn path(&self) -> &ColorSpec {
        &self.path
--- a/grep-printer/src/jsont.rs
+++ b/grep-printer/src/jsont.rs
@@ -114,39 +114,6 @@ impl<'a> Data<'a> {
        // so we do the easy thing for now.
        Data::Text { text: path.to_string_lossy() }
    }
-
-    // Unused deserialization routines.
-
-    /*
-    fn into_bytes(self) -> Vec<u8> {
-        match self {
-            Data::Text { text } => text.into_bytes(),
-            Data::Bytes { bytes } => bytes,
-        }
-    }
-
-    #[cfg(unix)]
-    fn into_path_buf(&self) -> PathBuf {
-        use std::os::unix::ffi::OsStrExt;
-
-        match self {
-            Data::Text { text } => PathBuf::from(text),
-            Data::Bytes { bytes } => {
-                PathBuf::from(OsStr::from_bytes(bytes))
-            }
-        }
-    }
-
-    #[cfg(not(unix))]
-    fn into_path_buf(&self) -> PathBuf {
-        match self {
-            Data::Text { text } => PathBuf::from(text),
-            Data::Bytes { bytes } => {
-                PathBuf::from(String::from_utf8_lossy(&bytes).into_owned())
-            }
-        }
-    }
-    */
 }

 fn to_base64<T, S>(
@@ -178,36 +145,3 @@ where P: AsRef<Path>,
 {
    path.as_ref().map(|p| Data::from_path(p.as_ref())).serialize(ser)
 }
-
-// The following are some deserialization helpers, in case we decide to support
-// deserialization of the above types.
-
-/*
-fn from_base64<'de, D>(
-    de: D,
-) -> Result<Vec<u8>, D::Error>
-where D: Deserializer<'de>
-{
-    let encoded = String::deserialize(de)?;
-    let decoded = base64::decode(encoded.as_bytes())
-        .map_err(D::Error::custom)?;
-    Ok(decoded)
-}
-
-fn deser_bytes<'de, D>(
-    de: D,
-) -> Result<Vec<u8>, D::Error>
-where D: Deserializer<'de>
-{
-    Data::deserialize(de).map(|datum| datum.into_bytes())
-}
-
-fn deser_path<'de, D>(
-    de: D,
-) -> Result<Option<PathBuf>, D::Error>
-where D: Deserializer<'de>
-{
-    Option::<Data>::deserialize(de)
-        .map(|opt| opt.map(|datum| datum.into_path_buf()))
-}
-*/
--- a/grep-printer/src/lib.rs
+++ b/grep-printer/src/lib.rs
@@ -83,7 +83,7 @@ extern crate serde_derive;
 extern crate serde_json;
 extern crate termcolor;

-pub use color::{ColorError, ColorSpecs, UserColorSpec};
+pub use color::{ColorError, ColorSpecs, UserColorSpec, default_color_specs};
 #[cfg(feature = "serde1")]
 pub use json::{JSON, JSONBuilder, JSONSink};
 pub use standard::{Standard, StandardBuilder, StandardSink};
--- a/grep-printer/src/macros.rs
+++ b/grep-printer/src/macros.rs
@@ -1,3 +1,4 @@
+/// Like assert_eq, but nicer output for long strings.
 #[cfg(test)]
 #[macro_export]
 macro_rules! assert_eq_printed {
--- a/grep-printer/src/standard.rs
+++ b/grep-printer/src/standard.rs
@@ -239,8 +239,9 @@ impl StandardBuilder {
    /// which may either be in index form (e.g., `$2`) or can reference named
    /// capturing groups if present in the original pattern (e.g., `$foo`).
    ///
-    /// For documentation on the full format, please see the `Matcher` trait's
-    /// `interpolate` method.
+    /// For documentation on the full format, please see the `Capture` trait's
+    /// `interpolate` method in the
+    /// [grep-printer](https://docs.rs/grep-printer) crate.
    pub fn replacement(
        &mut self,
        replacement: Option<Vec<u8>>,
@@ -1201,6 +1202,9 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
        if !self.wtr().borrow().supports_color() || spec.is_none() {
            return self.write_line(line);
        }
+        if self.exceeds_max_columns(line) {
+            return self.write_exceeded_line();
+        }

        let mut last_written =
            if !self.config().trim_ascii {
@@ -1393,7 +1397,7 @@ impl<'a, M: Matcher, W: WriteColor> StandardImpl<'a, M, W> {
    }

    fn has_line_terminator(&self, buf: &[u8]) -> bool {
-        buf.last() == Some(&self.searcher.line_terminator().as_byte())
+        self.searcher.line_terminator().is_suffix(buf)
    }

    fn is_context(&self) -> bool {
--- a/grep-regex/Cargo.toml
+++ b/grep-regex/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep-regex"
-version = "0.1.0"  #:version
+version = "0.1.1"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 Use Rust's regex library with the 'grep' crate.
@@ -13,9 +13,9 @@ keywords = ["regex", "grep", "search", "pattern", "line"]
 license = "Unlicense/MIT"

 [dependencies]
-log = "0.4"
-grep-matcher = { version = "0.1.0", path = "../grep-matcher" }
-regex = "1"
-regex-syntax = "0.6"
+log = "0.4.5"
+grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
+regex = "1.1"
+regex-syntax = "0.6.4"
 thread_local = "0.3.6"
-utf8-ranges = "1"
+utf8-ranges = "1.0.1"
--- a/grep-regex/src/literal.rs
+++ b/grep-regex/src/literal.rs
@@ -166,10 +166,10 @@ fn union_required(expr: &Hir, lits: &mut Literals) {
                    lits.cut();
                    continue;
                }
-                if lits2.contains_empty() {
+                if lits2.contains_empty() || !is_simple(&e) {
                    lits.cut();
                }
-                if !lits.cross_product(&lits2) {
+                if !lits.cross_product(&lits2) || !lits2.any_complete() {
                    // If this expression couldn't yield any literal that
                    // could be extended, then we need to quit. Since we're
                    // short-circuiting, we also need to freeze every member.
@@ -250,6 +250,20 @@ fn alternate_literals<F: FnMut(&Hir, &mut Literals)>(
    }
 }

+fn is_simple(expr: &Hir) -> bool {
+    match *expr.kind() {
+        HirKind::Empty
+        | HirKind::Literal(_)
+        | HirKind::Class(_)
+        | HirKind::Repetition(_)
+        | HirKind::Concat(_)
+        | HirKind::Alternation(_) => true,
+        HirKind::Anchor(_)
+        | HirKind::WordBoundary(_)
+        | HirKind::Group(_) => false,
+    }
+}
+
 /// Return the number of characters in the given class.
 fn count_unicode_class(cls: &hir::ClassUnicode) -> u32 {
    cls.iter().map(|r| 1 + (r.end() as u32 - r.start() as u32)).sum()
@@ -301,4 +315,12 @@ mod tests {
        // assert_eq!(one_regex(r"\w(foo|bar|baz)"), pat("foo|bar|baz"));
        // assert_eq!(one_regex(r"\w(foo|bar|baz)\w"), pat("foo|bar|baz"));
    }
+
+    #[test]
+    fn regression_1064() {
+        // Regression from:
+        // https://github.com/BurntSushi/ripgrep/issues/1064
+        // assert_eq!(one_regex(r"a.*c"), pat("a"));
+        assert_eq!(one_regex(r"a(.*c)"), pat("a"));
+    }
 }
--- a/grep-regex/src/matcher.rs
+++ b/grep-regex/src/matcher.rs
@@ -323,8 +323,15 @@ impl RegexMatcher {
    /// Create a new matcher from the given pattern using the default
    /// configuration, but matches lines terminated by `\n`.
    ///
-    /// This returns an error if the given pattern contains a literal `\n`.
-    /// Other uses of `\n` (such as in `\s`) are removed transparently.
+    /// This is meant to be a convenience constructor for using a
+    /// `RegexMatcherBuilder` and setting its
+    /// [`line_terminator`](struct.RegexMatcherBuilder.html#method.line_terminator)
+    /// to `\n`. The purpose of using this constructor is to permit special
+    /// optimizations that help speed up line oriented search. These types of
+    /// optimizations are only appropriate when matches span no more than one
+    /// line. For this reason, this constructor will return an error if the
+    /// given pattern contains a literal `\n`. Other uses of `\n` (such as in
+    /// `\s`) are removed transparently.
    pub fn new_line_matcher(pattern: &str) -> Result<RegexMatcher, Error> {
        RegexMatcherBuilder::new()
            .line_terminator(Some(b'\n'))
--- a/grep-searcher/Cargo.toml
+++ b/grep-searcher/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep-searcher"
-version = "0.1.0"  #:version
+version = "0.1.1"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 Fast line oriented regex searching as a library.
@@ -13,23 +13,26 @@ keywords = ["regex", "grep", "egrep", "search", "pattern"]
 license = "Unlicense/MIT"

 [dependencies]
-bytecount = "0.3.1"
-encoding_rs = "0.8"
-encoding_rs_io = "0.1.2"
-grep-matcher = { version = "0.1.0", path = "../grep-matcher" }
-log = "0.4"
-memchr = "2"
-memmap = "0.6"
+bytecount = "0.5"
+encoding_rs = "0.8.14"
+encoding_rs_io = "0.1.3"
+grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
+log = "0.4.5"
+memmap = "0.7"
+
+[dependencies.bstr]
+version = "*"
+path = "/home/andrew/rust/bstr"
+default-features = false
+features = ["std"]

 [dev-dependencies]
-grep-regex = { version = "0.1.0", path = "../grep-regex" }
-regex = "1"
+grep-regex = { version = "0.1.1", path = "../grep-regex" }
+regex = "1.1"

 [features]
-avx-accel = [
-  "bytecount/avx-accel",
-]
-simd-accel = [
-  "bytecount/simd-accel",
-  "encoding_rs/simd-accel",
-]
+default = ["bytecount/runtime-dispatch-simd"]
+simd-accel = ["encoding_rs/simd-accel"]
+
+# This feature is DEPRECATED. Runtime dispatch is used for SIMD now.
+avx-accel = []
--- a/grep-searcher/src/lib.rs
+++ b/grep-searcher/src/lib.rs
@@ -74,14 +74,11 @@ fn example() -> Result<(), Box<Error>> {
    let mut matches: Vec<(u64, String)> = vec![];
    Searcher::new().search_slice(&matcher, SHERLOCK, UTF8(|lnum, line| {
        // We are guaranteed to find a match, so the unwrap is OK.
-        eprintln!("LINE: {:?}", line);
        let mymatch = matcher.find(line.as_bytes())?.unwrap();
        matches.push((lnum, line[mymatch].to_string()));
        Ok(true)
    }))?;

-    eprintln!("MATCHES: {:?}", matches);
-
    assert_eq!(matches.len(), 2);
    assert_eq!(
        matches[0],
@@ -102,13 +99,13 @@ searches stdin.

 #![deny(missing_docs)]

+extern crate bstr;
 extern crate bytecount;
 extern crate encoding_rs;
 extern crate encoding_rs_io;
 extern crate grep_matcher;
 #[macro_use]
 extern crate log;
-extern crate memchr;
 extern crate memmap;
 #[cfg(test)]
 extern crate regex;
--- a/grep-searcher/src/line_buffer.rs
+++ b/grep-searcher/src/line_buffer.rs
@@ -1,8 +1,7 @@
 use std::cmp;
 use std::io;
-use std::ptr;

-use memchr::{memchr, memrchr};
+use bstr::{BStr, BString};

 /// The default buffer capacity that we use for the line buffer.
 pub(crate) const DEFAULT_BUFFER_CAPACITY: usize = 8 * (1<<10); // 8 KB
@@ -123,7 +122,7 @@ impl LineBufferBuilder {
    pub fn build(&self) -> LineBuffer {
        LineBuffer {
            config: self.config,
-            buf: vec![0; self.config.capacity],
+            buf: BString::from(vec![0; self.config.capacity]),
            pos: 0,
            last_lineterm: 0,
            end: 0,
@@ -254,7 +253,7 @@ impl<'b, R: io::Read> LineBufferReader<'b, R> {
    }

    /// Return the contents of this buffer.
-    pub fn buffer(&self) -> &[u8] {
+    pub fn buffer(&self) -> &BStr {
        self.line_buffer.buffer()
    }

@@ -284,7 +283,7 @@ pub struct LineBuffer {
    /// The configuration of this buffer.
    config: Config,
    /// The primary buffer with which to hold data.
-    buf: Vec<u8>,
+    buf: BString,
    /// The current position of this buffer. This is always a valid sliceable
    /// index into `buf`, and its maximum value is the length of `buf`.
    pos: usize,
@@ -339,13 +338,13 @@ impl LineBuffer {
    }

    /// Return the contents of this buffer.
-    fn buffer(&self) -> &[u8] {
+    fn buffer(&self) -> &BStr {
        &self.buf[self.pos..self.last_lineterm]
    }

    /// Return the contents of the free space beyond the end of the buffer as
    /// a mutable slice.
-    fn free_buffer(&mut self) -> &mut [u8] {
+    fn free_buffer(&mut self) -> &mut BStr {
        &mut self.buf[self.end..]
    }

@@ -396,7 +395,7 @@ impl LineBuffer {
        assert_eq!(self.pos, 0);
        loop {
            self.ensure_capacity()?;
-            let readlen = rdr.read(self.free_buffer())?;
+            let readlen = rdr.read(self.free_buffer().as_bytes_mut())?;
            if readlen == 0 {
                // We're only done reading for good once the caller has
                // consumed everything.
@@ -416,7 +415,7 @@ impl LineBuffer {
            match self.config.binary {
                BinaryDetection::None => {} // nothing to do
                BinaryDetection::Quit(byte) => {
-                    if let Some(i) = memchr(byte, newbytes) {
+                    if let Some(i) = newbytes.find_byte(byte) {
                        self.end = oldend + i;
                        self.last_lineterm = self.end;
                        self.binary_byte_offset =
@@ -444,7 +443,7 @@ impl LineBuffer {
            }

            // Update our `last_lineterm` positions if we read one.
-            if let Some(i) = memrchr(self.config.lineterm, newbytes) {
+            if let Some(i) = newbytes.rfind_byte(self.config.lineterm) {
                self.last_lineterm = oldend + i + 1;
                return Ok(true);
            }
@@ -467,40 +466,8 @@ impl LineBuffer {
            return;
        }

-        assert!(self.pos < self.end && self.end <= self.buf.len());
        let roll_len = self.end - self.pos;
-        unsafe {
-            // SAFETY: A buffer contains Copy data, so there's no problem
-            // moving it around. Safety also depends on our indices being
-            // in bounds, which they should always be, and we enforce with
-            // an assert above.
-            //
-            // It seems like it should be possible to do this in safe code that
-            // results in the same codegen. I tried the obvious:
-            //
-            //   for (src, dst) in (self.pos..self.end).zip(0..) {
-            //     self.buf[dst] = self.buf[src];
-            //   }
-            //
-            // But the above does not work, and in fact compiles down to a slow
-            // byte-by-byte loop. I tried a few other minor variations, but
-            // alas, better minds might prevail.
-            //
-            // Overall, this doesn't save us *too* much. It mostly matters when
-            // the number of bytes we're copying is large, which can happen
-            // if the searcher is asked to produce a lot of context. We could
-            // decide this isn't worth it, but it does make an appreciable
-            // impact at or around the context=30 range on my machine.
-            //
-            // We could also use a temporary buffer that compiles down to two
-            // memcpys and is faster than the byte-at-a-time loop, but it
-            // complicates our options for limiting memory allocation a bit.
-            ptr::copy(
-                self.buf[self.pos..].as_ptr(),
-                self.buf.as_mut_ptr(),
-                roll_len,
-            );
-        }
+        self.buf.copy_within(self.pos.., 0);
        self.pos = 0;
        self.last_lineterm = roll_len;
        self.end = roll_len;
@@ -536,14 +503,15 @@ impl LineBuffer {
    }
 }

-/// Replaces `src` with `replacement` in bytes.
-fn replace_bytes(bytes: &mut [u8], src: u8, replacement: u8) -> Option<usize> {
+/// Replaces `src` with `replacement` in bytes, and return the offset of the
+/// first replacement, if one exists.
+fn replace_bytes(bytes: &mut BStr, src: u8, replacement: u8) -> Option<usize> {
    if src == replacement {
        return None;
    }
    let mut first_pos = None;
    let mut pos = 0;
-    while let Some(i) = memchr(src, &bytes[pos..]).map(|i| pos + i) {
+    while let Some(i) = bytes[pos..].find_byte(src).map(|i| pos + i) {
        if first_pos.is_none() {
            first_pos = Some(i);
        }
@@ -560,6 +528,7 @@ fn replace_bytes(bytes: &mut [u8], src: u8, replacement: u8) -> Option<usize> {
 #[cfg(test)]
 mod tests {
    use std::str;
+    use bstr::BString;
    use super::*;

    const SHERLOCK: &'static str = "\
@@ -575,18 +544,14 @@ and exhibited clearly, with a label attached.\
        slice.to_string()
    }

-    fn btos(slice: &[u8]) -> &str {
-        str::from_utf8(slice).unwrap()
-    }
-
    fn replace_str(
        slice: &str,
        src: u8,
        replacement: u8,
    ) -> (String, Option<usize>) {
-        let mut dst = slice.to_string().into_bytes();
+        let mut dst = BString::from(slice);
        let result = replace_bytes(&mut dst, src, replacement);
-        (String::from_utf8(dst).unwrap(), result)
+        (dst.into_string().unwrap(), result)
    }

    #[test]
@@ -607,7 +572,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\n");
+        assert_eq!(rdr.buffer(), "homer\nlisa\n");
        assert_eq!(rdr.absolute_byte_offset(), 0);
        rdr.consume(5);
        assert_eq!(rdr.absolute_byte_offset(), 5);
@@ -615,7 +580,7 @@ and exhibited clearly, with a label attached.\
        assert_eq!(rdr.absolute_byte_offset(), 11);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "maggie");
+        assert_eq!(rdr.buffer(), "maggie");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -630,7 +595,7 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\nmaggie\n");
+        assert_eq!(rdr.buffer(), "homer\nlisa\nmaggie\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -645,7 +610,7 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "\n");
+        assert_eq!(rdr.buffer(), "\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -660,7 +625,7 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "\n\n");
+        assert_eq!(rdr.buffer(), "\n\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -698,12 +663,12 @@ and exhibited clearly, with a label attached.\
        let mut linebuf = LineBufferBuilder::new().capacity(1).build();
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

-        let mut got = vec![];
+        let mut got = BString::new();
        while rdr.fill().unwrap() {
-            got.extend(rdr.buffer());
+            got.push(rdr.buffer());
            rdr.consume_all();
        }
-        assert_eq!(bytes, btos(&got));
+        assert_eq!(bytes, got);
        assert_eq!(rdr.absolute_byte_offset(), bytes.len() as u64);
        assert_eq!(rdr.binary_byte_offset(), None);
    }
@@ -718,11 +683,11 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\n");
+        assert_eq!(rdr.buffer(), "homer\n");
        rdr.consume_all();

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "lisa\n");
+        assert_eq!(rdr.buffer(), "lisa\n");
        rdr.consume_all();

        // This returns an error because while we have just enough room to
@@ -732,11 +697,11 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.fill().is_err());

        // We can mush on though!
-        assert_eq!(btos(rdr.buffer()), "m");
+        assert_eq!(rdr.buffer(), "m");
        rdr.consume_all();

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "aggie");
+        assert_eq!(rdr.buffer(), "aggie");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -752,16 +717,16 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\n");
+        assert_eq!(rdr.buffer(), "homer\n");
        rdr.consume_all();

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "lisa\n");
+        assert_eq!(rdr.buffer(), "lisa\n");
        rdr.consume_all();

        // We have just enough space.
        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "maggie");
+        assert_eq!(rdr.buffer(), "maggie");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -777,7 +742,7 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(rdr.fill().is_err());
-        assert_eq!(btos(rdr.buffer()), "");
+        assert_eq!(rdr.buffer(), "");
    }

    #[test]
@@ -789,7 +754,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nli\x00sa\nmaggie\n");
+        assert_eq!(rdr.buffer(), "homer\nli\x00sa\nmaggie\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -808,7 +773,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nli");
+        assert_eq!(rdr.buffer(), "homer\nli");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -825,7 +790,7 @@ and exhibited clearly, with a label attached.\
        let mut rdr = LineBufferReader::new(bytes.as_bytes(), &mut linebuf);

        assert!(!rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "");
+        assert_eq!(rdr.buffer(), "");
        assert_eq!(rdr.absolute_byte_offset(), 0);
        assert_eq!(rdr.binary_byte_offset(), Some(0));
    }
@@ -841,7 +806,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\nmaggie\n");
+        assert_eq!(rdr.buffer(), "homer\nlisa\nmaggie\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -860,7 +825,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\nmaggie");
+        assert_eq!(rdr.buffer(), "homer\nlisa\nmaggie");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -878,7 +843,7 @@ and exhibited clearly, with a label attached.\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "\
+        assert_eq!(rdr.buffer(), "\
 For the Doctor Watsons of this world, as opposed to the Sherlock
 Holmeses, s\
 ");
@@ -901,7 +866,7 @@ Holmeses, s\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nli\nsa\nmaggie\n");
+        assert_eq!(rdr.buffer(), "homer\nli\nsa\nmaggie\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -920,7 +885,7 @@ Holmeses, s\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "\nhomer\nlisa\nmaggie\n");
+        assert_eq!(rdr.buffer(), "\nhomer\nlisa\nmaggie\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -939,7 +904,7 @@ Holmeses, s\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\nmaggie\n\n");
+        assert_eq!(rdr.buffer(), "homer\nlisa\nmaggie\n\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
@@ -958,7 +923,7 @@ Holmeses, s\
        assert!(rdr.buffer().is_empty());

        assert!(rdr.fill().unwrap());
-        assert_eq!(btos(rdr.buffer()), "homer\nlisa\nmaggie\n\n");
+        assert_eq!(rdr.buffer(), "homer\nlisa\nmaggie\n\n");
        rdr.consume_all();

        assert!(!rdr.fill().unwrap());
--- a/grep-searcher/src/lines.rs
+++ b/grep-searcher/src/lines.rs
@@ -2,8 +2,8 @@
 A collection of routines for performing operations on lines.
 */

+use bstr::{B, BStr};
 use bytecount;
-use memchr::{memchr, memrchr};
 use grep_matcher::{LineTerminator, Match};

 /// An iterator over lines in a particular slice of bytes.
@@ -14,7 +14,7 @@ use grep_matcher::{LineTerminator, Match};
 /// `'b` refers to the lifetime of the underlying bytes.
 #[derive(Debug)]
 pub struct LineIter<'b> {
-    bytes: &'b [u8],
+    bytes: &'b BStr,
    stepper: LineStep,
 }

@@ -23,7 +23,7 @@ impl<'b> LineIter<'b> {
    /// are terminated by `line_term`.
    pub fn new(line_term: u8, bytes: &'b [u8]) -> LineIter<'b> {
        LineIter {
-            bytes: bytes,
+            bytes: B(bytes),
            stepper: LineStep::new(line_term, 0, bytes.len()),
        }
    }
@@ -33,7 +33,7 @@ impl<'b> Iterator for LineIter<'b> {
    type Item = &'b [u8];

    fn next(&mut self) -> Option<&'b [u8]> {
-        self.stepper.next_match(self.bytes).map(|m| &self.bytes[m])
+        self.stepper.next_match(self.bytes).map(|m| self.bytes[m].as_bytes())
    }
 }

@@ -73,19 +73,19 @@ impl LineStep {
    /// The range returned includes the line terminator. Ranges are always
    /// non-empty.
    pub fn next(&mut self, bytes: &[u8]) -> Option<(usize, usize)> {
-        self.next_impl(bytes)
+        self.next_impl(B(bytes))
    }

    /// Like next, but returns a `Match` instead of a tuple.
    #[inline(always)]
-    pub(crate) fn next_match(&mut self, bytes: &[u8]) -> Option<Match> {
+    pub(crate) fn next_match(&mut self, bytes: &BStr) -> Option<Match> {
        self.next_impl(bytes).map(|(s, e)| Match::new(s, e))
    }

    #[inline(always)]
-    fn next_impl(&mut self, mut bytes: &[u8]) -> Option<(usize, usize)> {
+    fn next_impl(&mut self, mut bytes: &BStr) -> Option<(usize, usize)> {
        bytes = &bytes[..self.end];
-        match memchr(self.line_term, &bytes[self.pos..]) {
+        match bytes[self.pos..].find_byte(self.line_term) {
            None => {
                if self.pos < bytes.len() {
                    let m = (self.pos, bytes.len());
@@ -109,15 +109,15 @@ impl LineStep {
 }

 /// Count the number of occurrences of `line_term` in `bytes`.
-pub fn count(bytes: &[u8], line_term: u8) -> u64 {
-    bytecount::count(bytes, line_term) as u64
+pub fn count(bytes: &BStr, line_term: u8) -> u64 {
+    bytecount::count(bytes.as_bytes(), line_term) as u64
 }

 /// Given a line that possibly ends with a terminator, return that line without
 /// the terminator.
 #[inline(always)]
-pub fn without_terminator(bytes: &[u8], line_term: LineTerminator) -> &[u8] {
-    let line_term = line_term.as_bytes();
+pub fn without_terminator(bytes: &BStr, line_term: LineTerminator) -> &BStr {
+    let line_term = BStr::new(line_term.as_bytes());
    let start = bytes.len().saturating_sub(line_term.len());
    if bytes.get(start..) == Some(line_term) {
        return &bytes[..bytes.len() - line_term.len()];
@@ -131,18 +131,20 @@ pub fn without_terminator(bytes: &[u8], line_term: LineTerminator) -> &[u8] {
 /// Line terminators are considered part of the line they terminate.
 #[inline(always)]
 pub fn locate(
-    bytes: &[u8],
+    bytes: &BStr,
    line_term: u8,
    range: Match,
 ) -> Match {
-    let line_start = memrchr(line_term, &bytes[0..range.start()])
+    let line_start = bytes[..range.start()]
+        .rfind_byte(line_term)
        .map_or(0, |i| i + 1);
    let line_end =
        if range.end() > line_start && bytes[range.end() - 1] == line_term {
            range.end()
        } else {
-            memchr(line_term, &bytes[range.end()..])
-            .map_or(bytes.len(), |i| range.end() + i + 1)
+            bytes[range.end()..]
+                .find_byte(line_term)
+                .map_or(bytes.len(), |i| range.end() + i + 1)
        };
    Match::new(line_start, line_end)
 }
@@ -155,7 +157,7 @@ pub fn locate(
 ///
 /// If `bytes` ends with a line terminator, then the terminator itself is
 /// considered part of the last line.
-pub fn preceding(bytes: &[u8], line_term: u8, count: usize) -> usize {
+pub fn preceding(bytes: &BStr, line_term: u8, count: usize) -> usize {
    preceding_by_pos(bytes, bytes.len(), line_term, count)
 }

@@ -169,7 +171,7 @@ pub fn preceding(bytes: &[u8], line_term: u8, count: usize) -> usize {
 /// and `pos = 7`, `preceding(bytes, pos, b'\n', 0)` returns `4` (as does `pos
 /// = 8`) and `preceding(bytes, pos, `b'\n', 1)` returns `0`.
 fn preceding_by_pos(
-    bytes: &[u8],
+    bytes: &BStr,
    mut pos: usize,
    line_term: u8,
    mut count: usize,
@@ -180,7 +182,7 @@ fn preceding_by_pos(
        pos -= 1;
    }
    loop {
-        match memrchr(line_term, &bytes[..pos]) {
+        match bytes[..pos].rfind_byte(line_term) {
            None => {
                return 0;
            }
@@ -201,7 +203,10 @@ fn preceding_by_pos(
 mod tests {
    use std::ops::Range;
    use std::str;
+
+    use bstr::B;
    use grep_matcher::Match;
+
    use super::*;

    const SHERLOCK: &'static str = "\
@@ -220,7 +225,7 @@ and exhibited clearly, with a label attached.\
    fn lines(text: &str) -> Vec<&str> {
        let mut results = vec![];
        let mut it = LineStep::new(b'\n', 0, text.len());
-        while let Some(m) = it.next_match(text.as_bytes()) {
+        while let Some(m) = it.next_match(B(text)) {
            results.push(&text[m]);
        }
        results
@@ -229,26 +234,26 @@ and exhibited clearly, with a label attached.\
    fn line_ranges(text: &str) -> Vec<Range<usize>> {
        let mut results = vec![];
        let mut it = LineStep::new(b'\n', 0, text.len());
-        while let Some(m) = it.next_match(text.as_bytes()) {
+        while let Some(m) = it.next_match(B(text)) {
            results.push(m.start()..m.end());
        }
        results
    }

    fn prev(text: &str, pos: usize, count: usize) -> usize {
-        preceding_by_pos(text.as_bytes(), pos, b'\n', count)
+        preceding_by_pos(B(text), pos, b'\n', count)
    }

    fn loc(text: &str, start: usize, end: usize) -> Match {
-        locate(text.as_bytes(), b'\n', Match::new(start, end))
+        locate(B(text), b'\n', Match::new(start, end))
    }

    #[test]
    fn line_count() {
-        assert_eq!(0, count(b"", b'\n'));
-        assert_eq!(1, count(b"\n", b'\n'));
-        assert_eq!(2, count(b"\n\n", b'\n'));
-        assert_eq!(2, count(b"a\nb\nc", b'\n'));
+        assert_eq!(0, count(B(""), b'\n'));
+        assert_eq!(1, count(B("\n"), b'\n'));
+        assert_eq!(2, count(B("\n\n"), b'\n'));
+        assert_eq!(2, count(B("a\nb\nc"), b'\n'));
    }

    #[test]
@@ -331,7 +336,7 @@ and exhibited clearly, with a label attached.\
    #[test]
    fn preceding_lines_doc() {
        // These are the examples mentions in the documentation of `preceding`.
-        let bytes = b"abc\nxyz\n";
+        let bytes = B("abc\nxyz\n");
        assert_eq!(4, preceding_by_pos(bytes, 7, b'\n', 0));
        assert_eq!(4, preceding_by_pos(bytes, 8, b'\n', 0));
        assert_eq!(0, preceding_by_pos(bytes, 7, b'\n', 1));
--- a/grep-searcher/src/macros.rs
+++ b/grep-searcher/src/macros.rs
@@ -1,3 +1,4 @@
+/// Like assert_eq, but nicer output for long strings.
 #[cfg(test)]
 #[macro_export]
 macro_rules! assert_eq_printed {
--- a/grep-searcher/src/searcher/core.rs
+++ b/grep-searcher/src/searcher/core.rs
@@ -1,6 +1,6 @@
 use std::cmp;

-use memchr::memchr;
+use bstr::BStr;

 use grep_matcher::{LineMatchKind, Matcher};
 use lines::{self, LineStep};
@@ -84,7 +84,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    pub fn matched(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        range: &Range,
    ) -> Result<bool, S::Error> {
        self.sink_matched(buf, range)
@@ -107,7 +107,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            })
    }

-    pub fn match_by_line(&mut self, buf: &[u8]) -> Result<bool, S::Error> {
+    pub fn match_by_line(&mut self, buf: &BStr) -> Result<bool, S::Error> {
        if self.is_line_by_line_fast() {
            self.match_by_line_fast(buf)
        } else {
@@ -115,7 +115,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
        }
    }

-    pub fn roll(&mut self, buf: &[u8]) -> usize {
+    pub fn roll(&mut self, buf: &BStr) -> usize {
        let consumed =
            if self.config.max_context() == 0 {
                buf.len()
@@ -141,7 +141,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
        consumed
    }

-    pub fn detect_binary(&mut self, buf: &[u8], range: &Range) -> bool {
+    pub fn detect_binary(&mut self, buf: &BStr, range: &Range) -> bool {
        if self.binary_byte_offset.is_some() {
            return true;
        }
@@ -149,7 +149,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            BinaryDetection::Quit(b) => b,
            _ => return false,
        };
-        if let Some(i) = memchr(binary_byte, &buf[*range]) {
+        if let Some(i) = buf[*range].find_byte(binary_byte) {
            self.binary_byte_offset = Some(range.start() + i);
            true
        } else {
@@ -159,7 +159,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    pub fn before_context_by_line(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        upto: usize,
    ) -> Result<bool, S::Error> {
        if self.config.before_context == 0 {
@@ -194,7 +194,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    pub fn after_context_by_line(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        upto: usize,
    ) -> Result<bool, S::Error> {
        if self.after_context_left == 0 {
@@ -219,7 +219,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    pub fn other_context_by_line(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        upto: usize,
    ) -> Result<bool, S::Error> {
        let range = Range::new(self.last_line_visited, upto);
@@ -236,7 +236,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
        Ok(true)
    }

-    fn match_by_line_slow(&mut self, buf: &[u8]) -> Result<bool, S::Error> {
+    fn match_by_line_slow(&mut self, buf: &BStr) -> Result<bool, S::Error> {
        debug_assert!(!self.searcher.multi_line_with_matcher(&self.matcher));

        let range = Range::new(self.pos(), buf.len());
@@ -255,7 +255,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
                    &buf[line],
                    self.config.line_term,
                );
-                match self.matcher.shortest_match(slice) {
+                match self.matcher.shortest_match(slice.as_bytes()) {
                    Err(err) => return Err(S::Error::error_message(err)),
                    Ok(result) => result.is_some(),
                }
@@ -281,7 +281,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
        Ok(true)
    }

-    fn match_by_line_fast(&mut self, buf: &[u8]) -> Result<bool, S::Error> {
+    fn match_by_line_fast(&mut self, buf: &BStr) -> Result<bool, S::Error> {
        debug_assert!(!self.config.passthru);

        while !buf[self.pos()..].is_empty() {
@@ -316,7 +316,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
    #[inline(always)]
    fn match_by_line_fast_invert(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
    ) -> Result<bool, S::Error> {
        assert!(self.config.invert_match);

@@ -357,14 +357,14 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
    #[inline(always)]
    fn find_by_line_fast(
        &self,
-        buf: &[u8],
+        buf: &BStr,
    ) -> Result<Option<Range>, S::Error> {
        debug_assert!(!self.searcher.multi_line_with_matcher(&self.matcher));
        debug_assert!(self.is_line_by_line_fast());

        let mut pos = self.pos();
        while !buf[pos..].is_empty() {
-            match self.matcher.find_candidate_line(&buf[pos..]) {
+            match self.matcher.find_candidate_line(buf[pos..].as_bytes()) {
                Err(err) => return Err(S::Error::error_message(err)),
                Ok(None) => return Ok(None),
                Ok(Some(LineMatchKind::Confirmed(i))) => {
@@ -396,7 +396,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
                        &buf[line],
                        self.config.line_term,
                    );
-                    match self.matcher.is_match(slice) {
+                    match self.matcher.is_match(slice.as_bytes()) {
                        Err(err) => return Err(S::Error::error_message(err)),
                        Ok(true) => return Ok(Some(line)),
                        Ok(false) => {
@@ -413,7 +413,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
    #[inline(always)]
    fn sink_matched(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        range: &Range,
    ) -> Result<bool, S::Error> {
        if self.binary && self.detect_binary(buf, range) {
@@ -438,7 +438,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            &self.searcher,
            &SinkMatch {
                line_term: self.config.line_term,
-                bytes: linebuf,
+                bytes: linebuf.as_bytes(),
                absolute_byte_offset: offset,
                line_number: self.line_number,
            },
@@ -454,7 +454,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    fn sink_before_context(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        range: &Range,
    ) -> Result<bool, S::Error> {
        if self.binary && self.detect_binary(buf, range) {
@@ -466,7 +466,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            &self.searcher,
            &SinkContext {
                line_term: self.config.line_term,
-                bytes: &buf[*range],
+                bytes: buf[*range].as_bytes(),
                kind: SinkContextKind::Before,
                absolute_byte_offset: offset,
                line_number: self.line_number,
@@ -482,7 +482,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    fn sink_after_context(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        range: &Range,
    ) -> Result<bool, S::Error> {
        assert!(self.after_context_left >= 1);
@@ -496,7 +496,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            &self.searcher,
            &SinkContext {
                line_term: self.config.line_term,
-                bytes: &buf[*range],
+                bytes: buf[*range].as_bytes(),
                kind: SinkContextKind::After,
                absolute_byte_offset: offset,
                line_number: self.line_number,
@@ -513,7 +513,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {

    fn sink_other_context(
        &mut self,
-        buf: &[u8],
+        buf: &BStr,
        range: &Range,
    ) -> Result<bool, S::Error> {
        if self.binary && self.detect_binary(buf, range) {
@@ -525,7 +525,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
            &self.searcher,
            &SinkContext {
                line_term: self.config.line_term,
-                bytes: &buf[*range],
+                bytes: buf[*range].as_bytes(),
                kind: SinkContextKind::Other,
                absolute_byte_offset: offset,
                line_number: self.line_number,
@@ -555,7 +555,7 @@ impl<'s, M: Matcher, S: Sink> Core<'s, M, S> {
        }
    }

-    fn count_lines(&mut self, buf: &[u8], upto: usize) {
+    fn count_lines(&mut self, buf: &BStr, upto: usize) {
        if let Some(ref mut line_number) = self.line_number {
            if self.last_line_counted >= upto {
                return;
--- a/grep-searcher/src/searcher/glue.rs
+++ b/grep-searcher/src/searcher/glue.rs
@@ -1,7 +1,9 @@
 use std::cmp;
 use std::io;

+use bstr::BStr;
 use grep_matcher::Matcher;
+
 use lines::{self, LineStep};
 use line_buffer::{DEFAULT_BUFFER_CAPACITY, LineBufferReader};
 use sink::{Sink, SinkError};
@@ -77,14 +79,14 @@ where M: Matcher,
 pub struct SliceByLine<'s, M: 's, S> {
    config: &'s Config,
    core: Core<'s, M, S>,
-    slice: &'s [u8],
+    slice: &'s BStr,
 }

 impl<'s, M: Matcher, S: Sink> SliceByLine<'s, M, S> {
    pub fn new(
        searcher: &'s Searcher,
        matcher: M,
-        slice: &'s [u8],
+        slice: &'s BStr,
        write_to: S,
    ) -> SliceByLine<'s, M, S> {
        debug_assert!(!searcher.multi_line_with_matcher(&matcher));
@@ -127,7 +129,7 @@ impl<'s, M: Matcher, S: Sink> SliceByLine<'s, M, S> {
 pub struct MultiLine<'s, M: 's, S> {
    config: &'s Config,
    core: Core<'s, M, S>,
-    slice: &'s [u8],
+    slice: &'s BStr,
    last_match: Option<Range>,
 }

@@ -135,7 +137,7 @@ impl<'s, M: Matcher, S: Sink> MultiLine<'s, M, S> {
    pub fn new(
        searcher: &'s Searcher,
        matcher: M,
-        slice: &'s [u8],
+        slice: &'s BStr,
        write_to: S,
    ) -> MultiLine<'s, M, S> {
        debug_assert!(searcher.multi_line_with_matcher(&matcher));
@@ -306,7 +308,8 @@ impl<'s, M: Matcher, S: Sink> MultiLine<'s, M, S> {
    }

    fn find(&mut self) -> Result<Option<Range>, S::Error> {
-        match self.core.matcher().find(&self.slice[self.core.pos()..]) {
+        let haystack = &self.slice[self.core.pos()..];
+        match self.core.matcher().find(haystack.as_bytes()) {
            Err(err) => Err(S::Error::error_message(err)),
            Ok(None) => Ok(None),
            Ok(Some(m)) => Ok(Some(m.offset(self.core.pos()))),
--- a/grep-searcher/src/searcher/mod.rs
+++ b/grep-searcher/src/searcher/mod.rs
@@ -5,6 +5,7 @@ use std::fs::File;
 use std::io::{self, Read};
 use std::path::Path;

+use bstr::{B, BStr, BString};
 use encoding_rs;
 use encoding_rs_io::DecodeReaderBytesBuilder;
 use grep_matcher::{LineTerminator, Match, Matcher};
@@ -311,9 +312,9 @@ impl SearcherBuilder {
        Searcher {
            config: config,
            decode_builder: decode_builder,
-            decode_buffer: RefCell::new(vec![0; 8 * (1<<10)]),
+            decode_buffer: RefCell::new(BString::from(vec![0; 8 * (1<<10)])),
            line_buffer: RefCell::new(self.config.line_buffer()),
-            multi_line_buffer: RefCell::new(vec![]),
+            multi_line_buffer: RefCell::new(BString::new()),
        }
    }

@@ -543,7 +544,7 @@ pub struct Searcher {
    /// through the underlying bytes with no additional overhead.
    decode_builder: DecodeReaderBytesBuilder,
    /// A buffer that is used for transcoding scratch space.
-    decode_buffer: RefCell<Vec<u8>>,
+    decode_buffer: RefCell<BString>,
    /// A line buffer for use in line oriented searching.
    ///
    /// We wrap it in a RefCell to permit lending out borrows of `Searcher`
@@ -555,7 +556,7 @@ pub struct Searcher {
    /// multi line search. In particular, multi line searches cannot be
    /// performed incrementally, and need the entire haystack in memory at
    /// once.
-    multi_line_buffer: RefCell<Vec<u8>>,
+    multi_line_buffer: RefCell<BString>,
 }

 impl Searcher {
@@ -666,7 +667,7 @@ impl Searcher {

        let mut decode_buffer = self.decode_buffer.borrow_mut();
        let read_from = self.decode_builder
-            .build_with_buffer(read_from, &mut *decode_buffer)
+            .build_with_buffer(read_from, decode_buffer.as_mut_vec())
            .map_err(S::Error::error_io)?;

        if self.multi_line_with_matcher(&matcher) {
@@ -698,12 +699,13 @@ impl Searcher {
    where M: Matcher,
          S: Sink,
    {
+        let slice = B(slice);
        self.check_config(&matcher).map_err(S::Error::error_config)?;

        // We can search the slice directly, unless we need to do transcoding.
        if self.slice_needs_transcoding(slice) {
            trace!("slice reader: needs transcoding, using generic reader");
-            return self.search_reader(matcher, slice, write_to);
+            return self.search_reader(matcher, slice.as_bytes(), write_to);
        }
        if self.multi_line_with_matcher(&matcher) {
            trace!("slice reader: searching via multiline strategy");
@@ -736,7 +738,7 @@ impl Searcher {
    }

    /// Returns true if and only if the given slice needs to be transcoded.
-    fn slice_needs_transcoding(&self, slice: &[u8]) -> bool {
+    fn slice_needs_transcoding(&self, slice: &BStr) -> bool {
        self.config.encoding.is_some() || slice_has_utf16_bom(slice)
    }
 }
@@ -851,7 +853,9 @@ impl Searcher {
                .map(|m| m.len() as usize + 1)
                .unwrap_or(0);
            buf.reserve(cap);
-            read_from.read_to_end(&mut *buf).map_err(S::Error::error_io)?;
+            read_from
+                .read_to_end(buf.as_mut_vec())
+                .map_err(S::Error::error_io)?;
            return Ok(());
        }
        self.fill_multi_line_buffer_from_reader::<_, S>(read_from)
@@ -868,6 +872,7 @@ impl Searcher {
        assert!(self.config.multi_line);

        let mut buf = self.multi_line_buffer.borrow_mut();
+        let buf = buf.as_mut_vec();
        buf.clear();

        // If we don't have a heap limit, then we can defer to std's
@@ -919,8 +924,8 @@ impl Searcher {
 ///
 /// This is used by the searcher to determine if a transcoder is necessary.
 /// Otherwise, it is advantageous to search the slice directly.
-fn slice_has_utf16_bom(slice: &[u8]) -> bool {
-    let enc = match encoding_rs::Encoding::for_bom(slice) {
+fn slice_has_utf16_bom(slice: &BStr) -> bool {
+    let enc = match encoding_rs::Encoding::for_bom(slice.as_bytes()) {
        None => return false,
        Some((enc, _)) => enc,
    };
--- a/grep-searcher/src/sink.rs
+++ b/grep-searcher/src/sink.rs
@@ -246,6 +246,53 @@ impl<'a, S: Sink> Sink for &'a mut S {
    }
 }

+impl<S: Sink + ?Sized> Sink for Box<S> {
+    type Error = S::Error;
+
+    #[inline]
+    fn matched(
+        &mut self,
+        searcher: &Searcher,
+        mat: &SinkMatch,
+    ) -> Result<bool, S::Error> {
+        (**self).matched(searcher, mat)
+    }
+
+    #[inline]
+    fn context(
+        &mut self,
+        searcher: &Searcher,
+        context: &SinkContext,
+    ) -> Result<bool, S::Error> {
+        (**self).context(searcher, context)
+    }
+
+    #[inline]
+    fn context_break(
+        &mut self,
+        searcher: &Searcher,
+    ) -> Result<bool, S::Error> {
+        (**self).context_break(searcher)
+    }
+
+    #[inline]
+    fn begin(
+        &mut self,
+        searcher: &Searcher,
+    ) -> Result<bool, S::Error> {
+        (**self).begin(searcher)
+    }
+
+    #[inline]
+    fn finish(
+        &mut self,
+        searcher: &Searcher,
+        sink_finish: &SinkFinish,
+    ) -> Result<(), S::Error> {
+        (**self).finish(searcher, sink_finish)
+    }
+}
+
 /// Summary data reported at the end of a search.
 ///
 /// This reports data such as the total number of bytes searched and the
--- a/grep-searcher/src/testutil.rs
+++ b/grep-searcher/src/testutil.rs
@@ -1,10 +1,10 @@
 use std::io::{self, Write};
 use std::str;

+use bstr::B;
 use grep_matcher::{
    LineMatchKind, LineTerminator, Match, Matcher, NoCaptures, NoError,
 };
-use memchr::memchr;
 use regex::bytes::{Regex, RegexBuilder};

 use searcher::{BinaryDetection, Searcher, SearcherBuilder};
@@ -94,8 +94,8 @@ impl Matcher for RegexMatcher {
            }
            // Make it interesting and return the last byte in the current
            // line.
-            let i = memchr(self.line_term.unwrap().as_byte(), haystack)
-                .map(|i| i)
+            let i = B(haystack)
+                .find_byte(self.line_term.unwrap().as_byte())
                .unwrap_or(haystack.len() - 1);
            Ok(Some(LineMatchKind::Candidate(i)))
        } else {
--- a/grep/Cargo.toml
+++ b/grep/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "grep"
-version = "0.2.0"  #:version
+version = "0.2.3"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 Fast line oriented regex searching as a library.
@@ -13,18 +13,27 @@ keywords = ["regex", "grep", "egrep", "search", "pattern"]
 license = "Unlicense/MIT"

 [dependencies]
-grep-matcher = { version = "0.1.0", path = "../grep-matcher" }
-grep-pcre2 = { version = "0.1.0", path = "../grep-pcre2", optional = true }
-grep-printer = { version = "0.1.0", path = "../grep-printer" }
-grep-regex = { version = "0.1.0", path = "../grep-regex" }
-grep-searcher = { version = "0.1.0", path = "../grep-searcher" }
+grep-cli = { version = "0.1.1", path = "../grep-cli" }
+grep-matcher = { version = "0.1.1", path = "../grep-matcher" }
+grep-pcre2 = { version = "0.1.2", path = "../grep-pcre2", optional = true }
+grep-printer = { version = "0.1.1", path = "../grep-printer" }
+grep-regex = { version = "0.1.1", path = "../grep-regex" }
+grep-searcher = { version = "0.1.1", path = "../grep-searcher" }

 [dev-dependencies]
 atty = "0.2.11"
-termcolor = "1"
-walkdir = "2.2.0"
+regex = "1.1"
+termcolor = "1.0.4"
+walkdir = "2.2.7"
+
+[dev-dependencies.clap]
+version = "2.32.0"
+default-features = false
+features = ["suggestions"]

 [features]
-avx-accel = ["grep-searcher/avx-accel"]
 simd-accel = ["grep-searcher/simd-accel"]
 pcre2 = ["grep-pcre2"]
+
+# This feature is DEPRECATED. Runtime dispatch is used for SIMD now.
+avx-accel = []
--- a/grep/examples/simplegrep.rs
+++ b/grep/examples/simplegrep.rs
@@ -1,29 +1,19 @@
-extern crate atty;
 extern crate grep;
 extern crate termcolor;
 extern crate walkdir;

 use std::env;
-use std::error;
+use std::error::Error;
 use std::ffi::OsString;
-use std::path::Path;
 use std::process;
-use std::result;

+use grep::cli;
 use grep::printer::{ColorSpecs, StandardBuilder};
 use grep::regex::RegexMatcher;
 use grep::searcher::{BinaryDetection, SearcherBuilder};
-use termcolor::{ColorChoice, StandardStream};
+use termcolor::ColorChoice;
 use walkdir::WalkDir;

-macro_rules! fail {
-    ($($tt:tt)*) => {
-        return Err(From::from(format!($($tt)*)));
-    }
-}
-
-type Result<T> = result::Result<T, Box<error::Error>>;
-
 fn main() {
    if let Err(err) = try_main() {
        eprintln!("{}", err);
@@ -31,45 +21,39 @@ fn main() {
    }
 }

-fn try_main() -> Result<()> {
+fn try_main() -> Result<(), Box<Error>> {
    let mut args: Vec<OsString> = env::args_os().collect();
    if args.len() < 2 {
-        fail!("Usage: simplegrep <pattern> [<path> ...]");
+        return Err("Usage: simplegrep <pattern> [<path> ...]".into());
    }
    if args.len() == 2 {
        args.push(OsString::from("./"));
    }
-    let pattern = match args[1].clone().into_string() {
-        Ok(pattern) => pattern,
-        Err(_) => {
-            fail!(
-                "pattern is not valid UTF-8: '{:?}'",
-                args[1].to_string_lossy()
-            );
-        }
-    };
-    search(&pattern, &args[2..])
+    search(cli::pattern_from_os(&args[1])?, &args[2..])
 }

-fn search(pattern: &str, paths: &[OsString]) -> Result<()> {
+fn search(pattern: &str, paths: &[OsString]) -> Result<(), Box<Error>> {
    let matcher = RegexMatcher::new_line_matcher(&pattern)?;
    let mut searcher = SearcherBuilder::new()
        .binary_detection(BinaryDetection::quit(b'\x00'))
+        .line_number(false)
        .build();
    let mut printer = StandardBuilder::new()
-        .color_specs(colors())
-        .build(StandardStream::stdout(color_choice()));
+        .color_specs(ColorSpecs::default_with_color())
+        .build(cli::stdout(
+            if cli::is_tty_stdout() {
+                ColorChoice::Auto
+            } else {
+                ColorChoice::Never
+            }
+        ));

    for path in paths {
        for result in WalkDir::new(path) {
            let dent = match result {
                Ok(dent) => dent,
                Err(err) => {
-                    eprintln!(
-                        "{}: {}",
-                        err.path().unwrap_or(Path::new("error")).display(),
-                        err,
-                    );
+                    eprintln!("{}", err);
                    continue;
                }
            };
@@ -88,20 +72,3 @@ fn search(pattern: &str, paths: &[OsString]) -> Result<()> {
    }
    Ok(())
 }
-
-fn color_choice() -> ColorChoice {
-    if atty::is(atty::Stream::Stdout) {
-        ColorChoice::Auto
-    } else {
-        ColorChoice::Never
-    }
-}
-
-fn colors() -> ColorSpecs {
-    ColorSpecs::new(&[
-        "path:fg:magenta".parse().unwrap(),
-        "line:fg:green".parse().unwrap(),
-        "match:fg:red".parse().unwrap(),
-        "match:style:bold".parse().unwrap(),
-    ])
-}
--- a/grep/src/lib.rs
+++ b/grep/src/lib.rs
@@ -14,6 +14,7 @@ A cookbook and a guide are planned.

 #![deny(missing_docs)]

+pub extern crate grep_cli as cli;
 pub extern crate grep_matcher as matcher;
 #[cfg(feature = "pcre2")]
 pub extern crate grep_pcre2 as pcre2;
--- a/ignore/Cargo.toml
+++ b/ignore/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "ignore"
-version = "0.4.3"  #:version
+version = "0.4.6"  #:version
 authors = ["Andrew Gallant <jamslam@gmail.com>"]
 description = """
 A fast library for efficiently matching ignore files such as `.gitignore`
@@ -18,22 +18,21 @@ name = "ignore"
 bench = false

 [dependencies]
-crossbeam = "0.3"
-globset = { version = "0.4.0", path = "../globset" }
-lazy_static = "1"
-log = "0.4"
-memchr = "2"
-regex = "1"
-same-file = "1"
-thread_local = "0.3.2"
-walkdir = "2.2.0"
+crossbeam-channel = "0.3.6"
+globset = { version = "0.4.2", path = "../globset" }
+lazy_static = "1.1"
+log = "0.4.5"
+memchr = "2.1"
+regex = "1.1"
+same-file = "1.0.4"
+thread_local = "0.3.6"
+walkdir = "2.2.7"

-[target.'cfg(windows)'.dependencies.winapi]
-version = "0.3"
-features = ["std", "winnt"]
+[target.'cfg(windows)'.dependencies.winapi-util]
+version = "0.1.1"

 [dev-dependencies]
-tempdir = "0.3.5"
+tempfile = "3.0.5"

 [features]
 simd-accel = ["globset/simd-accel"]
--- a/ignore/examples/walk.rs
+++ b/ignore/examples/walk.rs
@@ -1,14 +1,12 @@
-extern crate crossbeam;
+extern crate crossbeam_channel as channel;
 extern crate ignore;
 extern crate walkdir;

 use std::env;
 use std::io::{self, Write};
 use std::path::Path;
-use std::sync::Arc;
 use std::thread;

-use crossbeam::sync::MsQueue;
 use ignore::WalkBuilder;
 use walkdir::WalkDir;

@@ -16,7 +14,7 @@ fn main() {
    let mut path = env::args().nth(1).unwrap();
    let mut parallel = false;
    let mut simple = false;
-    let queue: Arc<MsQueue<Option<DirEntry>>> = Arc::new(MsQueue::new());
+    let (tx, rx) = channel::bounded::<DirEntry>(100);
    if path == "parallel" {
        path = env::args().nth(2).unwrap();
        parallel = true;
@@ -25,10 +23,9 @@ fn main() {
        simple = true;
    }

-    let stdout_queue = queue.clone();
    let stdout_thread = thread::spawn(move || {
        let mut stdout = io::BufWriter::new(io::stdout());
-        while let Some(dent) = stdout_queue.pop() {
+        for dent in rx {
            write_path(&mut stdout, dent.path());
        }
    });
@@ -36,26 +33,26 @@ fn main() {
    if parallel {
        let walker = WalkBuilder::new(path).threads(6).build_parallel();
        walker.run(|| {
-            let queue = queue.clone();
+            let tx = tx.clone();
            Box::new(move |result| {
                use ignore::WalkState::*;

-                queue.push(Some(DirEntry::Y(result.unwrap())));
+                tx.send(DirEntry::Y(result.unwrap())).unwrap();
                Continue
            })
        });
    } else if simple {
        let walker = WalkDir::new(path);
        for result in walker {
-            queue.push(Some(DirEntry::X(result.unwrap())));
+            tx.send(DirEntry::X(result.unwrap())).unwrap();
        }
    } else {
        let walker = WalkBuilder::new(path).build();
        for result in walker {
-            queue.push(Some(DirEntry::Y(result.unwrap())));
+            tx.send(DirEntry::Y(result.unwrap())).unwrap();
        }
    }
-    queue.push(None);
+    drop(tx);
    stdout_thread.join().unwrap();
 }

--- a/ignore/src/dir.rs
+++ b/ignore/src/dir.rs
@@ -661,7 +661,7 @@ mod tests {
    use std::io::Write;
    use std::path::Path;

-    use tempdir::TempDir;
+    use tempfile::{self, TempDir};

    use dir::IgnoreBuilder;
    use gitignore::Gitignore;
@@ -683,9 +683,13 @@ mod tests {
        }
    }

+    fn tmpdir(prefix: &str) -> TempDir {
+        tempfile::Builder::new().prefix(prefix).tempdir().unwrap()
+    }
+
    #[test]
    fn explicit_ignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join("not-an-ignore"), "foo\n!bar");

        let (gi, err) = Gitignore::new(td.path().join("not-an-ignore"));
@@ -700,7 +704,7 @@ mod tests {

    #[test]
    fn git_exclude() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git/info"));
        wfile(td.path().join(".git/info/exclude"), "foo\n!bar");

@@ -713,7 +717,7 @@ mod tests {

    #[test]
    fn gitignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git"));
        wfile(td.path().join(".gitignore"), "foo\n!bar");

@@ -726,7 +730,7 @@ mod tests {

    #[test]
    fn gitignore_no_git() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "foo\n!bar");

        let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -738,7 +742,7 @@ mod tests {

    #[test]
    fn ignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".ignore"), "foo\n!bar");

        let (ig, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -750,7 +754,7 @@ mod tests {

    #[test]
    fn custom_ignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        let custom_ignore = ".customignore";
        wfile(td.path().join(custom_ignore), "foo\n!bar");

@@ -766,7 +770,7 @@ mod tests {
    // Tests that a custom ignore file will override an .ignore.
    #[test]
    fn custom_ignore_over_ignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        let custom_ignore = ".customignore";
        wfile(td.path().join(".ignore"), "foo");
        wfile(td.path().join(custom_ignore), "!foo");
@@ -781,7 +785,7 @@ mod tests {
    // Tests that earlier custom ignore files have lower precedence than later.
    #[test]
    fn custom_ignore_precedence() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        let custom_ignore1 = ".customignore1";
        let custom_ignore2 = ".customignore2";
        wfile(td.path().join(custom_ignore1), "foo");
@@ -798,7 +802,7 @@ mod tests {
    // Tests that an .ignore will override a .gitignore.
    #[test]
    fn ignore_over_gitignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "foo");
        wfile(td.path().join(".ignore"), "!foo");

@@ -810,7 +814,7 @@ mod tests {
    // Tests that exclude has lower precedent than both .ignore and .gitignore.
    #[test]
    fn exclude_lowest() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "!foo");
        wfile(td.path().join(".ignore"), "!bar");
        mkdirp(td.path().join(".git/info"));
@@ -825,7 +829,7 @@ mod tests {

    #[test]
    fn errored() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "f**oo");

        let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
@@ -834,7 +838,7 @@ mod tests {

    #[test]
    fn errored_both() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "f**oo");
        wfile(td.path().join(".ignore"), "fo**o");

@@ -844,7 +848,7 @@ mod tests {

    #[test]
    fn errored_partial() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git"));
        wfile(td.path().join(".gitignore"), "f**oo\nbar");

@@ -855,7 +859,7 @@ mod tests {

    #[test]
    fn errored_partial_and_ignore() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        wfile(td.path().join(".gitignore"), "f**oo\nbar");
        wfile(td.path().join(".ignore"), "!bar");

@@ -866,7 +870,7 @@ mod tests {

    #[test]
    fn not_present_empty() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");

        let (_, err) = IgnoreBuilder::new().build().add_child(td.path());
        assert!(err.is_none());
@@ -876,7 +880,7 @@ mod tests {
    fn stops_at_git_dir() {
        // This tests that .gitignore files beyond a .git barrier aren't
        // matched, but .ignore files are.
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git"));
        mkdirp(td.path().join("foo/.git"));
        wfile(td.path().join(".gitignore"), "foo");
@@ -897,7 +901,7 @@ mod tests {

    #[test]
    fn absolute_parent() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git"));
        mkdirp(td.path().join("foo"));
        wfile(td.path().join(".gitignore"), "bar");
@@ -920,7 +924,7 @@ mod tests {

    #[test]
    fn absolute_parent_anchored() {
-        let td = TempDir::new("ignore-test-").unwrap();
+        let td = tmpdir("ignore-test-");
        mkdirp(td.path().join(".git"));
        mkdirp(td.path().join("src/llvm"));
        wfile(td.path().join(".gitignore"), "/llvm/\nfoo");
--- a/ignore/src/gitignore.rs
+++ b/ignore/src/gitignore.rs
@@ -419,6 +419,8 @@ impl GitignoreBuilder {
        from: Option<PathBuf>,
        mut line: &str,
    ) -> Result<&mut GitignoreBuilder, Error> {
+        #![allow(deprecated)]
+
        if line.starts_with("#") {
            return Ok(self);
        }
--- a/ignore/src/lib.rs
+++ b/ignore/src/lib.rs
@@ -46,7 +46,7 @@ See the documentation for `WalkBuilder` for many other options.

 #![deny(missing_docs)]

-extern crate crossbeam;
+extern crate crossbeam_channel as channel;
 extern crate globset;
 #[macro_use]
 extern crate lazy_static;
@@ -56,11 +56,11 @@ extern crate memchr;
 extern crate regex;
 extern crate same_file;
 #[cfg(test)]
-extern crate tempdir;
+extern crate tempfile;
 extern crate thread_local;
 extern crate walkdir;
 #[cfg(windows)]
-extern crate winapi;
+extern crate winapi_util;

 use std::error;
 use std::fmt;
--- a/ignore/src/types.rs
+++ b/ignore/src/types.rs
@@ -103,10 +103,12 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
    ("amake", &["*.mk", "*.bp"]),
    ("asciidoc", &["*.adoc", "*.asc", "*.asciidoc"]),
    ("asm", &["*.asm", "*.s", "*.S"]),
+    ("asp", &["*.aspx", "*.aspx.cs", "*.aspx.cs", "*.ascx", "*.ascx.cs", "*.ascx.vb"]),
    ("avro", &["*.avdl", "*.avpr", "*.avsc"]),
    ("awk", &["*.awk"]),
-    ("bazel", &["*.bzl", "WORKSPACE", "BUILD"]),
+    ("bazel", &["*.bzl", "WORKSPACE", "BUILD", "BUILD.bazel"]),
    ("bitbake", &["*.bb", "*.bbappend", "*.bbclass", "*.conf", "*.inc"]),
+    ("buildstream", &["*.bst"]),
    ("bzip2", &["*.bz2"]),
    ("c", &["*.c", "*.h", "*.H", "*.cats"]),
    ("cabal", &["*.cabal"]),
@@ -127,7 +129,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
    ("cshtml", &["*.cshtml"]),
    ("css", &["*.css", "*.scss"]),
    ("csv", &["*.csv"]),
-    ("cython", &["*.pyx"]),
+    ("cython", &["*.pyx", "*.pxi", "*.pxd"]),
    ("dart", &["*.dart"]),
    ("d", &["*.d"]),
    ("dhall", &["*.dhall"]),
@@ -219,16 +221,19 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
    ("objcpp", &["*.h", "*.mm"]),
    ("ocaml", &["*.ml", "*.mli", "*.mll", "*.mly"]),
    ("org", &["*.org"]),
+    ("pascal", &["*.pas", "*.dpr", "*.lpr", "*.pp", "*.inc"]),
    ("perl", &["*.perl", "*.pl", "*.PL", "*.plh", "*.plx", "*.pm", "*.t"]),
    ("pdf", &["*.pdf"]),
    ("php", &["*.php", "*.php3", "*.php4", "*.php5", "*.phtml"]),
    ("pod", &["*.pod"]),
+    ("postscript", &[".eps", ".ps"]),
    ("protobuf", &["*.proto"]),
    ("ps", &["*.cdxml", "*.ps1", "*.ps1xml", "*.psd1", "*.psm1"]),
    ("puppet", &["*.erb", "*.pp", "*.rb"]),
    ("purs", &["*.purs"]),
    ("py", &["*.py"]),
    ("qmake", &["*.pro", "*.pri", "*.prf"]),
+    ("qml", &["*.qml"]),
    ("readme", &["README*", "*README"]),
    ("r", &["*.R", "*.r", "*.Rmd", "*.Rnw"]),
    ("rdoc", &["*.rdoc"]),
@@ -279,6 +284,7 @@ const DEFAULT_TYPES: &'static [(&'static str, &'static [&'static str])] = &[
    ("tcl", &["*.tcl"]),
    ("tex", &["*.tex", "*.ltx", "*.cls", "*.sty", "*.bib"]),
    ("textile", &["*.textile"]),
+    ("thrift", &["*.thrift"]),
    ("tf", &["*.tf"]),
    ("ts", &["*.ts", "*.tsx"]),
    ("txt", &["*.txt"]),
--- a/ignore/src/walk.rs
+++ b/ignore/src/walk.rs
@@ -10,7 +10,7 @@ use std::thread;
 use std::time::Duration;
 use std::vec;

-use crossbeam::sync::MsQueue;
+use channel;
 use same_file::Handle;
 use walkdir::{self, WalkDir};

@@ -36,6 +36,14 @@ impl DirEntry {
        self.dent.path()
    }

+    /// The full path that this entry represents.
+    /// Analogous to [`path`], but moves ownership of the path.
+    ///
+    /// [`path`]: struct.DirEntry.html#method.path
+    pub fn into_path(self) -> PathBuf {
+        self.dent.into_path()
+    }
+
    /// Whether this entry corresponds to a symbolic link or not.
    pub fn path_is_symlink(&self) -> bool {
        self.dent.path_is_symlink()
@@ -84,7 +92,8 @@ impl DirEntry {
    /// Returns an error, if one exists, associated with processing this entry.
    ///
    /// An example of an error is one that occurred while parsing an ignore
-    /// file.
+    /// file. Errors related to traversing a directory tree itself are reported
+    /// as part of yielding the directory entry, and not with this method.
    pub fn error(&self) -> Option<&Error> {
        self.err.as_ref()
    }
@@ -143,6 +152,15 @@ impl DirEntryInner {
        }
    }

+    fn into_path(self) -> PathBuf {
+        use self::DirEntryInner::*;
+        match self {
+            Stdin => PathBuf::from("<stdin>"),
+            Walkdir(x) => x.into_path(),
+            Raw(x) => x.into_path(),
+        }
+    }
+
    fn path_is_symlink(&self) -> bool {
        use self::DirEntryInner::*;
        match *self {
@@ -215,19 +233,6 @@ impl DirEntryInner {
    }

    /// Returns true if and only if this entry points to a directory.
-    ///
-    /// This works around a bug in Rust's standard library:
-    /// https://github.com/rust-lang/rust/issues/46484
-    #[cfg(windows)]
-    fn is_dir(&self) -> bool {
-        self.metadata().map(|md| metadata_is_dir(&md)).unwrap_or(false)
-    }
-
-    /// Returns true if and only if this entry points to a directory.
-    ///
-    /// This works around a bug in Rust's standard library:
-    /// https://github.com/rust-lang/rust/issues/46484
-    #[cfg(not(windows))]
    fn is_dir(&self) -> bool {
        self.file_type().map(|ft| ft.is_dir()).unwrap_or(false)
    }
@@ -252,10 +257,6 @@ struct DirEntryRaw {
    ino: u64,
    /// The underlying metadata (Windows only). We store this on Windows
    /// because this comes for free while reading a directory.
-    ///
-    /// We use this to determine whether an entry is a directory or not, which
-    /// works around a bug in Rust's standard library:
-    /// https://github.com/rust-lang/rust/issues/46484
    #[cfg(windows)]
    metadata: fs::Metadata,
 }
@@ -278,6 +279,10 @@ impl DirEntryRaw {
        &self.path
    }

+    fn into_path(self) -> PathBuf {
+        self.path
+    }
+
    fn path_is_symlink(&self) -> bool {
        self.ty.is_symlink() || self.follow_link
    }
@@ -375,21 +380,29 @@ impl DirEntryRaw {
    }

    #[cfg(not(unix))]
-    fn from_link(depth: usize, pb: PathBuf) -> Result<DirEntryRaw, Error> {
+    fn from_path(
+        depth: usize,
+        pb: PathBuf,
+        link: bool,
+    ) -> Result<DirEntryRaw, Error> {
        let md = fs::metadata(&pb).map_err(|err| {
            Error::Io(err).with_path(&pb)
        })?;
        Ok(DirEntryRaw {
            path: pb,
            ty: md.file_type(),
-            follow_link: true,
+            follow_link: link,
            depth: depth,
            metadata: md,
        })
    }

    #[cfg(unix)]
-    fn from_link(depth: usize, pb: PathBuf) -> Result<DirEntryRaw, Error> {
+    fn from_path(
+        depth: usize,
+        pb: PathBuf,
+        link: bool,
+    ) -> Result<DirEntryRaw, Error> {
        use std::os::unix::fs::MetadataExt;

        let md = fs::metadata(&pb).map_err(|err| {
@@ -398,7 +411,7 @@ impl DirEntryRaw {
        Ok(DirEntryRaw {
            path: pb,
            ty: md.file_type(),
-            follow_link: true,
+            follow_link: link,
            depth: depth,
            ino: md.ino(),
        })
@@ -460,10 +473,16 @@ pub struct WalkBuilder {
    max_depth: Option<usize>,
    max_filesize: Option<u64>,
    follow_links: bool,
-    sorter: Option<Arc<
-        Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static
-    >>,
+    same_file_system: bool,
+    sorter: Option<Sorter>,
    threads: usize,
+    skip: Option<Arc<Handle>>,
+}
+
+#[derive(Clone)]
+enum Sorter {
+    ByName(Arc<Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static>),
+    ByPath(Arc<Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static>),
 }

 impl fmt::Debug for WalkBuilder {
@@ -475,6 +494,7 @@ impl fmt::Debug for WalkBuilder {
            .field("max_filesize", &self.max_filesize)
            .field("follow_links", &self.follow_links)
            .field("threads", &self.threads)
+            .field("skip", &self.skip)
            .finish()
    }
 }
@@ -493,8 +513,10 @@ impl WalkBuilder {
            max_depth: None,
            max_filesize: None,
            follow_links: false,
+            same_file_system: false,
            sorter: None,
            threads: 0,
+            skip: None,
        }
    }

@@ -502,21 +524,30 @@ impl WalkBuilder {
    pub fn build(&self) -> Walk {
        let follow_links = self.follow_links;
        let max_depth = self.max_depth;
-        let cmp = self.sorter.clone();
+        let sorter = self.sorter.clone();
        let its = self.paths.iter().map(move |p| {
            if p == Path::new("-") {
                (p.to_path_buf(), None)
            } else {
                let mut wd = WalkDir::new(p);
-                wd = wd.follow_links(follow_links || path_is_file(p));
+                wd = wd.follow_links(follow_links || p.is_file());
+                wd = wd.same_file_system(self.same_file_system);
                if let Some(max_depth) = max_depth {
                    wd = wd.max_depth(max_depth);
                }
-                if let Some(ref cmp) = cmp {
-                    let cmp = cmp.clone();
-                    wd = wd.sort_by(move |a, b| {
-                        cmp(a.file_name(), b.file_name())
-                    });
+                if let Some(ref sorter) = sorter {
+                    match sorter.clone() {
+                        Sorter::ByName(cmp) => {
+                            wd = wd.sort_by(move |a, b| {
+                                cmp(a.file_name(), b.file_name())
+                            });
+                        }
+                        Sorter::ByPath(cmp) => {
+                            wd = wd.sort_by(move |a, b| {
+                                cmp(a.path(), b.path())
+                            });
+                        }
+                    }
                }
                (p.to_path_buf(), Some(WalkEventIter::from(wd)))
            }
@@ -528,6 +559,7 @@ impl WalkBuilder {
            ig_root: ig_root.clone(),
            ig: ig_root.clone(),
            max_filesize: self.max_filesize,
+            skip: self.skip.clone(),
        }
    }

@@ -543,7 +575,9 @@ impl WalkBuilder {
            max_depth: self.max_depth,
            max_filesize: self.max_filesize,
            follow_links: self.follow_links,
+            same_file_system: self.same_file_system,
            threads: self.threads,
+            skip: self.skip.clone(),
        }
    }

@@ -730,6 +764,30 @@ impl WalkBuilder {
        self
    }

+    /// Set a function for sorting directory entries by their path.
+    ///
+    /// If a compare function is set, the resulting iterator will return all
+    /// paths in sorted order. The compare function will be called to compare
+    /// entries from the same directory.
+    ///
+    /// This is like `sort_by_file_name`, except the comparator accepts
+    /// a `&Path` instead of the base file name, which permits it to sort by
+    /// more criteria.
+    ///
+    /// This method will override any previous sorter set by this method or
+    /// by `sort_by_file_name`.
+    ///
+    /// Note that this is not used in the parallel iterator.
+    pub fn sort_by_file_path<F>(
+        &mut self,
+        cmp: F,
+    ) -> &mut WalkBuilder
+    where F: Fn(&Path, &Path) -> cmp::Ordering + Send + Sync + 'static
+    {
+        self.sorter = Some(Sorter::ByPath(Arc::new(cmp)));
+        self
+    }
+
    /// Set a function for sorting directory entries by file name.
    ///
    /// If a compare function is set, the resulting iterator will return all
@@ -737,11 +795,47 @@ impl WalkBuilder {
    /// names from entries from the same directory using only the name of the
    /// entry.
    ///
+    /// This method will override any previous sorter set by this method or
+    /// by `sort_by_file_path`.
+    ///
    /// Note that this is not used in the parallel iterator.
    pub fn sort_by_file_name<F>(&mut self, cmp: F) -> &mut WalkBuilder
    where F: Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static
    {
-        self.sorter = Some(Arc::new(cmp));
+        self.sorter = Some(Sorter::ByName(Arc::new(cmp)));
+        self
+    }
+
+    /// Do not cross file system boundaries.
+    ///
+    /// When this option is enabled, directory traversal will not descend into
+    /// directories that are on a different file system from the root path.
+    ///
+    /// Currently, this option is only supported on Unix and Windows. If this
+    /// option is used on an unsupported platform, then directory traversal
+    /// will immediately return an error and will not yield any entries.
+    pub fn same_file_system(&mut self, yes: bool) -> &mut WalkBuilder {
+        self.same_file_system = yes;
+        self
+    }
+
+    /// Do not yield directory entries that are believed to correspond to
+    /// stdout.
+    ///
+    /// This is useful when a command is invoked via shell redirection to a
+    /// file that is also being read. For example, `grep -r foo ./ > results`
+    /// might end up trying to search `results` even though it is also writing
+    /// to it, which could cause an unbounded feedback loop. Setting this
+    /// option prevents this from happening by skipping over the `results`
+    /// file.
+    ///
+    /// This is disabled by default.
+    pub fn skip_stdout(&mut self, yes: bool) -> &mut WalkBuilder {
+        if yes {
+            self.skip = stdout_handle().map(Arc::new);
+        } else {
+            self.skip = None;
+        }
        self
    }
 }
@@ -758,6 +852,7 @@ pub struct Walk {
    ig_root: Ignore,
    ig: Ignore,
    max_filesize: Option<u64>,
+    skip: Option<Arc<Handle>>,
 }

 impl Walk {
@@ -770,12 +865,17 @@ impl Walk {
        WalkBuilder::new(path).build()
    }

-    fn skip_entry(&self, ent: &walkdir::DirEntry) -> bool {
+    fn skip_entry(&self, ent: &DirEntry) -> Result<bool, Error> {
        if ent.depth() == 0 {
-            return false;
+            return Ok(false);
        }

-        let is_dir = walkdir_entry_is_dir(ent);
+        if let Some(ref stdout) = self.skip {
+            if path_equals(ent, stdout)? {
+                return Ok(true);
+            }
+        }
+        let is_dir = ent.file_type().map_or(false, |ft| ft.is_dir());
        let max_size = self.max_filesize;
        let should_skip_path = skip_path(&self.ig, ent.path(), is_dir);
        let should_skip_filesize = if !is_dir && max_size.is_some() {
@@ -784,7 +884,7 @@ impl Walk {
            false
        };

-        should_skip_path || should_skip_filesize
+        Ok(should_skip_path || should_skip_filesize)
    }
 }

@@ -804,7 +904,7 @@ impl Iterator for Walk {
                        }
                        Some((path, Some(it))) => {
                            self.it = Some(it);
-                            if path_is_dir(&path) {
+                            if path.is_dir() {
                                let (ig, err) = self.ig_root.add_parents(path);
                                self.ig = ig;
                                if let Some(err) = err {
@@ -826,7 +926,12 @@ impl Iterator for Walk {
                    self.ig = self.ig.parent().unwrap();
                }
                Ok(WalkEvent::Dir(ent)) => {
-                    if self.skip_entry(&ent) {
+                    let mut ent = DirEntry::new_walkdir(ent, None);
+                    let should_skip = match self.skip_entry(&ent) {
+                        Err(err) => return Some(Err(err)),
+                        Ok(should_skip) => should_skip,
+                    };
+                    if should_skip {
                        self.it.as_mut().unwrap().it.skip_current_dir();
                        // Still need to push this on the stack because
                        // we'll get a WalkEvent::Exit event for this dir.
@@ -837,13 +942,19 @@ impl Iterator for Walk {
                    }
                    let (igtmp, err) = self.ig.add_child(ent.path());
                    self.ig = igtmp;
-                    return Some(Ok(DirEntry::new_walkdir(ent, err)));
+                    ent.err = err;
+                    return Some(Ok(ent));
                }
                Ok(WalkEvent::File(ent)) => {
-                    if self.skip_entry(&ent) {
+                    let ent = DirEntry::new_walkdir(ent, None);
+                    let should_skip = match self.skip_entry(&ent) {
+                        Err(err) => return Some(Err(err)),
+                        Ok(should_skip) => should_skip,
+                    };
+                    if should_skip {
                        continue;
                    }
-                    return Some(Ok(DirEntry::new_walkdir(ent, None)));
+                    return Some(Ok(ent));
                }
            }
        }
@@ -894,7 +1005,7 @@ impl Iterator for WalkEventIter {
            None => None,
            Some(Err(err)) => Some(Err(err)),
            Some(Ok(dent)) => {
-                if walkdir_entry_is_dir(&dent) {
+                if dent.file_type().is_dir() {
                    self.depth += 1;
                    Some(Ok(WalkEvent::Dir(dent)))
                } else {
@@ -943,7 +1054,9 @@ pub struct WalkParallel {
    max_filesize: Option<u64>,
    max_depth: Option<usize>,
    follow_links: bool,
+    same_file_system: bool,
    threads: usize,
+    skip: Option<Arc<Handle>>,
 }

 impl WalkParallel {
@@ -956,18 +1069,43 @@ impl WalkParallel {
    ) where F: FnMut() -> Box<FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static> {
        let mut f = mkf();
        let threads = self.threads();
-        let queue = Arc::new(MsQueue::new());
+        // TODO: Figure out how to use a bounded channel here. With an
+        // unbounded channel, the workers can run away and fill up memory
+        // with all of the file paths. But a bounded channel doesn't work since
+        // our producers are also are consumers, so they end up getting stuck.
+        //
+        // We probably need to rethink parallel traversal completely to fix
+        // this. The best case scenario would be finding a way to use rayon
+        // to do this.
+        let (tx, rx) = channel::unbounded();
        let mut any_work = false;
        // Send the initial set of root paths to the pool of workers.
        // Note that we only send directories. For files, we send to them the
        // callback directly.
        for path in self.paths {
-            let dent =
+            let (dent, root_device) =
                if path == Path::new("-") {
-                    DirEntry::new_stdin()
+                    (DirEntry::new_stdin(), None)
                } else {
-                    match DirEntryRaw::from_link(0, path) {
-                        Ok(dent) => DirEntry::new_raw(dent, None),
+                    let root_device =
+                        if !self.same_file_system {
+                            None
+                        } else {
+                            match device_num(&path) {
+                                Ok(root_device) => Some(root_device),
+                                Err(err) => {
+                                    let err = Error::Io(err).with_path(path);
+                                    if f(Err(err)).is_quit() {
+                                        return;
+                                    }
+                                    continue;
+                                }
+                            }
+                        };
+                    match DirEntryRaw::from_path(0, path, false) {
+                        Ok(dent) => {
+                            (DirEntry::new_raw(dent, None), root_device)
+                        }
                        Err(err) => {
                            if f(Err(err)).is_quit() {
                                return;
@@ -976,10 +1114,11 @@ impl WalkParallel {
                        }
                    }
                };
-            queue.push(Message::Work(Work {
+            tx.send(Message::Work(Work {
                dent: dent,
                ignore: self.ig_root.clone(),
-            }));
+                root_device: root_device,
+            })).unwrap();
            any_work = true;
        }
        // ... but there's no need to start workers if we don't need them.
@@ -994,7 +1133,8 @@ impl WalkParallel {
        for _ in 0..threads {
            let worker = Worker {
                f: mkf(),
-                queue: queue.clone(),
+                tx: tx.clone(),
+                rx: rx.clone(),
                quit_now: quit_now.clone(),
                is_waiting: false,
                is_quitting: false,
@@ -1004,9 +1144,12 @@ impl WalkParallel {
                max_depth: self.max_depth,
                max_filesize: self.max_filesize,
                follow_links: self.follow_links,
+                skip: self.skip.clone(),
            };
            handles.push(thread::spawn(|| worker.run()));
        }
+        drop(tx);
+        drop(rx);
        for handle in handles {
            handle.join().unwrap();
        }
@@ -1040,6 +1183,9 @@ struct Work {
    dent: DirEntry,
    /// Any ignore matchers that have been built for this directory's parents.
    ignore: Ignore,
+    /// The root device number. When present, only files with the same device
+    /// number should be considered.
+    root_device: Option<u64>,
 }

 impl Work {
@@ -1099,8 +1245,10 @@ impl Work {
 struct Worker {
    /// The caller's callback.
    f: Box<FnMut(Result<DirEntry, Error>) -> WalkState + Send + 'static>,
-    /// A queue of work items. This is multi-producer and multi-consumer.
-    queue: Arc<MsQueue<Message>>,
+    /// The push side of our mpmc queue.
+    tx: channel::Sender<Message>,
+    /// The receive side of our mpmc queue.
+    rx: channel::Receiver<Message>,
    /// Whether all workers should quit at the next opportunity. Note that
    /// this is distinct from quitting because of exhausting the contents of
    /// a directory. Instead, this is used when the caller's callback indicates
@@ -1125,6 +1273,9 @@ struct Worker {
    /// Whether to follow symbolic links or not. When this is enabled, loop
    /// detection is performed.
    follow_links: bool,
+    /// A file handle to skip, currently is either `None` or stdout, if it's
+    /// a file and it has been requested to skip files identical to stdout.
+    skip: Option<Arc<Handle>>,
 }

 impl Worker {
@@ -1159,6 +1310,23 @@ impl Worker {
                    continue;
                }
            };
+            let descend =
+                if let Some(root_device) = work.root_device {
+                    match is_same_file_system(root_device, work.dent.path()) {
+                        Ok(true) => true,
+                        Ok(false) => false,
+                        Err(err) => {
+                            if (self.f)(Err(err)).is_quit() {
+                                self.quit_now();
+                                return;
+                            }
+                            false
+                        }
+                    }
+                } else {
+                    true
+                };
+
            let depth = work.dent.depth();
            match (self.f)(Ok(work.dent)) {
                WalkState::Continue => {}
@@ -1168,11 +1336,20 @@ impl Worker {
                    return;
                }
            }
+            if !descend {
+                continue;
+            }
            if self.max_depth.map_or(false, |max| depth >= max) {
                continue;
            }
            for result in readdir {
-                if self.run_one(&work.ignore, depth + 1, result).is_quit() {
+                let state = self.run_one(
+                    &work.ignore,
+                    depth + 1,
+                    work.root_device,
+                    result,
+                );
+                if state.is_quit() {
                    self.quit_now();
                    return;
                }
@@ -1196,6 +1373,7 @@ impl Worker {
        &mut self,
        ig: &Ignore,
        depth: usize,
+        root_device: Option<u64>,
        result: Result<fs::DirEntry, io::Error>,
    ) -> WalkState {
        let fs_dent = match result {
@@ -1213,7 +1391,7 @@ impl Worker {
        let is_symlink = dent.file_type().map_or(false, |ft| ft.is_symlink());
        if self.follow_links && is_symlink {
            let path = dent.path().to_path_buf();
-            dent = match DirEntryRaw::from_link(depth, path) {
+            dent = match DirEntryRaw::from_path(depth, path, true) {
                Ok(dent) => DirEntry::new_raw(dent, None),
                Err(err) => {
                    return (self.f)(Err(err));
@@ -1225,20 +1403,35 @@ impl Worker {
                }
            }
        }
+        if let Some(ref stdout) = self.skip {
+            let is_stdout = match path_equals(&dent, stdout) {
+                Ok(is_stdout) => is_stdout,
+                Err(err) => return (self.f)(Err(err)),
+            };
+            if is_stdout {
+                return WalkState::Continue;
+            }
+        }
        let is_dir = dent.is_dir();
        let max_size = self.max_filesize;
        let should_skip_path = skip_path(ig, dent.path(), is_dir);
-        let should_skip_filesize = if !is_dir && max_size.is_some() {
-            skip_filesize(max_size.unwrap(), dent.path(), &dent.metadata().ok())
-        } else {
-            false
-        };
+        let should_skip_filesize =
+            if !is_dir && max_size.is_some() {
+                skip_filesize(
+                    max_size.unwrap(),
+                    dent.path(),
+                    &dent.metadata().ok(),
+                )
+            } else {
+                false
+            };

        if !should_skip_path && !should_skip_filesize {
-            self.queue.push(Message::Work(Work {
+            self.tx.send(Message::Work(Work {
                dent: dent,
                ignore: ig.clone(),
-            }));
+                root_device: root_device,
+            })).unwrap();
        }
        WalkState::Continue
    }
@@ -1252,13 +1445,13 @@ impl Worker {
            if self.is_quit_now() {
                return None;
            }
-            match self.queue.try_pop() {
-                Some(Message::Work(work)) => {
+            match self.rx.try_recv() {
+                Ok(Message::Work(work)) => {
                    self.waiting(false);
                    self.quitting(false);
                    return Some(work);
                }
-                Some(Message::Quit) => {
+                Ok(Message::Quit) => {
                    // We can't just quit because a Message::Quit could be
                    // spurious. For example, it's possible to observe that
                    // all workers are waiting even if there's more work to
@@ -1289,12 +1482,12 @@ impl Worker {
                        // Otherwise, spin.
                    }
                }
-                None => {
+                Err(_) => {
                    self.waiting(true);
                    self.quitting(false);
                    if self.num_waiting() == self.threads {
                        for _ in 0..self.threads {
-                            self.queue.push(Message::Quit);
+                            self.tx.send(Message::Quit).unwrap();
                        }
                    } else {
                        // You're right to consider this suspicious, but it's
@@ -1408,7 +1601,11 @@ fn skip_filesize(
    }
 }

-fn skip_path(ig: &Ignore, path: &Path, is_dir: bool) -> bool {
+fn skip_path(
+    ig: &Ignore,
+    path: &Path,
+    is_dir: bool,
+) -> bool {
    let m = ig.matched(path, is_dir);
    if m.is_ignore() {
        debug!("ignoring {}: {:?}", path.display(), m);
@@ -1421,60 +1618,81 @@ fn skip_path(ig: &Ignore, path: &Path, is_dir: bool) -> bool {
    }
 }

-/// Returns true if and only if this path points to a directory.
+/// Returns a handle to stdout for filtering search.
 ///
-/// This works around a bug in Rust's standard library:
-/// https://github.com/rust-lang/rust/issues/46484
-#[cfg(windows)]
-fn path_is_dir(path: &Path) -> bool {
-    fs::metadata(path).map(|md| metadata_is_dir(&md)).unwrap_or(false)
-}
-
-/// Returns true if and only if this entry points to a directory.
-#[cfg(not(windows))]
-fn path_is_dir(path: &Path) -> bool {
-    path.is_dir()
-}
-
-/// Returns true if and only if this path points to a file.
+/// A handle is returned if and only if stdout is being redirected to a file.
+/// The handle returned corresponds to that file.
 ///
-/// This works around a bug in Rust's standard library:
-/// https://github.com/rust-lang/rust/issues/46484
-#[cfg(windows)]
-fn path_is_file(path: &Path) -> bool {
-    !path_is_dir(path)
+/// This can be used to ensure that we do not attempt to search a file that we
+/// may also be writing to.
+fn stdout_handle() -> Option<Handle> {
+    let h = match Handle::stdout() {
+        Err(_) => return None,
+        Ok(h) => h,
+    };
+    let md = match h.as_file().metadata() {
+        Err(_) => return None,
+        Ok(md) => md,
+    };
+    if !md.is_file() {
+        return None;
+    }
+    Some(h)
 }

-/// Returns true if and only if this entry points to a directory.
-#[cfg(not(windows))]
-fn path_is_file(path: &Path) -> bool {
-    path.is_file()
+/// Returns true if and only if the given directory entry is believed to be
+/// equivalent to the given handle. If there was a problem querying the path
+/// for information to determine equality, then that error is returned.
+fn path_equals(dent: &DirEntry, handle: &Handle) -> Result<bool, Error> {
+    #[cfg(unix)]
+    fn never_equal(dent: &DirEntry, handle: &Handle) -> bool {
+        dent.ino() != Some(handle.ino())
+    }
+
+    #[cfg(not(unix))]
+    fn never_equal(_: &DirEntry, _: &Handle) -> bool {
+        false
+    }
+
+    // If we know for sure that these two things aren't equal, then avoid
+    // the costly extra stat call to determine equality.
+    if dent.is_stdin() || never_equal(dent, handle) {
+        return Ok(false);
+    }
+    Handle::from_path(dent.path())
+        .map(|h| &h == handle)
+        .map_err(|err| Error::Io(err).with_path(dent.path()))
 }

-/// Returns true if and only if the given walkdir entry points to a directory.
-///
-/// This works around a bug in Rust's standard library:
-/// https://github.com/rust-lang/rust/issues/46484
-#[cfg(windows)]
-fn walkdir_entry_is_dir(dent: &walkdir::DirEntry) -> bool {
-    dent.metadata().map(|md| metadata_is_dir(&md)).unwrap_or(false)
+/// Returns true if and only if the given path is on the same device as the
+/// given root device.
+fn is_same_file_system(root_device: u64, path: &Path) -> Result<bool, Error> {
+    let dent_device = device_num(path)
+        .map_err(|err| Error::Io(err).with_path(path))?;
+    Ok(root_device == dent_device)
 }

-/// Returns true if and only if the given walkdir entry points to a directory.
-#[cfg(not(windows))]
-fn walkdir_entry_is_dir(dent: &walkdir::DirEntry) -> bool {
-    dent.file_type().is_dir()
+#[cfg(unix)]
+fn device_num<P: AsRef<Path>>(path: P)-> io::Result<u64> {
+    use std::os::unix::fs::MetadataExt;
+
+    path.as_ref().metadata().map(|md| md.dev())
 }

-/// Returns true if and only if the given metadata points to a directory.
-///
-/// This works around a bug in Rust's standard library:
-/// https://github.com/rust-lang/rust/issues/46484
-#[cfg(windows)]
-fn metadata_is_dir(md: &fs::Metadata) -> bool {
-    use std::os::windows::fs::MetadataExt;
-    use winapi::um::winnt::FILE_ATTRIBUTE_DIRECTORY;
-    md.file_attributes() & FILE_ATTRIBUTE_DIRECTORY != 0
+ #[cfg(windows)]
+fn device_num<P: AsRef<Path>>(path: P) -> io::Result<u64> {
+    use winapi_util::{Handle, file};
+
+    let h = Handle::from_path_any(path)?;
+    file::information(h).map(|info| info.volume_serial_number())
+}
+
+#[cfg(not(any(unix, windows)))]
+fn device_num<P: AsRef<Path>>(_: P)-> io::Result<u64> {
+    Err(io::Error::new(
+        io::ErrorKind::Other,
+        "walkdir: same_file_system option not supported on this platform",
+    ))
 }

 #[cfg(test)]
@@ -1484,9 +1702,9 @@ mod tests {
    use std::path::Path;
    use std::sync::{Arc, Mutex};

-    use tempdir::TempDir;
+    use tempfile::{self, TempDir};

-    use super::{WalkBuilder, WalkState};
+    use super::{DirEntry, WalkBuilder, WalkState};

    fn wfile<P: AsRef<Path>>(path: P, contents: &str) {
        let mut file = File::create(path).unwrap();
@@ -1537,28 +1755,32 @@ mod tests {
        prefix: &Path,
        builder: &WalkBuilder,
    ) -> Vec<String> {
-        let paths = Arc::new(Mutex::new(vec![]));
-        let prefix = Arc::new(prefix.to_path_buf());
+        let mut paths = vec![];
+        for dent in walk_collect_entries_parallel(builder) {
+            let path = dent.path().strip_prefix(prefix).unwrap();
+            if path.as_os_str().is_empty() {
+                continue;
+            }
+            paths.push(normal_path(path.to_str().unwrap()));
+        }
+        paths.sort();
+        paths
+    }
+
+    fn walk_collect_entries_parallel(builder: &WalkBuilder) -> Vec<DirEntry> {
+        let dents = Arc::new(Mutex::new(vec![]));
        builder.build_parallel().run(|| {
-            let paths = paths.clone();
-            let prefix = prefix.clone();
+            let dents = dents.clone();
            Box::new(move |result| {
-                let dent = match result {
-                    Err(_) => return WalkState::Continue,
-                    Ok(dent) => dent,
-                };
-                let path = dent.path().strip_prefix(&**prefix).unwrap();
-                if path.as_os_str().is_empty() {
-                    return WalkState::Continue;
+                if let Ok(dent) = result {
+                    dents.lock().unwrap().push(dent);
                }
-                let mut paths = paths.lock().unwrap();
-                paths.push(normal_path(path.to_str().unwrap()));
                WalkState::Continue
            })
        });
-        let mut paths = paths.lock().unwrap();
-        paths.sort();
-        paths.to_vec()
+
+        let dents = dents.lock().unwrap();
+        dents.to_vec()
    }

    fn mkpaths(paths: &[&str]) -> Vec<String> {
@@ -1567,20 +1789,24 @@ mod tests {
        paths
    }

+    fn tmpdir(prefix: &str) -> TempDir {
+        tempfile::Builder::new().prefix(prefix).tempdir().unwrap()
+    }
+
    fn assert_paths(
        prefix: &Path,
        builder: &WalkBuilder,
        expected: &[&str],
    ) {
        let got = walk_collect(prefix, builder);
-        assert_eq!(got, mkpaths(expected));
+        assert_eq!(got, mkpaths(expected), "single threaded");
        let got = walk_collect_parallel(prefix, builder);
-        assert_eq!(got, mkpaths(expected));
+        assert_eq!(got, mkpaths(expected), "parallel");
    }

    #[test]
    fn no_ignores() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join("a/b/c"));
        mkdirp(td.path().join("x/y"));
        wfile(td.path().join("a/b/foo"), "");
@@ -1593,7 +1819,7 @@ mod tests {

    #[test]
    fn custom_ignore() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        let custom_ignore = ".customignore";
        mkdirp(td.path().join("a"));
        wfile(td.path().join(custom_ignore), "foo");
@@ -1609,7 +1835,7 @@ mod tests {

    #[test]
    fn custom_ignore_exclusive_use() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        let custom_ignore = ".customignore";
        mkdirp(td.path().join("a"));
        wfile(td.path().join(custom_ignore), "foo");
@@ -1629,7 +1855,7 @@ mod tests {

    #[test]
    fn gitignore() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join(".git"));
        mkdirp(td.path().join("a"));
        wfile(td.path().join(".gitignore"), "foo");
@@ -1645,7 +1871,7 @@ mod tests {

    #[test]
    fn explicit_ignore() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        let igpath = td.path().join(".not-an-ignore");
        mkdirp(td.path().join("a"));
        wfile(&igpath, "foo");
@@ -1661,7 +1887,7 @@ mod tests {

    #[test]
    fn explicit_ignore_exclusive_use() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        let igpath = td.path().join(".not-an-ignore");
        mkdirp(td.path().join("a"));
        wfile(&igpath, "foo");
@@ -1679,7 +1905,7 @@ mod tests {

    #[test]
    fn gitignore_parent() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join(".git"));
        mkdirp(td.path().join("a"));
        wfile(td.path().join(".gitignore"), "foo");
@@ -1692,7 +1918,7 @@ mod tests {

    #[test]
    fn max_depth() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join("a/b/c"));
        wfile(td.path().join("foo"), "");
        wfile(td.path().join("a/foo"), "");
@@ -1712,7 +1938,7 @@ mod tests {

    #[test]
    fn max_filesize() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join("a/b"));
        wfile_size(td.path().join("foo"), 0);
        wfile_size(td.path().join("bar"), 400);
@@ -1739,7 +1965,7 @@ mod tests {
    #[cfg(unix)] // because symlinks on windows are weird
    #[test]
    fn symlinks() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join("a/b"));
        symlink(td.path().join("a/b"), td.path().join("z"));
        wfile(td.path().join("a/b/foo"), "");
@@ -1753,10 +1979,31 @@ mod tests {
        ]);
    }

+    #[cfg(unix)] // because symlinks on windows are weird
+    #[test]
+    fn first_path_not_symlink() {
+        let td = tmpdir("walk-test-");
+        mkdirp(td.path().join("foo"));
+
+        let dents = WalkBuilder::new(td.path().join("foo"))
+            .build()
+            .into_iter()
+            .collect::<Result<Vec<_>, _>>()
+            .unwrap();
+        assert_eq!(1, dents.len());
+        assert!(!dents[0].path_is_symlink());
+
+        let dents = walk_collect_entries_parallel(
+            &WalkBuilder::new(td.path().join("foo")),
+        );
+        assert_eq!(1, dents.len());
+        assert!(!dents[0].path_is_symlink());
+    }
+
    #[cfg(unix)] // because symlinks on windows are weird
    #[test]
    fn symlink_loop() {
-        let td = TempDir::new("walk-test-").unwrap();
+        let td = tmpdir("walk-test-");
        mkdirp(td.path().join("a/b"));
        symlink(td.path().join("a"), td.path().join("a/b/c"));

@@ -1768,4 +2015,40 @@ mod tests {
            "a", "a/b",
        ]);
    }
+
+    // It's a little tricky to test the 'same_file_system' option since
+    // we need an environment with more than one file system. We adopt a
+    // heuristic where /sys is typically a distinct volume on Linux and roll
+    // with that.
+    #[test]
+    #[cfg(target_os = "linux")]
+    fn same_file_system() {
+        use super::device_num;
+
+        // If for some reason /sys doesn't exist or isn't a directory, just
+        // skip this test.
+        if !Path::new("/sys").is_dir() {
+            return;
+        }
+
+        // If our test directory actually isn't a different volume from /sys,
+        // then this test is meaningless and we shouldn't run it.
+        let td = tmpdir("walk-test-");
+        if device_num(td.path()).unwrap() == device_num("/sys").unwrap() {
+            return;
+        }
+
+        mkdirp(td.path().join("same_file"));
+        symlink("/sys", td.path().join("same_file").join("alink"));
+
+        // Create a symlink to sys and enable following symlinks. If the
+        // same_file_system option doesn't work, then this probably will hit a
+        // permission error. Otherwise, it should just skip over the symlink
+        // completely.
+        let mut builder = WalkBuilder::new(td.path());
+        builder.follow_links(true).same_file_system(true);
+        assert_paths(td.path(), &builder, &[
+            "same_file", "same_file/alink",
+        ]);
+    }
 }
--- a/pkg/brew/ripgrep-bin.rb
+++ b/pkg/brew/ripgrep-bin.rb
@@ -1,14 +1,14 @@
 class RipgrepBin < Formula
-  version '0.9.0'
+  version '0.10.0'
  desc "Recursively search directories for a regex pattern."
  homepage "https://github.com/BurntSushi/ripgrep"

  if OS.mac?
      url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-apple-darwin.tar.gz"
-      sha256 "36003ea8b62ad6274dc14140039f448cdf5026827d53cf24dad2d84005557a8c"
+      sha256 "32754b4173ac87a7bfffd436d601a49362676eb1841ab33440f2f49c002c8967"
  elsif OS.linux?
      url "https://github.com/BurntSushi/ripgrep/releases/download/#{version}/ripgrep-#{version}-x86_64-unknown-linux-musl.tar.gz"
-      sha256 "2eb4443e58f95051ff76ea036ed1faf940d5a04af4e7ff5a7dbd74576b907e99"
+      sha256 "c76080aa807a339b44139885d77d15ad60ab8cdd2c2fdaf345d0985625bc0f97"
  end

  conflicts_with "ripgrep"
--- a/scripts/copy-examples
+++ b/scripts/copy-examples
@@ -0,0 +1,33 @@
+#!/usr/bin/env python
+
+from __future__ import absolute_import, division, print_function
+import argparse
+import codecs
+import os.path
+import re
+
+RE_EACH_CODE_BLOCK = re.compile(
+    r'(?s)(?:```|\{\{< high rust[^>]+>\}\})[^\n]*\n(.*?)(?:```|\{\{< /high >\}\})'  # noqa
+)
+RE_MARKER = re.compile(r'^(?:# )?//([^/].*)$')
+RE_STRIP_COMMENT = re.compile(r'^# ?')
+
+if __name__ == '__main__':
+    p = argparse.ArgumentParser()
+    p.add_argument('--rust-file', default='src/cookbook.rs')
+    p.add_argument('--example-dir', default='grep/examples')
+    args = p.parse_args()
+
+    with codecs.open(args.rust_file, encoding='utf-8') as f:
+        rustcode = f.read()
+    for m in RE_EACH_CODE_BLOCK.finditer(rustcode):
+        lines = m.group(1).splitlines()
+        marker, codelines = lines[0], lines[1:]
+        m = RE_MARKER.search(marker)
+        if m is None:
+            continue
+
+        code = '\n'.join(RE_STRIP_COMMENT.sub('', line) for line in codelines)
+        fpath = os.path.join(args.example_dir, m.group(1))
+        with codecs.open(fpath, mode='w+', encoding='utf-8') as f:
+            print(code, file=f)
--- a/src/app.rs
+++ b/src/app.rs
@@ -9,16 +9,18 @@
 // is where we read clap's configuration from the end user's arguments and turn
 // it into a ripgrep-specific configuration type that is not coupled with clap.

-use clap::{self, App, AppSettings};
+use clap::{self, App, AppSettings, crate_authors, crate_version};
+use lazy_static::lazy_static;

 const ABOUT: &str = "
 ripgrep (rg) recursively searches your current directory for a regex pattern.
-By default, ripgrep will respect your `.gitignore` and automatically skip
-hidden files/directories and binary files.
+By default, ripgrep will respect your .gitignore and automatically skip hidden
+files/directories and binary files.

-ripgrep's regex engine uses finite automata and guarantees linear time
-searching. Because of this, features like backreferences and arbitrary
-lookaround are not supported.
+ripgrep's default regex engine uses finite automata and guarantees linear
+time searching. Because of this, features like backreferences and arbitrary
+look-around are not supported. However, if ripgrep is built with PCRE2, then
+the --pcre2 flag can be used to enable backreferences and look-around.

 ripgrep supports configuration files. Set RIPGREP_CONFIG_PATH to a
 configuration file. The file can specify one shell argument per line. Lines
@@ -79,7 +81,7 @@ pub fn app() -> App<'static, 'static> {
 /// Return the "long" format of ripgrep's version string.
 ///
 /// If a revision hash is given, then it is used. If one isn't given, then
-/// the RIPGREP_BUILD_GIT_HASH env var is inspect for it. If that isn't set,
+/// the RIPGREP_BUILD_GIT_HASH env var is inspected for it. If that isn't set,
 /// then a revision hash is not included in the version string returned.
 pub fn long_version(revision_hash: Option<&str>) -> String {
    // Do we have a git hash?
@@ -125,7 +127,7 @@ fn compile_cpu_features() -> Vec<&'static str> {
 }

 /// Returns the relevant CPU features enabled at runtime.
-#[cfg(all(ripgrep_runtime_cpu, target_arch = "x86_64"))]
+#[cfg(target_arch = "x86_64")]
 fn runtime_cpu_features() -> Vec<&'static str> {
    // This is kind of a dirty violation of abstraction, since it assumes
    // knowledge about what specific SIMD features are being used.
@@ -145,7 +147,7 @@ fn runtime_cpu_features() -> Vec<&'static str> {
 }

 /// Returns the relevant CPU features enabled at runtime.
-#[cfg(not(all(ripgrep_runtime_cpu, target_arch = "x86_64")))]
+#[cfg(not(target_arch = "x86_64"))]
 fn runtime_cpu_features() -> Vec<&'static str> {
    vec![]
 }
@@ -536,9 +538,14 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
    // The positional arguments must be defined first and in order.
    arg_pattern(&mut args);
    arg_path(&mut args);
-    // Flags can be defined in any order, but we do it alphabetically.
+    // Flags can be defined in any order, but we do it alphabetically. Note
+    // that each function may define multiple flags. For example,
+    // `flag_encoding` defines `--encoding` and `--no-encoding`. Most `--no`
+    // flags are hidden and merely mentioned in the docs of the corresponding
+    // "positive" flag.
    flag_after_context(&mut args);
    flag_before_context(&mut args);
+    flag_block_buffered(&mut args);
    flag_byte_offset(&mut args);
    flag_case_sensitive(&mut args);
    flag_color(&mut args);
@@ -566,6 +573,7 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
    flag_ignore_file(&mut args);
    flag_invert_match(&mut args);
    flag_json(&mut args);
+    flag_line_buffered(&mut args);
    flag_line_number(&mut args);
    flag_line_regexp(&mut args);
    flag_max_columns(&mut args);
@@ -582,14 +590,16 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
    flag_no_ignore_parent(&mut args);
    flag_no_ignore_vcs(&mut args);
    flag_no_messages(&mut args);
+    flag_no_pcre2_unicode(&mut args);
    flag_null(&mut args);
    flag_null_data(&mut args);
+    flag_one_file_system(&mut args);
    flag_only_matching(&mut args);
    flag_path_separator(&mut args);
    flag_passthru(&mut args);
    flag_pcre2(&mut args);
-    flag_pcre2_unicode(&mut args);
    flag_pre(&mut args);
+    flag_pre_glob(&mut args);
    flag_pretty(&mut args);
    flag_quiet(&mut args);
    flag_regex_size_limit(&mut args);
@@ -598,6 +608,8 @@ pub fn all_args_and_flags() -> Vec<RGArg> {
    flag_search_zip(&mut args);
    flag_smart_case(&mut args);
    flag_sort_files(&mut args);
+    flag_sort(&mut args);
+    flag_sortr(&mut args);
    flag_stats(&mut args);
    flag_text(&mut args);
    flag_threads(&mut args);
@@ -677,13 +689,49 @@ This overrides the --context flag.
    args.push(arg);
 }

+fn flag_block_buffered(args: &mut Vec<RGArg>) {
+    const SHORT: &str = "Force block buffering.";
+    const LONG: &str = long!("\
+When enabled, ripgrep will use block buffering. That is, whenever a matching
+line is found, it will be written to an in-memory buffer and will not be
+written to stdout until the buffer reaches a certain size. This is the default
+when ripgrep's stdout is redirected to a pipeline or a file. When ripgrep's
+stdout is connected to a terminal, line buffering will be used. Forcing block
+buffering can be useful when dumping a large amount of contents to a terminal.
+
+Forceful block buffering can be disabled with --no-block-buffered. Note that
+using --no-block-buffered causes ripgrep to revert to its default behavior of
+automatically detecting the buffering strategy. To force line buffering, use
+the --line-buffered flag.
+");
+    let arg = RGArg::switch("block-buffered")
+        .help(SHORT).long_help(LONG)
+        .overrides("no-block-buffered")
+        .overrides("line-buffered")
+        .overrides("no-line-buffered");
+    args.push(arg);
+
+    let arg = RGArg::switch("no-block-buffered")
+        .hidden()
+        .overrides("block-buffered")
+        .overrides("line-buffered")
+        .overrides("no-line-buffered");
+    args.push(arg);
+}
+
 fn flag_byte_offset(args: &mut Vec<RGArg>) {
    const SHORT: &str =
        "Print the 0-based byte offset for each matching line.";
    const LONG: &str = long!("\
-Print the 0-based byte offset within the input file
-before each line of output. If -o (--only-matching) is
-specified, print the offset of the matching part itself.
+Print the 0-based byte offset within the input file before each line of output.
+If -o (--only-matching) is specified, print the offset of the matching part
+itself.
+
+If ripgrep does transcoding, then the byte offset is in terms of the the result
+of transcoding and not the original data. This applies similarly to another
+transformation on the source, such as decompression or a --pre filter. Note
+that when the PCRE2 regex engine is used, then UTF-8 transcoding is done by
+default.
 ");
    let arg = RGArg::switch("byte-offset").short("b")
        .help(SHORT).long_help(LONG);
@@ -741,17 +789,17 @@ to one of eight choices: red, blue, green, cyan, magenta, yellow, white and
 black. Styles are limited to nobold, bold, nointense, intense, nounderline
 or underline.

-The format of the flag is `{type}:{attribute}:{value}`. `{type}` should be
-one of path, line, column or match. `{attribute}` can be fg, bg or style.
-`{value}` is either a color (for fg and bg) or a text style. A special format,
-`{type}:none`, will clear all color settings for `{type}`.
+The format of the flag is '{type}:{attribute}:{value}'. '{type}' should be
+one of path, line, column or match. '{attribute}' can be fg, bg or style.
+'{value}' is either a color (for fg and bg) or a text style. A special format,
+'{type}:none', will clear all color settings for '{type}'.

 For example, the following command will change the match color to magenta and
 the background color for line numbers to yellow:

    rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo.

-Extended colors can be used for `{value}` when the terminal supports ANSI color
+Extended colors can be used for '{value}' when the terminal supports ANSI color
 sequences. These are specified as either 'x' (256-color) or 'x,x,x' (24-bit
 truecolor) where x is a number between 0 and 255 inclusive. x may be given as
 a normal decimal number or a hexadecimal number, which is prefixed by `0x`.
@@ -969,7 +1017,7 @@ fn flag_files(args: &mut Vec<RGArg>) {
    const SHORT: &str = "Print each file that would be searched.";
    const LONG: &str = long!("\
 Print each file that would be searched without actually performing the search.
-This is useful to determine whether a particular file is being search or not.
+This is useful to determine whether a particular file is being searched or not.
 ");
    let arg = RGArg::switch("files")
        .help(SHORT).long_help(LONG)
@@ -1231,6 +1279,37 @@ The JSON Lines format can be disabled with --no-json.
    args.push(arg);
 }

+fn flag_line_buffered(args: &mut Vec<RGArg>) {
+    const SHORT: &str = "Force line buffering.";
+    const LONG: &str = long!("\
+When enabled, ripgrep will use line buffering. That is, whenever a matching
+line is found, it will be flushed to stdout immediately. This is the default
+when ripgrep's stdout is connected to a terminal, but otherwise, ripgrep will
+use block buffering, which is typically faster. This flag forces ripgrep to
+use line buffering even if it would otherwise use block buffering. This is
+typically useful in shell pipelines, e.g.,
+'tail -f something.log | rg foo --line-buffered | rg bar'.
+
+Forceful line buffering can be disabled with --no-line-buffered. Note that
+using --no-line-buffered causes ripgrep to revert to its default behavior of
+automatically detecting the buffering strategy. To force block buffering, use
+the --block-buffered flag.
+");
+    let arg = RGArg::switch("line-buffered")
+        .help(SHORT).long_help(LONG)
+        .overrides("no-line-buffered")
+        .overrides("block-buffered")
+        .overrides("no-block-buffered");
+    args.push(arg);
+
+    let arg = RGArg::switch("no-line-buffered")
+        .hidden()
+        .overrides("line-buffered")
+        .overrides("block-buffered")
+        .overrides("no-block-buffered");
+    args.push(arg);
+}
+
 fn flag_line_number(args: &mut Vec<RGArg>) {
    const SHORT: &str = "Show line numbers.";
    const LONG: &str = long!("\
@@ -1568,6 +1647,48 @@ This flag can be disabled with the --messages flag.
    args.push(arg);
 }

+fn flag_no_pcre2_unicode(args: &mut Vec<RGArg>) {
+    const SHORT: &str = "Disable Unicode mode for PCRE2 matching.";
+    const LONG: &str = long!("\
+When PCRE2 matching is enabled, this flag will disable Unicode mode, which is
+otherwise enabled by default. If PCRE2 matching is not enabled, then this flag
+has no effect.
+
+When PCRE2's Unicode mode is enabled, several different types of patterns
+become Unicode aware. This includes '\\b', '\\B', '\\w', '\\W', '\\d', '\\D',
+'\\s' and '\\S'. Similarly, the '.' meta character will match any Unicode
+codepoint instead of any byte. Caseless matching will also use Unicode simple
+case folding instead of ASCII-only case insensitivity.
+
+Unicode mode in PCRE2 represents a critical trade off in the user experience
+of ripgrep. In particular, unlike the default regex engine, PCRE2 does not
+support the ability to search possibly invalid UTF-8 with Unicode features
+enabled. Instead, PCRE2 *requires* that everything it searches when Unicode
+mode is enabled is valid UTF-8. (Or valid UTF-16/UTF-32, but for the purposes
+of ripgrep, we only discuss UTF-8.) This means that if you have PCRE2's Unicode
+mode enabled and you attempt to search invalid UTF-8, then the search for that
+file will halt and print an error. For this reason, when PCRE2's Unicode mode
+is enabled, ripgrep will automatically \"fix\" invalid UTF-8 sequences by
+replacing them with the Unicode replacement codepoint.
+
+If you would rather see the encoding errors surfaced by PCRE2 when Unicode mode
+is enabled, then pass the --no-encoding flag to disable all transcoding.
+
+Related flags: --pcre2
+
+This flag can be disabled with --pcre2-unicode.
+");
+    let arg = RGArg::switch("no-pcre2-unicode")
+        .help(SHORT).long_help(LONG)
+        .overrides("pcre2-unicode");
+    args.push(arg);
+
+    let arg = RGArg::switch("pcre2-unicode")
+        .hidden()
+        .overrides("no-pcre2-unicode");
+    args.push(arg);
+}
+
 fn flag_null(args: &mut Vec<RGArg>) {
    const SHORT: &str = "Print a NUL byte after file paths.";
    const LONG: &str = long!("\
@@ -1604,6 +1725,33 @@ Using this flag implies -a/--text.
    args.push(arg);
 }

+fn flag_one_file_system(args: &mut Vec<RGArg>) {
+    const SHORT: &str =
+        "Do not descend into directories on other file systems.";
+    const LONG: &str = long!("\
+When enabled, ripgrep will not cross file system boundaries relative to where
+the search started from.
+
+Note that this applies to each path argument given to ripgrep. For example, in
+the command 'rg --one-file-system /foo/bar /quux/baz', ripgrep will search both
+'/foo/bar' and '/quux/baz' even if they are on different file systems, but will
+not cross a file system boundary when traversing each path's directory tree.
+
+This is similar to find's '-xdev' or '-mount' flag.
+
+This flag can be disabled with --no-one-file-system.
+");
+    let arg = RGArg::switch("one-file-system")
+        .help(SHORT).long_help(LONG)
+        .overrides("no-one-file-system");
+    args.push(arg);
+
+    let arg = RGArg::switch("no-one-file-system")
+        .hidden()
+        .overrides("one-file-system");
+    args.push(arg);
+}
+
 fn flag_only_matching(args: &mut Vec<RGArg>) {
    const SHORT: &str = "Print only matches parts of a line.";
    const LONG: &str = long!("\
@@ -1658,6 +1806,8 @@ Note that PCRE2 is an optional ripgrep feature. If PCRE2 wasn't included in
 your build of ripgrep, then using this flag will result in ripgrep printing
 an error message and exiting.

+Related flags: --no-pcre2-unicode
+
 This flag can be disabled with --no-pcre2.
 ");
    let arg = RGArg::switch("pcre2").short("P")
@@ -1671,43 +1821,94 @@ This flag can be disabled with --no-pcre2.
    args.push(arg);
 }

-fn flag_pcre2_unicode(args: &mut Vec<RGArg>) {
-    const SHORT: &str = "Enable Unicode mode for PCRE2 matching.";
+fn flag_pre(args: &mut Vec<RGArg>) {
+    const SHORT: &str = "search outputs of COMMAND FILE for each FILE";
    const LONG: &str = long!("\
-When PCRE2 matching is enabled, this flag will enable Unicode mode. If PCRE2
-matching is not enabled, then this flag has no effect.
+For each input FILE, search the standard output of COMMAND FILE rather than the
+contents of FILE. This option expects the COMMAND program to either be an
+absolute path or to be available in your PATH. Either an empty string COMMAND
+or the `--no-pre` flag will disable this behavior.

-This flag is enabled by default when PCRE2 matching is enabled.
+    WARNING: When this flag is set, ripgrep will unconditionally spawn a
+    process for every file that is searched. Therefore, this can incur an
+    unnecessarily large performance penalty if you don't otherwise need the
+    flexibility offered by this flag.

-When PCRE2's Unicode mode is enabled several different types of patterns become
-Unicode aware. This includes '\\b', '\\B', '\\w', '\\W', '\\d', '\\D', '\\s'
-and '\\S'. Similarly, the '.' meta character will match any Unicode codepoint
-instead of any byte. Caseless matching will also use Unicode simple case
-folding instead of ASCII-only case insensitivity.
+A preprocessor is not run when ripgrep is searching stdin.

-Unicode mode in PCRE2 represents a critical trade off in the user experience
-of ripgrep. In particular, unlike the default regex engine, PCRE2 does not
-support the ability to search possibly invalid UTF-8 with Unicode features
-enabled. Instead, PCRE2 *requires* that everything it searches when Unicode
-mode is enabled is valid UTF-8. (Or valid UTF-16/UTF-32, but for the purposes
-of ripgrep, we only discuss UTF-8.) This means that if you have PCRE2's Unicode
-mode enabled and you attempt to search invalid UTF-8, then the search for that
-file will halt and print an error. For this reason, when PCRE2's Unicode mode
-is enabled, ripgrep will automatically \"fix\" invalid UTF-8 sequences by
-replacing them with the Unicode replacement codepoint.
+When searching over sets of files that may require one of several decoders
+as preprocessors, COMMAND should be a wrapper program or script which first
+classifies FILE based on magic numbers/content or based on the FILE name and
+then dispatches to an appropriate preprocessor. Each COMMAND also has its
+standard input connected to FILE for convenience.

-If you would rather see the encoding errors surfaced by PCRE2 when Unicode mode
-is enabled, then pass the --no-encoding flag to disable all transcoding.
+For example, a shell script for COMMAND might look like:

-This flag can be disabled with --no-pcre2-unicode.
+    case \"$1\" in
+    *.pdf)
+        exec pdftotext \"$1\" -
+        ;;
+    *)
+        case $(file \"$1\") in
+        *Zstandard*)
+            exec pzstd -cdq
+            ;;
+        *)
+            exec cat
+            ;;
+        esac
+        ;;
+    esac
+
+The above script uses `pdftotext` to convert a PDF file to plain text. For
+all other files, the script uses the `file` utility to sniff the type of the
+file based on its contents. If it is a compressed file in the Zstandard format,
+then `pzstd` is used to decompress the contents to stdout.
+
+This overrides the -z/--search-zip flag.
 ");
-    let arg = RGArg::switch("pcre2-unicode")
-        .help(SHORT).long_help(LONG);
+    let arg = RGArg::flag("pre", "COMMAND")
+        .help(SHORT).long_help(LONG)
+        .overrides("no-pre")
+        .overrides("search-zip");
    args.push(arg);

-    let arg = RGArg::switch("no-pcre2-unicode")
+    let arg = RGArg::switch("no-pre")
        .hidden()
-        .overrides("pcre2-unicode");
+        .overrides("pre");
+    args.push(arg);
+}
+
+fn flag_pre_glob(args: &mut Vec<RGArg>) {
+    const SHORT: &str =
+        "Include or exclude files from a preprocessing command.";
+    const LONG: &str = long!("\
+This flag works in conjunction with the --pre flag. Namely, when one or more
+--pre-glob flags are given, then only files that match the given set of globs
+will be handed to the command specified by the --pre flag. Any non-matching
+files will be searched without using the preprocessor command.
+
+This flag is useful when searching many files with the --pre flag. Namely,
+it permits the ability to avoid process overhead for files that don't need
+preprocessing. For example, given the following shell script, 'pre-pdftotext':
+
+    #!/bin/sh
+
+    pdftotext \"$1\" -
+
+then it is possible to use '--pre pre-pdftotext --pre-glob \'*.pdf\'' to make
+it so ripgrep only executes the 'pre-pdftotext' command on files with a '.pdf'
+extension.
+
+Multiple --pre-glob flags may be used. Globbing rules match .gitignore globs.
+Precede a glob with a ! to exclude it.
+
+This flag has no effect if the --pre flag is not used.
+");
+    let arg = RGArg::flag("pre-glob", "GLOB")
+        .help(SHORT).long_help(LONG)
+        .multiple()
+        .allow_leading_hyphen();
    args.push(arg);
 }

@@ -1816,64 +2017,6 @@ This flag can be disabled with --no-search-zip.
    args.push(arg);
 }

-fn flag_pre(args: &mut Vec<RGArg>) {
-    const SHORT: &str = "search outputs of COMMAND FILE for each FILE";
-    const LONG: &str = long!("\
-For each input FILE, search the standard output of COMMAND FILE rather than the
-contents of FILE. This option expects the COMMAND program to either be an
-absolute path or to be available in your PATH. Either an empty string COMMAND
-or the `--no-pre` flag will disable this behavior.
-
-    WARNING: When this flag is set, ripgrep will unconditionally spawn a
-    process for every file that is searched. Therefore, this can incur an
-    unnecessarily large performance penalty if you don't otherwise need the
-    flexibility offered by this flag.
-
-A preprocessor is not run when ripgrep is searching stdin.
-
-When searching over sets of files that may require one of several decoders
-as preprocessors, COMMAND should be a wrapper program or script which first
-classifies FILE based on magic numbers/content or based on the FILE name and
-then dispatches to an appropriate preprocessor. Each COMMAND also has its
-standard input connected to FILE for convenience.
-
-For example, a shell script for COMMAND might look like:
-
-    case \"$1\" in
-    *.pdf)
-        exec pdftotext \"$1\" -
-        ;;
-    *)
-        case $(file \"$1\") in
-        *Zstandard*)
-            exec pzstd -cdq
-            ;;
-        *)
-            exec cat
-            ;;
-        esac
-        ;;
-    esac
-
-The above script uses `pdftotext` to convert a PDF file to plain text. For
-all other files, the script uses the `file` utility to sniff the type of the
-file based on its contents. If it is a compressed file in the Zstandard format,
-then `pzstd` is used to decompress the contents to stdout.
-
-This overrides the -z/--search-zip flag.
-");
-    let arg = RGArg::flag("pre", "COMMAND")
-        .help(SHORT).long_help(LONG)
-        .overrides("no-pre")
-        .overrides("search-zip");
-    args.push(arg);
-
-    let arg = RGArg::switch("no-pre")
-        .hidden()
-        .overrides("pre");
-    args.push(arg);
-}
-
 fn flag_smart_case(args: &mut Vec<RGArg>) {
    const SHORT: &str = "Smart case search.";
    const LONG: &str = long!("\
@@ -1890,8 +2033,10 @@ This overrides the -s/--case-sensitive and -i/--ignore-case flags.
 }

 fn flag_sort_files(args: &mut Vec<RGArg>) {
-    const SHORT: &str = "Sort results by file path. Implies --threads=1.";
+    const SHORT: &str = "DEPRECATED";
    const LONG: &str = long!("\
+DEPRECATED: Use --sort or --sortr instead.
+
 Sort results by file path. Note that this currently disables all parallelism
 and runs search in a single thread.

@@ -1899,12 +2044,83 @@ This flag can be disabled with --no-sort-files.
 ");
    let arg = RGArg::switch("sort-files")
        .help(SHORT).long_help(LONG)
-        .overrides("no-sort-files");
+        .hidden()
+        .overrides("no-sort-files")
+        .overrides("sort")
+        .overrides("sortr");
    args.push(arg);

    let arg = RGArg::switch("no-sort-files")
        .hidden()
-        .overrides("sort-files");
+        .overrides("sort-files")
+        .overrides("sort")
+        .overrides("sortr");
+    args.push(arg);
+}
+
+fn flag_sort(args: &mut Vec<RGArg>) {
+    const SHORT: &str =
+        "Sort results in ascending order. Implies --threads=1.";
+    const LONG: &str = long!("\
+This flag enables sorting of results in ascending order. The possible values
+for this flag are:
+
+    path        Sort by file path.
+    modified    Sort by the last modified time on a file.
+    accessed    Sort by the last accessed time on a file.
+    created     Sort by the creation time on a file.
+    none        Do not sort results.
+
+If the sorting criteria isn't available on your system (for example, creation
+time is not available on ext4 file systems), then ripgrep will attempt to
+detect this and print an error without searching any results. Otherwise, the
+sort order is unspecified.
+
+To sort results in reverse or descending order, use the --sortr flag. Also,
+this flag overrides --sortr.
+
+Note that sorting results currently always forces ripgrep to abandon
+parallelism and run in a single thread.
+");
+    let arg = RGArg::flag("sort", "SORTBY")
+        .help(SHORT).long_help(LONG)
+        .possible_values(&["path", "modified", "accessed", "created", "none"])
+        .overrides("sortr")
+        .overrides("sort-files")
+        .overrides("no-sort-files");
+    args.push(arg);
+}
+
+fn flag_sortr(args: &mut Vec<RGArg>) {
+    const SHORT: &str =
+        "Sort results in descending order. Implies --threads=1.";
+    const LONG: &str = long!("\
+This flag enables sorting of results in descending order. The possible values
+for this flag are:
+
+    path        Sort by file path.
+    modified    Sort by the last modified time on a file.
+    accessed    Sort by the last accessed time on a file.
+    created     Sort by the creation time on a file.
+    none        Do not sort results.
+
+If the sorting criteria isn't available on your system (for example, creation
+time is not available on ext4 file systems), then ripgrep will attempt to
+detect this and print an error without searching any results. Otherwise, the
+sort order is unspecified.
+
+To sort results in ascending order, use the --sort flag. Also, this flag
+overrides --sort.
+
+Note that sorting results currently always forces ripgrep to abandon
+parallelism and run in a single thread.
+");
+    let arg = RGArg::flag("sortr", "SORTBY")
+        .help(SHORT).long_help(LONG)
+        .possible_values(&["path", "modified", "accessed", "created", "none"])
+        .overrides("sort")
+        .overrides("sort-files")
+        .overrides("no-sort-files");
    args.push(arg);
 }

--- a/src/args.rs
+++ b/src/args.rs
@@ -1,13 +1,14 @@
 use std::cmp;
 use std::env;
 use std::ffi::OsStr;
-use std::fs::File;
-use std::io::{self, BufRead};
+use std::fs;
+use std::io;
 use std::path::{Path, PathBuf};
 use std::sync::Arc;
+use std::time::SystemTime;

-use atty;
 use clap;
+use grep::cli;
 use grep::matcher::LineTerminator;
 #[cfg(feature = "pcre2")]
 use grep::pcre2::{
@@ -19,6 +20,7 @@ use grep::printer::{
    JSON, JSONBuilder,
    Standard, StandardBuilder,
    Summary, SummaryBuilder, SummaryKind,
+    default_color_specs,
 };
 use grep::regex::{
    RegexMatcher as RustRegexMatcher,
@@ -32,22 +34,22 @@ use ignore::types::{FileTypeDef, Types, TypesBuilder};
 use ignore::{Walk, WalkBuilder, WalkParallel};
 use log;
 use num_cpus;
-use path_printer::{PathPrinter, PathPrinterBuilder};
-use regex::{self, Regex};
-use same_file::Handle;
+use regex;
 use termcolor::{
    WriteColor,
-    BufferedStandardStream, BufferWriter, ColorChoice, StandardStream,
+    BufferWriter, ColorChoice,
 };

-use app;
-use config;
-use logger::Logger;
-use messages::{set_messages, set_ignore_messages};
-use search::{PatternMatcher, Printer, SearchWorker, SearchWorkerBuilder};
-use subject::SubjectBuilder;
-use unescape::{escape, unescape};
-use Result;
+use crate::app;
+use crate::config;
+use crate::logger::Logger;
+use crate::messages::{set_messages, set_ignore_messages};
+use crate::path_printer::{PathPrinter, PathPrinterBuilder};
+use crate::search::{
+    PatternMatcher, Printer, SearchWorker, SearchWorkerBuilder,
+};
+use crate::subject::SubjectBuilder;
+use crate::Result;

 /// The command that ripgrep should execute based on the command line
 /// configuration.
@@ -285,6 +287,7 @@ impl Args {
        builder
            .json_stats(self.matches().is_present("json"))
            .preprocessor(self.matches().preprocessor())
+            .preprocessor_globs(self.matches().preprocessor_globs()?)
            .search_zip(self.matches().is_present("search-zip"));
        Ok(builder.build(matcher, searcher, printer))
    }
@@ -307,20 +310,20 @@ impl Args {
    /// file or a stream such as stdin.
    pub fn subject_builder(&self) -> SubjectBuilder {
        let mut builder = SubjectBuilder::new();
-        builder
-            .strip_dot_prefix(self.using_default_path())
-            .skip(self.matches().stdout_handle());
+        builder.strip_dot_prefix(self.using_default_path());
        builder
    }

    /// Execute the given function with a writer to stdout that enables color
    /// support based on the command line configuration.
-    pub fn stdout(&self) -> Box<WriteColor + Send> {
-        let color_choice = self.matches().color_choice();
-        if atty::is(atty::Stream::Stdout) {
-            Box::new(StandardStream::stdout(color_choice))
+    pub fn stdout(&self) -> cli::StandardStream {
+        let color = self.matches().color_choice();
+        if self.matches().is_present("line-buffered") {
+            cli::stdout_buffered_line(color)
+        } else if self.matches().is_present("block-buffered") {
+            cli::stdout_buffered_block(color)
        } else {
-            Box::new(BufferedStandardStream::stdout(color_choice))
+            cli::stdout(color)
        }
    }

@@ -360,6 +363,120 @@ enum OutputKind {
    JSON,
 }

+/// The sort criteria, if present.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+struct SortBy {
+    /// Whether to reverse the sort criteria (i.e., descending order).
+    reverse: bool,
+    /// The actual sorting criteria.
+    kind: SortByKind,
+}
+
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+enum SortByKind {
+    /// No sorting at all.
+    None,
+    /// Sort by path.
+    Path,
+    /// Sort by last modified time.
+    LastModified,
+    /// Sort by last accessed time.
+    LastAccessed,
+    /// Sort by creation time.
+    Created,
+}
+
+impl SortBy {
+    fn asc(kind: SortByKind) -> SortBy {
+        SortBy { reverse: false, kind: kind }
+    }
+
+    fn desc(kind: SortByKind) -> SortBy {
+        SortBy { reverse: true, kind: kind }
+    }
+
+    fn none() -> SortBy {
+        SortBy::asc(SortByKind::None)
+    }
+
+    /// Try to check that the sorting criteria selected is actually supported.
+    /// If it isn't, then an error is returned.
+    fn check(&self) -> Result<()> {
+        match self.kind {
+            SortByKind::None | SortByKind::Path => {}
+            SortByKind::LastModified => {
+                env::current_exe()?.metadata()?.modified()?;
+            }
+            SortByKind::LastAccessed => {
+                env::current_exe()?.metadata()?.accessed()?;
+            }
+            SortByKind::Created => {
+                env::current_exe()?.metadata()?.created()?;
+            }
+        }
+        Ok(())
+    }
+
+    fn configure_walk_builder(self, builder: &mut WalkBuilder) {
+        // This isn't entirely optimal. In particular, we will wind up issuing
+        // a stat for many files redundantly. Aside from having potentially
+        // inconsistent results with respect to sorting, this is also slow.
+        // We could fix this here at the expense of memory by caching stat
+        // calls. A better fix would be to find a way to push this down into
+        // directory traversal itself, but that's a somewhat nasty change.
+        match self.kind {
+            SortByKind::None => {}
+            SortByKind::Path => {
+                if self.reverse {
+                    builder.sort_by_file_name(|a, b| a.cmp(b).reverse());
+                } else {
+                    builder.sort_by_file_name(|a, b| a.cmp(b));
+                }
+            }
+            SortByKind::LastModified => {
+                builder.sort_by_file_path(move |a, b| {
+                    sort_by_metadata_time(
+                        a, b,
+                        self.reverse,
+                        |md| md.modified(),
+                    )
+                });
+            }
+            SortByKind::LastAccessed => {
+                builder.sort_by_file_path(move |a, b| {
+                    sort_by_metadata_time(
+                        a, b,
+                        self.reverse,
+                        |md| md.accessed(),
+                    )
+                });
+            }
+            SortByKind::Created => {
+                builder.sort_by_file_path(move |a, b| {
+                    sort_by_metadata_time(
+                        a, b,
+                        self.reverse,
+                        |md| md.created(),
+                    )
+                });
+            }
+        }
+    }
+}
+
+impl SortByKind {
+    fn new(kind: &str) -> SortByKind {
+        match kind {
+            "none" => SortByKind::None,
+            "path" => SortByKind::Path,
+            "modified" => SortByKind::LastModified,
+            "accessed" => SortByKind::LastAccessed,
+            "created" => SortByKind::Created,
+            _ => SortByKind::None,
+        }
+    }
+}
+
 impl ArgMatches {
    /// Create an ArgMatches from clap's parse result.
    fn new(clap_matches: clap::ArgMatches<'static>) -> ArgMatches {
@@ -376,7 +493,9 @@ impl ArgMatches {
    fn reconfigure(self) -> ArgMatches {
        // If the end user says no config, then respect it.
        if self.is_present("no-config") {
-            debug!("not reading config files because --no-config is present");
+            log::debug!(
+                "not reading config files because --no-config is present"
+            );
            return self;
        }
        // If the user wants ripgrep to use a config file, then parse args
@@ -390,7 +509,7 @@ impl ArgMatches {
            args.insert(0, bin);
        }
        args.extend(cliargs);
-        debug!("final argv: {:?}", args);
+        log::debug!("final argv: {:?}", args);
        ArgMatches::new(app::app().get_matches_from(args))
    }

@@ -500,7 +619,10 @@ impl ArgMatches {
        if let Some(limit) = self.dfa_size_limit()? {
            builder.dfa_size_limit(limit);
        }
-        Ok(builder.build(&patterns.join("|"))?)
+        match builder.build(&patterns.join("|")) {
+            Ok(m) => Ok(m),
+            Err(err) => Err(From::from(suggest_multiline(err.to_string()))),
+        }
    }

    /// Build a matcher using PCRE2.
@@ -515,17 +637,17 @@ impl ArgMatches {
            .caseless(self.case_insensitive())
            .multi_line(true)
            .word(self.is_present("word-regexp"));
-        // For whatever reason, the JIT craps out during compilation with a
-        // "no more memory" error on 32 bit systems. So don't use it there.
+        // For whatever reason, the JIT craps out during regex compilation with
+        // a "no more memory" error on 32 bit systems. So don't use it there.
        if !cfg!(target_pointer_width = "32") {
-            builder.jit(true);
+            builder.jit_if_available(true);
        }
        if self.pcre2_unicode() {
            builder.utf(true).ucp(true);
            if self.encoding()?.is_some() {
                // SAFETY: If an encoding was specified, then we're guaranteed
                // to get valid UTF-8, so we can disable PCRE2's UTF checking.
-                // (Feeding invalid UTF-8 to PCRE2 is UB.)
+                // (Feeding invalid UTF-8 to PCRE2 is undefined behavior.)
                unsafe {
                    builder.disable_utf_check();
                }
@@ -663,6 +785,8 @@ impl ArgMatches {
            .follow_links(self.is_present("follow"))
            .max_filesize(self.max_file_size()?)
            .threads(self.threads()?)
+            .same_file_system(self.is_present("one-file-system"))
+            .skip_stdout(true)
            .overrides(self.overrides()?)
            .types(self.types()?)
            .hidden(!self.hidden())
@@ -677,9 +801,9 @@ impl ArgMatches {
        if !self.no_ignore() {
            builder.add_custom_ignore_filename(".rgignore");
        }
-        if self.is_present("sort-files") {
-            builder.sort_by_file_name(|a, b| a.cmp(b));
-        }
+        let sortby = self.sort_by()?;
+        sortby.check()?;
+        sortby.configure_walk_builder(&mut builder);
        Ok(builder)
    }
 }
@@ -738,7 +862,7 @@ impl ArgMatches {
        } else if preference == "ansi" {
            ColorChoice::AlwaysAnsi
        } else if preference == "auto" {
-            if atty::is(atty::Stream::Stdout) || self.is_present("pretty") {
+            if cli::is_tty_stdout() || self.is_present("pretty") {
                ColorChoice::Auto
            } else {
                ColorChoice::Never
@@ -754,15 +878,7 @@ impl ArgMatches {
    /// is returned.
    fn color_specs(&self) -> Result<ColorSpecs> {
        // Start with a default set of color specs.
-        let mut specs = vec![
-            #[cfg(unix)]
-            "path:fg:magenta".parse().unwrap(),
-            #[cfg(windows)]
-            "path:fg:cyan".parse().unwrap(),
-            "line:fg:green".parse().unwrap(),
-            "match:fg:red".parse().unwrap(),
-            "match:style:bold".parse().unwrap(),
-        ];
+        let mut specs = default_color_specs();
        for spec_str in self.values_of_lossy_vec("colors") {
            specs.push(spec_str.parse()?);
        }
@@ -798,9 +914,9 @@ impl ArgMatches {
    ///
    /// If one was not provided, the default `--` is returned.
    fn context_separator(&self) -> Vec<u8> {
-        match self.value_of_lossy("context-separator") {
+        match self.value_of_os("context-separator") {
            None => b"--".to_vec(),
-            Some(sep) => unescape(&sep),
+            Some(sep) => cli::unescape_os(&sep),
        }
    }

@@ -875,7 +991,7 @@ impl ArgMatches {
        if self.is_present("no-heading") || self.is_present("vimgrep") {
            false
        } else {
-            atty::is(atty::Stream::Stdout)
+            cli::is_tty_stdout()
            || self.is_present("heading")
            || self.is_present("pretty")
        }
@@ -927,7 +1043,7 @@ impl ArgMatches {
        // generally want to show line numbers by default when printing to a
        // tty for human consumption, except for one interesting case: when
        // we're only searching stdin. This makes pipelines work as expected.
-        (atty::is(atty::Stream::Stdout) && !self.is_only_stdin(paths))
+        (cli::is_tty_stdout() && !self.is_only_stdin(paths))
        || self.is_present("line-number")
        || self.is_present("column")
        || self.is_present("pretty")
@@ -1062,8 +1178,7 @@ impl ArgMatches {
        let file_is_stdin = self.values_of_os("file")
            .map_or(false, |mut files| files.any(|f| f == "-"));
        let search_cwd =
-            atty::is(atty::Stream::Stdin)
-            || !stdin_is_readable()
+            !cli::is_readable_stdin()
            || (self.is_present("file") && file_is_stdin)
            || self.is_present("files")
            || self.is_present("type-list");
@@ -1079,9 +1194,9 @@ impl ArgMatches {
    /// If the provided path separator is more than a single byte, then an
    /// error is returned.
    fn path_separator(&self) -> Result<Option<u8>> {
-        let sep = match self.value_of_lossy("path-separator") {
+        let sep = match self.value_of_os("path-separator") {
            None => return Ok(None),
-            Some(sep) => unescape(&sep),
+            Some(sep) => cli::unescape_os(&sep),
        };
        if sep.is_empty() {
            Ok(None)
@@ -1092,7 +1207,7 @@ impl ArgMatches {
                 In some shells on Windows '/' is automatically \
                 expanded. Use '//' instead.",
                 sep.len(),
-                 escape(&sep),
+                 cli::escape(&sep),
            )))
        } else {
            Ok(Some(sep[0]))
@@ -1139,18 +1254,12 @@ impl ArgMatches {
                }
            }
        }
-        if let Some(files) = self.values_of_os("file") {
-            for file in files {
-                if file == "-" {
-                    let stdin = io::stdin();
-                    for line in stdin.lock().lines() {
-                        pats.push(self.pattern_from_str(&line?));
-                    }
+        if let Some(paths) = self.values_of_os("file") {
+            for path in paths {
+                if path == "-" {
+                    pats.extend(cli::patterns_from_stdin()?);
                } else {
-                    let f = File::open(file)?;
-                    for line in io::BufReader::new(f).lines() {
-                        pats.push(self.pattern_from_str(&line?));
-                    }
+                    pats.extend(cli::patterns_from_path(path)?);
                }
            }
        }
@@ -1172,7 +1281,7 @@ impl ArgMatches {
    ///
    /// If the pattern is not valid UTF-8, then an error is returned.
    fn pattern_from_os_str(&self, pat: &OsStr) -> Result<String> {
-        let s = pattern_to_str(pat)?;
+        let s = cli::pattern_from_os(pat)?;
        Ok(self.pattern_from_str(s))
    }

@@ -1222,6 +1331,17 @@ impl ArgMatches {
        Some(Path::new(path).to_path_buf())
    }

+    /// Builds the set of globs for filtering files to apply to the --pre
+    /// flag. If no --pre-globs are available, then this always returns an
+    /// empty set of globs.
+    fn preprocessor_globs(&self) -> Result<Override> {
+        let mut builder = OverrideBuilder::new(env::current_dir()?);
+        for glob in self.values_of_lossy_vec("pre-glob") {
+            builder.add(&glob)?;
+        }
+        Ok(builder.build()?)
+    }
+
    /// Parse the regex-size-limit argument option into a byte count.
    fn regex_size_limit(&self) -> Result<Option<usize>> {
        let r = self.parse_human_readable_size("regex-size-limit")?;
@@ -1233,6 +1353,22 @@ impl ArgMatches {
        self.value_of_lossy("replace").map(|s| s.into_bytes())
    }

+    /// Returns the sorting criteria based on command line parameters.
+    fn sort_by(&self) -> Result<SortBy> {
+        // For backcompat, continue supporting deprecated --sort-files flag.
+        if self.is_present("sort-files") {
+            return Ok(SortBy::asc(SortByKind::Path));
+        }
+        let sortby = match self.value_of_lossy("sort") {
+            None => match self.value_of_lossy("sortr") {
+                None => return Ok(SortBy::none()),
+                Some(choice) => SortBy::desc(SortByKind::new(&choice)),
+            }
+            Some(choice) => SortBy::asc(SortByKind::new(&choice)),
+        };
+        Ok(sortby)
+    }
+
    /// Returns true if and only if aggregate statistics for a search should
    /// be tracked.
    ///
@@ -1243,28 +1379,6 @@ impl ArgMatches {
        self.output_kind() == OutputKind::JSON || self.is_present("stats")
    }

-    /// Returns a handle to stdout for filtering search.
-    ///
-    /// A handle is returned if and only if ripgrep's stdout is being
-    /// redirected to a file. The handle returned corresponds to that file.
-    ///
-    /// This can be used to ensure that we do not attempt to search a file
-    /// that ripgrep is writing to.
-    fn stdout_handle(&self) -> Option<Handle> {
-        let h = match Handle::stdout() {
-            Err(_) => return None,
-            Ok(h) => h,
-        };
-        let md = match h.as_file().metadata() {
-            Err(_) => return None,
-            Ok(md) => md,
-        };
-        if !md.is_file() {
-            return None;
-        }
-        Some(h)
-    }
-
    /// When the output format is `Summary`, this returns the type of summary
    /// output to show.
    ///
@@ -1288,7 +1402,7 @@ impl ArgMatches {

    /// Return the number of threads that should be used for parallelism.
    fn threads(&self) -> Result<usize> {
-        if self.is_present("sort-files") {
+        if self.sort_by()?.kind != SortByKind::None {
            return Ok(1);
        }
        let threads = self.usize_of("threads")?.unwrap_or(0);
@@ -1386,40 +1500,11 @@ impl ArgMatches {
        &self,
        arg_name: &str,
    ) -> Result<Option<u64>> {
-        lazy_static! {
-            static ref RE: Regex = Regex::new(r"^([0-9]+)([KMG])?$").unwrap();
-        }
-
-        let arg_value = match self.value_of_lossy(arg_name) {
-            Some(x) => x,
-            None => return Ok(None)
+        let size = match self.value_of_lossy(arg_name) {
+            None => return Ok(None),
+            Some(size) => size,
        };
-        let caps = RE
-            .captures(&arg_value)
-            .ok_or_else(|| {
-                format!("invalid format for {}", arg_name)
-            })?;
-
-        let value = caps[1].parse::<u64>()?;
-        let suffix = caps.get(2).map(|x| x.as_str());
-
-        let v_10 = value.checked_mul(1024);
-        let v_20 = v_10.and_then(|x| x.checked_mul(1024));
-        let v_30 = v_20.and_then(|x| x.checked_mul(1024));
-        let try_suffix = |x: Option<u64>| {
-            if x.is_some() {
-                Ok(x)
-            } else {
-                Err(From::from(format!("number too large for {}", arg_name)))
-            }
-        };
-        match suffix {
-            None => Ok(Some(value)),
-            Some("K") => try_suffix(v_10),
-            Some("M") => try_suffix(v_20),
-            Some("G") => try_suffix(v_30),
-            _ => Err(From::from(format!("invalid suffix for {}", arg_name)))
-        }
+        Ok(Some(cli::parse_human_readable_size(&size)?))
    }
 }

@@ -1453,21 +1538,6 @@ impl ArgMatches {
    }
 }

-/// Convert an OsStr to a Unicode string.
-///
-/// Patterns _must_ be valid UTF-8, so if the given OsStr isn't valid UTF-8,
-/// this returns an error.
-fn pattern_to_str(s: &OsStr) -> Result<&str> {
-    s.to_str().ok_or_else(|| {
-        From::from(format!(
-            "Argument '{}' is not valid UTF-8. \
-             Use hex escape sequences to match arbitrary \
-             bytes in a pattern (e.g., \\xFF).",
-             s.to_string_lossy()
-        ))
-    })
-}
-
 /// Inspect an error resulting from building a Rust regex matcher, and if it's
 /// believed to correspond to a syntax error that PCRE2 could handle, then
 /// add a message to suggest the use of -P/--pcre2.
@@ -1483,6 +1553,17 @@ and look-around.", msg)
    }
 }

+fn suggest_multiline(msg: String) -> String {
+    if msg.contains("the literal") && msg.contains("not allowed") {
+        format!("{}
+
+Consider enabling multiline mode with the --multiline flag (or -U for short).
+When multiline mode is enabled, new line characters can be matched.", msg)
+    } else {
+        msg
+    }
+}
+
 /// Convert the result of parsing a human readable file size to a `usize`,
 /// failing if the type does not fit.
 fn u64_to_usize(
@@ -1502,32 +1583,30 @@ fn u64_to_usize(
    }
 }

-/// Returns true if and only if stdin is deemed searchable.
-#[cfg(unix)]
-fn stdin_is_readable() -> bool {
-    use std::os::unix::fs::FileTypeExt;
-
-    let ft = match Handle::stdin().and_then(|h| h.as_file().metadata()) {
-        Err(_) => return false,
-        Ok(md) => md.file_type(),
+/// Builds a comparator for sorting two files according to a system time
+/// extracted from the file's metadata.
+///
+/// If there was a problem extracting the metadata or if the time is not
+/// available, then both entries compare equal.
+fn sort_by_metadata_time<G>(
+    p1: &Path,
+    p2: &Path,
+    reverse: bool,
+    get_time: G,
+) -> cmp::Ordering
+where G: Fn(&fs::Metadata) -> io::Result<SystemTime>
+{
+    let t1 = match p1.metadata().and_then(|md| get_time(&md)) {
+        Ok(t) => t,
+        Err(_) => return cmp::Ordering::Equal,
    };
-    ft.is_file() || ft.is_fifo()
-}
-
-/// Returns true if and only if stdin is deemed searchable.
-#[cfg(windows)]
-fn stdin_is_readable() -> bool {
-    use std::os::windows::io::AsRawHandle;
-    use winapi::um::fileapi::GetFileType;
-    use winapi::um::winbase::{FILE_TYPE_DISK, FILE_TYPE_PIPE};
-
-    let handle = match Handle::stdin() {
-        Err(_) => return false,
-        Ok(handle) => handle,
+    let t2 = match p2.metadata().and_then(|md| get_time(&md)) {
+        Ok(t) => t,
+        Err(_) => return cmp::Ordering::Equal,
    };
-    let raw_handle = handle.as_raw_handle();
-    // SAFETY: As far as I can tell, it's not possible to use GetFileType in
-    // a way that violates safety. We give it a handle and we get an integer.
-    let ft = unsafe { GetFileType(raw_handle) };
-    ft == FILE_TYPE_DISK || ft == FILE_TYPE_PIPE
+    if reverse {
+        t1.cmp(&t2).reverse()
+    } else {
+        t1.cmp(&t2)
+    }
 }
--- a/src/config.rs
+++ b/src/config.rs
@@ -9,7 +9,9 @@ use std::io::{self, BufRead};
 use std::ffi::OsString;
 use std::path::{Path, PathBuf};

-use Result;
+use log;
+
+use crate::Result;

 /// Return a sequence of arguments derived from ripgrep rc configuration files.
 pub fn args() -> Vec<OsString> {
@@ -34,7 +36,7 @@ pub fn args() -> Vec<OsString> {
            message!("{}:{}", config_path.display(), err);
        }
    }
-    debug!(
+    log::debug!(
        "{}: arguments loaded from config file: {:?}",
        config_path.display(),
        args
--- a/src/decompressor.rs
+++ b/src/decompressor.rs
@@ -1,190 +0,0 @@
-use std::collections::HashMap;
-use std::ffi::OsStr;
-use std::fmt;
-use std::io::{self, Read};
-use std::path::Path;
-use std::process::{self, Stdio};
-
-use globset::{Glob, GlobSet, GlobSetBuilder};
-
-/// A decompression command, contains the command to be spawned as well as any
-/// necessary CLI args.
-#[derive(Clone, Copy, Debug)]
-struct DecompressionCommand {
-    cmd: &'static str,
-    args: &'static [&'static str],
-}
-
-impl DecompressionCommand {
-    /// Create a new decompress command
-    fn new(
-        cmd: &'static str,
-        args: &'static [&'static str],
-    ) -> DecompressionCommand {
-        DecompressionCommand {
-            cmd, args
-        }
-    }
-}
-
-impl fmt::Display for DecompressionCommand {
-    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        write!(f, "{} {}", self.cmd, self.args.join(" "))
-    }
-}
-
-lazy_static! {
-    static ref DECOMPRESSION_COMMANDS: HashMap<
-        &'static str,
-        DecompressionCommand,
-    > = {
-        let mut m = HashMap::new();
-
-        const ARGS: &[&str] = &["-d", "-c"];
-        m.insert("gz", DecompressionCommand::new("gzip", ARGS));
-        m.insert("bz2", DecompressionCommand::new("bzip2", ARGS));
-        m.insert("xz", DecompressionCommand::new("xz", ARGS));
-        m.insert("lz4", DecompressionCommand::new("lz4", ARGS));
-
-        const LZMA_ARGS: &[&str] = &["--format=lzma", "-d", "-c"];
-        m.insert("lzma", DecompressionCommand::new("xz", LZMA_ARGS));
-
-        m
-    };
-    static ref SUPPORTED_COMPRESSION_FORMATS: GlobSet = {
-        let mut builder = GlobSetBuilder::new();
-        builder.add(Glob::new("*.gz").unwrap());
-        builder.add(Glob::new("*.bz2").unwrap());
-        builder.add(Glob::new("*.xz").unwrap());
-        builder.add(Glob::new("*.lz4").unwrap());
-        builder.add(Glob::new("*.lzma").unwrap());
-        builder.build().unwrap()
-    };
-    static ref TAR_ARCHIVE_FORMATS: GlobSet = {
-        let mut builder = GlobSetBuilder::new();
-        builder.add(Glob::new("*.tar.gz").unwrap());
-        builder.add(Glob::new("*.tar.xz").unwrap());
-        builder.add(Glob::new("*.tar.bz2").unwrap());
-        builder.add(Glob::new("*.tar.lz4").unwrap());
-        builder.add(Glob::new("*.tgz").unwrap());
-        builder.add(Glob::new("*.txz").unwrap());
-        builder.add(Glob::new("*.tbz2").unwrap());
-        builder.build().unwrap()
-    };
-}
-
-/// DecompressionReader provides an `io::Read` implementation for a limited
-/// set of compression formats.
-#[derive(Debug)]
-pub struct DecompressionReader {
-    cmd: DecompressionCommand,
-    child: process::Child,
-    done: bool,
-}
-
-impl DecompressionReader {
-    /// Returns a handle to the stdout of the spawned decompression process for
-    /// `path`, which can be directly searched in the worker. When the returned
-    /// value is exhausted, the underlying process is reaped. If the underlying
-    /// process fails, then its stderr is read and converted into a normal
-    /// io::Error.
-    ///
-    /// If there is any error in spawning the decompression command, then
-    /// return `None`, after outputting any necessary debug or error messages.
-    pub fn from_path(path: &Path) -> Option<DecompressionReader> {
-        let extension = match path.extension().and_then(OsStr::to_str) {
-            Some(extension) => extension,
-            None => {
-                debug!(
-                    "{}: failed to get compresson extension", path.display());
-                return None;
-            }
-        };
-        let decompression_cmd = match DECOMPRESSION_COMMANDS.get(extension) {
-            Some(cmd) => cmd,
-            None => {
-                debug!(
-                    "{}: failed to get decompression command", path.display());
-                return None;
-            }
-        };
-        let cmd = process::Command::new(decompression_cmd.cmd)
-            .args(decompression_cmd.args)
-            .arg(path)
-            .stdout(Stdio::piped())
-            .stderr(Stdio::piped())
-            .spawn();
-        let child = match cmd {
-            Ok(process) => process,
-            Err(_) => {
-                debug!(
-                    "{}: decompression command '{}' not found",
-                    path.display(), decompression_cmd.cmd);
-                return None;
-            }
-        };
-        Some(DecompressionReader::new(*decompression_cmd, child))
-    }
-
-    fn new(
-        cmd: DecompressionCommand,
-        child: process::Child,
-    ) -> DecompressionReader {
-        DecompressionReader {
-            cmd: cmd,
-            child: child,
-            done: false,
-        }
-    }
-
-    fn read_error(&mut self) -> io::Result<io::Error> {
-        let mut errbytes = vec![];
-        self.child.stderr.as_mut().unwrap().read_to_end(&mut errbytes)?;
-        let errstr = String::from_utf8_lossy(&errbytes);
-        let errstr = errstr.trim();
-
-        Ok(if errstr.is_empty() {
-            let msg = format!("decompression command failed: '{}'", self.cmd);
-            io::Error::new(io::ErrorKind::Other, msg)
-        } else {
-            let msg = format!(
-                "decompression command '{}' failed: {}", self.cmd, errstr);
-            io::Error::new(io::ErrorKind::Other, msg)
-        })
-    }
-}
-
-impl io::Read for DecompressionReader {
-    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
-        if self.done {
-            return Ok(0);
-        }
-        let nread = self.child.stdout.as_mut().unwrap().read(buf)?;
-        if nread == 0 {
-            self.done = true;
-            // Reap the child now that we're done reading.
-            // If the command failed, report stderr as an error.
-            if !self.child.wait()?.success() {
-                return Err(self.read_error()?);
-            }
-        }
-        Ok(nread)
-    }
-}
-
-/// Returns true if the given path contains a supported compression format or
-/// is a TAR archive.
-pub fn is_compressed(path: &Path) -> bool {
-    is_supported_compression_format(path) || is_tar_archive(path)
-}
-
-/// Returns true if the given path matches any one of the supported compression
-/// formats
-fn is_supported_compression_format(path: &Path) -> bool {
-    SUPPORTED_COMPRESSION_FORMATS.is_match(path)
-}
-
-/// Returns true if the given path matches any of the known TAR file formats.
-fn is_tar_archive(path: &Path) -> bool {
-    TAR_ARCHIVE_FORMATS.is_match(path)
-}
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,23 +1,4 @@
-extern crate atty;
-#[macro_use]
-extern crate clap;
-extern crate globset;
-extern crate grep;
-extern crate ignore;
-#[macro_use]
-extern crate lazy_static;
-#[macro_use]
-extern crate log;
-extern crate num_cpus;
-extern crate regex;
-extern crate same_file;
-#[macro_use]
-extern crate serde_json;
-extern crate termcolor;
-#[cfg(windows)]
-extern crate winapi;
-
-use std::io;
+use std::io::{self, Write};
 use std::process;
 use std::sync::{Arc, Mutex};
 use std::time::Instant;
@@ -33,13 +14,10 @@ mod messages;
 mod app;
 mod args;
 mod config;
-mod decompressor;
-mod preprocessor;
 mod logger;
 mod path_printer;
 mod search;
 mod subject;
-mod unescape;

 type Result<T> = ::std::result::Result<T, Box<::std::error::Error>>;

--- a/src/messages.rs
+++ b/src/messages.rs
@@ -6,7 +6,7 @@ static IGNORE_MESSAGES: AtomicBool = ATOMIC_BOOL_INIT;
 #[macro_export]
 macro_rules! message {
    ($($tt:tt)*) => {
-        if ::messages::messages() {
+        if crate::messages::messages() {
            eprintln!($($tt)*);
        }
    }
@@ -15,7 +15,7 @@ macro_rules! message {
 #[macro_export]
 macro_rules! ignore_message {
    ($($tt:tt)*) => {
-        if ::messages::messages() && ::messages::ignore_messages() {
+        if crate::messages::messages() && crate::messages::ignore_messages() {
            eprintln!($($tt)*);
        }
    }
--- a/src/preprocessor.rs
+++ b/src/preprocessor.rs
@@ -1,93 +0,0 @@
-use std::fs::File;
-use std::io::{self, Read};
-use std::path::{Path, PathBuf};
-use std::process::{self, Stdio};
-
-/// PreprocessorReader provides an `io::Read` impl to read kids output.
-#[derive(Debug)]
-pub struct PreprocessorReader {
-    cmd: PathBuf,
-    path: PathBuf,
-    child: process::Child,
-    done: bool,
-}
-
-impl PreprocessorReader {
-    /// Returns a handle to the stdout of the spawned preprocessor process for
-    /// `path`, which can be directly searched in the worker. When the returned
-    /// value is exhausted, the underlying process is reaped. If the underlying
-    /// process fails, then its stderr is read and converted into a normal
-    /// io::Error.
-    ///
-    /// If there is any error in spawning the preprocessor command, then
-    /// return the corresponding error.
-    pub fn from_cmd_path(
-        cmd: PathBuf,
-        path: &Path,
-    ) -> io::Result<PreprocessorReader> {
-        let child = process::Command::new(&cmd)
-            .arg(path)
-            .stdin(Stdio::from(File::open(path)?))
-            .stdout(Stdio::piped())
-            .stderr(Stdio::piped())
-            .spawn()
-            .map_err(|err| {
-                io::Error::new(
-                    io::ErrorKind::Other,
-                    format!(
-                        "error running preprocessor command '{}': {}",
-                        cmd.display(),
-                        err,
-                    ),
-                )
-            })?;
-        Ok(PreprocessorReader {
-            cmd: cmd,
-            path: path.to_path_buf(),
-            child: child,
-            done: false,
-        })
-    }
-
-    fn read_error(&mut self) -> io::Result<io::Error> {
-        let mut errbytes = vec![];
-        self.child.stderr.as_mut().unwrap().read_to_end(&mut errbytes)?;
-        let errstr = String::from_utf8_lossy(&errbytes);
-        let errstr = errstr.trim();
-
-        Ok(if errstr.is_empty() {
-            let msg = format!(
-                "preprocessor command failed: '{} {}'",
-                self.cmd.display(),
-                self.path.display(),
-            );
-            io::Error::new(io::ErrorKind::Other, msg)
-        } else {
-            let msg = format!(
-                "preprocessor command failed: '{} {}': {}",
-                self.cmd.display(),
-                self.path.display(),
-                errstr,
-            );
-            io::Error::new(io::ErrorKind::Other, msg)
-        })
-    }
-}
-
-impl io::Read for PreprocessorReader {
-    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
-        if self.done {
-            return Ok(0);
-        }
-        let nread = self.child.stdout.as_mut().unwrap().read(buf)?;
-        if nread == 0 {
-            self.done = true;
-            // Reap the child now that we're done reading.
-            // If the command failed, report stderr as an error.
-            if !self.child.wait()?.success() {
-                return Err(self.read_error()?);
-            }
-        }
-        Ok(nread)
-    }
-}
--- a/src/search.rs
+++ b/src/search.rs
@@ -1,19 +1,22 @@
+use std::fs::File;
 use std::io;
 use std::path::{Path, PathBuf};
+use std::process::{Command, Stdio};
 use std::time::Duration;

+use grep::cli;
 use grep::matcher::Matcher;
 #[cfg(feature = "pcre2")]
 use grep::pcre2::{RegexMatcher as PCRE2RegexMatcher};
 use grep::printer::{JSON, Standard, Summary, Stats};
 use grep::regex::{RegexMatcher as RustRegexMatcher};
 use grep::searcher::Searcher;
+use ignore::overrides::Override;
 use serde_json as json;
+use serde_json::json;
 use termcolor::WriteColor;

-use decompressor::{DecompressionReader, is_compressed};
-use preprocessor::PreprocessorReader;
-use subject::Subject;
+use crate::subject::Subject;

 /// The configuration for the search worker. Among a few other things, the
 /// configuration primarily controls the way we show search results to users
@@ -22,6 +25,7 @@ use subject::Subject;
 struct Config {
    json_stats: bool,
    preprocessor: Option<PathBuf>,
+    preprocessor_globs: Override,
    search_zip: bool,
 }

@@ -30,6 +34,7 @@ impl Default for Config {
        Config {
            json_stats: false,
            preprocessor: None,
+            preprocessor_globs: Override::empty(),
            search_zip: false,
        }
    }
@@ -39,6 +44,8 @@ impl Default for Config {
 #[derive(Clone, Debug)]
 pub struct SearchWorkerBuilder {
    config: Config,
+    command_builder: cli::CommandReaderBuilder,
+    decomp_builder: cli::DecompressionReaderBuilder,
 }

 impl Default for SearchWorkerBuilder {
@@ -50,7 +57,17 @@ impl Default for SearchWorkerBuilder {
 impl SearchWorkerBuilder {
    /// Create a new builder for configuring and constructing a search worker.
    pub fn new() -> SearchWorkerBuilder {
-        SearchWorkerBuilder { config: Config::default() }
+        let mut cmd_builder = cli::CommandReaderBuilder::new();
+        cmd_builder.async_stderr(true);
+
+        let mut decomp_builder = cli::DecompressionReaderBuilder::new();
+        decomp_builder.async_stderr(true);
+
+        SearchWorkerBuilder {
+            config: Config::default(),
+            command_builder: cmd_builder,
+            decomp_builder: decomp_builder,
+        }
    }

    /// Create a new search worker using the given searcher, matcher and
@@ -62,7 +79,12 @@ impl SearchWorkerBuilder {
        printer: Printer<W>,
    ) -> SearchWorker<W> {
        let config = self.config.clone();
-        SearchWorker { config, matcher, searcher, printer }
+        let command_builder = self.command_builder.clone();
+        let decomp_builder = self.decomp_builder.clone();
+        SearchWorker {
+            config, command_builder, decomp_builder,
+            matcher, searcher, printer,
+        }
    }

    /// Forcefully use JSON to emit statistics, even if the underlying printer
@@ -90,6 +112,17 @@ impl SearchWorkerBuilder {
        self
    }

+    /// Set the globs for determining which files should be run through the
+    /// preprocessor. By default, with no globs and a preprocessor specified,
+    /// every file is run through the preprocessor.
+    pub fn preprocessor_globs(
+        &mut self,
+        globs: Override,
+    ) -> &mut SearchWorkerBuilder {
+        self.config.preprocessor_globs = globs;
+        self
+    }
+
    /// Enable the decompression and searching of common compressed files.
    ///
    /// When enabled, if a particular file path is recognized as a compressed
@@ -237,6 +270,8 @@ impl<W: WriteColor> Printer<W> {
 #[derive(Debug)]
 pub struct SearchWorker<W> {
    config: Config,
+    command_builder: cli::CommandReaderBuilder,
+    decomp_builder: cli::DecompressionReaderBuilder,
    matcher: PatternMatcher,
    searcher: Searcher,
    printer: Printer<W>,
@@ -278,20 +313,66 @@ impl<W: WriteColor> SearchWorker<W> {
            let stdin = io::stdin();
            // A `return` here appeases the borrow checker. NLL will fix this.
            return self.search_reader(path, stdin.lock());
-        } else if self.config.preprocessor.is_some() {
-            let cmd = self.config.preprocessor.clone().unwrap();
-            let rdr = PreprocessorReader::from_cmd_path(cmd, path)?;
-            self.search_reader(path, rdr)
-        } else if self.config.search_zip && is_compressed(path) {
-            match DecompressionReader::from_path(path) {
-                None => Ok(SearchResult::default()),
-                Some(rdr) => self.search_reader(path, rdr),
-            }
+        } else if self.should_preprocess(path) {
+            self.search_preprocessor(path)
+        } else if self.should_decompress(path) {
+            self.search_decompress(path)
        } else {
            self.search_path(path)
        }
    }

+    /// Returns true if and only if the given file path should be
+    /// decompressed before searching.
+    fn should_decompress(&self, path: &Path) -> bool {
+        if !self.config.search_zip {
+            return false;
+        }
+        self.decomp_builder.get_matcher().has_command(path)
+    }
+
+    /// Returns true if and only if the given file path should be run through
+    /// the preprocessor.
+    fn should_preprocess(&self, path: &Path) -> bool {
+        if !self.config.preprocessor.is_some() {
+            return false;
+        }
+        if self.config.preprocessor_globs.is_empty() {
+            return true;
+        }
+        !self.config.preprocessor_globs.matched(path, false).is_ignore()
+    }
+
+    /// Search the given file path by first asking the preprocessor for the
+    /// data to search instead of opening the path directly.
+    fn search_preprocessor(
+        &mut self,
+        path: &Path,
+    ) -> io::Result<SearchResult> {
+        let bin = self.config.preprocessor.clone().unwrap();
+        let mut cmd = Command::new(&bin);
+        cmd.arg(path).stdin(Stdio::from(File::open(path)?));
+
+        let rdr = self.command_builder.build(&mut cmd)?;
+        self.search_reader(path, rdr).map_err(|err| {
+            io::Error::new(
+                io::ErrorKind::Other,
+                format!("preprocessor command failed: '{:?}': {}", cmd, err),
+            )
+        })
+    }
+
+    /// Attempt to decompress the data at the given file path and search the
+    /// result. If the given file path isn't recognized as a compressed file,
+    /// then search it without doing any decompression.
+    fn search_decompress(
+        &mut self,
+        path: &Path,
+    ) -> io::Result<SearchResult> {
+        let rdr = self.decomp_builder.build(path)?;
+        self.search_reader(path, rdr)
+    }
+
    /// Search the contents of the given file path.
    fn search_path(&mut self, path: &Path) -> io::Result<SearchResult> {
        use self::PatternMatcher::*;
--- a/src/subject.rs
+++ b/src/subject.rs
@@ -1,26 +1,18 @@
-use std::io;
 use std::path::Path;
-use std::sync::Arc;

 use ignore::{self, DirEntry};
-use same_file::Handle;
+use log;

 /// A configuration for describing how subjects should be built.
 #[derive(Clone, Debug)]
 struct Config {
-    skip: Option<Arc<Handle>>,
    strip_dot_prefix: bool,
-    separator: Option<u8>,
-    terminator: Option<u8>,
 }

 impl Default for Config {
    fn default() -> Config {
        Config {
-            skip: None,
            strip_dot_prefix: false,
-            separator: None,
-            terminator: None,
        }
    }
 }
@@ -71,26 +63,6 @@ impl SubjectBuilder {
        if subj.dent.is_stdin() {
            return Some(subj);
        }
-        // If we're supposed to skip a particular file, then skip it.
-        if let Some(ref handle) = self.config.skip {
-            match subj.equals(handle) {
-                Ok(false) => {} // fallthrough
-                Ok(true) => {
-                    debug!(
-                        "ignoring {}: (probably same file as stdout)",
-                        subj.dent.path().display()
-                    );
-                    return None;
-                }
-                Err(err) => {
-                    debug!(
-                        "ignoring {}: got error: {}",
-                        subj.dent.path().display(), err
-                    );
-                    return None;
-                }
-            }
-        }
        // If this subject has a depth of 0, then it was provided explicitly
        // by an end user (or via a shell glob). In this case, we always want
        // to search it if it even smells like a file (e.g., a symlink).
@@ -108,7 +80,7 @@ impl SubjectBuilder {
        // directory. Otherwise, emitting messages for directories is just
        // noisy.
        if !subj.is_dir() {
-            debug!(
+            log::debug!(
                "ignoring {}: failed to pass subject filter: \
                 file type: {:?}, metadata: {:?}",
                 subj.dent.path().display(),
@@ -119,22 +91,6 @@ impl SubjectBuilder {
        None
    }

-    /// When provided, subjects that represent the same file as the handle
-    /// given will be skipped.
-    ///
-    /// Typically, it is useful to pass a handle referring to stdout, such
-    /// that the file being written to isn't searched, which can lead to
-    /// an unbounded feedback mechanism.
-    ///
-    /// Only one handle to skip can be provided.
-    pub fn skip(
-        &mut self,
-        handle: Option<Handle>,
-    ) -> &mut SubjectBuilder {
-        self.config.skip = handle.map(Arc::new);
-        self
-    }
-
    /// When enabled, if the subject's file path starts with `./` then it is
    /// stripped.
    ///
@@ -172,59 +128,12 @@ impl Subject {
    }

    /// Returns true if and only if this subject points to a directory.
-    ///
-    /// This works around a bug in Rust's standard library:
-    /// https://github.com/rust-lang/rust/issues/46484
-    #[cfg(windows)]
-    fn is_dir(&self) -> bool {
-        use std::os::windows::fs::MetadataExt;
-        use winapi::um::winnt::FILE_ATTRIBUTE_DIRECTORY;
-
-        self.dent.metadata().map(|md| {
-            md.file_attributes() & FILE_ATTRIBUTE_DIRECTORY != 0
-        }).unwrap_or(false)
-    }
-
-    /// Returns true if and only if this subject points to a directory.
-    #[cfg(not(windows))]
    fn is_dir(&self) -> bool {
        self.dent.file_type().map_or(false, |ft| ft.is_dir())
    }

    /// Returns true if and only if this subject points to a file.
-    ///
-    /// This works around a bug in Rust's standard library:
-    /// https://github.com/rust-lang/rust/issues/46484
-    #[cfg(windows)]
-    fn is_file(&self) -> bool {
-        !self.is_dir()
-    }
-
-    /// Returns true if and only if this subject points to a file.
-    #[cfg(not(windows))]
    fn is_file(&self) -> bool {
        self.dent.file_type().map_or(false, |ft| ft.is_file())
    }
-
-    /// Returns true if and only if this subject is believed to be equivalent
-    /// to the given handle. If there was a problem querying this subject for
-    /// information to determine equality, then that error is returned.
-    fn equals(&self, handle: &Handle) -> io::Result<bool> {
-        #[cfg(unix)]
-        fn never_equal(dent: &DirEntry, handle: &Handle) -> bool {
-            dent.ino() != Some(handle.ino())
-        }
-
-        #[cfg(not(unix))]
-        fn never_equal(_: &DirEntry, _: &Handle) -> bool {
-            false
-        }
-
-        // If we know for sure that these two things aren't equal, then avoid
-        // the costly extra stat call to determine equality.
-        if self.dent.is_stdin() || never_equal(&self.dent, handle) {
-            return Ok(false);
-        }
-        Handle::from_path(self.path()).map(|h| &h == handle)
-    }
 }
--- a/src/unescape.rs
+++ b/src/unescape.rs
@@ -1,137 +0,0 @@
-/// A single state in the state machine used by `unescape`.
-#[derive(Clone, Copy, Eq, PartialEq)]
-enum State {
-    /// The state after seeing a `\`.
-    Escape,
-    /// The state after seeing a `\x`.
-    HexFirst,
-    /// The state after seeing a `\x[0-9A-Fa-f]`.
-    HexSecond(char),
-    /// Default state.
-    Literal,
-}
-
-/// Escapes an arbitrary byte slice such that it can be presented as a human
-/// readable string.
-pub fn escape(bytes: &[u8]) -> String {
-    use std::ascii::escape_default;
-
-    let escaped = bytes.iter().flat_map(|&b| escape_default(b)).collect();
-    String::from_utf8(escaped).unwrap()
-}
-
-/// Unescapes a string given on the command line. It supports a limited set of
-/// escape sequences:
-///
-/// * `\t`, `\r` and `\n` are mapped to their corresponding ASCII bytes.
-/// * `\xZZ` hexadecimal escapes are mapped to their byte.
-pub fn unescape(s: &str) -> Vec<u8> {
-    use self::State::*;
-
-    let mut bytes = vec![];
-    let mut state = Literal;
-    for c in s.chars() {
-        match state {
-            Escape => {
-                match c {
-                    'n' => { bytes.push(b'\n'); state = Literal; }
-                    'r' => { bytes.push(b'\r'); state = Literal; }
-                    't' => { bytes.push(b'\t'); state = Literal; }
-                    'x' => { state = HexFirst; }
-                    c => {
-                        bytes.extend(format!(r"\{}", c).into_bytes());
-                        state = Literal;
-                    }
-                }
-            }
-            HexFirst => {
-                match c {
-                    '0'...'9' | 'A'...'F' | 'a'...'f' => {
-                        state = HexSecond(c);
-                    }
-                    c => {
-                        bytes.extend(format!(r"\x{}", c).into_bytes());
-                        state = Literal;
-                    }
-                }
-            }
-            HexSecond(first) => {
-                match c {
-                    '0'...'9' | 'A'...'F' | 'a'...'f' => {
-                        let ordinal = format!("{}{}", first, c);
-                        let byte = u8::from_str_radix(&ordinal, 16).unwrap();
-                        bytes.push(byte);
-                        state = Literal;
-                    }
-                    c => {
-                        let original = format!(r"\x{}{}", first, c);
-                        bytes.extend(original.into_bytes());
-                        state = Literal;
-                    }
-                }
-            }
-            Literal => {
-                match c {
-                    '\\' => { state = Escape; }
-                    c => { bytes.extend(c.to_string().as_bytes()); }
-                }
-            }
-        }
-    }
-    match state {
-        Escape => bytes.push(b'\\'),
-        HexFirst => bytes.extend(b"\\x"),
-        HexSecond(c) => bytes.extend(format!("\\x{}", c).into_bytes()),
-        Literal => {}
-    }
-    bytes
-}
-
-#[cfg(test)]
-mod tests {
-    use super::unescape;
-
-    fn b(bytes: &'static [u8]) -> Vec<u8> {
-        bytes.to_vec()
-    }
-
-    #[test]
-    fn unescape_nul() {
-        assert_eq!(b(b"\x00"), unescape(r"\x00"));
-    }
-
-    #[test]
-    fn unescape_nl() {
-        assert_eq!(b(b"\n"), unescape(r"\n"));
-    }
-
-    #[test]
-    fn unescape_tab() {
-        assert_eq!(b(b"\t"), unescape(r"\t"));
-    }
-
-    #[test]
-    fn unescape_carriage() {
-        assert_eq!(b(b"\r"), unescape(r"\r"));
-    }
-
-    #[test]
-    fn unescape_nothing_simple() {
-        assert_eq!(b(b"\\a"), unescape(r"\a"));
-    }
-
-    #[test]
-    fn unescape_nothing_hex0() {
-        assert_eq!(b(b"\\x"), unescape(r"\x"));
-    }
-
-    #[test]
-    fn unescape_nothing_hex1() {
-        assert_eq!(b(b"\\xz"), unescape(r"\xz"));
-    }
-
-    #[test]
-    fn unescape_nothing_hex2() {
-        assert_eq!(b(b"\\xzz"), unescape(r"\xzz"));
-    }
-}
--- a/tests/feature.rs
+++ b/tests/feature.rs
@@ -1,5 +1,5 @@
-use hay::{SHERLOCK, SHERLOCK_CRLF};
-use util::{Dir, TestCommand, sort_lines};
+use crate::hay::{SHERLOCK, SHERLOCK_CRLF};
+use crate::util::{Dir, TestCommand, sort_lines};

 // See: https://github.com/BurntSushi/ripgrep/issues/1
 rgtest!(f1_sjis, |dir: Dir, mut cmd: TestCommand| {
--- a/tests/json.rs
+++ b/tests/json.rs
@@ -1,9 +1,10 @@
 use std::time;

+use serde_derive::Deserialize;
 use serde_json as json;

-use hay::{SHERLOCK, SHERLOCK_CRLF};
-use util::{Dir, TestCommand};
+use crate::hay::{SHERLOCK, SHERLOCK_CRLF};
+use crate::util::{Dir, TestCommand};

 #[derive(Clone, Debug, Deserialize, PartialEq, Eq)]
 #[serde(tag = "type", content = "data")]
@@ -241,6 +242,49 @@ rgtest!(notutf8, |dir: Dir, mut cmd: TestCommand| {
    );
 });

+rgtest!(notutf8_file, |dir: Dir, mut cmd: TestCommand| {
+    use std::ffi::OsStr;
+
+    // This test does not work with PCRE2 because PCRE2 does not support the
+    // `u` flag.
+    if dir.is_pcre2() {
+        return;
+    }
+
+    let name = "foo";
+    let contents = &b"quux\xFFbaz"[..];
+
+    // APFS does not support creating files with invalid UTF-8 bytes, so just
+    // skip the test if we can't create our file.
+    if !dir.try_create_bytes(OsStr::new(name), contents).is_ok() {
+        return;
+    }
+    cmd.arg("--json").arg(r"(?-u)\xFF");
+
+    let msgs = json_decode(&cmd.stdout());
+
+    assert_eq!(
+        msgs[0].unwrap_begin(),
+        Begin { path: Some(Data::text("foo")) }
+    );
+    assert_eq!(
+        msgs[1].unwrap_match(),
+        Match {
+            path: Some(Data::text("foo")),
+            lines: Data::bytes("cXV1eP9iYXo="),
+            line_number: Some(1),
+            absolute_offset: 0,
+            submatches: vec![
+                SubMatch {
+                    m: Data::bytes("/w=="),
+                    start: 4,
+                    end: 5,
+                },
+            ],
+        }
+    );
+});
+
 // See: https://github.com/BurntSushi/ripgrep/issues/416
 //
 // This test in particular checks that our match does _not_ include the `\r`
--- a/tests/macros.rs
+++ b/tests/macros.rs
@@ -3,11 +3,11 @@ macro_rules! rgtest {
    ($name:ident, $fun:expr) => {
        #[test]
        fn $name() {
-            let (dir, cmd) = ::util::setup(stringify!($name));
+            let (dir, cmd) = crate::util::setup(stringify!($name));
            $fun(dir, cmd);

            if cfg!(feature = "pcre2") {
-                let (dir, cmd) = ::util::setup_pcre2(stringify!($name));
+                let (dir, cmd) = crate::util::setup_pcre2(stringify!($name));
                $fun(dir, cmd);
            }
        }
--- a/tests/misc.rs
+++ b/tests/misc.rs
@@ -1,5 +1,5 @@
-use hay::SHERLOCK;
-use util::{Dir, TestCommand, cmd_exists, sort_lines};
+use crate::hay::SHERLOCK;
+use crate::util::{Dir, TestCommand, cmd_exists, sort_lines};

 // This file contains "miscellaneous" tests that were either written before
 // features were tracked more explicitly, or were simply written without
@@ -816,6 +816,24 @@ be, to a very large extent, the result of luck. Sherlock Holmes
    eqnice!(expected, cmd.stdout());
 });

+rgtest!(preprocessing_glob, |dir: Dir, mut cmd: TestCommand| {
+    if !cmd_exists("xzcat") {
+        return;
+    }
+
+    dir.create("sherlock", SHERLOCK);
+    dir.create_bytes("sherlock.xz", include_bytes!("./data/sherlock.xz"));
+    cmd.args(&["--pre", "xzcat", "--pre-glob", "*.xz", "Sherlock"]);
+
+    let expected = "\
+sherlock.xz:For the Doctor Watsons of this world, as opposed to the Sherlock
+sherlock.xz:be, to a very large extent, the result of luck. Sherlock Holmes
+sherlock:For the Doctor Watsons of this world, as opposed to the Sherlock
+sherlock:be, to a very large extent, the result of luck. Sherlock Holmes
+";
+    eqnice!(sort_lines(expected), sort_lines(&cmd.stdout()));
+});
+
 rgtest!(compressed_gzip, |dir: Dir, mut cmd: TestCommand| {
    if !cmd_exists("gzip") {
        return;
--- a/tests/multiline.rs
+++ b/tests/multiline.rs
@@ -1,5 +1,5 @@
-use hay::SHERLOCK;
-use util::{Dir, TestCommand};
+use crate::hay::SHERLOCK;
+use crate::util::{Dir, TestCommand};

 // This tests that multiline matches that span multiple lines, but where
 // multiple matches may begin and end on the same line work correctly.
--- a/tests/regression.rs
+++ b/tests/regression.rs
@@ -1,5 +1,5 @@
-use hay::SHERLOCK;
-use util::{Dir, TestCommand, sort_lines};
+use crate::hay::SHERLOCK;
+use crate::util::{Dir, TestCommand, sort_lines};

 // See: https://github.com/BurntSushi/ripgrep/issues/16
 rgtest!(r16, |dir: Dir, mut cmd: TestCommand| {
@@ -562,3 +562,9 @@ rgtest!(r900, |dir: Dir, mut cmd: TestCommand| {

    cmd.arg("-fpat").arg("sherlock").assert_err();
 });
+
+// See: https://github.com/BurntSushi/ripgrep/issues/1064
+rgtest!(r1064, |dir: Dir, mut cmd: TestCommand| {
+    dir.create("input", "abc");
+    eqnice!("input:abc\n", cmd.arg("a(.*c)").stdout());
+});
--- a/tests/tests.rs
+++ b/tests/tests.rs
@@ -1,8 +1,3 @@
-extern crate serde;
-#[macro_use]
-extern crate serde_derive;
-extern crate serde_json;
-
 // Macros useful for testing.
 #[macro_use]
 mod macros;
--- a/tests/util.rs
+++ b/tests/util.rs
@@ -103,6 +103,7 @@ impl Dir {

    /// Try to create a new file with the given name and contents in this
    /// directory.
+    #[allow(dead_code)] // unused on Windows
    pub fn try_create<P: AsRef<Path>>(
        &self,
        name: P,
@@ -222,6 +223,7 @@ impl Dir {
    /// Creates a file symlink to the src with the given target name
    /// in this directory.
    #[cfg(windows)]
+    #[allow(dead_code)] // unused on Windows
    pub fn link_file<S: AsRef<Path>, T: AsRef<Path>>(
        &self,
        src: S,