cli: replace clap with lexopt and supporting code

ripgrep began it's life with docopt for argument parsing. Then it moved to Clap and stayed there for a number of years. Clap has served ripgrep well, and it probably could continue to serve ripgrep well, but I ended up deciding to move off of it. Why? The first time I had the thought of moving off of Clap was during the 2->3->4 transition. I thought the 3.x and 4.x releases were great, but for me, it ended up moving a little too quickly. Since the release of 4.x was telegraphed around when 3.x came out, I decided to just hold off and wait to migrate to 4.x instead of doing a 3.x migration followed shortly by another 4.x migration. Of course, I just never ended up doing the migration at all. I never got around to it and there just wasn't a compelling reason for me to upgrade. While I never investigated it, I saw an upgrade as a non-trivial amount of work in part because I didn't encapsulate the usage of Clap enough. The above is just what got me started thinking about it. It wasn't enough to get me to move off of it on its own. What ended up pushing me over the edge was a combination of factors: * As mentioned above, I didn't want to run on the migration treadmill. This has proven to not be much of an issue, but at the time of the 2->3->4 releases, I didn't know how long Clap 4.x would be out before a 5.x would come out. * The release of lexopt[1] caught my eye. IMO, that crate demonstrates exactly how something new can arrive on the scene and just thoroughly solve a problem minimalistically. It has the docs, the reasoning, the simple API, the tests and good judgment. It gets all the weird corner cases right that Clap also gets right (and is part of why I was originally attracted to Clap). * I have an overall desire to reduce the size of my dependency tree. In part because a smaller dependency tree tends to correlate with better compile times, but also in part because it reduces my reliance and trust on others. It lets me be the "master" of ripgrep's destiny by reducing the amount of behavior that is the result of someone else's decision (whether good or bad). * I perceived that Clap solves a more general problem than what I actually need solved. Despite the vast number of flags that ripgrep has, its requirements are actually pretty simple. We just need simple switches and flags that support one value. No multi-value flags. No sub-commands. And probably a lot of other functionality that Clap has that makes it so flexible for so many different use cases. (I'm being hand wavy on the last point.) With all that said, perhaps most importantly, the future of ripgrep possibly demands a more flexible CLI argument parser. In today's world, I would really like, for example, flags like `--type` and `--type-not` to be able to accumulate their repeated values into a single sequence while respecting the order they appear on the CLI. For example, prior to this migration, `rg regex-automata -Tlock -ttoml` would not return results in `Cargo.lock` in this repository because the `-Tlock` always took priority even though `-ttoml` appeared after it. But with this migration, `-ttoml` now correctly overrides `-Tlock`. We would like to do similar things for `-g/--glob` and `--iglob` and potentially even now introduce a `-G/--glob-not` flag instead of requiring users to use `!` to negate a glob. (Which I had done originally to work-around this problem.) And some day, I'd like to add some kind of boolean matching to ripgrep perhaps similar to how `git grep` does it. (Although I haven't thought too carefully on a design yet.) In order to do that, I perceive it would be difficult to implement correctly in Clap. I believe that this last point is possible to implement correctly in Clap 2.x, although it is awkward to do so. I have not looked closely enough at the Clap 4.x API to know whether it's still possible there. In any case, these were enough reasons to move off of Clap and own more of the argument parsing process myself. This did require a few things: * I had to write my own logic for how arguments are combined into one single state object. Of course, I wanted this. This was part of the upside. But it's still code I didn't have to write for Clap. * I had to write my own shell completion generator. * I had to write my own `-h/--help` output generator. * I also had to write my own man page generator. Well, I had to do this with Clap 2.x too, although my understanding is that Clap 4.x supports this. With that said, without having tried it, my guess is that I probably wouldn't have liked the output it generated because I ultimately had to write most of the roff by hand myself to get the man page I wanted. (This also had the benefit of dropping the build dependency on asciidoc/asciidoctor.) While this is definitely a fair bit of extra work, it overall only cost me a couple days. IMO, that's a good trade off given that this code is unlikely to change again in any substantial way. And it should also allow for more flexible semantics going forward. Fixes #884, Fixes #1648, Fixes #1701, Fixes #1814, Fixes #1966 [1]: https://docs.rs/lexopt/0.3.0/lexopt/index.html
2025-08-01 12:41:58 -07:00 · 2023-10-16 18:05:39 -04:00
parent c33f623719
commit 082245dadb
47 changed files with 12730 additions and 6147 deletions
--- a/crates/core/app.rs
+++ b/crates/core/app.rs
--- a/crates/core/args.rs
+++ b/crates/core/args.rs
--- a/crates/core/flags/complete/bash.rs
+++ b/crates/core/flags/complete/bash.rs
@@ -0,0 +1,107 @@
+/*!
+Provides completions for ripgrep's CLI for the bash shell.
+*/
+
+use crate::flags::defs::FLAGS;
+
+const TEMPLATE_FULL: &'static str = "
+_rg() {
+  local i cur prev opts cmds
+  COMPREPLY=()
+  cur=\"${COMP_WORDS[COMP_CWORD]}\"
+  prev=\"${COMP_WORDS[COMP_CWORD-1]}\"
+  cmd=\"\"
+  opts=\"\"
+
+  for i in ${COMP_WORDS[@]}; do
+    case \"${i}\" in
+      rg)
+        cmd=\"rg\"
+        ;;
+      *)
+        ;;
+    esac
+  done
+
+  case \"${cmd}\" in
+    rg)
+      opts=\"!OPTS!\"
+      if [[ ${cur} == -* || ${COMP_CWORD} -eq 1 ]] ; then
+        COMPREPLY=($(compgen -W \"${opts}\" -- \"${cur}\"))
+        return 0
+      fi
+      case \"${prev}\" in
+!CASES!
+      esac
+      COMPREPLY=($(compgen -W \"${opts}\" -- \"${cur}\"))
+      return 0
+      ;;
+  esac
+}
+
+complete -F _rg -o bashdefault -o default rg
+";
+
+const TEMPLATE_CASE: &'static str = "
+        !FLAG!)
+          COMPREPLY=($(compgen -f \"${cur}\"))
+          return 0
+          ;;
+";
+
+const TEMPLATE_CASE_CHOICES: &'static str = "
+        !FLAG!)
+          COMPREPLY=($(compgen -W \"!CHOICES!\" -- \"${cur}\"))
+          return 0
+          ;;
+";
+
+/// Generate completions for Bash.
+///
+/// Note that these completions are based on what was produced for ripgrep <=13
+/// using Clap 2.x. Improvements on this are welcome.
+pub(crate) fn generate() -> String {
+    let mut opts = String::new();
+    for flag in FLAGS.iter() {
+        opts.push_str("--");
+        opts.push_str(flag.name_long());
+        opts.push(' ');
+        if let Some(short) = flag.name_short() {
+            opts.push('-');
+            opts.push(char::from(short));
+            opts.push(' ');
+        }
+        if let Some(name) = flag.name_negated() {
+            opts.push_str("--");
+            opts.push_str(name);
+            opts.push(' ');
+        }
+    }
+    opts.push_str("<PATTERN> <PATH>...");
+
+    let mut cases = String::new();
+    for flag in FLAGS.iter() {
+        let template = if !flag.doc_choices().is_empty() {
+            let choices = flag.doc_choices().join(" ");
+            TEMPLATE_CASE_CHOICES.trim_end().replace("!CHOICES!", &choices)
+        } else {
+            TEMPLATE_CASE.trim_end().to_string()
+        };
+        let name = format!("--{}", flag.name_long());
+        cases.push_str(&template.replace("!FLAG!", &name));
+        if let Some(short) = flag.name_short() {
+            let name = format!("-{}", char::from(short));
+            cases.push_str(&template.replace("!FLAG!", &name));
+        }
+        if let Some(negated) = flag.name_negated() {
+            let name = format!("--{negated}");
+            cases.push_str(&template.replace("!FLAG!", &name));
+        }
+    }
+
+    TEMPLATE_FULL
+        .replace("!OPTS!", &opts)
+        .replace("!CASES!", &cases)
+        .trim_start()
+        .to_string()
+}
--- a/crates/core/flags/complete/fish.rs
+++ b/crates/core/flags/complete/fish.rs
@@ -0,0 +1,47 @@
+/*!
+Provides completions for ripgrep's CLI for the fish shell.
+*/
+
+use crate::flags::defs::FLAGS;
+
+const TEMPLATE: &'static str =
+    "complete -c rg -n '__fish_use_subcommand' !SHORT! !LONG! !DOC!\n";
+const TEMPLATE_CHOICES: &'static str =
+    "complete -c rg -n '__fish_use_subcommand' !SHORT! !LONG! !DOC! -r -f -a '!CHOICES!'\n";
+
+/// Generate completions for Fish.
+///
+/// Note that these completions are based on what was produced for ripgrep <=13
+/// using Clap 2.x. Improvements on this are welcome.
+pub(crate) fn generate() -> String {
+    let mut out = String::new();
+    for flag in FLAGS.iter() {
+        let short = match flag.name_short() {
+            None => "".to_string(),
+            Some(byte) => format!("-s {}", char::from(byte)),
+        };
+        let long = format!("-l '{}'", flag.name_long().replace("'", "\\'"));
+        let doc = format!("-d '{}'", flag.doc_short().replace("'", "\\'"));
+        let template = if flag.doc_choices().is_empty() {
+            TEMPLATE.to_string()
+        } else {
+            TEMPLATE_CHOICES
+                .replace("!CHOICES!", &flag.doc_choices().join(" "))
+        };
+        out.push_str(
+            &template
+                .replace("!SHORT!", &short)
+                .replace("!LONG!", &long)
+                .replace("!DOC!", &doc),
+        );
+        if let Some(negated) = flag.name_negated() {
+            out.push_str(
+                &template
+                    .replace("!SHORT!", "")
+                    .replace("!LONG!", &negated)
+                    .replace("!DOC!", &doc),
+            );
+        }
+    }
+    out
+}
--- a/crates/core/flags/complete/mod.rs
+++ b/crates/core/flags/complete/mod.rs
@@ -0,0 +1,8 @@
+/*!
+Modules for generating completions for various shells.
+*/
+
+pub(super) mod bash;
+pub(super) mod fish;
+pub(super) mod powershell;
+pub(super) mod zsh;
--- a/crates/core/flags/complete/powershell.rs
+++ b/crates/core/flags/complete/powershell.rs
@@ -0,0 +1,86 @@
+/*!
+Provides completions for ripgrep's CLI for PowerShell.
+*/
+
+use crate::flags::defs::FLAGS;
+
+const TEMPLATE: &'static str = "
+using namespace System.Management.Automation
+using namespace System.Management.Automation.Language
+
+Register-ArgumentCompleter -Native -CommandName 'rg' -ScriptBlock {
+  param($wordToComplete, $commandAst, $cursorPosition)
+  $commandElements = $commandAst.CommandElements
+  $command = @(
+    'rg'
+    for ($i = 1; $i -lt $commandElements.Count; $i++) {
+        $element = $commandElements[$i]
+        if ($element -isnot [StringConstantExpressionAst] -or
+            $element.StringConstantType -ne [StringConstantType]::BareWord -or
+            $element.Value.StartsWith('-')) {
+            break
+    }
+    $element.Value
+  }) -join ';'
+
+  $completions = @(switch ($command) {
+    'rg' {
+!FLAGS!
+    }
+  })
+
+  $completions.Where{ $_.CompletionText -like \"$wordToComplete*\" } |
+    Sort-Object -Property ListItemText
+}
+";
+
+const TEMPLATE_FLAG: &'static str =
+    "[CompletionResult]::new('!DASH_NAME!', '!NAME!', [CompletionResultType]::ParameterName, '!DOC!')";
+
+/// Generate completions for PowerShell.
+///
+/// Note that these completions are based on what was produced for ripgrep <=13
+/// using Clap 2.x. Improvements on this are welcome.
+pub(crate) fn generate() -> String {
+    let mut flags = String::new();
+    for (i, flag) in FLAGS.iter().enumerate() {
+        let doc = flag.doc_short().replace("'", "''");
+
+        let dash_name = format!("--{}", flag.name_long());
+        let name = flag.name_long();
+        if i > 0 {
+            flags.push('\n');
+        }
+        flags.push_str("      ");
+        flags.push_str(
+            &TEMPLATE_FLAG
+                .replace("!DASH_NAME!", &dash_name)
+                .replace("!NAME!", &name)
+                .replace("!DOC!", &doc),
+        );
+
+        if let Some(byte) = flag.name_short() {
+            let dash_name = format!("-{}", char::from(byte));
+            let name = char::from(byte).to_string();
+            flags.push_str("\n      ");
+            flags.push_str(
+                &TEMPLATE_FLAG
+                    .replace("!DASH_NAME!", &dash_name)
+                    .replace("!NAME!", &name)
+                    .replace("!DOC!", &doc),
+            );
+        }
+
+        if let Some(negated) = flag.name_negated() {
+            let dash_name = format!("--{}", negated);
+            flags.push_str("\n      ");
+            flags.push_str(
+                &TEMPLATE_FLAG
+                    .replace("!DASH_NAME!", &dash_name)
+                    .replace("!NAME!", &negated)
+                    .replace("!DOC!", &doc),
+            );
+        }
+    }
+    TEMPLATE.trim_start().replace("!FLAGS!", &flags)
+}
--- a/crates/core/flags/complete/rg.zsh
+++ b/crates/core/flags/complete/rg.zsh
@@ -0,0 +1,661 @@
+#compdef rg
+
+##
+# zsh completion function for ripgrep
+#
+# Run ci/test-complete after building to ensure that the options supported by
+# this function stay in synch with the `rg` binary.
+#
+# For convenience, a completion reference guide is included at the bottom of
+# this file.
+#
+# Originally based on code from the zsh-users project — see copyright notice
+# below.
+
+_rg() {
+  local curcontext=$curcontext no='!' descr ret=1
+  local -a context line state state_descr args tmp suf
+  local -A opt_args
+
+  # ripgrep has many options which negate the effect of a more common one — for
+  # example, `--no-column` to negate `--column`, and `--messages` to negate
+  # `--no-messages`. There are so many of these, and they're so infrequently
+  # used, that some users will probably find it irritating if they're completed
+  # indiscriminately, so let's not do that unless either the current prefix
+  # matches one of those negation options or the user has the `complete-all`
+  # style set. Note that this prefix check has to be updated manually to account
+  # for all of the potential negation options listed below!
+  if
+    # We also want to list all of these options during testing
+    [[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] ||
+    # (--[imnp]* => --ignore*, --messages, --no-*, --pcre2-unicode)
+    [[ $PREFIX$SUFFIX == --[imnp]* ]] ||
+    zstyle -t ":completion:${curcontext}:" complete-all
+  then
+    no=
+  fi
+
+  # We make heavy use of argument groups here to prevent the option specs from
+  # growing unwieldy. These aren't supported in zsh <5.4, though, so we'll strip
+  # them out below if necessary. This makes the exclusions inaccurate on those
+  # older versions, but oh well — it's not that big a deal
+  args=(
+    + '(exclusive)' # Misc. fully exclusive options
+    '(: * -)'{-h,--help}'[display help information]'
+    '(: * -)'{-V,--version}'[display version information]'
+    '(: * -)'--pcre2-version'[print the version of PCRE2 used by ripgrep, if available]'
+
+    + '(buffered)' # buffering options
+    '--line-buffered[force line buffering]'
+    $no"--no-line-buffered[don't force line buffering]"
+    '--block-buffered[force block buffering]'
+    $no"--no-block-buffered[don't force block buffering]"
+
+    + '(case)' # Case-sensitivity options
+    {-i,--ignore-case}'[search case-insensitively]'
+    {-s,--case-sensitive}'[search case-sensitively]'
+    {-S,--smart-case}'[search case-insensitively if pattern is all lowercase]'
+
+    + '(context-a)' # Context (after) options
+    '(context-c)'{-A+,--after-context=}'[specify lines to show after each match]:number of lines'
+
+    + '(context-b)' # Context (before) options
+    '(context-c)'{-B+,--before-context=}'[specify lines to show before each match]:number of lines'
+
+    + '(context-c)' # Context (combined) options
+    '(context-a context-b)'{-C+,--context=}'[specify lines to show before and after each match]:number of lines'
+
+    + '(column)' # Column options
+    '--column[show column numbers for matches]'
+    $no"--no-column[don't show column numbers for matches]"
+
+    + '(count)' # Counting options
+    {-c,--count}'[only show count of matching lines for each file]'
+    '--count-matches[only show count of individual matches for each file]'
+    '--include-zero[include files with zero matches in summary]'
+    $no"--no-include-zero[don't include files with zero matches in summary]"
+
+    + '(encoding)' # Encoding options
+    {-E+,--encoding=}'[specify text encoding of files to search]: :_rg_encodings'
+    $no'--no-encoding[use default text encoding]'
+
+    + '(engine)' # Engine choice options
+    '--engine=[select which regex engine to use]:when:((
+      default\:"use default engine"
+      pcre2\:"identical to --pcre2"
+      auto\:"identical to --auto-hybrid-regex"
+    ))'
+
+    + file # File-input options
+    '(1)*'{-f+,--file=}'[specify file containing patterns to search for]: :_files'
+
+    + '(file-match)' # Files with/without match options
+    '(stats)'{-l,--files-with-matches}'[only show names of files with matches]'
+    '(stats)--files-without-match[only show names of files without matches]'
+
+    + '(file-name)' # File-name options
+    {-H,--with-filename}'[show file name for matches]'
+    {-I,--no-filename}"[don't show file name for matches]"
+
+    + '(file-system)' # File system options
+    "--one-file-system[don't descend into directories on other file systems]"
+    $no'--no-one-file-system[descend into directories on other file systems]'
+
+    + '(fixed)' # Fixed-string options
+    {-F,--fixed-strings}'[treat pattern as literal string instead of regular expression]'
+    $no"--no-fixed-strings[don't treat pattern as literal string]"
+
+    + '(follow)' # Symlink-following options
+    {-L,--follow}'[follow symlinks]'
+    $no"--no-follow[don't follow symlinks]"
+
+    + '(generate)' # Options for generating ancillary data
+    '--generate=[generate man page or completion scripts]:when:((
+      man\:"man page"
+      complete-bash\:"shell completions for bash"
+      complete-zsh\:"shell completions for zsh"
+      complete-fish\:"shell completions for fish"
+      complete-powershell\:"shell completions for PowerShell"
+    ))'
+
+    + glob # File-glob options
+    '*'{-g+,--glob=}'[include/exclude files matching specified glob]:glob'
+    '*--iglob=[include/exclude files matching specified case-insensitive glob]:glob'
+
+    + '(glob-case-insensitive)' # File-glob case sensitivity options
+    '--glob-case-insensitive[treat -g/--glob patterns case insensitively]'
+    $no'--no-glob-case-insensitive[treat -g/--glob patterns case sensitively]'
+
+    + '(heading)' # Heading options
+    '(pretty-vimgrep)--heading[show matches grouped by file name]'
+    "(pretty-vimgrep)--no-heading[don't show matches grouped by file name]"
+
+    + '(hidden)' # Hidden-file options
+    {-.,--hidden}'[search hidden files and directories]'
+    $no"--no-hidden[don't search hidden files and directories]"
+
+    + '(hybrid)' # hybrid regex options
+    '--auto-hybrid-regex[DEPRECATED: dynamically use PCRE2 if necessary]'
+    $no"--no-auto-hybrid-regex[DEPRECATED: don't dynamically use PCRE2 if necessary]"
+
+    + '(ignore)' # Ignore-file options
+    "(--no-ignore-global --no-ignore-parent --no-ignore-vcs --no-ignore-dot)--no-ignore[don't respect ignore files]"
+    $no'(--ignore-global --ignore-parent --ignore-vcs --ignore-dot)--ignore[respect ignore files]'
+
+    + '(ignore-file-case-insensitive)' # Ignore-file case sensitivity options
+    '--ignore-file-case-insensitive[process ignore files case insensitively]'
+    $no'--no-ignore-file-case-insensitive[process ignore files case sensitively]'
+
+    + '(ignore-exclude)' # Local exclude (ignore)-file options
+    "--no-ignore-exclude[don't respect local exclude (ignore) files]"
+    $no'--ignore-exclude[respect local exclude (ignore) files]'
+
+    + '(ignore-global)' # Global ignore-file options
+    "--no-ignore-global[don't respect global ignore files]"
+    $no'--ignore-global[respect global ignore files]'
+
+    + '(ignore-parent)' # Parent ignore-file options
+    "--no-ignore-parent[don't respect ignore files in parent directories]"
+    $no'--ignore-parent[respect ignore files in parent directories]'
+
+    + '(ignore-vcs)' # VCS ignore-file options
+    "--no-ignore-vcs[don't respect version control ignore files]"
+    $no'--ignore-vcs[respect version control ignore files]'
+
+    + '(require-git)' # git specific settings
+    "--no-require-git[don't require git repository to respect gitignore rules]"
+    $no'--require-git[require git repository to respect gitignore rules]'
+
+    + '(ignore-dot)' # .ignore options
+    "--no-ignore-dot[don't respect .ignore files]"
+    $no'--ignore-dot[respect .ignore files]'
+
+    + '(ignore-files)' # custom global ignore file options
+    "--no-ignore-files[don't respect --ignore-file flags]"
+    $no'--ignore-files[respect --ignore-file files]'
+
+    + '(json)' # JSON options
+    '--json[output results in JSON Lines format]'
+    $no"--no-json[don't output results in JSON Lines format]"
+
+    + '(line-number)' # Line-number options
+    {-n,--line-number}'[show line numbers for matches]'
+    {-N,--no-line-number}"[don't show line numbers for matches]"
+
+    + '(line-terminator)' # Line-terminator options
+    '--crlf[use CRLF as line terminator]'
+    $no"--no-crlf[don't use CRLF as line terminator]"
+    '(text)--null-data[use NUL as line terminator]'
+
+    + '(max-columns-preview)' # max column preview options
+    '--max-columns-preview[show preview for long lines (with -M)]'
+    $no"--no-max-columns-preview[don't show preview for long lines (with -M)]"
+
+    + '(max-depth)' # Directory-depth options
+    '--max-depth=[specify max number of directories to descend]:number of directories'
+    '--maxdepth=[alias for --max-depth]:number of directories'
+    '!--maxdepth=:number of directories'
+
+    + '(messages)' # Error-message options
+    '(--no-ignore-messages)--no-messages[suppress some error messages]'
+    $no"--messages[don't suppress error messages affected by --no-messages]"
+
+    + '(messages-ignore)' # Ignore-error message options
+    "--no-ignore-messages[don't show ignore-file parse error messages]"
+    $no'--ignore-messages[show ignore-file parse error messages]'
+
+    + '(mmap)' # mmap options
+    '--mmap[search using memory maps when possible]'
+    "--no-mmap[don't search using memory maps]"
+
+    + '(multiline)' # Multiline options
+    {-U,--multiline}'[permit matching across multiple lines]'
+    $no'(multiline-dotall)--no-multiline[restrict matches to at most one line each]'
+
+    + '(multiline-dotall)' # Multiline DOTALL options
+    '(--no-multiline)--multiline-dotall[allow "." to match newline (with -U)]'
+    $no"(--no-multiline)--no-multiline-dotall[don't allow \".\" to match newline (with -U)]"
+
+    + '(only)' # Only-match options
+    {-o,--only-matching}'[show only matching part of each line]'
+
+    + '(passthru)' # Pass-through options
+    '(--vimgrep)--passthru[show both matching and non-matching lines]'
+    '(--vimgrep)--passthrough[alias for --passthru]'
+
+    + '(pcre2)' # PCRE2 options
+    {-P,--pcre2}'[enable matching with PCRE2]'
+    $no'(pcre2-unicode)--no-pcre2[disable matching with PCRE2]'
+
+    + '(pcre2-unicode)' # PCRE2 Unicode options
+    $no'(--no-pcre2 --no-pcre2-unicode)--pcre2-unicode[DEPRECATED: enable PCRE2 Unicode mode (with -P)]'
+    '(--no-pcre2 --pcre2-unicode)--no-pcre2-unicode[DEPRECATED: disable PCRE2 Unicode mode (with -P)]'
+
+    + '(pre)' # Preprocessing options
+    '(-z --search-zip)--pre=[specify preprocessor utility]:preprocessor utility:_command_names -e'
+    $no'--no-pre[disable preprocessor utility]'
+
+    + pre-glob # Preprocessing glob options
+    '*--pre-glob[include/exclude files for preprocessing with --pre]'
+
+    + '(pretty-vimgrep)' # Pretty/vimgrep display options
+    '(heading)'{-p,--pretty}'[alias for --color=always --heading -n]'
+    '(heading passthru)--vimgrep[show results in vim-compatible format]'
+
+    + regexp # Explicit pattern options
+    '(1 file)*'{-e+,--regexp=}'[specify pattern]:pattern'
+
+    + '(replace)' # Replacement options
+    {-r+,--replace=}'[specify string used to replace matches]:replace string'
+
+    + '(sort)' # File-sorting options
+    '(threads)--sort=[sort results in ascending order (disables parallelism)]:sort method:((
+      none\:"no sorting"
+      path\:"sort by file path"
+      modified\:"sort by last modified time"
+      accessed\:"sort by last accessed time"
+      created\:"sort by creation time"
+    ))'
+    '(threads)--sortr=[sort results in descending order (disables parallelism)]:sort method:((
+      none\:"no sorting"
+      path\:"sort by file path"
+      modified\:"sort by last modified time"
+      accessed\:"sort by last accessed time"
+      created\:"sort by creation time"
+    ))'
+    '(threads)--sort-files[DEPRECATED: sort results by file path (disables parallelism)]'
+    $no"--no-sort-files[DEPRECATED: do not sort results]"
+
+    + '(stats)' # Statistics options
+    '(--files file-match)--stats[show search statistics]'
+    $no"--no-stats[don't show search statistics]"
+
+    + '(text)' # Binary-search options
+    {-a,--text}'[search binary files as if they were text]'
+    "--binary[search binary files, don't print binary data]"
+    $no"--no-binary[don't search binary files]"
+    $no"(--null-data)--no-text[don't search binary files as if they were text]"
+
+    + '(threads)' # Thread-count options
+    '(sort)'{-j+,--threads=}'[specify approximate number of threads to use]:number of threads'
+
+    + '(trim)' # Trim options
+    '--trim[trim any ASCII whitespace prefix from each line]'
+    $no"--no-trim[don't trim ASCII whitespace prefix from each line]"
+
+    + type # Type options
+    '*'{-t+,--type=}'[only search files matching specified type]: :_rg_types'
+    '*--type-add=[add new glob for specified file type]: :->typespec'
+    '*--type-clear=[clear globs previously defined for specified file type]: :_rg_types'
+    # This should actually be exclusive with everything but other type options
+    '(: *)--type-list[show all supported file types and their associated globs]'
+    '*'{-T+,--type-not=}"[don't search files matching specified file type]: :_rg_types"
+
+    + '(word-line)' # Whole-word/line match options
+    {-w,--word-regexp}'[only show matches surrounded by word boundaries]'
+    {-x,--line-regexp}'[only show matches surrounded by line boundaries]'
+
+    + '(unicode)' # Unicode options
+    $no'--unicode[enable Unicode mode]'
+    '--no-unicode[disable Unicode mode]'
+
+    + '(zip)' # Compression options
+    '(--pre)'{-z,--search-zip}'[search in compressed files]'
+    $no"--no-search-zip[don't search in compressed files]"
+
+    + misc # Other options — no need to separate these at the moment
+    '(-b --byte-offset)'{-b,--byte-offset}'[show 0-based byte offset for each matching line]'
+    $no"--no-byte-offset[don't show byte offsets for each matching line]"
+    '--color=[specify when to use colors in output]:when:((
+      never\:"never use colors"
+      auto\:"use colors or not based on stdout, TERM, etc."
+      always\:"always use colors"
+      ansi\:"always use ANSI colors (even on Windows)"
+    ))'
+    '*--colors=[specify color and style settings]: :->colorspec'
+    '--context-separator=[specify string used to separate non-continuous context lines in output]:separator'
+    $no"--no-context-separator[don't print context separators]"
+    '--debug[show debug messages]'
+    '--field-context-separator[set string to delimit fields in context lines]'
+    '--field-match-separator[set string to delimit fields in matching lines]'
+    '--hostname-bin=[executable for getting system hostname]:hostname executable:_command_names -e'
+    '--hyperlink-format=[specify pattern for hyperlinks]:pattern'
+    '--trace[show more verbose debug messages]'
+    '--dfa-size-limit=[specify upper size limit of generated DFA]:DFA size (bytes)'
+    "(1 stats)--files[show each file that would be searched (but don't search)]"
+    '*--ignore-file=[specify additional ignore file]:ignore file:_files'
+    '(-v --invert-match)'{-v,--invert-match}'[invert matching]'
+    $no"--no-invert-match[do not invert matching]"
+    '(-M --max-columns)'{-M+,--max-columns=}'[specify max length of lines to print]:number of bytes'
+    '(-m --max-count)'{-m+,--max-count=}'[specify max number of matches per file]:number of matches'
+    '--max-filesize=[specify size above which files should be ignored]:file size (bytes)'
+    "--no-config[don't load configuration files]"
+    '(-0 --null)'{-0,--null}'[print NUL byte after file names]'
+    '--path-separator=[specify path separator to use when printing file names]:separator'
+    '(-q --quiet)'{-q,--quiet}'[suppress normal output]'
+    '--regex-size-limit=[specify upper size limit of compiled regex]:regex size (bytes)'
+    '*'{-u,--unrestricted}'[reduce level of "smart" searching]'
+    '--stop-on-nonmatch[stop on first non-matching line after a matching one]'
+
+    + operand # Operands
+    '(--files --type-list file regexp)1: :_guard "^-*" pattern'
+    '(--type-list)*: :_files'
+  )
+
+  # This is used with test-complete to verify that there are no options
+  # listed in the help output that aren't also defined here
+  [[ $_RG_COMPLETE_LIST_ARGS == (1|t*|y*) ]] && {
+    print -rl - $args
+    return 0
+  }
+
+  # Strip out argument groups where unsupported (see above)
+  [[ $ZSH_VERSION == (4|5.<0-3>)(.*)# ]] &&
+  args=( ${(@)args:#(#i)(+|[a-z0-9][a-z0-9_-]#|\([a-z0-9][a-z0-9_-]#\))} )
+
+  _arguments -C -s -S : $args && ret=0
+
+  case $state in
+    colorspec)
+      if [[ ${IPREFIX#--*=}$PREFIX == [^:]# ]]; then
+        suf=( -qS: )
+        tmp=(
+          'column:specify coloring for column numbers'
+          'line:specify coloring for line numbers'
+          'match:specify coloring for match text'
+          'path:specify coloring for file names'
+        )
+        descr='color/style type'
+      elif [[ ${IPREFIX#--*=}$PREFIX == (column|line|match|path):[^:]# ]]; then
+        suf=( -qS: )
+        tmp=(
+          'none:clear color/style for type'
+          'bg:specify background color'
+          'fg:specify foreground color'
+          'style:specify text style'
+        )
+        descr='color/style attribute'
+      elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:(bg|fg):[^:]# ]]; then
+        tmp=( black blue green red cyan magenta yellow white )
+        descr='color name or r,g,b'
+      elif [[ ${IPREFIX#--*=}$PREFIX == [^:]##:style:[^:]# ]]; then
+        tmp=( {,no}bold {,no}intense {,no}underline )
+        descr='style name'
+      else
+        _message -e colorspec 'no more arguments'
+      fi
+
+      (( $#tmp )) && {
+        compset -P '*:'
+        _describe -t colorspec $descr tmp $suf && ret=0
+      }
+      ;;
+
+    typespec)
+      if compset -P '[^:]##:include:'; then
+        _sequence -s , _rg_types && ret=0
+      # @todo This bit in particular could be better, but it's a little
+      # complex, and attempting to solve it seems to run us up against a crash
+      # bug — zsh # 40362
+      elif compset -P '[^:]##:'; then
+        _message 'glob or include directive' && ret=1
+      elif [[ ! -prefix *:* ]]; then
+        _rg_types -qS : && ret=0
+      fi
+      ;;
+  esac
+
+  return ret
+}
+
+# Complete encodings
+_rg_encodings() {
+  local -a expl
+  local -aU _encodings
+
+  # This is impossible to read, but these encodings rarely if ever change, so it
+  # probably doesn't matter. They are derived from the list given here:
+  # https://encoding.spec.whatwg.org/#concept-encoding-get
+  _encodings=(
+    {{,us-}ascii,arabic,chinese,cyrillic,greek{,8},hebrew,korean}
+    logical visual mac {,cs}macintosh x-mac-{cyrillic,roman,ukrainian}
+    866 ibm{819,866} csibm866
+    big5{,-hkscs} {cn-,cs}big5 x-x-big5
+    cp{819,866,125{0..8}} x-cp125{0..8}
+    csiso2022{jp,kr} csiso8859{6,8}{e,i}
+    csisolatin{{1..6},9} csisolatin{arabic,cyrillic,greek,hebrew}
+    ecma-{114,118} asmo-708 elot_928 sun_eu_greek
+    euc-{jp,kr} x-euc-jp cseuckr cseucpkdfmtjapanese
+    {,x-}gbk csiso58gb231280 gb18030 {,cs}gb2312 gb_2312{,-80} hz-gb-2312
+    iso-2022-{cn,cn-ext,jp,kr}
+    iso8859{,-}{{1..11},13,14,15}
+    iso-8859-{{1..11},{6,8}-{e,i},13,14,15,16} iso_8859-{{1..9},15}
+    iso_8859-{1,2,6,7}:1987 iso_8859-{3,4,5,8}:1988 iso_8859-9:1989
+    iso-ir-{58,100,101,109,110,126,127,138,144,148,149,157}
+    koi{,8,8-r,8-ru,8-u,8_r} cskoi8r
+    ks_c_5601-{1987,1989} ksc{,_}5691 csksc56011987
+    latin{1..6} l{{1..6},9}
+    shift{-,_}jis csshiftjis {,x-}sjis ms_kanji ms932
+    utf{,-}8 utf-16{,be,le} unicode-1-1-utf-8
+    windows-{31j,874,949,125{0..8}} dos-874 tis-620 ansi_x3.4-1968
+    x-user-defined auto none
+  )
+
+  _wanted encodings expl encoding compadd -a "$@" - _encodings
+}
+
+# Complete file types
+_rg_types() {
+  local -a expl
+  local -aU _types
+
+  _types=( ${(@)${(f)"$( _call_program types $words[1] --type-list )"}//:[[:space:]]##/:} )
+
+  if zstyle -t ":completion:${curcontext}:types" extra-verbose; then
+    _describe -t types 'file type' _types
+  else
+    _wanted types expl 'file type' compadd "$@" - ${(@)_types%%:*}
+  fi
+}
+
+_rg "$@"
+
+################################################################################
+# ZSH COMPLETION REFERENCE
+#
+# For the convenience of developers who aren't especially familiar with zsh
+# completion functions, a brief reference guide follows. This is in no way
+# comprehensive; it covers just enough of the basic structure, syntax, and
+# conventions to help someone make simple changes like adding new options. For
+# more complete documentation regarding zsh completion functions, please see the
+# following:
+#
+# * http://zsh.sourceforge.net/Doc/Release/Completion-System.html
+# * https://github.com/zsh-users/zsh/blob/master/Etc/completion-style-guide
+#
+# OVERVIEW
+#
+# Most zsh completion functions are defined in terms of `_arguments`, which is a
+# shell function that takes a series of argument specifications. The specs for
+# `rg` are stored in an array, which is common for more complex functions; the
+# elements of the array are passed to `_arguments` on invocation.
+#
+# ARGUMENT-SPECIFICATION SYNTAX
+#
+# The following is a contrived example of the argument specs for a simple tool:
+#
+#   '(: * -)'{-h,--help}'[display help information]'
+#   '(-q -v --quiet --verbose)'{-q,--quiet}'[decrease output verbosity]'
+#   '!(-q -v --quiet --verbose)--silent'
+#   '(-q -v --quiet --verbose)'{-v,--verbose}'[increase output verbosity]'
+#   '--color=[specify when to use colors]:when:(always never auto)'
+#   '*:example file:_files'
+#
+# Although there may appear to be six specs here, there are actually nine; we
+# use brace expansion to combine specs for options that go by multiple names,
+# like `-q` and `--quiet`. This is customary, and ties in with the fact that zsh
+# merges completion possibilities together when they have the same description.
+#
+# The first line defines the option `-h`/`--help`. With most tools, it isn't
+# useful to complete anything after `--help` because it effectively overrides
+# all others; the `(: * -)` at the beginning of the spec tells zsh not to
+# complete any other operands (`:` and `*`) or options (`-`) after this one has
+# been used. The `[...]` at the end associates a description with `-h`/`--help`;
+# as mentioned, zsh will see the identical descriptions and merge these options
+# together when offering completion possibilities.
+#
+# The next line defines `-q`/`--quiet`. Here we don't want to suppress further
+# completions entirely, but we don't want to offer `-q` if `--quiet` has been
+# given (since they do the same thing), nor do we want to offer `-v` (since it
+# doesn't make sense to be quiet and verbose at the same time). We don't need to
+# tell zsh not to offer `--quiet` a second time, since that's the default
+# behaviour, but since this line expands to two specs describing `-q` *and*
+# `--quiet` we do need to explicitly list all of them here.
+#
+# The next line defines a hidden option `--silent` — maybe it's a deprecated
+# synonym for `--quiet`. The leading `!` indicates that zsh shouldn't offer this
+# option during completion. The benefit of providing a spec for an option that
+# shouldn't be completed is that, if someone *does* use it, we can correctly
+# suppress completion of other options afterwards.
+#
+# The next line defines `-v`/`--verbose`; this works just like `-q`/`--quiet`.
+#
+# The next line defines `--color`. In this example, `--color` doesn't have a
+# corresponding short option, so we don't need to use brace expansion. Further,
+# there are no other options it's exclusive with (just itself), so we don't need
+# to define those at the beginning. However, it does take a mandatory argument.
+# The `=` at the end of `--color=` indicates that the argument may appear either
+# like `--color always` or like `--color=always`; this is how most GNU-style
+# command-line tools work. The corresponding short option would normally use `+`
+# — for example, `-c+` would allow either `-c always` or `-calways`. For this
+# option, the arguments are known ahead of time, so we can simply list them in
+# parentheses at the end (`when` is used as the description for the argument).
+#
+# The last line defines an operand (a non-option argument). In this example, the
+# operand can be used any number of times (the leading `*`), and it should be a
+# file path, so we tell zsh to call the `_files` function to complete it. The
+# `example file` in the middle is the description to use for this operand; we
+# could use a space instead to accept the default provided by `_files`.
+#
+# GROUPING ARGUMENT SPECIFICATIONS
+#
+# Newer versions of zsh support grouping argument specs together. All specs
+# following a `+` and then a group name are considered to be members of the
+# named group. Grouping is useful mostly for organisational purposes; it makes
+# the relationship between different options more obvious, and makes it easier
+# to specify exclusions.
+#
+# We could rewrite our example above using grouping as follows:
+#
+#   '(: * -)'{-h,--help}'[display help information]'
+#   '--color=[specify when to use colors]:when:(always never auto)'
+#   '*:example file:_files'
+#   + '(verbosity)'
+#   {-q,--quiet}'[decrease output verbosity]'
+#   '!--silent'
+#   {-v,--verbose}'[increase output verbosity]'
+#
+# Here we take advantage of a useful feature of spec grouping — when the group
+# name is surrounded by parentheses, as in `(verbosity)`, it tells zsh that all
+# of the options in that group are exclusive with each other. As a result, we
+# don't need to manually list out the exclusions at the beginning of each
+# option.
+#
+# Groups can also be referred to by name in other argument specs; for example:
+#
+#   '(xyz)--aaa' '*: :_files'
+#   + xyz --xxx --yyy --zzz
+#
+# Here we use the group name `xyz` to tell zsh that `--xxx`, `--yyy`, and
+# `--zzz` are not to be completed after `--aaa`. This makes the exclusion list
+# much more compact and reusable.
+#
+# CONVENTIONS
+#
+# zsh completion functions generally adhere to the following conventions:
+#
+# * Use two spaces for indentation
+# * Combine specs for options with different names using brace expansion
+# * In combined specs, list the short option first (as in `{-a,--text}`)
+# * Use `+` or `=` as described above for options that take arguments
+# * Provide a description for all options, option-arguments, and operands
+# * Capitalise/punctuate argument descriptions as phrases, not complete
+#   sentences — 'display help information', never 'Display help information.'
+#   (but still capitalise acronyms and proper names)
+# * Write argument descriptions as verb phrases — 'display x', 'enable y',
+#   'use z'
+# * Word descriptions to make it clear when an option expects an argument;
+#   usually this is done with the word 'specify', as in 'specify x' or
+#   'use specified x')
+# * Write argument descriptions as tersely as possible — for example, articles
+#   like 'a' and 'the' should be omitted unless it would be confusing
+#
+# Other conventions currently used by this function:
+#
+# * Order argument specs alphabetically by group name, then option name
+# * Group options that are directly related, mutually exclusive, or frequently
+#   referenced by other argument specs
+# * Use only characters in the set [a-z0-9_-] in group names
+# * Order exclusion lists as follows: short options, long options, groups
+# * Use American English in descriptions
+# * Use 'don't' in descriptions instead of 'do not'
+# * Word descriptions for related options as similarly as possible. For example,
+#   `--foo[enable foo]` and `--no-foo[disable foo]`, or `--foo[use foo]` and
+#   `--no-foo[don't use foo]`
+# * Word descriptions to make it clear when an option only makes sense with
+#   another option, usually by adding '(with -x)' to the end
+# * Don't quote strings or variables unnecessarily. When quotes are required,
+#   prefer single-quotes to double-quotes
+# * Prefix option specs with `$no` when the option serves only to negate the
+#   behaviour of another option that must be provided explicitly by the user.
+#   This prevents rarely used options from cluttering up the completion menu
+################################################################################
+
+# ------------------------------------------------------------------------------
+# Copyright (c) 2011 Github zsh-users - http://github.com/zsh-users
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in the
+#       documentation and/or other materials provided with the distribution.
+#     * Neither the name of the zsh-users nor the
+#       names of its contributors may be used to endorse or promote products
+#       derived from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL ZSH-USERS BE LIABLE FOR ANY
+# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+# ------------------------------------------------------------------------------
+# Description
+# -----------
+#
+#  Completion script for ripgrep
+#
+# ------------------------------------------------------------------------------
+# Authors
+# -------
+#
+#  * arcizan <ghostrevery@gmail.com>
+#  * MaskRay <i@maskray.me>
+#
+# ------------------------------------------------------------------------------
+
+# Local Variables:
+# mode: shell-script
+# coding: utf-8-unix
+# indent-tabs-mode: nil
+# sh-indentation: 2
+# sh-basic-offset: 2
+# End:
+# vim: ft=zsh sw=2 ts=2 et
--- a/crates/core/flags/complete/zsh.rs
+++ b/crates/core/flags/complete/zsh.rs
@@ -0,0 +1,23 @@
+/*!
+Provides completions for ripgrep's CLI for the zsh shell.
+
+Unlike completion short for other shells (at time of writing), zsh's
+completions for ripgrep are maintained by hand. This is because:
+
+1. They are lovingly written by an expert in such things.
+2. Are much higher in quality than the ones below that are auto-generated.
+Namely, the zsh completions take application level context about flag
+compatibility into account.
+3. There is a CI script that fails if a new flag is added to ripgrep that
+isn't included in the zsh completions.
+4. There is a wealth of documentation in the zsh script explaining how it
+works and how it can be extended.
+
+In principle, I'd be open to maintaining any completion script by hand so
+long as it meets criteria 3 and 4 above.
+*/
+
+/// Generate completions for zsh.
+pub(crate) fn generate() -> String {
+    include_str!("rg.zsh").to_string()
+}
--- a/crates/core/flags/config.rs
+++ b/crates/core/flags/config.rs
@@ -1,6 +1,9 @@
-// This module provides routines for reading ripgrep config "rc" files. The
-// primary output of these routines is a sequence of arguments, where each
-// argument corresponds precisely to one shell argument.
+/*!
+This module provides routines for reading ripgrep config "rc" files.
+
+The primary output of these routines is a sequence of arguments, where each
+argument corresponds precisely to one shell argument.
+*/

 use std::{
    ffi::OsString,
--- a/crates/core/flags/defs.rs
+++ b/crates/core/flags/defs.rs
--- a/crates/core/flags/doc/help.rs
+++ b/crates/core/flags/doc/help.rs
@@ -0,0 +1,259 @@
+/*!
+Provides routines for generating ripgrep's "short" and "long" help
+documentation.
+
+The short version is used when the `-h` flag is given, while the long version
+is used when the `--help` flag is given.
+*/
+
+use std::{collections::BTreeMap, fmt::Write};
+
+use crate::flags::{defs::FLAGS, doc::version, Category, Flag};
+
+const TEMPLATE_SHORT: &'static str = include_str!("template.short.help");
+const TEMPLATE_LONG: &'static str = include_str!("template.long.help");
+
+/// Wraps `std::write!` and asserts there is no failure.
+///
+/// We only write to `String` in this module.
+macro_rules! write {
+    ($($tt:tt)*) => { std::write!($($tt)*).unwrap(); }
+}
+
+/// Generate short documentation, i.e., for `-h`.
+pub(crate) fn generate_short() -> String {
+    let mut cats: BTreeMap<Category, (Vec<String>, Vec<String>)> =
+        BTreeMap::new();
+    let (mut maxcol1, mut maxcol2) = (0, 0);
+    for flag in FLAGS.iter().copied() {
+        let columns =
+            cats.entry(flag.doc_category()).or_insert((vec![], vec![]));
+        let (col1, col2) = generate_short_flag(flag);
+        maxcol1 = maxcol1.max(col1.len());
+        maxcol2 = maxcol2.max(col2.len());
+        columns.0.push(col1);
+        columns.1.push(col2);
+    }
+    let mut out =
+        TEMPLATE_SHORT.replace("!!VERSION!!", &version::generate_digits());
+    for (cat, (col1, col2)) in cats.iter() {
+        let var = format!("!!{name}!!", name = cat.as_str());
+        let val = format_short_columns(col1, col2, maxcol1, maxcol2);
+        out = out.replace(&var, &val);
+    }
+    out
+}
+
+/// Generate short for a single flag.
+///
+/// The first element corresponds to the flag name while the second element
+/// corresponds to the documentation string.
+fn generate_short_flag(flag: &dyn Flag) -> (String, String) {
+    let (mut col1, mut col2) = (String::new(), String::new());
+
+    // Some of the variable names are fine for longer form
+    // docs, but they make the succinct short help very noisy.
+    // So just shorten some of them.
+    let var = flag.doc_variable().map(|s| {
+        let mut s = s.to_string();
+        s = s.replace("SEPARATOR", "SEP");
+        s = s.replace("REPLACEMENT", "TEXT");
+        s = s.replace("NUM+SUFFIX?", "NUM");
+        s
+    });
+
+    // Generate the first column, the flag name.
+    if let Some(byte) = flag.name_short() {
+        let name = char::from(byte);
+        write!(col1, r"-{name}");
+        write!(col1, r", ");
+    }
+    write!(col1, r"--{name}", name = flag.name_long());
+    if let Some(var) = var.as_ref() {
+        write!(col1, r"={var}");
+    }
+
+    // And now the second column, with the description.
+    write!(col2, "{}", flag.doc_short());
+
+    (col1, col2)
+}
+
+/// Write two columns of documentation.
+///
+/// `maxcol1` should be the maximum length (in bytes) of the first column,
+/// while `maxcol2` should be the maximum length (in bytes) of the second
+/// column.
+fn format_short_columns(
+    col1: &[String],
+    col2: &[String],
+    maxcol1: usize,
+    _maxcol2: usize,
+) -> String {
+    assert_eq!(col1.len(), col2.len(), "columns must have equal length");
+    const PAD: usize = 2;
+    let mut out = String::new();
+    for (i, (c1, c2)) in col1.iter().zip(col2.iter()).enumerate() {
+        if i > 0 {
+            write!(out, "\n");
+        }
+
+        let pad = maxcol1 - c1.len() + PAD;
+        write!(out, "  ");
+        write!(out, "{c1}");
+        write!(out, "{}", " ".repeat(pad));
+        write!(out, "{c2}");
+    }
+    out
+}
+
+/// Generate long documentation, i.e., for `--help`.
+pub(crate) fn generate_long() -> String {
+    let mut cats = BTreeMap::new();
+    for flag in FLAGS.iter().copied() {
+        let mut cat = cats.entry(flag.doc_category()).or_insert(String::new());
+        if !cat.is_empty() {
+            write!(cat, "\n\n");
+        }
+        generate_long_flag(flag, &mut cat);
+    }
+
+    let mut out =
+        TEMPLATE_LONG.replace("!!VERSION!!", &version::generate_digits());
+    for (cat, value) in cats.iter() {
+        let var = format!("!!{name}!!", name = cat.as_str());
+        out = out.replace(&var, value);
+    }
+    out
+}
+
+/// Write generated documentation for `flag` to `out`.
+fn generate_long_flag(flag: &dyn Flag, out: &mut String) {
+    if let Some(byte) = flag.name_short() {
+        let name = char::from(byte);
+        write!(out, r"    -{name}");
+        if let Some(var) = flag.doc_variable() {
+            write!(out, r" {var}");
+        }
+        write!(out, r", ");
+    } else {
+        write!(out, r"    ");
+    }
+
+    let name = flag.name_long();
+    write!(out, r"--{name}");
+    if let Some(var) = flag.doc_variable() {
+        write!(out, r"={var}");
+    }
+    write!(out, "\n");
+
+    let doc = flag.doc_long().trim();
+    let doc = super::render_custom_markup(doc, "flag", |name, out| {
+        let Some(flag) = crate::flags::parse::lookup(name) else {
+            unreachable!(r"found unrecognized \flag{{{name}}} in --help docs")
+        };
+        if let Some(name) = flag.name_short() {
+            write!(out, r"-{}/", char::from(name));
+        }
+        write!(out, r"--{}", flag.name_long());
+    });
+    let doc = super::render_custom_markup(&doc, "flag-negate", |name, out| {
+        let Some(flag) = crate::flags::parse::lookup(name) else {
+            unreachable!(
+                r"found unrecognized \flag-negate{{{name}}} in --help docs"
+            )
+        };
+        let Some(name) = flag.name_negated() else {
+            let long = flag.name_long();
+            unreachable!(
+                "found \\flag-negate{{{long}}} in --help docs but \
+                 {long} does not have a negation"
+            );
+        };
+        write!(out, r"--{name}");
+    });
+
+    let mut cleaned = remove_roff(&doc);
+    if let Some(negated) = flag.name_negated() {
+        // Flags that can be negated that aren't switches, like
+        // --context-separator, are somewhat weird. Because of that, the docs
+        // for those flags should discuss the semantics of negation explicitly.
+        // But for switches, the behavior is always the same.
+        if flag.is_switch() {
+            write!(cleaned, "\n\nThis flag can be disabled with --{negated}.");
+        }
+    }
+    let indent = " ".repeat(8);
+    let wrapopts = textwrap::Options::new(71)
+        // Normally I'd be fine with breaking at hyphens, but ripgrep's docs
+        // includes a lot of flag names, and they in turn contain hyphens.
+        // Breaking flag names across lines is not great.
+        .word_splitter(textwrap::WordSplitter::NoHyphenation);
+    for (i, paragraph) in cleaned.split("\n\n").enumerate() {
+        if i > 0 {
+            write!(out, "\n\n");
+        }
+        let mut new = paragraph.to_string();
+        if paragraph.lines().all(|line| line.starts_with("    ")) {
+            // Re-indent but don't refill so as to preserve line breaks
+            // in code/shell example snippets.
+            new = textwrap::indent(&new, &indent);
+        } else {
+            new = new.replace("\n", " ");
+            new = textwrap::refill(&new, &wrapopts);
+            new = textwrap::indent(&new, &indent);
+        }
+        write!(out, "{}", new.trim_end());
+    }
+}
+
+/// Removes roff syntax from `v` such that the result is approximately plain
+/// text readable.
+///
+/// This is basically a mish mash of heuristics based on the specific roff used
+/// in the docs for the flags in this tool. If new kinds of roff are used in
+/// the docs, then this may need to be updated to handle them.
+fn remove_roff(v: &str) -> String {
+    let mut lines = vec![];
+    for line in v.trim().lines() {
+        assert!(!line.is_empty(), "roff should have no empty lines");
+        if line.starts_with(".") {
+            if line.starts_with(".IP ") {
+                let item_label = line
+                    .split(" ")
+                    .nth(1)
+                    .expect("first argument to .IP")
+                    .replace(r"\(bu", r"•")
+                    .replace(r"\fB", "")
+                    .replace(r"\fP", ":");
+                lines.push(format!("{item_label}"));
+            } else if line.starts_with(".IB ") || line.starts_with(".BI ") {
+                let pieces = line
+                    .split_whitespace()
+                    .skip(1)
+                    .collect::<Vec<_>>()
+                    .concat();
+                lines.push(format!("{pieces}"));
+            } else if line.starts_with(".sp")
+                || line.starts_with(".PP")
+                || line.starts_with(".TP")
+            {
+                lines.push("".to_string());
+            }
+        } else if line.starts_with(r"\fB") && line.ends_with(r"\fP") {
+            let line = line.replace(r"\fB", "").replace(r"\fP", "");
+            lines.push(format!("{line}:"));
+        } else {
+            lines.push(line.to_string());
+        }
+    }
+    // Squash multiple adjacent paragraph breaks into one.
+    lines.dedup_by(|l1, l2| l1.is_empty() && l2.is_empty());
+    lines
+        .join("\n")
+        .replace(r"\fB", "")
+        .replace(r"\fI", "")
+        .replace(r"\fP", "")
+        .replace(r"\-", "-")
+        .replace(r"\\", r"\")
+}
--- a/crates/core/flags/doc/man.rs
+++ b/crates/core/flags/doc/man.rs
@@ -0,0 +1,110 @@
+/*!
+Provides routines for generating ripgrep's man page in `roff` format.
+*/
+
+use std::{collections::BTreeMap, fmt::Write};
+
+use crate::flags::{defs::FLAGS, doc::version, Flag};
+
+const TEMPLATE: &'static str = include_str!("template.rg.1");
+
+/// Wraps `std::write!` and asserts there is no failure.
+///
+/// We only write to `String` in this module.
+macro_rules! write {
+    ($($tt:tt)*) => { std::write!($($tt)*).unwrap(); }
+}
+
+/// Wraps `std::writeln!` and asserts there is no failure.
+///
+/// We only write to `String` in this module.
+macro_rules! writeln {
+    ($($tt:tt)*) => { std::writeln!($($tt)*).unwrap(); }
+}
+
+/// Returns a `roff` formatted string corresponding to ripgrep's entire man
+/// page.
+pub(crate) fn generate() -> String {
+    let mut cats = BTreeMap::new();
+    for flag in FLAGS.iter().copied() {
+        let mut cat = cats.entry(flag.doc_category()).or_insert(String::new());
+        if !cat.is_empty() {
+            writeln!(cat, ".sp");
+        }
+        generate_flag(flag, &mut cat);
+    }
+
+    let mut out = TEMPLATE.replace("!!VERSION!!", &version::generate_digits());
+    for (cat, value) in cats.iter() {
+        let var = format!("!!{name}!!", name = cat.as_str());
+        out = out.replace(&var, value);
+    }
+    out
+}
+
+/// Writes `roff` formatted documentation for `flag` to `out`.
+fn generate_flag(flag: &'static dyn Flag, out: &mut String) {
+    if let Some(byte) = flag.name_short() {
+        let name = char::from(byte);
+        write!(out, r"\fB\-{name}\fP");
+        if let Some(var) = flag.doc_variable() {
+            write!(out, r" \fI{var}\fP");
+        }
+        write!(out, r", ");
+    }
+
+    let name = flag.name_long();
+    write!(out, r"\fB\-\-{name}\fP");
+    if let Some(var) = flag.doc_variable() {
+        write!(out, r"=\fI{var}\fP");
+    }
+    write!(out, "\n");
+
+    writeln!(out, ".RS 4");
+    let doc = flag.doc_long().trim();
+    // Convert \flag{foo} into something nicer.
+    let doc = super::render_custom_markup(doc, "flag", |name, out| {
+        let Some(flag) = crate::flags::parse::lookup(name) else {
+            unreachable!(r"found unrecognized \flag{{{name}}} in roff docs")
+        };
+        out.push_str(r"\fB");
+        if let Some(name) = flag.name_short() {
+            write!(out, r"\-{}/", char::from(name));
+        }
+        write!(out, r"\-\-{}", flag.name_long());
+        out.push_str(r"\fP");
+    });
+    // Convert \flag-negate{foo} into something nicer.
+    let doc = super::render_custom_markup(&doc, "flag-negate", |name, out| {
+        let Some(flag) = crate::flags::parse::lookup(name) else {
+            unreachable!(
+                r"found unrecognized \flag-negate{{{name}}} in roff docs"
+            )
+        };
+        let Some(name) = flag.name_negated() else {
+            let long = flag.name_long();
+            unreachable!(
+                "found \\flag-negate{{{long}}} in roff docs but \
+                 {long} does not have a negation"
+            );
+        };
+        out.push_str(r"\fB");
+        write!(out, r"\-\-{name}");
+        out.push_str(r"\fP");
+    });
+    writeln!(out, "{doc}");
+    if let Some(negated) = flag.name_negated() {
+        // Flags that can be negated that aren't switches, like
+        // --context-separator, are somewhat weird. Because of that, the docs
+        // for those flags should discuss the semantics of negation explicitly.
+        // But for switches, the behavior is always the same.
+        if flag.is_switch() {
+            writeln!(out, ".sp");
+            writeln!(
+                out,
+                r"This flag can be disabled with \fB\-\-{negated}\fP."
+            );
+        }
+    }
+    writeln!(out, ".RE");
+}
--- a/crates/core/flags/doc/mod.rs
+++ b/crates/core/flags/doc/mod.rs
@@ -0,0 +1,38 @@
+/*!
+Modules for generating documentation for ripgrep's flags.
+*/
+
+pub(crate) mod help;
+pub(crate) mod man;
+pub(crate) mod version;
+
+/// Searches for `\tag{...}` occurrences in `doc` and calls `replacement` for
+/// each such tag found.
+///
+/// The first argument given to `replacement` is the tag value, `...`. The
+/// second argument is the buffer that accumulates the full replacement text.
+///
+/// Since this function is only intended to be used on doc strings written into
+/// the program source code, callers should panic in `replacement` if there are
+/// any errors or unexpected circumstances.
+fn render_custom_markup(
+    mut doc: &str,
+    tag: &str,
+    mut replacement: impl FnMut(&str, &mut String),
+) -> String {
+    let mut out = String::with_capacity(doc.len());
+    let tag_prefix = format!(r"\{tag}{{");
+    while let Some(offset) = doc.find(&tag_prefix) {
+        out.push_str(&doc[..offset]);
+
+        let start = offset + tag_prefix.len();
+        let Some(end) = doc[start..].find('}').map(|i| start + i) else {
+            unreachable!(r"found {tag_prefix} without closing }}");
+        };
+        let name = &doc[start..end];
+        replacement(name, &mut out);
+        doc = &doc[end + 1..];
+    }
+    out.push_str(doc);
+    out
+}
--- a/crates/core/flags/doc/template.long.help
+++ b/crates/core/flags/doc/template.long.help
@@ -0,0 +1,61 @@
+ripgrep !!VERSION!!
+Andrew Gallant <jamslam@gmail.com>
+
+ripgrep (rg) recursively searches the current directory for a regex pattern.
+By default, ripgrep will respect gitignore rules and automatically skip hidden
+files/directories and binary files.
+
+Use -h for short descriptions and --help for more details.
+
+Project home page: https://github.com/BurntSushi/ripgrep
+
+USAGE:
+    rg [OPTIONS] PATTERN [PATH ...]
+    rg [OPTIONS] -e PATTERN ... [PATH ...]
+    rg [OPTIONS] -f PATTERNFILE ... [PATH ...]
+    rg [OPTIONS] --files [PATH ...]
+    rg [OPTIONS] --type-list
+    command | rg [OPTIONS] PATTERN
+    rg [OPTIONS] --help
+    rg [OPTIONS] --version
+
+POSITIONAL ARGUMENTS:
+    <PATTERN>
+        A regular expression used for searching. To match a pattern beginning
+        with a dash, use the -e/--regexp flag.
+
+        For example, to search for the literal '-foo', you can use this flag:
+
+            rg -e -foo
+
+        You can also use the special '--' delimiter to indicate that no more
+        flags will be provided. Namely, the following is equivalent to the
+        above:
+
+            rg -- -foo
+
+    <PATH>...
+        A file or directory to search. Directories are searched recursively.
+        File paths specified on the command line override glob and ignore
+        rules.
+
+INPUT OPTIONS:
+!!input!!
+
+SEARCH OPTIONS:
+!!search!!
+
+FILTER OPTIONS:
+!!filter!!
+
+OUTPUT OPTIONS:
+!!output!!
+
+OUTPUT MODES:
+!!output-modes!!
+
+LOGGING OPTIONS:
+!!logging!!
+
+OTHER BEHAVIORS:
+!!other-behaviors!!
--- a/crates/core/flags/doc/template.rg.1
+++ b/crates/core/flags/doc/template.rg.1
@@ -0,0 +1,415 @@
+.TH RG 1 2023-11-13 "!!VERSION!!" "User Commands"
+.
+.
+.SH NAME
+rg \- recursively search the current directory for lines matching a pattern
+.
+.
+.SH SYNOPSIS
+.\" I considered using GNU troff's .SY and .YS "synopsis" macros here, but it
+.\" looks like they aren't portable. Specifically, they don't appear to be in
+.\" BSD's mdoc used on macOS.
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fIPATTERN\fP [\fIPATH\fP...]
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-e\fP \fIPATTERN\fP... [\fIPATH\fP...]
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-f\fP \fIPATTERNFILE\fP... [\fIPATH\fP...]
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-\-files\fP [\fIPATH\fP...]
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-\-type\-list\fP
+.sp
+\fIcommand\fP | \fBrg\fP [\fIOPTIONS\fP] \fIPATTERN\fP
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-\-help\fP
+.sp
+\fBrg\fP [\fIOPTIONS\fP] \fB\-\-version\fP
+.
+.
+.SH DESCRIPTION
+ripgrep (rg) recursively searches the current directory for a regex pattern.
+By default, ripgrep will respect your \fB.gitignore\fP and automatically skip
+hidden files/directories and binary files.
+.sp
+ripgrep's default regex engine uses finite automata and guarantees linear
+time searching. Because of this, features like backreferences and arbitrary
+look-around are not supported. However, if ripgrep is built with PCRE2,
+then the \fB\-P/\-\-pcre2\fP flag can be used to enable backreferences and
+look-around.
+.sp
+ripgrep supports configuration files. Set \fBRIPGREP_CONFIG_PATH\fP to a
+configuration file. The file can specify one shell argument per line. Lines
+starting with \fB#\fP are ignored. For more details, see \fBCONFIGURATION
+FILES\fP below.
+.sp
+ripgrep will automatically detect if stdin exists and search stdin for a regex
+pattern, e.g. \fBls | rg foo\fP. In some environments, stdin may exist when
+it shouldn't. To turn off stdin detection, one can explicitly specify the
+directory to search, e.g. \fBrg foo ./\fP.
+.sp
+Tip: to disable all smart filtering and make ripgrep behave a bit more like
+classical grep, use \fBrg -uuu\fP.
+.
+.
+.SH REGEX SYNTAX
+ripgrep uses Rust's regex engine by default, which documents its syntax:
+\fIhttps://docs.rs/regex/1.*/regex/#syntax\fP
+.sp
+ripgrep uses byte-oriented regexes, which has some additional documentation:
+\fIhttps://docs.rs/regex/1.*/regex/bytes/index.html#syntax\fP
+.sp
+To a first approximation, ripgrep uses Perl-like regexes without look-around or
+backreferences. This makes them very similar to the "extended" (ERE) regular
+expressions supported by *egrep*, but with a few additional features like
+Unicode character classes.
+.sp
+If you're using ripgrep with the \fB\-P/\-\-pcre2\fP flag, then please consult
+\fIhttps://www.pcre.org\fP or the PCRE2 man pages for documentation on the
+supported syntax.
+.
+.
+.SH POSITIONAL ARGUMENTS
+.TP 12
+\fIPATTERN\fP
+A regular expression used for searching. To match a pattern beginning with a
+dash, use the \fB\-e/\-\-regexp\fP option.
+.TP 12
+\fIPATH\fP
+A file or directory to search. Directories are searched recursively. File paths
+specified explicitly on the command line override glob and ignore rules.
+.
+.
+.SH OPTIONS
+This section documents all flags that ripgrep accepts. Flags are grouped into
+categories below according to their function.
+.sp
+Note that many options can be turned on and off. In some cases, those flags are
+not listed explicitly below. For example, the \fB\-\-column\fP flag (listed
+below) enables column numbers in ripgrep's output, but the \fB\-\-no\-column\fP
+flag (not listed below) disables them. The reverse can also exist. For example,
+the \fB\-\-no\-ignore\fP flag (listed below) disables ripgrep's \fBgitignore\fP
+logic, but the \fB\-\-ignore\fP flag (not listed below) enables it. These
+flags are useful for overriding a ripgrep configuration file (or alias) on the
+command line. Each flag's documentation notes whether an inverted flag exists.
+In all cases, the flag specified last takes precedence.
+.
+.SS INPUT OPTIONS
+!!input!!
+.
+.SS SEARCH OPTIONS
+!!search!!
+.
+.SS FILTER OPTIONS
+!!filter!!
+.
+.SS OUTPUT OPTIONS
+!!output!!
+.
+.SS OUTPUT MODES
+!!output-modes!!
+.
+.SS LOGGING OPTIONS
+!!logging!!
+.
+.SS OTHER BEHAVIORS
+!!other-behaviors!!
+.
+.
+.SH EXIT STATUS
+If ripgrep finds a match, then the exit status of the program is \fB0\fP.
+If no match could be found, then the exit status is \fB1\fP. If an error
+occurred, then the exit status is always \fB2\fP unless ripgrep was run with
+the \fB\-q/\-\-quiet\fP flag and a match was found. In summary:
+.sp
+.IP \(bu 3n
+\fB0\fP exit status occurs only when at least one match was found, and if
+no error occurred, unless \fB\-q/\-\-quiet\fP was given.
+.
+.IP \(bu 3n
+\fB1\fP exit status occurs only when no match was found and no error occurred.
+.
+.IP \(bu 3n
+\fB2\fP exit status occurs when an error occurred. This is true for both
+catastrophic errors (e.g., a regex syntax error) and for soft errors (e.g.,
+unable to read a file).
+.
+.
+.SH AUTOMATIC FILTERING
+ripgrep does a fair bit of automatic filtering by default. This section
+describes that filtering and how to control it.
+.sp
+\fBTIP\fP: To disable automatic filtering, use \fBrg -uuu\fP.
+.sp
+ripgrep's automatic "smart" filtering is one of the most apparent
+differentiating features between ripgrep and other tools like \fBgrep\fP. As
+such, its behavior may be surprising to users that aren't expecting it.
+.sp
+ripgrep does four types of filtering automatically:
+.sp
+.
+.IP 1. 3n
+Files and directories that match ignore rules are not searched.
+.IP 2. 3n
+Hidden files and directories are not searched.
+.IP 3. 3n
+Binary files (files with a \fBNUL\fP byte) are not searched.
+.IP 4. 3n
+Symbolic links are not followed.
+.PP
+The first type of filtering is the most sophisticated. ripgrep will attempt to
+respect your \fBgitignore\fP rules as faithfully as possible. In particular,
+this includes the following:
+.
+.IP \(bu 3n
+Any global rules, e.g., in \fB$HOME/.config/git/ignore\fP.
+.
+.IP \(bu 3n
+Any rules in relevant \fB.gitignore\fP files.
+.
+.IP \(bu 3n
+Any local rules, e.g., in \fB.git/info/exclude\fP.
+.PP
+In some cases, ripgrep and \fBgit\fP will not always be in sync in terms
+of which files are ignored. For example, a file that is ignored via
+\fB.gitignore\fP but is tracked by \fBgit\fP would not be searched by ripgrep
+even though \fBgit\fP tracks it. This is unlikely to ever be fixed. Instead,
+you should either make sure your exclude rules match the files you track
+precisely, or otherwise use \fBgit grep\fP for search.
+.sp
+Additional ignore rules can be provided outside of a \fBgit\fP context:
+.
+.IP \(bu 3n
+Any rules in \fB.ignore\fP.
+.
+.IP \(bu 3n
+Any rules in \fB.rgignore\fP.
+.
+.IP \(bu 3n
+Any rules in files specified with the \fB\-\-ignore\-file\fP flag.
+.PP
+The precedence of ignore rules is as follows, with later items overriding
+earlier items:
+.
+.IP \(bu 3n
+Files given by \fB\-\-ignore\-file\fP.
+.
+.IP \(bu 3n
+Global gitignore rules, e.g., from \fB$HOME/.config/git/ignore\fP.
+.
+.IP \(bu 3n
+Local rules from \fB.git/info/exclude\fP.
+.
+.IP \(bu 3n
+Rules from \fB.gitignore\fP.
+.
+.IP \(bu 3n
+Rules from \fB.ignore\fP.
+.
+.IP \(bu 3n
+Rules from \fB.rgignore\fP.
+.PP
+So for example, if \fIfoo\fP were in a \fB.gitignore\fP and \fB!\fP\fIfoo\fP
+were in an \fB.rgignore\fP, then \fIfoo\fP would not be ignored since
+\fB.rgignore\fP takes precedence over \fB.gitignore\fP.
+.sp
+Each of the types of filtering can be configured via command line flags:
+.
+.IP \(bu 3n
+There are several flags starting with \fB\-\-no\-ignore\fP that toggle which,
+if any, ignore rules are respected. \fB\-\-no\-ignore\fP by itself will disable
+all
+of them.
+.
+.IP \(bu 3n
+\fB\-./\-\-hidden\fP will force ripgrep to search hidden files and directories.
+.
+.IP \(bu 3n
+\fB\-\-binary\fP will force ripgrep to search binary files.
+.
+.IP \(bu 3n
+\fB\-L/\-\-follow\fP will force ripgrep to follow symlinks.
+.PP
+As a special short hand, the \fB\-u\fP flag can be specified up to three times.
+Each additional time incrementally decreases filtering:
+.
+.IP \(bu 3n
+\fB\-u\fP is equivalent to \fB\-\-no\-ignore\fP.
+.
+.IP \(bu 3n
+\fB\-uu\fP is equivalent to \fB\-\-no\-ignore \-\-hidden\fP.
+.
+.IP \(bu 3n
+\fB\-uuu\fP is equivalent to \fB\-\-no\-ignore \-\-hidden \-\-binary\fP.
+.PP
+In particular, \fBrg -uuu\fP should search the same exact content as \fBgrep
+-r\fP.
+.
+.
+.SH CONFIGURATION FILES
+ripgrep supports reading configuration files that change ripgrep's default
+behavior. The format of the configuration file is an "rc" style and is very
+simple. It is defined by two rules:
+.
+.IP 1. 3n
+Every line is a shell argument, after trimming whitespace.
+.
+.IP 2. 3n
+Lines starting with \fB#\fP (optionally preceded by any amount of whitespace)
+are ignored.
+.PP
+ripgrep will look for a single configuration file if and only if the
+\fBRIPGREP_CONFIG_PATH\fP environment variable is set and is non-empty.
+ripgrep will parse arguments from this file on startup and will behave as if
+the arguments in this file were prepended to any explicit arguments given to
+ripgrep on the command line. Note though that the \fBrg\fP command you run
+must still be valid. That is, it must always contain at least one pattern at
+the command line, even if the configuration file uses the \fB\-e/\-\-regexp\fP
+flag.
+.sp
+For example, if your ripgreprc file contained a single line:
+.sp
+.EX
+    \-\-smart\-case
+.EE
+.sp
+then the following command
+.sp
+.EX
+    RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo
+.EE
+.sp
+would behave identically to the following command:
+.sp
+.EX
+    rg \-\-smart-case foo
+.EE
+.sp
+Another example is adding types, like so:
+.sp
+.EX
+    \-\-type-add
+    web:*.{html,css,js}*
+.EE
+.sp
+The above would behave identically to the following command:
+.sp
+.EX
+    rg \-\-type\-add 'web:*.{html,css,js}*' foo
+.EE
+.sp
+The same applies to using globs. This:
+.sp
+.EX
+    \-\-glob=!.git
+.EE
+.sp
+or this:
+.sp
+.EX
+    \-\-glob
+    !.git
+.EE
+.sp
+would behave identically to the following command:
+.sp
+.EX
+    rg \-\-glob '!.git' foo
+.EE
+.sp
+The bottom line is that every shell argument needs to be on its own line. So
+for example, a config file containing
+.sp
+.EX
+    \-j 4
+.EE
+.sp
+is probably not doing what you intend. Instead, you want
+.sp
+.EX
+    \-j
+    4
+.EE
+.sp
+or
+.sp
+.EX
+    \-j4
+.EE
+.sp
+ripgrep also provides a flag, \fB\-\-no\-config\fP, that when present will
+suppress any and all support for configuration. This includes any future
+support for auto-loading configuration files from pre-determined paths.
+.sp
+Conflicts between configuration files and explicit arguments are handled
+exactly like conflicts in the same command line invocation. That is, assuming
+your config file contains only \fB\-\-smart\-case\fP, then this command:
+.sp
+.EX
+    RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo \-\-case\-sensitive
+.EE
+.sp
+is exactly equivalent to
+.sp
+.EX
+    rg \-\-smart\-case foo \-\-case\-sensitive
+.EE
+.sp
+in which case, the \fB\-\-case\-sensitive\fP flag would override the
+\fB\-\-smart\-case\fP flag.
+.
+.
+.SH SHELL COMPLETION
+Shell completion files are included in the release tarball for Bash, Fish, Zsh
+and PowerShell.
+.sp
+For \fBbash\fP, move \fBrg.bash\fP to \fB$XDG_CONFIG_HOME/bash_completion\fP or
+\fB/etc/bash_completion.d/\fP.
+.sp
+For \fBfish\fP, move \fBrg.fish\fP to \fB$HOME/.config/fish/completions\fP.
+.sp
+For \fBzsh\fP, move \fB_rg\fP to one of your \fB$fpath\fP directories.
+.
+.
+.SH CAVEATS
+ripgrep may abort unexpectedly when using default settings if it searches a
+file that is simultaneously truncated. This behavior can be avoided by passing
+the \fB\-\-no\-mmap\fP flag which will forcefully disable the use of memory
+maps in all cases.
+.sp
+ripgrep may use a large amount of memory depending on a few factors. Firstly,
+if ripgrep uses parallelism for search (the default), then the entire
+output for each individual file is buffered into memory in order to prevent
+interleaving matches in the output. To avoid this, you can disable parallelism
+with the \fB\-j1\fP flag. Secondly, ripgrep always needs to have at least a
+single line in memory in order to execute a search. A file with a very long
+line can thus cause ripgrep to use a lot of memory. Generally, this only occurs
+when searching binary data with the \fB\-a/\-\-text\fP flag enabled. (When the
+\fB\-a/\-\-text\fP flag isn't enabled, ripgrep will replace all NUL bytes with
+line terminators, which typically prevents exorbitant memory usage.) Thirdly,
+when ripgrep searches a large file using a memory map, the process will likely
+report its resident memory usage as the size of the file. However, this does
+not mean ripgrep actually needed to use that much heap memory; the operating
+system will generally handle this for you.
+.
+.
+.SH VERSION
+!!VERSION!!
+.
+.
+.SH HOMEPAGE
+\fIhttps://github.com/BurntSushi/ripgrep\fP
+.sp
+Please report bugs and feature requests to the issue tracker. Please do your
+best to provide a reproducible test case for bugs. This should include the
+corpus being searched, the \fBrg\fP command, the actual output and the expected
+output. Please also include the output of running the same \fBrg\fP command but
+with the \fB\-\-debug\fP flag.
+.sp
+If you have questions that don't obviously fall into the "bug" or "feature
+request" category, then they are welcome in the Discussions section of the
+issue tracker: \fIhttps://github.com/BurntSushi/ripgrep/discussions\fP.
+.
+.
+.SH AUTHORS
+Andrew Gallant <\fIjamslam@gmail.com\fP>
--- a/crates/core/flags/doc/template.short.help
+++ b/crates/core/flags/doc/template.short.help
@@ -0,0 +1,38 @@
+ripgrep !!VERSION!!
+Andrew Gallant <jamslam@gmail.com>
+
+ripgrep (rg) recursively searches the current directory for a regex pattern.
+By default, ripgrep will respect gitignore rules and automatically skip hidden
+files/directories and binary files.
+
+Use -h for short descriptions and --help for more details.
+
+Project home page: https://github.com/BurntSushi/ripgrep
+
+USAGE:
+  rg [OPTIONS] PATTERN [PATH ...]
+
+POSITIONAL ARGUMENTS:
+  <PATTERN>   A regular expression used for searching.
+  <PATH>...   A file or directory to search.
+
+INPUT OPTIONS:
+!!input!!
+
+SEARCH OPTIONS:
+!!search!!
+
+FILTER OPTIONS:
+!!filter!!
+
+OUTPUT OPTIONS:
+!!output!!
+
+OUTPUT MODES:
+!!output-modes!!
+
+LOGGING OPTIONS:
+!!logging!!
+
+OTHER BEHAVIORS:
+!!other-behaviors!!
--- a/crates/core/flags/doc/version.rs
+++ b/crates/core/flags/doc/version.rs
@@ -0,0 +1,148 @@
+/*!
+Provides routines for generating version strings.
+
+Version strings can be just the digits, an overall short one-line description
+or something more verbose that includes things like CPU target feature support.
+*/
+
+use std::fmt::Write;
+
+/// Generates just the numerical part of the version of ripgrep.
+///
+/// This includes the git revision hash.
+pub(crate) fn generate_digits() -> String {
+    let semver = option_env!("CARGO_PKG_VERSION").unwrap_or("N/A");
+    match option_env!("RIPGREP_BUILD_GIT_HASH") {
+        None => semver.to_string(),
+        Some(hash) => format!("{semver} (rev {hash})"),
+    }
+}
+
+/// Generates a short version string of the form `ripgrep x.y.z`.
+pub(crate) fn generate_short() -> String {
+    let digits = generate_digits();
+    format!("ripgrep {digits}")
+}
+
+/// Generates a longer multi-line version string.
+///
+/// This includes not only the version of ripgrep but some other information
+/// about its build. For example, SIMD support and PCRE2 support.
+pub(crate) fn generate_long() -> String {
+    let (compile, runtime) = (compile_cpu_features(), runtime_cpu_features());
+
+    let mut out = String::new();
+    writeln!(out, "{}", generate_short()).unwrap();
+    writeln!(out, "features:{}", features().join(",")).unwrap();
+    if !compile.is_empty() {
+        writeln!(out, "simd(compile):{}", compile.join(",")).unwrap();
+    }
+    if !runtime.is_empty() {
+        writeln!(out, "simd(runtime):{}", runtime.join(",")).unwrap();
+    }
+    out
+}
+
+/// Returns the relevant SIMD features supported by the CPU at runtime.
+///
+/// This is kind of a dirty violation of abstraction, since it assumes
+/// knowledge about what specific SIMD features are being used by various
+/// components.
+fn runtime_cpu_features() -> Vec<String> {
+    #[cfg(target_arch = "x86_64")]
+    {
+        let mut features = vec![];
+
+        let sse2 = is_x86_feature_detected!("sse2");
+        features.push(format!("{sign}SSE2", sign = sign(sse2)));
+
+        let ssse3 = is_x86_feature_detected!("ssse3");
+        features.push(format!("{sign}SSSE3", sign = sign(ssse3)));
+
+        let avx2 = is_x86_feature_detected!("avx2");
+        features.push(format!("{sign}AVX2", sign = sign(avx2)));
+
+        features
+    }
+    #[cfg(target_arch = "aarch64")]
+    {
+        let mut features = vec![];
+
+        // memchr and aho-corasick only use NEON when it is available at
+        // compile time. This isn't strictly necessary, but NEON is supposed
+        // to be available for all aarch64 targets. If this isn't true, please
+        // file an issue at https://github.com/BurntSushi/memchr.
+        let neon = cfg!(target_feature = "neon");
+        features.push(format!("{sign}NEON", sign = sign(neon)));
+
+        features
+    }
+    #[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
+    {
+        vec![]
+    }
+}
+
+/// Returns the SIMD features supported while compiling ripgrep.
+///
+/// In essence, any features listed here are required to run ripgrep correctly.
+///
+/// This is kind of a dirty violation of abstraction, since it assumes
+/// knowledge about what specific SIMD features are being used by various
+/// components.
+///
+/// An easy way to enable everything available on your current CPU is to
+/// compile ripgrep with `RUSTFLAGS="-C target-cpu=native"`. But note that
+/// the binary produced by this will not be portable.
+fn compile_cpu_features() -> Vec<String> {
+    #[cfg(target_arch = "x86_64")]
+    {
+        let mut features = vec![];
+
+        let sse2 = cfg!(target_feature = "sse2");
+        features.push(format!("{sign}SSE2", sign = sign(sse2)));
+
+        let ssse3 = cfg!(target_feature = "ssse3");
+        features.push(format!("{sign}SSSE3", sign = sign(ssse3)));
+
+        let avx2 = cfg!(target_feature = "avx2");
+        features.push(format!("{sign}AVX2", sign = sign(avx2)));
+
+        features
+    }
+    #[cfg(target_arch = "aarch64")]
+    {
+        let mut features = vec![];
+
+        let neon = cfg!(target_feature = "neon");
+        features.push(format!("{sign}NEON", sign = sign(neon)));
+
+        features
+    }
+    #[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
+    {
+        vec![]
+    }
+}
+
+/// Returns a list of "features" supported (or not) by this build of ripgrpe.
+fn features() -> Vec<String> {
+    let mut features = vec![];
+
+    let simd_accel = cfg!(feature = "simd-accel");
+    features.push(format!("{sign}simd-accel", sign = sign(simd_accel)));
+
+    let pcre2 = cfg!(feature = "pcre2");
+    features.push(format!("{sign}pcre2", sign = sign(pcre2)));
+
+    features
+}
+
+/// Returns `+` when `enabled` is `true` and `-` otherwise.
+fn sign(enabled: bool) -> &'static str {
+    if enabled {
+        "+"
+    } else {
+        "-"
+    }
+}
--- a/crates/core/flags/hiargs.rs
+++ b/crates/core/flags/hiargs.rs
--- a/crates/core/flags/lowargs.rs
+++ b/crates/core/flags/lowargs.rs
@@ -0,0 +1,758 @@
+/*!
+Provides the definition of low level arguments from CLI flags.
+*/
+
+use std::{
+    ffi::{OsStr, OsString},
+    path::PathBuf,
+};
+
+use {
+    bstr::{BString, ByteVec},
+    grep::printer::{HyperlinkFormat, UserColorSpec},
+};
+
+/// A collection of "low level" arguments.
+///
+/// The "low level" here is meant to constrain this type to be as close to the
+/// actual CLI flags and arguments as possible. Namely, other than some
+/// convenience types to help validate flag values and deal with overrides
+/// between flags, these low level arguments do not contain any higher level
+/// abstractions.
+///
+/// Another self-imposed constraint is that populating low level arguments
+/// should not require anything other than validating what the user has
+/// provided. For example, low level arguments should not contain a
+/// `HyperlinkConfig`, since in order to get a full configuration, one needs to
+/// discover the hostname of the current system (which might require running a
+/// binary or a syscall).
+///
+/// Low level arguments are populated by the parser directly via the `update`
+/// method on the corresponding implementation of the `Flag` trait.
+#[derive(Debug, Default)]
+pub(crate) struct LowArgs {
+    // Essential arguments.
+    pub(crate) special: Option<SpecialMode>,
+    pub(crate) mode: Mode,
+    pub(crate) positional: Vec<OsString>,
+    pub(crate) patterns: Vec<PatternSource>,
+    // Everything else, sorted lexicographically.
+    pub(crate) binary: BinaryMode,
+    pub(crate) boundary: Option<BoundaryMode>,
+    pub(crate) buffer: BufferMode,
+    pub(crate) byte_offset: bool,
+    pub(crate) case: CaseMode,
+    pub(crate) color: ColorChoice,
+    pub(crate) colors: Vec<UserColorSpec>,
+    pub(crate) column: Option<bool>,
+    pub(crate) context: ContextMode,
+    pub(crate) context_separator: ContextSeparator,
+    pub(crate) crlf: bool,
+    pub(crate) dfa_size_limit: Option<usize>,
+    pub(crate) encoding: EncodingMode,
+    pub(crate) engine: EngineChoice,
+    pub(crate) field_context_separator: FieldContextSeparator,
+    pub(crate) field_match_separator: FieldMatchSeparator,
+    pub(crate) fixed_strings: bool,
+    pub(crate) follow: bool,
+    pub(crate) glob_case_insensitive: bool,
+    pub(crate) globs: Vec<String>,
+    pub(crate) heading: Option<bool>,
+    pub(crate) hidden: bool,
+    pub(crate) hostname_bin: Option<PathBuf>,
+    pub(crate) hyperlink_format: HyperlinkFormat,
+    pub(crate) iglobs: Vec<String>,
+    pub(crate) ignore_file: Vec<PathBuf>,
+    pub(crate) ignore_file_case_insensitive: bool,
+    pub(crate) include_zero: bool,
+    pub(crate) invert_match: bool,
+    pub(crate) line_number: Option<bool>,
+    pub(crate) logging: Option<LoggingMode>,
+    pub(crate) max_columns: Option<u64>,
+    pub(crate) max_columns_preview: bool,
+    pub(crate) max_count: Option<u64>,
+    pub(crate) max_depth: Option<usize>,
+    pub(crate) max_filesize: Option<u64>,
+    pub(crate) mmap: MmapMode,
+    pub(crate) multiline: bool,
+    pub(crate) multiline_dotall: bool,
+    pub(crate) no_config: bool,
+    pub(crate) no_ignore_dot: bool,
+    pub(crate) no_ignore_exclude: bool,
+    pub(crate) no_ignore_files: bool,
+    pub(crate) no_ignore_global: bool,
+    pub(crate) no_ignore_messages: bool,
+    pub(crate) no_ignore_parent: bool,
+    pub(crate) no_ignore_vcs: bool,
+    pub(crate) no_messages: bool,
+    pub(crate) no_require_git: bool,
+    pub(crate) no_unicode: bool,
+    pub(crate) null: bool,
+    pub(crate) null_data: bool,
+    pub(crate) one_file_system: bool,
+    pub(crate) only_matching: bool,
+    pub(crate) path_separator: Option<u8>,
+    pub(crate) pre: Option<PathBuf>,
+    pub(crate) pre_glob: Vec<String>,
+    pub(crate) quiet: bool,
+    pub(crate) regex_size_limit: Option<usize>,
+    pub(crate) replace: Option<BString>,
+    pub(crate) search_zip: bool,
+    pub(crate) sort: Option<SortMode>,
+    pub(crate) stats: bool,
+    pub(crate) stop_on_nonmatch: bool,
+    pub(crate) threads: Option<usize>,
+    pub(crate) trim: bool,
+    pub(crate) type_changes: Vec<TypeChange>,
+    pub(crate) unrestricted: usize,
+    pub(crate) vimgrep: bool,
+    pub(crate) with_filename: Option<bool>,
+}
+
+/// A "special" mode that supercedes everything else.
+///
+/// When one of these modes is present, it overrides everything else and causes
+/// ripgrep to short-circuit. In particular, we avoid converting low-level
+/// argument types into higher level arguments types that can fail for various
+/// reasons related to the environment. (Parsing the low-level arguments can
+/// fail too, but usually not in a way that can't be worked around by removing
+/// the corresponding arguments from the CLI command.) This is overall a hedge
+/// to ensure that version and help information are basically always available.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum SpecialMode {
+    /// Show a condensed version of "help" output. Generally speaking, this
+    /// shows each flag and an extremely terse description of that flag on
+    /// a single line. This corresponds to the `-h` flag.
+    HelpShort,
+    /// Shows a very verbose version of the "help" output. The docs for some
+    /// flags will be paragraphs long. This corresponds to the `--help` flag.
+    HelpLong,
+    /// Show condensed version information. e.g., `ripgrep x.y.z`.
+    VersionShort,
+    /// Show verbose version information. Includes "short" information as well
+    /// as features included in the build.
+    VersionLong,
+    /// Show PCRE2's version information, or an error if this version of
+    /// ripgrep wasn't compiled with PCRE2 support.
+    VersionPCRE2,
+}
+
+/// The overall mode that ripgrep should operate in.
+///
+/// If ripgrep were designed without the legacy of grep, these would probably
+/// be sub-commands? Perhaps not, since they aren't as frequently used.
+///
+/// The point of putting these in one enum is that they are all mutually
+/// exclusive and override one another.
+///
+/// Note that -h/--help and -V/--version are not included in this because
+/// they always overrides everything else, regardless of where it appears
+/// in the command line. They are treated as "special" modes that short-circuit
+/// ripgrep's usual flow.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum Mode {
+    /// ripgrep will execute a search of some kind.
+    Search(SearchMode),
+    /// Show the files that *would* be searched, but don't actually search
+    /// them.
+    Files,
+    /// List all file type definitions configured, including the default file
+    /// types and any additional file types added to the command line.
+    Types,
+    /// Generate various things like the man page and completion files.
+    Generate(GenerateMode),
+}
+
+impl Default for Mode {
+    fn default() -> Mode {
+        Mode::Search(SearchMode::Standard)
+    }
+}
+
+impl Mode {
+    /// Update this mode to the new mode while implementing various override
+    /// semantics. For example, a search mode cannot override a non-search
+    /// mode.
+    pub(crate) fn update(&mut self, new: Mode) {
+        match *self {
+            // If we're in a search mode, then anything can override it.
+            Mode::Search(_) => *self = new,
+            _ => {
+                // Once we're in a non-search mode, other non-search modes
+                // can override it. But search modes cannot. So for example,
+                // `--files -l` will still be Mode::Files.
+                if !matches!(*self, Mode::Search(_)) {
+                    *self = new;
+                }
+            }
+        }
+    }
+}
+
+/// The kind of search that ripgrep is going to perform.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum SearchMode {
+    /// The default standard mode of operation. ripgrep looks for matches and
+    /// prints them when found.
+    ///
+    /// There is no specific flag for this mode since it's the default. But
+    /// some of the modes below, like JSON, have negation flags like --no-json
+    /// that let you revert back to this default mode.
+    Standard,
+    /// Show files containing at least one match.
+    FilesWithMatches,
+    /// Show files that don't contain any matches.
+    FilesWithoutMatch,
+    /// Show files containing at least one match and the number of matching
+    /// lines.
+    Count,
+    /// Show files containing at least one match and the total number of
+    /// matches.
+    CountMatches,
+    /// Print matches in a JSON lines format.
+    JSON,
+}
+
+/// The thing to generate via the --generate flag.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum GenerateMode {
+    /// Generate the raw roff used for the man page.
+    Man,
+    /// Completions for bash.
+    CompleteBash,
+    /// Completions for zsh.
+    CompleteZsh,
+    /// Completions for fish.
+    CompleteFish,
+    /// Completions for PowerShell.
+    CompletePowerShell,
+}
+
+/// Indicates how ripgrep should treat binary data.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum BinaryMode {
+    /// Automatically determine the binary mode to use. Essentially, when
+    /// a file is searched explicitly, then it will be searched using the
+    /// `SearchAndSuppress` strategy. Otherwise, it will be searched in a way
+    /// that attempts to skip binary files as much as possible. That is, once
+    /// a file is classified as binary, searching will immediately stop.
+    Auto,
+    /// Search files even when they have binary data, but if a match is found,
+    /// suppress it and emit a warning.
+    ///
+    /// In this mode, `NUL` bytes are replaced with line terminators. This is
+    /// a heuristic meant to reduce heap memory usage, since true binary data
+    /// isn't line oriented. If one attempts to treat such data as line
+    /// oriented, then one may wind up with impractically large lines. For
+    /// example, many binary files contain very long runs of NUL bytes.
+    SearchAndSuppress,
+    /// Treat all files as if they were plain text. There's no skipping and no
+    /// replacement of `NUL` bytes with line terminators.
+    AsText,
+}
+
+impl Default for BinaryMode {
+    fn default() -> BinaryMode {
+        BinaryMode::Auto
+    }
+}
+
+/// Indicates what kind of boundary mode to use (line or word).
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum BoundaryMode {
+    /// Only allow matches when surrounded by line bounaries.
+    Line,
+    /// Only allow matches when surrounded by word bounaries.
+    Word,
+}
+
+/// Indicates the buffer mode that ripgrep should use when printing output.
+///
+/// The default is `Auto`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum BufferMode {
+    /// Select the buffer mode, 'line' or 'block', automatically based on
+    /// whether stdout is connected to a tty.
+    Auto,
+    /// Flush the output buffer whenever a line terminator is seen.
+    ///
+    /// This is useful when wants to see search results more immediately,
+    /// for example, with `tail -f`.
+    Line,
+    /// Flush the output buffer whenever it reaches some fixed size. The size
+    /// is usually big enough to hold many lines.
+    ///
+    /// This is useful for maximum performance, particularly when printing
+    /// lots of results.
+    Block,
+}
+
+impl Default for BufferMode {
+    fn default() -> BufferMode {
+        BufferMode::Auto
+    }
+}
+
+/// Indicates the case mode for how to interpret all patterns given to ripgrep.
+///
+/// The default is `Sensitive`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum CaseMode {
+    /// Patterns are matched case sensitively. i.e., `a` does not match `A`.
+    Sensitive,
+    /// Patterns are matched case insensitively. i.e., `a` does match `A`.
+    Insensitive,
+    /// Patterns are automatically matched case insensitively only when they
+    /// consist of all lowercase literal characters. For example, the pattern
+    /// `a` will match `A` but `A` will not match `a`.
+    Smart,
+}
+
+impl Default for CaseMode {
+    fn default() -> CaseMode {
+        CaseMode::Sensitive
+    }
+}
+
+/// Indicates whether ripgrep should include color/hyperlinks in its output.
+///
+/// The default is `Auto`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum ColorChoice {
+    /// Color and hyperlinks will never be used.
+    Never,
+    /// Color and hyperlinks will be used only when stdout is connected to a
+    /// tty.
+    Auto,
+    /// Color will always be used.
+    Always,
+    /// Color will always be used and only ANSI escapes will be used.
+    ///
+    /// This only makes sense in the context of legacy Windows console APIs.
+    /// At time of writing, ripgrep will try to use the legacy console APIs
+    /// if ANSI coloring isn't believed to be possible. This option will force
+    /// ripgrep to use ANSI coloring.
+    Ansi,
+}
+
+impl Default for ColorChoice {
+    fn default() -> ColorChoice {
+        ColorChoice::Auto
+    }
+}
+
+impl ColorChoice {
+    /// Convert this color choice to the corresponding termcolor type.
+    pub(crate) fn to_termcolor(&self) -> termcolor::ColorChoice {
+        match *self {
+            ColorChoice::Never => termcolor::ColorChoice::Never,
+            ColorChoice::Auto => termcolor::ColorChoice::Auto,
+            ColorChoice::Always => termcolor::ColorChoice::Always,
+            ColorChoice::Ansi => termcolor::ColorChoice::AlwaysAnsi,
+        }
+    }
+}
+
+/// Indicates the line context options ripgrep should use for output.
+///
+/// The default is no context at all.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum ContextMode {
+    /// All lines will be printed. That is, the context is unbounded.
+    Passthru,
+    /// Only show a certain number of lines before and after each match.
+    Limited(ContextModeLimited),
+}
+
+impl Default for ContextMode {
+    fn default() -> ContextMode {
+        ContextMode::Limited(ContextModeLimited::default())
+    }
+}
+
+impl ContextMode {
+    /// Set the "before" context.
+    ///
+    /// If this was set to "passthru" context, then it is overridden in favor
+    /// of limited context with the given value for "before" and `0` for
+    /// "after."
+    pub(crate) fn set_before(&mut self, lines: usize) {
+        match *self {
+            ContextMode::Passthru => {
+                *self = ContextMode::Limited(ContextModeLimited {
+                    before: Some(lines),
+                    after: None,
+                    both: None,
+                })
+            }
+            ContextMode::Limited(ContextModeLimited {
+                ref mut before,
+                ..
+            }) => *before = Some(lines),
+        }
+    }
+
+    /// Set the "after" context.
+    ///
+    /// If this was set to "passthru" context, then it is overridden in favor
+    /// of limited context with the given value for "after" and `0` for
+    /// "before."
+    pub(crate) fn set_after(&mut self, lines: usize) {
+        match *self {
+            ContextMode::Passthru => {
+                *self = ContextMode::Limited(ContextModeLimited {
+                    before: None,
+                    after: Some(lines),
+                    both: None,
+                })
+            }
+            ContextMode::Limited(ContextModeLimited {
+                ref mut after, ..
+            }) => *after = Some(lines),
+        }
+    }
+
+    /// Set the "both" context.
+    ///
+    /// If this was set to "passthru" context, then it is overridden in favor
+    /// of limited context with the given value for "both" and `None` for
+    /// "before" and "after".
+    pub(crate) fn set_both(&mut self, lines: usize) {
+        match *self {
+            ContextMode::Passthru => {
+                *self = ContextMode::Limited(ContextModeLimited {
+                    before: None,
+                    after: None,
+                    both: Some(lines),
+                })
+            }
+            ContextMode::Limited(ContextModeLimited {
+                ref mut both, ..
+            }) => *both = Some(lines),
+        }
+    }
+
+    /// A convenience function for use in tests that returns the limited
+    /// context. If this mode isn't limited, then it panics.
+    #[cfg(test)]
+    pub(crate) fn get_limited(&self) -> (usize, usize) {
+        match *self {
+            ContextMode::Passthru => unreachable!("context mode is passthru"),
+            ContextMode::Limited(ref limited) => limited.get(),
+        }
+    }
+}
+
+/// A context mode for a finite number of lines.
+///
+/// Namely, this indicates that a specific number of lines (possibly zero)
+/// should be shown before and/or after each matching line.
+///
+/// Note that there is a subtle difference between `Some(0)` and `None`. In the
+/// former case, it happens when `0` is given explicitly, where as `None` is
+/// the default value and occurs when no value is specified.
+///
+/// `both` is only set by the -C/--context flag. The reason why we don't just
+/// set before = after = --context is because the before and after context
+/// settings always take precedent over the -C/--context setting, regardless of
+/// order. Thus, we need to keep track of them separately.
+#[derive(Debug, Default, Eq, PartialEq)]
+pub(crate) struct ContextModeLimited {
+    before: Option<usize>,
+    after: Option<usize>,
+    both: Option<usize>,
+}
+
+impl ContextModeLimited {
+    /// Returns the specific number of contextual lines that should be shown
+    /// around each match. This takes proper precedent into account, i.e.,
+    /// that `before` and `after` both partially override `both` in all cases.
+    ///
+    /// By default, this returns `(0, 0)`.
+    pub(crate) fn get(&self) -> (usize, usize) {
+        let (mut before, mut after) =
+            self.both.map(|lines| (lines, lines)).unwrap_or((0, 0));
+        // --before and --after always override --context, regardless
+        // of where they appear relative to each other.
+        if let Some(lines) = self.before {
+            before = lines;
+        }
+        if let Some(lines) = self.after {
+            after = lines;
+        }
+        (before, after)
+    }
+}
+
+/// Represents the separator to use between non-contiguous sections of
+/// contextual lines.
+///
+/// The default is `--`.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub(crate) struct ContextSeparator(Option<BString>);
+
+impl Default for ContextSeparator {
+    fn default() -> ContextSeparator {
+        ContextSeparator(Some(BString::from("--")))
+    }
+}
+
+impl ContextSeparator {
+    /// Create a new context separator from the user provided argument. This
+    /// handles unescaping.
+    pub(crate) fn new(os: &OsStr) -> anyhow::Result<ContextSeparator> {
+        let Some(string) = os.to_str() else {
+            anyhow::bail!(
+                "separator must be valid UTF-8 (use escape sequences \
+                 to provide a separator that is not valid UTF-8)"
+            )
+        };
+        Ok(ContextSeparator(Some(Vec::unescape_bytes(string).into())))
+    }
+
+    /// Creates a new separator that intructs the printer to disable contextual
+    /// separators entirely.
+    pub(crate) fn disabled() -> ContextSeparator {
+        ContextSeparator(None)
+    }
+
+    /// Return the raw bytes of this separator.
+    ///
+    /// If context separators were disabled, then this returns `None`.
+    ///
+    /// Note that this may return a `Some` variant with zero bytes.
+    pub(crate) fn into_bytes(self) -> Option<Vec<u8>> {
+        self.0.map(|sep| sep.into())
+    }
+}
+
+/// The encoding mode the searcher will use.
+///
+/// The default is `Auto`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum EncodingMode {
+    /// Use only BOM sniffing to auto-detect an encoding.
+    Auto,
+    /// Use an explicit encoding forcefully, but let BOM sniffing override it.
+    Some(grep::searcher::Encoding),
+    /// Use no explicit encoding and disable all BOM sniffing. This will
+    /// always result in searching the raw bytes, regardless of their
+    /// true encoding.
+    Disabled,
+}
+
+impl Default for EncodingMode {
+    fn default() -> EncodingMode {
+        EncodingMode::Auto
+    }
+}
+
+/// The regex engine to use.
+///
+/// The default is `Default`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum EngineChoice {
+    /// Uses the default regex engine: Rust's `regex` crate.
+    ///
+    /// (Well, technically it uses `regex-automata`, but `regex-automata` is
+    /// the implementation of the `regex` crate.)
+    Default,
+    /// Dynamically select the right engine to use.
+    ///
+    /// This works by trying to use the default engine, and if the pattern does
+    /// not compile, it switches over to the PCRE2 engine if it's available.
+    Auto,
+    /// Uses the PCRE2 regex engine if it's available.
+    PCRE2,
+}
+
+impl Default for EngineChoice {
+    fn default() -> EngineChoice {
+        EngineChoice::Default
+    }
+}
+
+/// The field context separator to use to between metadata for each contextual
+/// line.
+///
+/// The default is `-`.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub(crate) struct FieldContextSeparator(BString);
+
+impl Default for FieldContextSeparator {
+    fn default() -> FieldContextSeparator {
+        FieldContextSeparator(BString::from("-"))
+    }
+}
+
+impl FieldContextSeparator {
+    /// Create a new separator from the given argument value provided by the
+    /// user. Unescaping it automatically handled.
+    pub(crate) fn new(os: &OsStr) -> anyhow::Result<FieldContextSeparator> {
+        let Some(string) = os.to_str() else {
+            anyhow::bail!(
+                "separator must be valid UTF-8 (use escape sequences \
+                 to provide a separator that is not valid UTF-8)"
+            )
+        };
+        Ok(FieldContextSeparator(Vec::unescape_bytes(string).into()))
+    }
+
+    /// Return the raw bytes of this separator.
+    ///
+    /// Note that this may return an empty `Vec`.
+    pub(crate) fn into_bytes(self) -> Vec<u8> {
+        self.0.into()
+    }
+}
+
+/// The field match separator to use to between metadata for each matching
+/// line.
+///
+/// The default is `:`.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub(crate) struct FieldMatchSeparator(BString);
+
+impl Default for FieldMatchSeparator {
+    fn default() -> FieldMatchSeparator {
+        FieldMatchSeparator(BString::from(":"))
+    }
+}
+
+impl FieldMatchSeparator {
+    /// Create a new separator from the given argument value provided by the
+    /// user. Unescaping it automatically handled.
+    pub(crate) fn new(os: &OsStr) -> anyhow::Result<FieldMatchSeparator> {
+        let Some(string) = os.to_str() else {
+            anyhow::bail!(
+                "separator must be valid UTF-8 (use escape sequences \
+                 to provide a separator that is not valid UTF-8)"
+            )
+        };
+        Ok(FieldMatchSeparator(Vec::unescape_bytes(string).into()))
+    }
+
+    /// Return the raw bytes of this separator.
+    ///
+    /// Note that this may return an empty `Vec`.
+    pub(crate) fn into_bytes(self) -> Vec<u8> {
+        self.0.into()
+    }
+}
+
+/// The type of logging to do. `Debug` emits some details while `Trace` emits
+/// much more.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum LoggingMode {
+    Debug,
+    Trace,
+}
+
+/// Indicates when to use memory maps.
+///
+/// The default is `Auto`.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum MmapMode {
+    /// This instructs ripgrep to use heuristics for selecting when to and not
+    /// to use memory maps for searching.
+    Auto,
+    /// This instructs ripgrep to always try memory maps when possible. (Memory
+    /// maps are not possible to use in all circumstances, for example, for
+    /// virtual files.)
+    AlwaysTryMmap,
+    /// Never use memory maps under any circumstances. This includes even
+    /// when multi-line search is enabled where ripgrep will read the entire
+    /// contents of a file on to the heap before searching it.
+    Never,
+}
+
+impl Default for MmapMode {
+    fn default() -> MmapMode {
+        MmapMode::Auto
+    }
+}
+
+/// Represents a source of patterns that ripgrep should search for.
+///
+/// The reason to unify these is so that we can retain the order of `-f/--flag`
+/// and `-e/--regexp` flags relative to one another.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum PatternSource {
+    /// Comes from the `-e/--regexp` flag.
+    Regexp(String),
+    /// Comes from the `-f/--file` flag.
+    File(PathBuf),
+}
+
+/// The sort criteria, if present.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) struct SortMode {
+    /// Whether to reverse the sort criteria (i.e., descending order).
+    pub(crate) reverse: bool,
+    /// The actual sorting criteria.
+    pub(crate) kind: SortModeKind,
+}
+
+/// The criteria to use for sorting.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum SortModeKind {
+    /// Sort by path.
+    Path,
+    /// Sort by last modified time.
+    LastModified,
+    /// Sort by last accessed time.
+    LastAccessed,
+    /// Sort by creation time.
+    Created,
+}
+
+impl SortMode {
+    /// Checks whether the selected sort mode is supported. If it isn't, an
+    /// error (hopefully explaining why) is returned.
+    pub(crate) fn supported(&self) -> anyhow::Result<()> {
+        match self.kind {
+            SortModeKind::Path => Ok(()),
+            SortModeKind::LastModified => {
+                let md = std::env::current_exe()
+                    .and_then(|p| p.metadata())
+                    .and_then(|md| md.modified());
+                let Err(err) = md else { return Ok(()) };
+                anyhow::bail!(
+                    "sorting by last modified isn't supported: {err}"
+                );
+            }
+            SortModeKind::LastAccessed => {
+                let md = std::env::current_exe()
+                    .and_then(|p| p.metadata())
+                    .and_then(|md| md.accessed());
+                let Err(err) = md else { return Ok(()) };
+                anyhow::bail!(
+                    "sorting by last accessed isn't supported: {err}"
+                );
+            }
+            SortModeKind::Created => {
+                let md = std::env::current_exe()
+                    .and_then(|p| p.metadata())
+                    .and_then(|md| md.created());
+                let Err(err) = md else { return Ok(()) };
+                anyhow::bail!(
+                    "sorting by creation time isn't supported: {err}"
+                );
+            }
+        }
+    }
+}
+
+/// A single instance of either a change or a selection of one ripgrep's
+/// file types.
+#[derive(Debug, Eq, PartialEq)]
+pub(crate) enum TypeChange {
+    /// Clear the given type from ripgrep.
+    Clear { name: String },
+    /// Add the given type definition (name and glob) to ripgrep.
+    Add { def: String },
+    /// Select the given type for filtering.
+    Select { name: String },
+    /// Select the given type for filtering but negate it.
+    Negate { name: String },
+}
--- a/crates/core/flags/mod.rs
+++ b/crates/core/flags/mod.rs
@@ -0,0 +1,282 @@
+/*!
+Defines ripgrep's command line interface.
+
+This modules deals with everything involving ripgrep's flags and positional
+arguments. This includes generating shell completions, `--help` output and even
+ripgrep's man page. It's also responsible for parsing and validating every
+flag (including reading ripgrep's config file), and manages the contact points
+between these flags and ripgrep's cast of supporting libraries. For example,
+once [`HiArgs`] has been created, it knows how to create a multi threaded
+recursive directory traverser.
+*/
+use std::{
+    ffi::OsString,
+    fmt::Debug,
+    panic::{RefUnwindSafe, UnwindSafe},
+};
+
+pub(crate) use crate::flags::{
+    complete::{
+        bash::generate as generate_complete_bash,
+        fish::generate as generate_complete_fish,
+        powershell::generate as generate_complete_powershell,
+        zsh::generate as generate_complete_zsh,
+    },
+    doc::{
+        help::{
+            generate_long as generate_help_long,
+            generate_short as generate_help_short,
+        },
+        man::generate as generate_man_page,
+        version::{
+            generate_long as generate_version_long,
+            generate_short as generate_version_short,
+        },
+    },
+    hiargs::HiArgs,
+    lowargs::{GenerateMode, Mode, SearchMode, SpecialMode},
+    parse::{parse, ParseResult},
+};
+
+mod complete;
+mod config;
+mod defs;
+mod doc;
+mod hiargs;
+mod lowargs;
+mod parse;
+
+/// A trait that encapsulates the definition of an optional flag for ripgrep.
+///
+/// This trait is meant to be used via dynamic dispatch. Namely, the `defs`
+/// module provides a single global slice of `&dyn Flag` values correspondings
+/// to all of the flags in ripgrep.
+///
+/// ripgrep's required positional arguments are handled by the parser and by
+/// the conversion from low-level arguments to high level arguments. Namely,
+/// all of ripgrep's positional arguments are treated as file paths, except
+/// in certain circumstances where the first argument is treated as a regex
+/// pattern.
+///
+/// Note that each implementation of this trait requires a long flag name,
+/// but can also optionally have a short version and even a negation flag.
+/// For example, the `-E/--encoding` flag accepts a value, but it also has a
+/// `--no-encoding` negation flag for reverting back to "automatic" encoding
+/// detection. All three of `-E`, `--encoding` and `--no-encoding` are provided
+/// by a single implementation of this trait.
+///
+/// ripgrep only supports flags that are switches or flags that accept a single
+/// value. Flags that accept multiple values are an unsupported abberation.
+trait Flag: Debug + Send + Sync + UnwindSafe + RefUnwindSafe + 'static {
+    /// Returns true if this flag is a switch. When a flag is a switch, the
+    /// CLI parser will look for a value after the flag is seen.
+    fn is_switch(&self) -> bool;
+
+    /// A short single byte name for this flag. This returns `None` by default,
+    /// which signifies that the flag has no short name.
+    ///
+    /// The byte returned must be an ASCII codepoint that is a `.` or is
+    /// alpha-numeric.
+    fn name_short(&self) -> Option<u8> {
+        None
+    }
+
+    /// Returns the long name of this flag. All flags must have a "long" name.
+    ///
+    /// The long name must be at least 2 bytes, and all of its bytes must be
+    /// ASCII codepoints that are either `-` or alpha-numeric.
+    fn name_long(&self) -> &'static str;
+
+    /// Returns a list of aliases for this flag.
+    ///
+    /// The aliases must follow the same rules as `Flag::name_long`.
+    ///
+    /// By default, an empty slice is returned.
+    fn aliases(&self) -> &'static [&'static str] {
+        &[]
+    }
+
+    /// Returns a negated name for this flag. The negation of a flag is
+    /// intended to have the opposite meaning of a flag or to otherwise turn
+    /// something "off" or revert it to its default behavior.
+    ///
+    /// Negated flags are not listed in their own section in the `-h/--help`
+    /// output or man page. Instead, they are automatically mentioned at the
+    /// end of the documentation section of the flag they negated.
+    ///
+    /// The aliases must follow the same rules as `Flag::name_long`.
+    ///
+    /// By default, a flag has no negation and this returns `None`.
+    fn name_negated(&self) -> Option<&'static str> {
+        None
+    }
+
+    /// Returns the variable name describing the type of value this flag
+    /// accepts. This should always be set for non-switch flags and never set
+    /// for switch flags.
+    ///
+    /// For example, the `--max-count` flag has its variable name set to `NUM`.
+    ///
+    /// The convention is to capitalize variable names.
+    ///
+    /// By default this returns `None`.
+    fn doc_variable(&self) -> Option<&'static str> {
+        None
+    }
+
+    /// Returns the category of this flag.
+    ///
+    /// Every flag must have a single category. Categories are used to organize
+    /// flags in the generated documentation.
+    fn doc_category(&self) -> Category;
+
+    /// A (very) short documentation string describing what this flag does.
+    ///
+    /// This may sacrifice "proper English" in order to be as terse as
+    /// possible. Generally, we try to ensure that `rg -h` doesn't have any
+    /// lines that exceed 79 columns.
+    fn doc_short(&self) -> &'static str;
+
+    /// A (possibly very) longer documentation string describing in full
+    /// detail what this flag does. This should be in mandoc/mdoc format.
+    fn doc_long(&self) -> &'static str;
+
+    /// If this is a non-switch flag that accepts a small set of specific
+    /// values, then this should list them.
+    ///
+    /// This returns an empty slice by default.
+    fn doc_choices(&self) -> &'static [&'static str] {
+        &[]
+    }
+
+    /// Given the parsed value (which might just be a switch), this should
+    /// update the state in `args` based on the value given for this flag.
+    ///
+    /// This may update state for other flags as appropriate.
+    ///
+    /// The `-V/--version` and `-h/--help` flags are treated specially in the
+    /// parser and should do nothing here.
+    ///
+    /// By convention, implementations should generally not try to "do"
+    /// anything other than validate the value given. For example, the
+    /// implementation for `--hostname-bin` should not try to resolve the
+    /// hostname to use by running the binary provided. That should be saved
+    /// for a later step. This convention is used to ensure that getting the
+    /// low-level arguments is as reliable and quick as possible. It also
+    /// ensures that "doing something" occurs a minimal number of times. For
+    /// example, by avoiding trying to find the hostname here, we can do it
+    /// once later no matter how many times `--hostname-bin` is provided.
+    ///
+    /// Implementations should not include the flag name in the error message
+    /// returned. The flag name is included automatically by the parser.
+    fn update(
+        &self,
+        value: FlagValue,
+        args: &mut crate::flags::lowargs::LowArgs,
+    ) -> anyhow::Result<()>;
+}
+
+/// The category that a flag belongs to.
+///
+/// Categories are used to organize flags into "logical" groups in the
+/// generated documentation.
+#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, PartialOrd, Ord)]
+enum Category {
+    /// Flags related to how ripgrep reads its input. Its "input" generally
+    /// consists of the patterns it is trying to match and the haystacks it is
+    /// trying to search.
+    Input,
+    /// Flags related to the operation of the search itself. For example,
+    /// whether case insensitive matching is enabled.
+    Search,
+    /// Flags related to how ripgrep filters haystacks. For example, whether
+    /// to respect gitignore files or not.
+    Filter,
+    /// Flags related to how ripgrep shows its search results. For example,
+    /// whether to show line numbers or not.
+    Output,
+    /// Flags related to changing ripgrep's output at a more fundamental level.
+    /// For example, flags like `--count` suppress printing of individual
+    /// lines, and instead just print the total count of matches for each file
+    /// searched.
+    OutputModes,
+    /// Flags related to logging behavior such as emitting non-fatal error
+    /// messages or printing search statistics.
+    Logging,
+    /// Other behaviors not related to ripgrep's core functionality. For
+    /// example, printing the file type globbing rules, or printing the list
+    /// of files ripgrep would search without actually searching them.
+    OtherBehaviors,
+}
+
+impl Category {
+    /// Returns a string representation of this category.
+    ///
+    /// This string is the name of the variable used in various templates for
+    /// generated documentation. This name can be used for interpolation.
+    fn as_str(&self) -> &'static str {
+        match *self {
+            Category::Input => "input",
+            Category::Search => "search",
+            Category::Filter => "filter",
+            Category::Output => "output",
+            Category::OutputModes => "output-modes",
+            Category::Logging => "logging",
+            Category::OtherBehaviors => "other-behaviors",
+        }
+    }
+}
+
+/// Represents a value parsed from the command line.
+///
+/// This doesn't include the corresponding flag, but values come in one of
+/// two forms: a switch (on or off) or an arbitrary value.
+///
+/// Note that the CLI doesn't directly support negated switches. For example,
+/// you can'd do anything like `-n=false` or any of that nonsense. Instead,
+/// the CLI parser knows about which flag names are negations and which aren't
+/// (courtesy of the `Flag` trait). If a flag given is known as a negation,
+/// then a `FlagValue::Switch(false)` value is passed into `Flag::update`.
+#[derive(Debug)]
+enum FlagValue {
+    /// A flag that is either on or off.
+    Switch(bool),
+    /// A flag that comes with an arbitrary user value.
+    Value(OsString),
+}
+
+impl FlagValue {
+    /// Return the yes or no value of this switch.
+    ///
+    /// If this flag value is not a switch, then this panics.
+    ///
+    /// This is useful when writing the implementation of `Flag::update`.
+    /// namely, callers usually know whether a switch or a value is expected.
+    /// If a flag is something different, then it indicates a bug, and thus a
+    /// panic is acceptable.
+    fn unwrap_switch(self) -> bool {
+        match self {
+            FlagValue::Switch(yes) => yes,
+            FlagValue::Value(_) => {
+                unreachable!("got flag value but expected switch")
+            }
+        }
+    }
+
+    /// Return the user provided value of this flag.
+    ///
+    /// If this flag is a switch, then this panics.
+    ///
+    /// This is useful when writing the implementation of `Flag::update`.
+    /// namely, callers usually know whether a switch or a value is expected.
+    /// If a flag is something different, then it indicates a bug, and thus a
+    /// panic is acceptable.
+    fn unwrap_value(self) -> OsString {
+        match self {
+            FlagValue::Switch(_) => {
+                unreachable!("got switch but expected flag value")
+            }
+            FlagValue::Value(v) => v,
+        }
+    }
+}
--- a/crates/core/flags/parse.rs
+++ b/crates/core/flags/parse.rs
@@ -0,0 +1,392 @@
+/*!
+Parses command line arguments into a structured and typed representation.
+*/
+
+use std::ffi::OsString;
+
+use anyhow::Context;
+
+use crate::flags::{
+    defs::FLAGS,
+    hiargs::HiArgs,
+    lowargs::{LoggingMode, LowArgs, SpecialMode},
+    Flag, FlagValue,
+};
+
+/// The result of parsing CLI arguments.
+///
+/// This is basically a `anyhow::Result<T>`, but with one extra variant that is
+/// inhabited whenever ripgrep should execute a "special" mode. That is, when a
+/// user provides the `-h/--help` or `-V/--version` flags.
+///
+/// This special variant exists to allow CLI parsing to short circuit as
+/// quickly as is reasonable. For example, it lets CLI parsing avoid reading
+/// ripgrep's configuration and converting low level arguments into a higher
+/// level representation.
+#[derive(Debug)]
+pub(crate) enum ParseResult<T> {
+    Special(SpecialMode),
+    Ok(T),
+    Err(anyhow::Error),
+}
+
+impl<T> ParseResult<T> {
+    /// If this result is `Ok`, then apply `then` to it. Otherwise, return this
+    /// result unchanged.
+    fn and_then<U>(
+        self,
+        mut then: impl FnMut(T) -> ParseResult<U>,
+    ) -> ParseResult<U> {
+        match self {
+            ParseResult::Special(mode) => ParseResult::Special(mode),
+            ParseResult::Ok(t) => then(t),
+            ParseResult::Err(err) => ParseResult::Err(err),
+        }
+    }
+}
+
+/// Parse CLI arguments and convert then to their high level representation.
+pub(crate) fn parse() -> ParseResult<HiArgs> {
+    parse_low().and_then(|low| match HiArgs::from_low_args(low) {
+        Ok(hi) => ParseResult::Ok(hi),
+        Err(err) => ParseResult::Err(err),
+    })
+}
+
+/// Parse CLI arguments only into their low level representation.
+///
+/// This takes configuration into account. That is, it will try to read
+/// `RIPGREP_CONFIG_PATH` and prepend any arguments found there to the
+/// arguments passed to this process.
+///
+/// This will also set one-time global state flags, such as the log level and
+/// whether messages should be printed.
+fn parse_low() -> ParseResult<LowArgs> {
+    if let Err(err) = crate::logger::Logger::init() {
+        let err = anyhow::anyhow!("failed to initialize logger: {err}");
+        return ParseResult::Err(err);
+    }
+
+    let parser = Parser::new();
+    let mut low = LowArgs::default();
+    if let Err(err) = parser.parse(std::env::args_os().skip(1), &mut low) {
+        return ParseResult::Err(err);
+    }
+    // Even though we haven't parsed the config file yet (assuming it exists),
+    // we can still use the arguments given on the CLI to setup ripgrep's
+    // logging preferences. Even if the config file changes them in some way,
+    // it's really the best we can do. This way, for example, folks can pass
+    // `--trace` and see any messages logged during config file parsing.
+    set_log_levels(&low);
+    // Before we try to take configuration into account, we can bail early
+    // if a special mode was enabled. This is basically only for version and
+    // help output which shouldn't be impacted by extra configuration.
+    if let Some(special) = low.special.take() {
+        return ParseResult::Special(special);
+    }
+    // If the end user says no config, then respect it.
+    if low.no_config {
+        log::debug!("not reading config files because --no-config is present");
+        return ParseResult::Ok(low);
+    }
+    // Look for arguments from a config file. If we got nothing (whether the
+    // file is empty or RIPGREP_CONFIG_PATH wasn't set), then we don't need
+    // to re-parse.
+    let config_args = crate::flags::config::args();
+    if config_args.is_empty() {
+        log::debug!("no extra arguments found from configuration file");
+        return ParseResult::Ok(low);
+    }
+    // The final arguments are just the arguments from the CLI appending to
+    // the end of the config arguments.
+    let mut final_args = config_args;
+    final_args.extend(std::env::args_os().skip(1));
+
+    // Now do the CLI parsing dance again.
+    let mut low = LowArgs::default();
+    if let Err(err) = parser.parse(final_args.into_iter(), &mut low) {
+        return ParseResult::Err(err);
+    }
+    // Reset the message and logging levels, since they could have changed.
+    set_log_levels(&low);
+    ParseResult::Ok(low)
+}
+
+/// Sets global state flags that control logging based on low-level arguments.
+fn set_log_levels(low: &LowArgs) {
+    crate::messages::set_messages(!low.no_messages);
+    crate::messages::set_ignore_messages(!low.no_ignore_messages);
+    match low.logging {
+        Some(LoggingMode::Trace) => {
+            log::set_max_level(log::LevelFilter::Trace)
+        }
+        Some(LoggingMode::Debug) => {
+            log::set_max_level(log::LevelFilter::Debug)
+        }
+        None => log::set_max_level(log::LevelFilter::Warn),
+    }
+}
+
+/// Parse the sequence of CLI arguments given a low level typed set of
+/// arguments.
+///
+/// This is exposed for testing that the correct low-level arguments are parsed
+/// from a CLI. It just runs the parser once over the CLI arguments. It doesn't
+/// setup logging or read from a config file.
+///
+/// This assumes the iterator given does *not* begin with the binary name.
+#[cfg(test)]
+pub(crate) fn parse_low_raw(
+    rawargs: impl IntoIterator<Item = impl Into<OsString>>,
+) -> anyhow::Result<LowArgs> {
+    let mut args = LowArgs::default();
+    Parser::new().parse(rawargs, &mut args)?;
+    Ok(args)
+}
+
+/// Return the metadata for the flag of the given name.
+pub(super) fn lookup(name: &str) -> Option<&'static dyn Flag> {
+    // N.B. Creating a new parser might look expensive, but it only builds
+    // the lookup trie exactly once. That is, we get a `&'static Parser` from
+    // `Parser::new()`.
+    match Parser::new().find_long(name) {
+        FlagLookup::Match(&FlagInfo { flag, .. }) => Some(flag),
+        _ => None,
+    }
+}
+
+/// A parser for turning a sequence of command line arguments into a more
+/// strictly typed set of arguments.
+#[derive(Debug)]
+struct Parser {
+    /// A single map that contains all possible flag names. This includes
+    /// short and long names, aliases and negations. This maps those names to
+    /// indices into `info`.
+    map: FlagMap,
+    /// A map from IDs returned by the `map` to the corresponding flag
+    /// information.
+    info: Vec<FlagInfo>,
+}
+
+impl Parser {
+    /// Create a new parser.
+    ///
+    /// This always creates the same parser and only does it once. Callers may
+    /// call this repeatedly, and the parser will only be built once.
+    fn new() -> &'static Parser {
+        use std::sync::OnceLock;
+
+        // Since a parser's state is immutable and completely determined by
+        // FLAGS, and since FLAGS is a constant, we can initialize it exactly
+        // once.
+        static P: OnceLock<Parser> = OnceLock::new();
+        P.get_or_init(|| {
+            let mut infos = vec![];
+            for &flag in FLAGS.iter() {
+                infos.push(FlagInfo {
+                    flag,
+                    name: Ok(flag.name_long()),
+                    kind: FlagInfoKind::Standard,
+                });
+                for alias in flag.aliases() {
+                    infos.push(FlagInfo {
+                        flag,
+                        name: Ok(alias),
+                        kind: FlagInfoKind::Alias,
+                    });
+                }
+                if let Some(byte) = flag.name_short() {
+                    infos.push(FlagInfo {
+                        flag,
+                        name: Err(byte),
+                        kind: FlagInfoKind::Standard,
+                    });
+                }
+                if let Some(name) = flag.name_negated() {
+                    infos.push(FlagInfo {
+                        flag,
+                        name: Ok(name),
+                        kind: FlagInfoKind::Negated,
+                    });
+                }
+            }
+            let map = FlagMap::new(&infos);
+            Parser { map, info: infos }
+        })
+    }
+
+    /// Parse the given CLI arguments into a low level representation.
+    ///
+    /// The iterator given should *not* start with the binary name.
+    fn parse<I, O>(&self, rawargs: I, args: &mut LowArgs) -> anyhow::Result<()>
+    where
+        I: IntoIterator<Item = O>,
+        O: Into<OsString>,
+    {
+        let mut p = lexopt::Parser::from_args(rawargs);
+        while let Some(arg) = p.next().context("invalid CLI arguments")? {
+            let lookup = match arg {
+                lexopt::Arg::Value(value) => {
+                    args.positional.push(value);
+                    continue;
+                }
+                lexopt::Arg::Short(ch) if ch == 'h' => {
+                    // Special case -h/--help since behavior is different
+                    // based on whether short or long flag is given.
+                    args.special = Some(SpecialMode::HelpShort);
+                    continue;
+                }
+                lexopt::Arg::Short(ch) if ch == 'V' => {
+                    // Special case -V/--version since behavior is different
+                    // based on whether short or long flag is given.
+                    args.special = Some(SpecialMode::VersionShort);
+                    continue;
+                }
+                lexopt::Arg::Short(ch) => self.find_short(ch),
+                lexopt::Arg::Long(name) if name == "help" => {
+                    // Special case -h/--help since behavior is different
+                    // based on whether short or long flag is given.
+                    args.special = Some(SpecialMode::HelpLong);
+                    continue;
+                }
+                lexopt::Arg::Long(name) if name == "version" => {
+                    // Special case -V/--version since behavior is different
+                    // based on whether short or long flag is given.
+                    args.special = Some(SpecialMode::VersionLong);
+                    continue;
+                }
+                lexopt::Arg::Long(name) => self.find_long(name),
+            };
+            let mat = match lookup {
+                FlagLookup::Match(mat) => mat,
+                FlagLookup::UnrecognizedShort(name) => {
+                    anyhow::bail!("unrecognized flag -{name}")
+                }
+                FlagLookup::UnrecognizedLong(name) => {
+                    anyhow::bail!("unrecognized flag --{name}")
+                }
+            };
+            let value = if matches!(mat.kind, FlagInfoKind::Negated) {
+                // Negated flags are always switches, even if the non-negated
+                // flag is not. For example, --context-separator accepts a
+                // value, but --no-context-separator does not.
+                FlagValue::Switch(false)
+            } else if mat.flag.is_switch() {
+                FlagValue::Switch(true)
+            } else {
+                FlagValue::Value(p.value().with_context(|| {
+                    format!("missing value for flag {mat}")
+                })?)
+            };
+            mat.flag
+                .update(value, args)
+                .with_context(|| format!("error parsing flag {mat}"))?;
+        }
+        Ok(())
+    }
+
+    /// Look for a flag by its short name.
+    fn find_short(&self, ch: char) -> FlagLookup<'_> {
+        if !ch.is_ascii() {
+            return FlagLookup::UnrecognizedShort(ch);
+        }
+        let byte = u8::try_from(ch).unwrap();
+        let Some(index) = self.map.find(&[byte]) else {
+            return FlagLookup::UnrecognizedShort(ch);
+        };
+        FlagLookup::Match(&self.info[index])
+    }
+
+    /// Look for a flag by its long name.
+    ///
+    /// This also works for aliases and negated names.
+    fn find_long(&self, name: &str) -> FlagLookup<'_> {
+        let Some(index) = self.map.find(name.as_bytes()) else {
+            return FlagLookup::UnrecognizedLong(name.to_string());
+        };
+        FlagLookup::Match(&self.info[index])
+    }
+}
+
+/// The result of looking up a flag name.
+#[derive(Debug)]
+enum FlagLookup<'a> {
+    /// Lookup found a match and the metadata for the flag is attached.
+    Match(&'a FlagInfo),
+    /// The given short name is unrecognized.
+    UnrecognizedShort(char),
+    /// The given long name is unrecognized.
+    UnrecognizedLong(String),
+}
+
+/// The info about a flag associated with a flag's ID in the the flag map.
+#[derive(Debug)]
+struct FlagInfo {
+    /// The flag object and its associated metadata.
+    flag: &'static dyn Flag,
+    /// The actual name that is stored in the Aho-Corasick automaton. When this
+    /// is a byte, it corresponds to a short single character ASCII flag. The
+    /// actual pattern that's in the Aho-Corasick automaton is just the single
+    /// byte.
+    name: Result<&'static str, u8>,
+    /// The type of flag that is stored for the corresponding Aho-Corasick
+    /// pattern.
+    kind: FlagInfoKind,
+}
+
+/// The kind of flag that is being matched.
+#[derive(Debug)]
+enum FlagInfoKind {
+    /// A standard flag, e.g., --passthru.
+    Standard,
+    /// A negation of a standard flag, e.g., --no-multiline.
+    Negated,
+    /// An alias for a standard flag, e.g., --passthrough.
+    Alias,
+}
+
+impl std::fmt::Display for FlagInfo {
+    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+        match self.name {
+            Ok(long) => write!(f, "--{long}"),
+            Err(short) => write!(f, "-{short}", short = char::from(short)),
+        }
+    }
+}
+
+/// A map from flag names (short, long, negated and aliases) to their ID.
+///
+/// Once an ID is known, it can be used to look up a flag's metadata in the
+/// parser's internal state.
+#[derive(Debug)]
+struct FlagMap {
+    map: std::collections::HashMap<Vec<u8>, usize>,
+}
+
+impl FlagMap {
+    /// Create a new map of flags for the given flag information.
+    ///
+    /// The index of each flag info corresponds to its ID.
+    fn new(infos: &[FlagInfo]) -> FlagMap {
+        let mut map = std::collections::HashMap::with_capacity(infos.len());
+        for (i, info) in infos.iter().enumerate() {
+            match info.name {
+                Ok(name) => {
+                    assert_eq!(None, map.insert(name.as_bytes().to_vec(), i));
+                }
+                Err(byte) => {
+                    assert_eq!(None, map.insert(vec![byte], i));
+                }
+            }
+        }
+        FlagMap { map }
+    }
+
+    /// Look for a match of `name` in the given Aho-Corasick automaton.
+    ///
+    /// This only returns a match if the one found has a length equivalent to
+    /// the length of the name given.
+    fn find(&self, name: &[u8]) -> Option<usize> {
+        self.map.get(name).copied()
+    }
+}
--- a/crates/core/haystack.rs
+++ b/crates/core/haystack.rs
@@ -1,108 +1,111 @@
+/*!
+Defines a builder for haystacks.
+
+A "haystack" represents something we want to search. It encapsulates the logic
+for whether a haystack ought to be searched or not, separate from the standard
+ignore rules and other filtering logic.
+
+Effectively, a haystack wraps a directory entry and adds some light application
+level logic around it.
+*/
+
 use std::path::Path;

-/// A configuration for describing how subjects should be built.
-#[derive(Clone, Debug)]
-struct Config {
-    strip_dot_prefix: bool,
-}
-
-impl Default for Config {
-    fn default() -> Config {
-        Config { strip_dot_prefix: false }
-    }
-}
-
 /// A builder for constructing things to search over.
 #[derive(Clone, Debug)]
-pub struct SubjectBuilder {
-    config: Config,
+pub(crate) struct HaystackBuilder {
+    strip_dot_prefix: bool,
 }

-impl SubjectBuilder {
-    /// Return a new subject builder with a default configuration.
-    pub fn new() -> SubjectBuilder {
-        SubjectBuilder { config: Config::default() }
+impl HaystackBuilder {
+    /// Return a new haystack builder with a default configuration.
+    pub(crate) fn new() -> HaystackBuilder {
+        HaystackBuilder { strip_dot_prefix: false }
    }

-    /// Create a new subject from a possibly missing directory entry.
+    /// Create a new haystack from a possibly missing directory entry.
    ///
    /// If the directory entry isn't present, then the corresponding error is
-    /// logged if messages have been configured. Otherwise, if the subject is
-    /// deemed searchable, then it is returned.
-    pub fn build_from_result(
+    /// logged if messages have been configured. Otherwise, if the directory
+    /// entry is deemed searchable, then it is returned as a haystack.
+    pub(crate) fn build_from_result(
        &self,
        result: Result<ignore::DirEntry, ignore::Error>,
-    ) -> Option<Subject> {
+    ) -> Option<Haystack> {
        match result {
            Ok(dent) => self.build(dent),
            Err(err) => {
-                err_message!("{}", err);
+                err_message!("{err}");
                None
            }
        }
    }

-    /// Create a new subject using this builder's configuration.
+    /// Create a new haystack using this builder's configuration.
    ///
-    /// If a subject could not be created or should otherwise not be searched,
-    /// then this returns `None` after emitting any relevant log messages.
-    pub fn build(&self, dent: ignore::DirEntry) -> Option<Subject> {
-        let subj =
-            Subject { dent, strip_dot_prefix: self.config.strip_dot_prefix };
-        if let Some(ignore_err) = subj.dent.error() {
-            ignore_message!("{}", ignore_err);
+    /// If a directory entry could not be created or should otherwise not be
+    /// searched, then this returns `None` after emitting any relevant log
+    /// messages.
+    fn build(&self, dent: ignore::DirEntry) -> Option<Haystack> {
+        let hay = Haystack { dent, strip_dot_prefix: self.strip_dot_prefix };
+        if let Some(err) = hay.dent.error() {
+            ignore_message!("{err}");
        }
        // If this entry was explicitly provided by an end user, then we always
        // want to search it.
-        if subj.is_explicit() {
-            return Some(subj);
+        if hay.is_explicit() {
+            return Some(hay);
        }
        // At this point, we only want to search something if it's explicitly a
        // file. This omits symlinks. (If ripgrep was configured to follow
        // symlinks, then they have already been followed by the directory
        // traversal.)
-        if subj.is_file() {
-            return Some(subj);
+        if hay.is_file() {
+            return Some(hay);
        }
        // We got nothing. Emit a debug message, but only if this isn't a
        // directory. Otherwise, emitting messages for directories is just
        // noisy.
-        if !subj.is_dir() {
+        if !hay.is_dir() {
            log::debug!(
-                "ignoring {}: failed to pass subject filter: \
+                "ignoring {}: failed to pass haystack filter: \
                 file type: {:?}, metadata: {:?}",
-                subj.dent.path().display(),
-                subj.dent.file_type(),
-                subj.dent.metadata()
+                hay.dent.path().display(),
+                hay.dent.file_type(),
+                hay.dent.metadata()
            );
        }
        None
    }

-    /// When enabled, if the subject's file path starts with `./` then it is
+    /// When enabled, if the haystack's file path starts with `./` then it is
    /// stripped.
    ///
    /// This is useful when implicitly searching the current working directory.
-    pub fn strip_dot_prefix(&mut self, yes: bool) -> &mut SubjectBuilder {
-        self.config.strip_dot_prefix = yes;
+    pub(crate) fn strip_dot_prefix(
+        &mut self,
+        yes: bool,
+    ) -> &mut HaystackBuilder {
+        self.strip_dot_prefix = yes;
        self
    }
 }

-/// A subject is a thing we want to search. Generally, a subject is either a
-/// file or stdin.
+/// A haystack is a thing we want to search.
+///
+/// Generally, a haystack is either a file or stdin.
 #[derive(Clone, Debug)]
-pub struct Subject {
+pub(crate) struct Haystack {
    dent: ignore::DirEntry,
    strip_dot_prefix: bool,
 }

-impl Subject {
-    /// Return the file path corresponding to this subject.
+impl Haystack {
+    /// Return the file path corresponding to this haystack.
    ///
-    /// If this subject corresponds to stdin, then a special `<stdin>` path
+    /// If this haystack corresponds to stdin, then a special `<stdin>` path
    /// is returned instead.
-    pub fn path(&self) -> &Path {
+    pub(crate) fn path(&self) -> &Path {
        if self.strip_dot_prefix && self.dent.path().starts_with("./") {
            self.dent.path().strip_prefix("./").unwrap()
        } else {
@@ -111,21 +114,21 @@ impl Subject {
    }

    /// Returns true if and only if this entry corresponds to stdin.
-    pub fn is_stdin(&self) -> bool {
+    pub(crate) fn is_stdin(&self) -> bool {
        self.dent.is_stdin()
    }

-    /// Returns true if and only if this entry corresponds to a subject to
+    /// Returns true if and only if this entry corresponds to a haystack to
    /// search that was explicitly supplied by an end user.
    ///
    /// Generally, this corresponds to either stdin or an explicit file path
    /// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
-    /// an explicit subject, but, e.g., `./some-dir/some-other-file` is not.
+    /// an explicit haystack, but, e.g., `./some-dir/some-other-file` is not.
    ///
    /// However, note that ripgrep does not see through shell globbing. e.g.,
    /// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
-    /// as an explicit subject.
-    pub fn is_explicit(&self) -> bool {
+    /// as an explicit haystack.
+    pub(crate) fn is_explicit(&self) -> bool {
        // stdin is obvious. When an entry has a depth of 0, that means it
        // was explicitly provided to our directory iterator, which means it
        // was in turn explicitly provided by the end user. The !is_dir check
@@ -135,7 +138,7 @@ impl Subject {
        self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
    }

-    /// Returns true if and only if this subject points to a directory after
+    /// Returns true if and only if this haystack points to a directory after
    /// following symbolic links.
    fn is_dir(&self) -> bool {
        let ft = match self.dent.file_type() {
@@ -150,7 +153,7 @@ impl Subject {
        self.dent.path_is_symlink() && self.dent.path().is_dir()
    }

-    /// Returns true if and only if this subject points to a file.
+    /// Returns true if and only if this haystack points to a file.
    fn is_file(&self) -> bool {
        self.dent.file_type().map_or(false, |ft| ft.is_file())
    }
--- a/crates/core/logger.rs
+++ b/crates/core/logger.rs
@@ -1,7 +1,10 @@
-// This module defines a super simple logger that works with the `log` crate.
-// We don't need anything fancy; just basic log levels and the ability to
-// print to stderr. We therefore avoid bringing in extra dependencies just
-// for this functionality.
+/*!
+Defines a super simple logger that works with the `log` crate.
+
+We don't do anything fancy. We just need basic log levels and the ability to
+print to stderr. We therefore avoid bringing in extra dependencies just for
+this functionality.
+*/

 use log::{self, Log};

@@ -10,15 +13,16 @@ use log::{self, Log};
 /// This logger does no filtering. Instead, it relies on the `log` crates
 /// filtering via its global max_level setting.
 #[derive(Debug)]
-pub struct Logger(());
+pub(crate) struct Logger(());

+/// A singleton used as the target for an implementation of the `Log` trait.
 const LOGGER: &'static Logger = &Logger(());

 impl Logger {
    /// Create a new logger that logs to stderr and initialize it as the
    /// global logger. If there was a problem setting the logger, then an
    /// error is returned.
-    pub fn init() -> Result<(), log::SetLoggerError> {
+    pub(crate) fn init() -> Result<(), log::SetLoggerError> {
        log::set_logger(LOGGER)
    }
 }
--- a/crates/core/main.rs
+++ b/crates/core/main.rs
@@ -1,21 +1,20 @@
-use std::{
-    io::{self, Write},
-    time::Instant,
-};
+/*!
+The main entry point into ripgrep.
+*/
+
+use std::{io::Write, process::ExitCode};

 use ignore::WalkState;

-use crate::{args::Args, subject::Subject};
+use crate::flags::{HiArgs, SearchMode};

 #[macro_use]
 mod messages;

-mod app;
-mod args;
-mod config;
+mod flags;
+mod haystack;
 mod logger;
 mod search;
-mod subject;

 // Since Rust no longer uses jemalloc by default, ripgrep will, by default,
 // use the system allocator. On Linux, this would normally be glibc's
@@ -40,143 +39,163 @@ mod subject;
 #[global_allocator]
 static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;

-fn main() {
-    if let Err(err) = Args::parse().and_then(try_main) {
-        eprintln_locked!("{:#}", err);
-        std::process::exit(2);
-    }
-}
-
-fn try_main(args: Args) -> anyhow::Result<()> {
-    use args::Command::*;
-
-    let matched = match args.command() {
-        Search => search(&args),
-        SearchParallel => search_parallel(&args),
-        SearchNever => Ok(false),
-        Files => files(&args),
-        FilesParallel => files_parallel(&args),
-        Types => types(&args),
-        PCRE2Version => pcre2_version(&args),
-    }?;
-    if matched && (args.quiet() || !messages::errored()) {
-        std::process::exit(0)
-    } else if messages::errored() {
-        std::process::exit(2)
-    } else {
-        std::process::exit(1)
-    }
-}
-
-/// The top-level entry point for single-threaded search. This recursively
-/// steps through the file list (current directory by default) and searches
-/// each file sequentially.
-fn search(args: &Args) -> anyhow::Result<bool> {
-    /// The meat of the routine is here. This lets us call the same iteration
-    /// code over each file regardless of whether we stream over the files
-    /// as they're produced by the underlying directory traversal or whether
-    /// they've been collected and sorted (for example) first.
-    fn iter(
-        args: &Args,
-        subjects: impl Iterator<Item = Subject>,
-        started_at: std::time::Instant,
-    ) -> anyhow::Result<bool> {
-        let quit_after_match = args.quit_after_match()?;
-        let mut stats = args.stats()?;
-        let mut searcher = args.search_worker(args.stdout())?;
-        let mut matched = false;
-        let mut searched = false;
-
-        for subject in subjects {
-            searched = true;
-            let search_result = match searcher.search(&subject) {
-                Ok(search_result) => search_result,
-                // A broken pipe means graceful termination.
-                Err(err) if err.kind() == io::ErrorKind::BrokenPipe => break,
-                Err(err) => {
-                    err_message!("{}: {}", subject.path().display(), err);
-                    continue;
+/// Then, as it was, then again it will be.
+fn main() -> ExitCode {
+    match run(flags::parse()) {
+        Ok(code) => code,
+        Err(err) => {
+            // Look for a broken pipe error. In this case, we generally want
+            // to exit "gracefully" with a success exit code. This matches
+            // existing Unix convention. We need to handle this explicitly
+            // since the Rust runtime doesn't ask for PIPE signals, and thus
+            // we get an I/O error instead. Traditional C Unix applications
+            // quit by getting a PIPE signal that they don't handle, and thus
+            // the unhandled signal causes the process to unceremoniously
+            // terminate.
+            for cause in err.chain() {
+                if let Some(ioerr) = cause.downcast_ref::<std::io::Error>() {
+                    if ioerr.kind() == std::io::ErrorKind::BrokenPipe {
+                        return ExitCode::from(0);
+                    }
                }
-            };
-            matched |= search_result.has_match();
-            if let Some(ref mut stats) = stats {
-                *stats += search_result.stats().unwrap();
-            }
-            if matched && quit_after_match {
-                break;
            }
+            eprintln_locked!("{:#}", err);
+            ExitCode::from(2)
        }
-        if args.using_default_path() && !searched {
-            eprint_nothing_searched();
-        }
-        if let Some(ref stats) = stats {
-            let elapsed = Instant::now().duration_since(started_at);
-            // We don't care if we couldn't print this successfully.
-            let _ = searcher.print_stats(elapsed, stats);
-        }
-        Ok(matched)
-    }
-
-    let started_at = Instant::now();
-    let subject_builder = args.subject_builder();
-    let subjects = args
-        .walker()?
-        .filter_map(|result| subject_builder.build_from_result(result));
-    if args.needs_stat_sort() {
-        let subjects = args.sort_by_stat(subjects).into_iter();
-        iter(args, subjects, started_at)
-    } else {
-        iter(args, subjects, started_at)
    }
 }

-/// The top-level entry point for multi-threaded search. The parallelism is
-/// itself achieved by the recursive directory traversal. All we need to do is
-/// feed it a worker for performing a search on each file.
+/// The main entry point for ripgrep.
+///
+/// The given parse result determines ripgrep's behavior. The parse
+/// result should be the result of parsing CLI arguments in a low level
+/// representation, and then followed by an attempt to convert them into a
+/// higher level representation. The higher level representation has some nicer
+/// abstractions, for example, instead of representing the `-g/--glob` flag
+/// as a `Vec<String>` (as in the low level representation), the globs are
+/// converted into a single matcher.
+fn run(result: crate::flags::ParseResult<HiArgs>) -> anyhow::Result<ExitCode> {
+    use crate::flags::{Mode, ParseResult};
+
+    let args = match result {
+        ParseResult::Err(err) => return Err(err),
+        ParseResult::Special(mode) => return special(mode),
+        ParseResult::Ok(args) => args,
+    };
+    let matched = match args.mode() {
+        Mode::Search(_) if !args.matches_possible() => false,
+        Mode::Search(mode) if args.threads() == 1 => search(&args, mode)?,
+        Mode::Search(mode) => search_parallel(&args, mode)?,
+        Mode::Files if args.threads() == 1 => files(&args)?,
+        Mode::Files => files_parallel(&args)?,
+        Mode::Types => return types(&args),
+        Mode::Generate(mode) => return generate(mode),
+    };
+    Ok(if matched && (args.quiet() || !messages::errored()) {
+        ExitCode::from(0)
+    } else if messages::errored() {
+        ExitCode::from(2)
+    } else {
+        ExitCode::from(1)
+    })
+}
+
+/// The top-level entry point for single-threaded search.
+///
+/// This recursively steps through the file list (current directory by default)
+/// and searches each file sequentially.
+fn search(args: &HiArgs, mode: SearchMode) -> anyhow::Result<bool> {
+    let started_at = std::time::Instant::now();
+    let haystack_builder = args.haystack_builder();
+    let unsorted = args
+        .walk_builder()?
+        .build()
+        .filter_map(|result| haystack_builder.build_from_result(result));
+    let haystacks = args.sort(unsorted);
+
+    let mut matched = false;
+    let mut searched = false;
+    let mut stats = args.stats();
+    let mut searcher = args.search_worker(
+        args.matcher()?,
+        args.searcher()?,
+        args.printer(mode, args.stdout()),
+    )?;
+    for haystack in haystacks {
+        searched = true;
+        let search_result = match searcher.search(&haystack) {
+            Ok(search_result) => search_result,
+            // A broken pipe means graceful termination.
+            Err(err) if err.kind() == std::io::ErrorKind::BrokenPipe => break,
+            Err(err) => {
+                err_message!("{}: {}", haystack.path().display(), err);
+                continue;
+            }
+        };
+        matched = matched || search_result.has_match();
+        if let Some(ref mut stats) = stats {
+            *stats += search_result.stats().unwrap();
+        }
+        if matched && args.quit_after_match() {
+            break;
+        }
+    }
+    if args.has_implicit_path() && !searched {
+        eprint_nothing_searched();
+    }
+    if let Some(ref stats) = stats {
+        let wtr = searcher.printer().get_mut();
+        let _ = print_stats(mode, stats, started_at, wtr);
+    }
+    Ok(matched)
+}
+
+/// The top-level entry point for multi-threaded search.
+///
+/// The parallelism is itself achieved by the recursive directory traversal.
+/// All we need to do is feed it a worker for performing a search on each file.
 ///
 /// Requesting a sorted output from ripgrep (such as with `--sort path`) will
 /// automatically disable parallelism and hence sorting is not handled here.
-fn search_parallel(args: &Args) -> anyhow::Result<bool> {
-    use std::sync::atomic::{AtomicBool, Ordering::SeqCst};
+fn search_parallel(args: &HiArgs, mode: SearchMode) -> anyhow::Result<bool> {
+    use std::sync::atomic::{AtomicBool, Ordering};

-    let quit_after_match = args.quit_after_match()?;
-    let started_at = Instant::now();
-    let subject_builder = args.subject_builder();
-    let bufwtr = args.buffer_writer()?;
-    let stats = args.stats()?.map(std::sync::Mutex::new);
+    let started_at = std::time::Instant::now();
+    let haystack_builder = args.haystack_builder();
+    let bufwtr = args.buffer_writer();
+    let stats = args.stats().map(std::sync::Mutex::new);
    let matched = AtomicBool::new(false);
    let searched = AtomicBool::new(false);
-    let mut searcher_err = None;
-    args.walker_parallel()?.run(|| {
+
+    let mut searcher = args.search_worker(
+        args.matcher()?,
+        args.searcher()?,
+        args.printer(mode, bufwtr.buffer()),
+    )?;
+    args.walk_builder()?.build_parallel().run(|| {
        let bufwtr = &bufwtr;
        let stats = &stats;
        let matched = &matched;
        let searched = &searched;
-        let subject_builder = &subject_builder;
-        let mut searcher = match args.search_worker(bufwtr.buffer()) {
-            Ok(searcher) => searcher,
-            Err(err) => {
-                searcher_err = Some(err);
-                return Box::new(move |_| WalkState::Quit);
-            }
-        };
+        let haystack_builder = &haystack_builder;
+        let mut searcher = searcher.clone();

        Box::new(move |result| {
-            let subject = match subject_builder.build_from_result(result) {
-                Some(subject) => subject,
+            let haystack = match haystack_builder.build_from_result(result) {
+                Some(haystack) => haystack,
                None => return WalkState::Continue,
            };
-            searched.store(true, SeqCst);
+            searched.store(true, Ordering::SeqCst);
            searcher.printer().get_mut().clear();
-            let search_result = match searcher.search(&subject) {
+            let search_result = match searcher.search(&haystack) {
                Ok(search_result) => search_result,
                Err(err) => {
-                    err_message!("{}: {}", subject.path().display(), err);
+                    err_message!("{}: {}", haystack.path().display(), err);
                    return WalkState::Continue;
                }
            };
            if search_result.has_match() {
-                matched.store(true, SeqCst);
+                matched.store(true, Ordering::SeqCst);
            }
            if let Some(ref locked_stats) = *stats {
                let mut stats = locked_stats.lock().unwrap();
@@ -184,128 +203,110 @@ fn search_parallel(args: &Args) -> anyhow::Result<bool> {
            }
            if let Err(err) = bufwtr.print(searcher.printer().get_mut()) {
                // A broken pipe means graceful termination.
-                if err.kind() == io::ErrorKind::BrokenPipe {
+                if err.kind() == std::io::ErrorKind::BrokenPipe {
                    return WalkState::Quit;
                }
                // Otherwise, we continue on our merry way.
-                err_message!("{}: {}", subject.path().display(), err);
+                err_message!("{}: {}", haystack.path().display(), err);
            }
-            if matched.load(SeqCst) && quit_after_match {
+            if matched.load(Ordering::SeqCst) && args.quit_after_match() {
                WalkState::Quit
            } else {
                WalkState::Continue
            }
        })
    });
-    if let Some(err) = searcher_err.take() {
-        return Err(err);
-    }
-    if args.using_default_path() && !searched.load(SeqCst) {
+    if args.has_implicit_path() && !searched.load(Ordering::SeqCst) {
        eprint_nothing_searched();
    }
    if let Some(ref locked_stats) = stats {
-        let elapsed = Instant::now().duration_since(started_at);
        let stats = locked_stats.lock().unwrap();
-        let mut searcher = args.search_worker(args.stdout())?;
-        // We don't care if we couldn't print this successfully.
-        let _ = searcher.print_stats(elapsed, &stats);
+        let mut wtr = searcher.printer().get_mut();
+        let _ = print_stats(mode, &stats, started_at, &mut wtr);
+        let _ = bufwtr.print(&mut wtr);
    }
-    Ok(matched.load(SeqCst))
+    Ok(matched.load(Ordering::SeqCst))
 }

-fn eprint_nothing_searched() {
-    err_message!(
-        "No files were searched, which means ripgrep probably \
-         applied a filter you didn't expect.\n\
-         Running with --debug will show why files are being skipped."
-    );
-}
+/// The top-level entry point for file listing without searching.
+///
+/// This recursively steps through the file list (current directory by default)
+/// and prints each path sequentially using a single thread.
+fn files(args: &HiArgs) -> anyhow::Result<bool> {
+    let haystack_builder = args.haystack_builder();
+    let unsorted = args
+        .walk_builder()?
+        .build()
+        .filter_map(|result| haystack_builder.build_from_result(result));
+    let haystacks = args.sort(unsorted);

-/// The top-level entry point for listing files without searching them. This
-/// recursively steps through the file list (current directory by default) and
-/// prints each path sequentially using a single thread.
-fn files(args: &Args) -> anyhow::Result<bool> {
-    /// The meat of the routine is here. This lets us call the same iteration
-    /// code over each file regardless of whether we stream over the files
-    /// as they're produced by the underlying directory traversal or whether
-    /// they've been collected and sorted (for example) first.
-    fn iter(
-        args: &Args,
-        subjects: impl Iterator<Item = Subject>,
-    ) -> anyhow::Result<bool> {
-        let quit_after_match = args.quit_after_match()?;
-        let mut matched = false;
-        let mut path_printer = args.path_printer(args.stdout())?;
-
-        for subject in subjects {
-            matched = true;
-            if quit_after_match {
+    let mut matched = false;
+    let mut path_printer = args.path_printer_builder().build(args.stdout());
+    for haystack in haystacks {
+        matched = true;
+        if args.quit_after_match() {
+            break;
+        }
+        if let Err(err) = path_printer.write(haystack.path()) {
+            // A broken pipe means graceful termination.
+            if err.kind() == std::io::ErrorKind::BrokenPipe {
                break;
            }
-            if let Err(err) = path_printer.write(subject.path()) {
-                // A broken pipe means graceful termination.
-                if err.kind() == io::ErrorKind::BrokenPipe {
-                    break;
-                }
-                // Otherwise, we have some other error that's preventing us from
-                // writing to stdout, so we should bubble it up.
-                return Err(err.into());
-            }
+            // Otherwise, we have some other error that's preventing us from
+            // writing to stdout, so we should bubble it up.
+            return Err(err.into());
        }
-        Ok(matched)
-    }
-
-    let subject_builder = args.subject_builder();
-    let subjects = args
-        .walker()?
-        .filter_map(|result| subject_builder.build_from_result(result));
-    if args.needs_stat_sort() {
-        let subjects = args.sort_by_stat(subjects).into_iter();
-        iter(args, subjects)
-    } else {
-        iter(args, subjects)
    }
+    Ok(matched)
 }

-/// The top-level entry point for listing files without searching them. This
-/// recursively steps through the file list (current directory by default) and
-/// prints each path sequentially using multiple threads.
+/// The top-level entry point for multi-threaded file listing without
+/// searching.
+///
+/// This recursively steps through the file list (current directory by default)
+/// and prints each path sequentially using multiple threads.
 ///
 /// Requesting a sorted output from ripgrep (such as with `--sort path`) will
 /// automatically disable parallelism and hence sorting is not handled here.
-fn files_parallel(args: &Args) -> anyhow::Result<bool> {
-    use std::sync::atomic::AtomicBool;
-    use std::sync::atomic::Ordering::SeqCst;
-    use std::sync::mpsc;
-    use std::thread;
+fn files_parallel(args: &HiArgs) -> anyhow::Result<bool> {
+    use std::{
+        sync::{
+            atomic::{AtomicBool, Ordering},
+            mpsc,
+        },
+        thread,
+    };

-    let quit_after_match = args.quit_after_match()?;
-    let subject_builder = args.subject_builder();
-    let mut path_printer = args.path_printer(args.stdout())?;
+    let haystack_builder = args.haystack_builder();
+    let mut path_printer = args.path_printer_builder().build(args.stdout());
    let matched = AtomicBool::new(false);
-    let (tx, rx) = mpsc::channel::<Subject>();
+    let (tx, rx) = mpsc::channel::<crate::haystack::Haystack>();

-    let print_thread = thread::spawn(move || -> io::Result<()> {
-        for subject in rx.iter() {
-            path_printer.write(subject.path())?;
+    // We spawn a single printing thread to make sure we don't tear writes.
+    // We use a channel here under the presumption that it's probably faster
+    // than using a mutex in the worker threads below, but this has never been
+    // seriously litigated.
+    let print_thread = thread::spawn(move || -> std::io::Result<()> {
+        for haystack in rx.iter() {
+            path_printer.write(haystack.path())?;
        }
        Ok(())
    });
-    args.walker_parallel()?.run(|| {
-        let subject_builder = &subject_builder;
+    args.walk_builder()?.build_parallel().run(|| {
+        let haystack_builder = &haystack_builder;
        let matched = &matched;
        let tx = tx.clone();

        Box::new(move |result| {
-            let subject = match subject_builder.build_from_result(result) {
-                Some(subject) => subject,
+            let haystack = match haystack_builder.build_from_result(result) {
+                Some(haystack) => haystack,
                None => return WalkState::Continue,
            };
-            matched.store(true, SeqCst);
-            if quit_after_match {
+            matched.store(true, Ordering::SeqCst);
+            if args.quit_after_match() {
                WalkState::Quit
            } else {
-                match tx.send(subject) {
+                match tx.send(haystack) {
                    Ok(_) => WalkState::Continue,
                    Err(_) => WalkState::Quit,
                }
@@ -317,18 +318,18 @@ fn files_parallel(args: &Args) -> anyhow::Result<bool> {
        // A broken pipe means graceful termination, so fall through.
        // Otherwise, something bad happened while writing to stdout, so bubble
        // it up.
-        if err.kind() != io::ErrorKind::BrokenPipe {
+        if err.kind() != std::io::ErrorKind::BrokenPipe {
            return Err(err.into());
        }
    }
-    Ok(matched.load(SeqCst))
+    Ok(matched.load(Ordering::SeqCst))
 }

-/// The top-level entry point for --type-list.
-fn types(args: &Args) -> anyhow::Result<bool> {
+/// The top-level entry point for `--type-list`.
+fn types(args: &HiArgs) -> anyhow::Result<ExitCode> {
    let mut count = 0;
    let mut stdout = args.stdout();
-    for def in args.type_defs()? {
+    for def in args.types().definitions() {
        count += 1;
        stdout.write_all(def.name().as_bytes())?;
        stdout.write_all(b": ")?;
@@ -343,32 +344,156 @@ fn types(args: &Args) -> anyhow::Result<bool> {
        }
        stdout.write_all(b"\n")?;
    }
-    Ok(count > 0)
+    Ok(ExitCode::from(if count == 0 { 1 } else { 0 }))
 }

-/// The top-level entry point for --pcre2-version.
-fn pcre2_version(args: &Args) -> anyhow::Result<bool> {
-    #[cfg(feature = "pcre2")]
-    fn imp(args: &Args) -> anyhow::Result<bool> {
-        use grep::pcre2;
+/// Implements ripgrep's "generate" modes.
+///
+/// These modes correspond to generating some kind of ancillary data related
+/// to ripgrep. At present, this includes ripgrep's man page (in roff format)
+/// and supported shell completions.
+fn generate(mode: crate::flags::GenerateMode) -> anyhow::Result<ExitCode> {
+    use crate::flags::GenerateMode;

-        let mut stdout = args.stdout();
+    let output = match mode {
+        GenerateMode::Man => flags::generate_man_page(),
+        GenerateMode::CompleteBash => flags::generate_complete_bash(),
+        GenerateMode::CompleteZsh => flags::generate_complete_zsh(),
+        GenerateMode::CompleteFish => flags::generate_complete_fish(),
+        GenerateMode::CompletePowerShell => {
+            flags::generate_complete_powershell()
+        }
+    };
+    writeln!(std::io::stdout(), "{}", output.trim_end())?;
+    Ok(ExitCode::from(0))
+}
+
+/// Implements ripgrep's "special" modes.
+///
+/// A special mode is one that generally short-circuits most (not all) of
+/// ripgrep's initialization logic and skips right to this routine. The
+/// special modes essentially consist of printing help and version output. The
+/// idea behind the short circuiting is to ensure there is as little as possible
+/// (within reason) that would prevent ripgrep from emitting help output.
+///
+/// For example, part of the initialization logic that is skipped (among
+/// other things) is accessing the current working directory. If that fails,
+/// ripgrep emits an error. We don't want to emit an error if it fails and
+/// the user requested version or help information.
+fn special(mode: crate::flags::SpecialMode) -> anyhow::Result<ExitCode> {
+    use crate::flags::SpecialMode;
+
+    let output = match mode {
+        SpecialMode::HelpShort => flags::generate_help_short(),
+        SpecialMode::HelpLong => flags::generate_help_long(),
+        SpecialMode::VersionShort => flags::generate_version_short(),
+        SpecialMode::VersionLong => flags::generate_version_long(),
+        // --pcre2-version is a little special because it emits an error
+        // exit code if this build of ripgrep doesn't support PCRE2.
+        SpecialMode::VersionPCRE2 => return version_pcre2(),
+    };
+    writeln!(std::io::stdout(), "{}", output.trim_end())?;
+    Ok(ExitCode::from(0))
+}
+
+/// The top-level entry point for `--pcre2-version`.
+fn version_pcre2() -> anyhow::Result<ExitCode> {
+    let mut stdout = std::io::stdout().lock();
+
+    #[cfg(feature = "pcre2")]
+    {
+        use grep::pcre2;

        let (major, minor) = pcre2::version();
        writeln!(stdout, "PCRE2 {}.{} is available", major, minor)?;
-
        if cfg!(target_pointer_width = "64") && pcre2::is_jit_available() {
            writeln!(stdout, "JIT is available")?;
        }
-        Ok(true)
+        Ok(ExitCode::from(0))
    }

    #[cfg(not(feature = "pcre2"))]
-    fn imp(args: &Args) -> anyhow::Result<bool> {
-        let mut stdout = args.stdout();
+    {
        writeln!(stdout, "PCRE2 is not available in this build of ripgrep.")?;
-        Ok(false)
+        Ok(ExitCode::from(1))
+    }
+}
+
+/// Prints a heuristic error messages when nothing is searched.
+///
+/// This can happen if an applicable ignore file has one or more rules that
+/// are too broad and cause ripgrep to ignore everything.
+///
+/// We only show this error message when the user does *not* provide an
+/// explicit path to search. This is because the message can otherwise be
+/// noisy, e.g., when it is intended that there is nothing to search.
+fn eprint_nothing_searched() {
+    err_message!(
+        "No files were searched, which means ripgrep probably \
+         applied a filter you didn't expect.\n\
+         Running with --debug will show why files are being skipped."
+    );
+}
+
+/// Prints the statistics given to the writer given.
+///
+/// The search mode given determines whether the stats should be printed in
+/// a plain text format or in a JSON format.
+///
+/// The `started` time should be the time at which ripgrep started working.
+///
+/// If an error occurs while writing, then writing stops and the error is
+/// returned. Note that callers should probably ignore this errror, since
+/// whether stats fail to print or not generally shouldn't cause ripgrep to
+/// enter into an "error" state. And usually the only way for this to fail is
+/// if writing to stdout itself fails.
+fn print_stats<W: Write>(
+    mode: SearchMode,
+    stats: &grep::printer::Stats,
+    started: std::time::Instant,
+    mut wtr: W,
+) -> std::io::Result<()> {
+    let elapsed = std::time::Instant::now().duration_since(started);
+    if matches!(mode, SearchMode::JSON) {
+        // We specifically match the format laid out by the JSON printer in
+        // the grep-printer crate. We simply "extend" it with the 'summary'
+        // message type.
+        serde_json::to_writer(
+            &mut wtr,
+            &serde_json::json!({
+                "type": "summary",
+                "data": {
+                    "stats": stats,
+                    "elapsed_total": {
+                        "secs": elapsed.as_secs(),
+                        "nanos": elapsed.subsec_nanos(),
+                        "human": format!("{:0.6}s", elapsed.as_secs_f64()),
+                    },
+                }
+            }),
+        )?;
+        write!(wtr, "\n")
+    } else {
+        write!(
+            wtr,
+            "
+{matches} matches
+{lines} matched lines
+{searches_with_match} files contained matches
+{searches} files searched
+{bytes_printed} bytes printed
+{bytes_searched} bytes searched
+{search_time:0.6} seconds spent searching
+{process_time:0.6} seconds
+",
+            matches = stats.matches(),
+            lines = stats.matched_lines(),
+            searches_with_match = stats.searches_with_match(),
+            searches = stats.searches(),
+            bytes_printed = stats.bytes_printed(),
+            bytes_searched = stats.bytes_searched(),
+            search_time = stats.elapsed().as_secs_f64(),
+            process_time = elapsed.as_secs_f64(),
+        )
    }
-
-    imp(args)
 }
--- a/crates/core/messages.rs
+++ b/crates/core/messages.rs
@@ -1,21 +1,59 @@
+/*!
+This module defines some macros and some light shared mutable state.
+
+This state is responsible for keeping track of whether we should emit certain
+kinds of messages to the user (such as errors) that are distinct from the
+standard "debug" or "trace" log messages. This state is specifically set at
+startup time when CLI arguments are parsed and then never changed.
+
+The other state tracked here is whether ripgrep experienced an error
+condition. Aside from errors associated with invalid CLI arguments, ripgrep
+generally does not abort when an error occurs (e.g., if reading a file failed).
+But when an error does occur, it will alter ripgrep's exit status. Thus, when
+an error message is emitted via `err_message`, then a global flag is toggled
+indicating that at least one error occurred. When ripgrep exits, this flag is
+consulted to determine what the exit status ought to be.
+*/
+
 use std::sync::atomic::{AtomicBool, Ordering};

+/// When false, "messages" will not be printed.
 static MESSAGES: AtomicBool = AtomicBool::new(false);
+/// When false, "messages" related to ignore rules will not be printed.
 static IGNORE_MESSAGES: AtomicBool = AtomicBool::new(false);
+/// Flipped to true when an error message is printed.
 static ERRORED: AtomicBool = AtomicBool::new(false);

-/// Like eprintln, but locks STDOUT to prevent interleaving lines.
+/// Like eprintln, but locks stdout to prevent interleaving lines.
+///
+/// This locks stdout, not stderr, even though this prints to stderr. This
+/// avoids the appearance of interleaving output when stdout and stderr both
+/// correspond to a tty.)
 #[macro_export]
 macro_rules! eprintln_locked {
    ($($tt:tt)*) => {{
        {
+            use std::io::Write;
+
            // This is a bit of an abstraction violation because we explicitly
-            // lock STDOUT before printing to STDERR. This avoids interleaving
+            // lock stdout before printing to stderr. This avoids interleaving
            // lines within ripgrep because `search_parallel` uses `termcolor`,
-            // which accesses the same STDOUT lock when writing lines.
+            // which accesses the same stdout lock when writing lines.
            let stdout = std::io::stdout();
            let _handle = stdout.lock();
-            eprintln!($($tt)*);
+            // We specifically ignore any errors here. One plausible error we
+            // can get in some cases is a broken pipe error. And when that
+            // occurs, we should exit gracefully. Otherwise, just abort with
+            // an error code because there isn't much else we can do.
+            //
+            // See: https://github.com/BurntSushi/ripgrep/issues/1966
+            if let Err(err) = writeln!(std::io::stderr(), $($tt)*) {
+                if err.kind() == std::io::ErrorKind::BrokenPipe {
+                    std::process::exit(0);
+                } else {
+                    std::process::exit(2);
+                }
+            }
        }
    }}
 }
@@ -52,19 +90,19 @@ macro_rules! ignore_message {
 }

 /// Returns true if and only if messages should be shown.
-pub fn messages() -> bool {
+pub(crate) fn messages() -> bool {
    MESSAGES.load(Ordering::SeqCst)
 }

 /// Set whether messages should be shown or not.
 ///
 /// By default, they are not shown.
-pub fn set_messages(yes: bool) {
+pub(crate) fn set_messages(yes: bool) {
    MESSAGES.store(yes, Ordering::SeqCst)
 }

 /// Returns true if and only if "ignore" related messages should be shown.
-pub fn ignore_messages() -> bool {
+pub(crate) fn ignore_messages() -> bool {
    IGNORE_MESSAGES.load(Ordering::SeqCst)
 }

@@ -75,16 +113,19 @@ pub fn ignore_messages() -> bool {
 /// Note that this is overridden if `messages` is disabled. Namely, if
 /// `messages` is disabled, then "ignore" messages are never shown, regardless
 /// of this setting.
-pub fn set_ignore_messages(yes: bool) {
+pub(crate) fn set_ignore_messages(yes: bool) {
    IGNORE_MESSAGES.store(yes, Ordering::SeqCst)
 }

 /// Returns true if and only if ripgrep came across a non-fatal error.
-pub fn errored() -> bool {
+pub(crate) fn errored() -> bool {
    ERRORED.load(Ordering::SeqCst)
 }

 /// Indicate that ripgrep has come across a non-fatal error.
-pub fn set_errored() {
+///
+/// Callers should not use this directly. Instead, it is called automatically
+/// via the `err_message` macro.
+pub(crate) fn set_errored() {
    ERRORED.store(true, Ordering::SeqCst);
 }
--- a/crates/core/search.rs
+++ b/crates/core/search.rs
@@ -1,59 +1,47 @@
-use std::{
-    io,
-    path::{Path, PathBuf},
-    time::Duration,
-};
+/*!
+Defines a very high level "search worker" abstraction.

-use {
-    grep::{
-        cli,
-        matcher::Matcher,
-        printer::{Standard, Stats, Summary, JSON},
-        regex::RegexMatcher as RustRegexMatcher,
-        searcher::{BinaryDetection, Searcher},
-    },
-    ignore::overrides::Override,
-    serde_json::{self as json, json},
-    termcolor::WriteColor,
-};
+A search worker manages the high level interaction points between the matcher
+(i.e., which regex engine is used), the searcher (i.e., how data is actually
+read and matched using the regex engine) and the printer. For example, the
+search worker is where things like preprocessors or decompression happens.
+*/

-#[cfg(feature = "pcre2")]
-use grep::pcre2::RegexMatcher as PCRE2RegexMatcher;
+use std::{io, path::Path};

-use crate::subject::Subject;
+use {grep::matcher::Matcher, termcolor::WriteColor};

-/// The configuration for the search worker. Among a few other things, the
-/// configuration primarily controls the way we show search results to users
-/// at a very high level.
+/// The configuration for the search worker.
+///
+/// Among a few other things, the configuration primarily controls the way we
+/// show search results to users at a very high level.
 #[derive(Clone, Debug)]
 struct Config {
-    json_stats: bool,
-    preprocessor: Option<PathBuf>,
-    preprocessor_globs: Override,
+    preprocessor: Option<std::path::PathBuf>,
+    preprocessor_globs: ignore::overrides::Override,
    search_zip: bool,
-    binary_implicit: BinaryDetection,
-    binary_explicit: BinaryDetection,
+    binary_implicit: grep::searcher::BinaryDetection,
+    binary_explicit: grep::searcher::BinaryDetection,
 }

 impl Default for Config {
    fn default() -> Config {
        Config {
-            json_stats: false,
            preprocessor: None,
-            preprocessor_globs: Override::empty(),
+            preprocessor_globs: ignore::overrides::Override::empty(),
            search_zip: false,
-            binary_implicit: BinaryDetection::none(),
-            binary_explicit: BinaryDetection::none(),
+            binary_implicit: grep::searcher::BinaryDetection::none(),
+            binary_explicit: grep::searcher::BinaryDetection::none(),
        }
    }
 }

 /// A builder for configuring and constructing a search worker.
 #[derive(Clone, Debug)]
-pub struct SearchWorkerBuilder {
+pub(crate) struct SearchWorkerBuilder {
    config: Config,
-    command_builder: cli::CommandReaderBuilder,
-    decomp_builder: cli::DecompressionReaderBuilder,
+    command_builder: grep::cli::CommandReaderBuilder,
+    decomp_builder: grep::cli::DecompressionReaderBuilder,
 }

 impl Default for SearchWorkerBuilder {
@@ -64,11 +52,11 @@ impl Default for SearchWorkerBuilder {

 impl SearchWorkerBuilder {
    /// Create a new builder for configuring and constructing a search worker.
-    pub fn new() -> SearchWorkerBuilder {
-        let mut cmd_builder = cli::CommandReaderBuilder::new();
+    pub(crate) fn new() -> SearchWorkerBuilder {
+        let mut cmd_builder = grep::cli::CommandReaderBuilder::new();
        cmd_builder.async_stderr(true);

-        let mut decomp_builder = cli::DecompressionReaderBuilder::new();
+        let mut decomp_builder = grep::cli::DecompressionReaderBuilder::new();
        decomp_builder.async_stderr(true);

        SearchWorkerBuilder {
@@ -80,10 +68,10 @@ impl SearchWorkerBuilder {

    /// Create a new search worker using the given searcher, matcher and
    /// printer.
-    pub fn build<W: WriteColor>(
+    pub(crate) fn build<W: WriteColor>(
        &self,
        matcher: PatternMatcher,
-        searcher: Searcher,
+        searcher: grep::searcher::Searcher,
        printer: Printer<W>,
    ) -> SearchWorker<W> {
        let config = self.config.clone();
@@ -99,29 +87,17 @@ impl SearchWorkerBuilder {
        }
    }

-    /// Forcefully use JSON to emit statistics, even if the underlying printer
-    /// is not the JSON printer.
-    ///
-    /// This is useful for implementing flag combinations like
-    /// `--json --quiet`, which uses the summary printer for implementing
-    /// `--quiet` but still wants to emit summary statistics, which should
-    /// be JSON formatted because of the `--json` flag.
-    pub fn json_stats(&mut self, yes: bool) -> &mut SearchWorkerBuilder {
-        self.config.json_stats = yes;
-        self
-    }
-
    /// Set the path to a preprocessor command.
    ///
    /// When this is set, instead of searching files directly, the given
    /// command will be run with the file path as the first argument, and the
    /// output of that command will be searched instead.
-    pub fn preprocessor(
+    pub(crate) fn preprocessor(
        &mut self,
-        cmd: Option<PathBuf>,
+        cmd: Option<std::path::PathBuf>,
    ) -> anyhow::Result<&mut SearchWorkerBuilder> {
        if let Some(ref prog) = cmd {
-            let bin = cli::resolve_binary(prog)?;
+            let bin = grep::cli::resolve_binary(prog)?;
            self.config.preprocessor = Some(bin);
        } else {
            self.config.preprocessor = None;
@@ -132,9 +108,9 @@ impl SearchWorkerBuilder {
    /// Set the globs for determining which files should be run through the
    /// preprocessor. By default, with no globs and a preprocessor specified,
    /// every file is run through the preprocessor.
-    pub fn preprocessor_globs(
+    pub(crate) fn preprocessor_globs(
        &mut self,
-        globs: Override,
+        globs: ignore::overrides::Override,
    ) -> &mut SearchWorkerBuilder {
        self.config.preprocessor_globs = globs;
        self
@@ -147,7 +123,10 @@ impl SearchWorkerBuilder {
    ///
    /// Note that if a preprocessor command is set, then it overrides this
    /// setting.
-    pub fn search_zip(&mut self, yes: bool) -> &mut SearchWorkerBuilder {
+    pub(crate) fn search_zip(
+        &mut self,
+        yes: bool,
+    ) -> &mut SearchWorkerBuilder {
        self.config.search_zip = yes;
        self
    }
@@ -155,13 +134,14 @@ impl SearchWorkerBuilder {
    /// Set the binary detection that should be used when searching files
    /// found via a recursive directory search.
    ///
-    /// Generally, this binary detection may be `BinaryDetection::quit` if
-    /// we want to skip binary files completely.
+    /// Generally, this binary detection may be
+    /// `grep::searcher::BinaryDetection::quit` if we want to skip binary files
+    /// completely.
    ///
    /// By default, no binary detection is performed.
-    pub fn binary_detection_implicit(
+    pub(crate) fn binary_detection_implicit(
        &mut self,
-        detection: BinaryDetection,
+        detection: grep::searcher::BinaryDetection,
    ) -> &mut SearchWorkerBuilder {
        self.config.binary_implicit = detection;
        self
@@ -170,14 +150,14 @@ impl SearchWorkerBuilder {
    /// Set the binary detection that should be used when searching files
    /// explicitly supplied by an end user.
    ///
-    /// Generally, this binary detection should NOT be `BinaryDetection::quit`,
-    /// since we never want to automatically filter files supplied by the end
-    /// user.
+    /// Generally, this binary detection should NOT be
+    /// `grep::searcher::BinaryDetection::quit`, since we never want to
+    /// automatically filter files supplied by the end user.
    ///
    /// By default, no binary detection is performed.
-    pub fn binary_detection_explicit(
+    pub(crate) fn binary_detection_explicit(
        &mut self,
-        detection: BinaryDetection,
+        detection: grep::searcher::BinaryDetection,
    ) -> &mut SearchWorkerBuilder {
        self.config.binary_explicit = detection;
        self
@@ -191,14 +171,14 @@ impl SearchWorkerBuilder {
 /// every search also has some aggregate statistics or meta data that may be
 /// useful to higher level routines.
 #[derive(Clone, Debug, Default)]
-pub struct SearchResult {
+pub(crate) struct SearchResult {
    has_match: bool,
-    stats: Option<Stats>,
+    stats: Option<grep::printer::Stats>,
 }

 impl SearchResult {
    /// Whether the search found a match or not.
-    pub fn has_match(&self) -> bool {
+    pub(crate) fn has_match(&self) -> bool {
        self.has_match
    }

@@ -206,103 +186,36 @@ impl SearchResult {
    ///
    /// It can be expensive to compute statistics, so these are only present
    /// if explicitly enabled in the printer provided by the caller.
-    pub fn stats(&self) -> Option<&Stats> {
+    pub(crate) fn stats(&self) -> Option<&grep::printer::Stats> {
        self.stats.as_ref()
    }
 }

 /// The pattern matcher used by a search worker.
 #[derive(Clone, Debug)]
-pub enum PatternMatcher {
-    RustRegex(RustRegexMatcher),
+pub(crate) enum PatternMatcher {
+    RustRegex(grep::regex::RegexMatcher),
    #[cfg(feature = "pcre2")]
-    PCRE2(PCRE2RegexMatcher),
+    PCRE2(grep::pcre2::RegexMatcher),
 }

 /// The printer used by a search worker.
 ///
 /// The `W` type parameter refers to the type of the underlying writer.
-#[derive(Debug)]
-pub enum Printer<W> {
+#[derive(Clone, Debug)]
+pub(crate) enum Printer<W> {
    /// Use the standard printer, which supports the classic grep-like format.
-    Standard(Standard<W>),
+    Standard(grep::printer::Standard<W>),
    /// Use the summary printer, which supports aggregate displays of search
    /// results.
-    Summary(Summary<W>),
+    Summary(grep::printer::Summary<W>),
    /// A JSON printer, which emits results in the JSON Lines format.
-    JSON(JSON<W>),
+    JSON(grep::printer::JSON<W>),
 }

 impl<W: WriteColor> Printer<W> {
-    fn print_stats(
-        &mut self,
-        total_duration: Duration,
-        stats: &Stats,
-    ) -> io::Result<()> {
-        match *self {
-            Printer::JSON(_) => self.print_stats_json(total_duration, stats),
-            Printer::Standard(_) | Printer::Summary(_) => {
-                self.print_stats_human(total_duration, stats)
-            }
-        }
-    }
-
-    fn print_stats_human(
-        &mut self,
-        total_duration: Duration,
-        stats: &Stats,
-    ) -> io::Result<()> {
-        write!(
-            self.get_mut(),
-            "
-{matches} matches
-{lines} matched lines
-{searches_with_match} files contained matches
-{searches} files searched
-{bytes_printed} bytes printed
-{bytes_searched} bytes searched
-{search_time:0.6} seconds spent searching
-{process_time:0.6} seconds
-",
-            matches = stats.matches(),
-            lines = stats.matched_lines(),
-            searches_with_match = stats.searches_with_match(),
-            searches = stats.searches(),
-            bytes_printed = stats.bytes_printed(),
-            bytes_searched = stats.bytes_searched(),
-            search_time = fractional_seconds(stats.elapsed()),
-            process_time = fractional_seconds(total_duration)
-        )
-    }
-
-    fn print_stats_json(
-        &mut self,
-        total_duration: Duration,
-        stats: &Stats,
-    ) -> io::Result<()> {
-        // We specifically match the format laid out by the JSON printer in
-        // the grep-printer crate. We simply "extend" it with the 'summary'
-        // message type.
-        let fractional = fractional_seconds(total_duration);
-        json::to_writer(
-            self.get_mut(),
-            &json!({
-                "type": "summary",
-                "data": {
-                    "stats": stats,
-                    "elapsed_total": {
-                        "secs": total_duration.as_secs(),
-                        "nanos": total_duration.subsec_nanos(),
-                        "human": format!("{:0.6}s", fractional),
-                    },
-                }
-            }),
-        )?;
-        write!(self.get_mut(), "\n")
-    }
-
    /// Return a mutable reference to the underlying printer's writer.
-    pub fn get_mut(&mut self) -> &mut W {
+    pub(crate) fn get_mut(&mut self) -> &mut W {
        match *self {
            Printer::Standard(ref mut p) => p.get_mut(),
            Printer::Summary(ref mut p) => p.get_mut(),
@@ -316,29 +229,32 @@ impl<W: WriteColor> Printer<W> {
 /// It is intended for a single worker to execute many searches, and is
 /// generally intended to be used from a single thread. When searching using
 /// multiple threads, it is better to create a new worker for each thread.
-#[derive(Debug)]
-pub struct SearchWorker<W> {
+#[derive(Clone, Debug)]
+pub(crate) struct SearchWorker<W> {
    config: Config,
-    command_builder: cli::CommandReaderBuilder,
-    decomp_builder: cli::DecompressionReaderBuilder,
+    command_builder: grep::cli::CommandReaderBuilder,
+    decomp_builder: grep::cli::DecompressionReaderBuilder,
    matcher: PatternMatcher,
-    searcher: Searcher,
+    searcher: grep::searcher::Searcher,
    printer: Printer<W>,
 }

 impl<W: WriteColor> SearchWorker<W> {
-    /// Execute a search over the given subject.
-    pub fn search(&mut self, subject: &Subject) -> io::Result<SearchResult> {
-        let bin = if subject.is_explicit() {
+    /// Execute a search over the given haystack.
+    pub(crate) fn search(
+        &mut self,
+        haystack: &crate::haystack::Haystack,
+    ) -> io::Result<SearchResult> {
+        let bin = if haystack.is_explicit() {
            self.config.binary_explicit.clone()
        } else {
            self.config.binary_implicit.clone()
        };
-        let path = subject.path();
+        let path = haystack.path();
        log::trace!("{}: binary detection: {:?}", path.display(), bin);

        self.searcher.set_binary_detection(bin);
-        if subject.is_stdin() {
+        if haystack.is_stdin() {
            self.search_reader(path, &mut io::stdin().lock())
        } else if self.should_preprocess(path) {
            self.search_preprocessor(path)
@@ -350,28 +266,10 @@ impl<W: WriteColor> SearchWorker<W> {
    }

    /// Return a mutable reference to the underlying printer.
-    pub fn printer(&mut self) -> &mut Printer<W> {
+    pub(crate) fn printer(&mut self) -> &mut Printer<W> {
        &mut self.printer
    }

-    /// Print the given statistics to the underlying writer in a way that is
-    /// consistent with this searcher's printer's format.
-    ///
-    /// While `Stats` contains a duration itself, this only corresponds to the
-    /// time spent searching, where as `total_duration` should roughly
-    /// approximate the lifespan of the ripgrep process itself.
-    pub fn print_stats(
-        &mut self,
-        total_duration: Duration,
-        stats: &Stats,
-    ) -> io::Result<()> {
-        if self.config.json_stats {
-            self.printer().print_stats_json(total_duration, stats)
-        } else {
-            self.printer().print_stats(total_duration, stats)
-        }
-    }
-
    /// Returns true if and only if the given file path should be
    /// decompressed before searching.
    fn should_decompress(&self, path: &Path) -> bool {
@@ -399,10 +297,11 @@ impl<W: WriteColor> SearchWorker<W> {
        &mut self,
        path: &Path,
    ) -> io::Result<SearchResult> {
+        use std::{fs::File, process::Stdio};
+
        let bin = self.config.preprocessor.as_ref().unwrap();
        let mut cmd = std::process::Command::new(bin);
-        cmd.arg(path)
-            .stdin(std::process::Stdio::from(std::fs::File::open(path)?));
+        cmd.arg(path).stdin(Stdio::from(File::open(path)?));

        let mut rdr = self.command_builder.build(&mut cmd).map_err(|err| {
            io::Error::new(
@@ -478,7 +377,7 @@ impl<W: WriteColor> SearchWorker<W> {
 /// searcher and printer.
 fn search_path<M: Matcher, W: WriteColor>(
    matcher: M,
-    searcher: &mut Searcher,
+    searcher: &mut grep::searcher::Searcher,
    printer: &mut Printer<W>,
    path: &Path,
 ) -> io::Result<SearchResult> {
@@ -514,7 +413,7 @@ fn search_path<M: Matcher, W: WriteColor>(
 /// and printer.
 fn search_reader<M: Matcher, R: io::Read, W: WriteColor>(
    matcher: M,
-    searcher: &mut Searcher,
+    searcher: &mut grep::searcher::Searcher,
    printer: &mut Printer<W>,
    path: &Path,
    mut rdr: R,
@@ -546,8 +445,3 @@ fn search_reader<M: Matcher, R: io::Read, W: WriteColor>(
        }
    }
 }
-
-/// Return the given duration as fractional seconds.
-fn fractional_seconds(duration: Duration) -> f64 {
-    (duration.as_secs() as f64) + (duration.subsec_nanos() as f64 * 1e-9)
-}