regex: migrate grep-regex to regex-automata

We just do a "basic" dumb migration. We don't try to improve anything here.
2025-08-04 06:02:00 -07:00 · 2023-06-15 15:05:07 -04:00
parent 1035f6b1ff
commit e028ea3792
12 changed files with 237 additions and 129 deletions
--- a/crates/regex/src/literal.rs
+++ b/crates/regex/src/literal.rs
@@ -1,5 +1,25 @@
 use regex_syntax::hir::Hir;

+// BREADCRUMBS:
+//
+// The way we deal with line terminators in the regex is clunky, but probably
+// the least bad option for now unfortunately.
+//
+// The `non_matching_bytes` routine currently hardcodes line terminators for
+// anchors. But it's not really clear it should even care about line terminators
+// anyway, since anchors aren't actually part of a match. If we fix that
+// though, that currently reveals a different bug elsewhere: '(?-m:^)' isn't
+// implemented correctly in multi-line search, because it defers to the fast
+// line-by-line strategy, which ends up being wrong. I think the way forward
+// there is to:
+//
+// 1) Adding something in the grep-matcher interface that exposes a way to
+// query for \A and \z specifically. If they're in the pattern, then we can
+// decide how to handle them.
+//
+// 2) Perhaps provide a way to "translate \A/\z to ^/$" for cases when
+// mulit-line search is not enabled.
+
 #[derive(Clone, Debug)]
 pub struct LiteralSets {}