ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-07-31 20:21:59 -07:00

Author	SHA1	Message	Date
Andrew Gallant	254b8b67bb	globset: small perf improvements This tweaks the path handling functions slightly to make them a hair faster. In particular, `file_name` is called on every path that ripgrep visits, and it was possible to remove a few branches without changing behavior.	2019-04-05 23:24:08 -04:00
Andrew Gallant	8a7f43b84d	globset: use bstr This simplifies the various path related functions and pushed more platform dependent code down into bstr. This likely also makes things a bit more efficient on Windows, since we now only do a single UTF-8 check for each file path.	2019-04-05 23:24:08 -04:00
Andrew Gallant	96ee4482cd	globset: remove use of unsafe This commit removes, in retrospect, a silly use of `unsafe`. In particular, to extract a file name extension (distinct from how `std` implements it), we were transmuting an OsStr to its underlying WTF-8 byte representation and then searching that. This required `unsafe` and relied on an undocumented std API, so it was a bad choice to make, but everything gets sacrificed at the Alter of Performance. The thing I didn't seem to realize at the time was that: 1. On Unix, you can already get the raw byte representation in a manner that has zero cost. 2. On Windows, paths are already being encoded and copied every which way. So doing a UTF-8 check and, in rare cases (for invalid UTF-8), an extra copy, doesn't seem like that much more of an added expense. Thus, rewrite the extension extraction using safe APIs. On Unix, this should have identical performance characteristics as the previous implementation. On Windows, we do pay a higher cost in the UTF-8 check, but Windows is already paying a similar cost a few times over anyway.	2018-02-10 22:28:12 -05:00
Andrew Gallant	d79add341b	Move all gitignore matching to separate crate. This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45	2016-10-29 20:48:59 -04:00
Andrew Gallant	e96d93034a	Finish overhaul of glob matching. This commit completes the initial move of glob matching to an external crate, including fixing up cross platform support, polishing the external crate for others to use and fixing a number of bugs in the process. Fixes #87, #127, #131	2016-10-10 19:24:18 -04:00
Andrew Gallant	175406df01	Refactor and test glob sets. This commit goes a long way toward refactoring glob sets so that the code is easier to maintain going forward. In particular, it makes the literal optimizations that glob sets used a lot more structured and much easier to extend. Tests have also been modified to include glob sets. There's still a bit of polish work left to do before a release. This also fixes the immediate issue where large gitignore files were causing ripgrep to slow way down. While we don't technically fix it for good, we're a lot better about reducing the number of regexes we compile. In particular, if a gitignore file contains thousands of patterns that can't be matched more simply using literals, then ripgrep will slow down again. We could fix this for good by avoiding RegexSet if the number of regexes grows too large. Fixes #134.	2016-10-04 20:28:56 -04:00
Andrew Gallant	fdf24317ac	Move glob implementation to new crate. It is isolated and complex enough that it deserves attention all on its own. It's also eminently reusable.	2016-09-30 19:42:41 -04:00

7 Commits