mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2025-08-01 04:32:01 -07:00
This commit removes, in retrospect, a silly use of `unsafe`. In particular, to extract a file name extension (distinct from how `std` implements it), we were transmuting an OsStr to its underlying WTF-8 byte representation and then searching that. This required `unsafe` and relied on an undocumented std API, so it was a bad choice to make, but everything gets sacrificed at the Alter of Performance. The thing I didn't seem to realize at the time was that: 1. On Unix, you can already get the raw byte representation in a manner that has zero cost. 2. On Windows, paths are already being encoded and copied every which way. So doing a UTF-8 check and, in rare cases (for invalid UTF-8), an extra copy, doesn't seem like that much more of an added expense. Thus, rewrite the extension extraction using safe APIs. On Unix, this should have identical performance characteristics as the previous implementation. On Windows, we do pay a higher cost in the UTF-8 check, but Windows is already paying a similar cost a few times over anyway.