searcher: do UTF-8 BOM sniffing like UTF-16

Previously, we were only looking for the UTF-16 BOM for determining
whether to do transcoding or not. But we should also look for the UTF-8
BOM as well.

Fixes #1638, Closes #1697
This commit is contained in:
Alessandro Menezes
2020-10-02 16:17:39 -04:00
committed by Andrew Gallant
parent 53c4855517
commit 2295061e80
3 changed files with 34 additions and 4 deletions

View File

@@ -867,6 +867,15 @@ use B;
eqnice!("2\n", cmd.stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1638
//
// Tests if UTF-8 BOM is sniffed, then the column index is correct.
rgtest!(r1638, |dir: Dir, mut cmd: TestCommand| {
dir.create_bytes("foo", b"\xef\xbb\xbfx");
eqnice!("foo:1:1:x\n", cmd.arg("--column").arg("x").stdout());
});
// See: https://github.com/BurntSushi/ripgrep/issues/1765
rgtest!(r1765, |dir: Dir, mut cmd: TestCommand| {
dir.create("test", "\n");