doc: clarify automatic encoding detection

Fixes #1103
This commit is contained in:
Andrew Gallant 2019-01-26 13:55:17 -05:00
parent afb89bcdad
commit 6d5dba85bd
No known key found for this signature in database
GPG Key ID: B2E3A4923F8B0D44
3 changed files with 11 additions and 3 deletions

View File

@ -27,6 +27,8 @@ Bug fixes:
`**` is now accepted as valid syntax anywhere in a glob. `**` is now accepted as valid syntax anywhere in a glob.
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095): * [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
Fix corner cases involving the `--crlf` flag. Fix corner cases involving the `--crlf` flag.
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
Clarify what `--encoding auto` does.
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106): * [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
`--files-with-matches` and `--files-without-match` work with one file. `--files-with-matches` and `--files-without-match` work with one file.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093): * [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):

View File

@ -609,7 +609,8 @@ topic, but we can try to summarize its relevancy to ripgrep:
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
a special exception, UTF-16 is prevalent in Windows environments a special exception, UTF-16 is prevalent in Windows environments
In light of the above, here is how ripgrep behaves: In light of the above, here is how ripgrep behaves when `--encoding auto` is
given, which is the default:
* All input is assumed to be ASCII compatible (which means every byte that * All input is assumed to be ASCII compatible (which means every byte that
corresponds to an ASCII codepoint actually is an ASCII codepoint). This corresponds to an ASCII codepoint actually is an ASCII codepoint). This

View File

@ -982,10 +982,15 @@ fn flag_encoding(args: &mut Vec<RGArg>) {
const LONG: &str = long!("\ const LONG: &str = long!("\
Specify the text encoding that ripgrep will use on all files searched. The Specify the text encoding that ripgrep will use on all files searched. The
default value is 'auto', which will cause ripgrep to do a best effort automatic default value is 'auto', which will cause ripgrep to do a best effort automatic
detection of encoding on a per-file basis. Other supported values can be found detection of encoding on a per-file basis. Automatic detection in this case
in the list of labels here: only applies to files that begin with a UTF-8 or UTF-16 byte-order mark (BOM).
No other automatic detection is performend.
Other supported values can be found in the list of labels here:
https://encoding.spec.whatwg.org/#concept-encoding-get https://encoding.spec.whatwg.org/#concept-encoding-get
For more details on encoding and how ripgrep deals with it, see GUIDE.md.
This flag can be disabled with --no-encoding. This flag can be disabled with --no-encoding.
"); ");
let arg = RGArg::flag("encoding", "ENCODING").short("E") let arg = RGArg::flag("encoding", "ENCODING").short("E")