RE: Unicode and end users

From: Yves Arrouye ([email protected])
Date: Sat Feb 16 2002 - 23:09:37 EST

Previous message: Asmus Freytag: "Re: Smiles, faces, etc"
Maybe in reply to: Martin Kochanski: "Unicode and end users"
Next in thread: Lars Kristan: "RE: Unicode and end users"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> If "foo" is a US-ASCII string, "grep foo file" will work fine with any
> US-ASCII-superset charset for which non-ASCII characters do not use
> bytes < 0x80, including the hypothetical one I described, with no
> possibility of a false match. However "grep f�� file" will work only
> if the current shell charset (i.e. of argv[1]) matches the encoding of
> "file".

Not necessarily. It will work as long as the sequence of 3 bytes f�� is the
representation of the string you are looking for in the file, in that file's
encoding. grep does not validate anything, nor should it IMHO. If you want
to guarantee the encoding, use a converter like ICU's uconv(1) or iconv(1).

Previous message: Asmus Freytag: "Re: Smiles, faces, etc"
Maybe in reply to: Martin Kochanski: "Unicode and end users"
Next in thread: Lars Kristan: "RE: Unicode and end users"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Sat Feb 16 2002 - 22:43:10 EST