Regular expressions for non-English languages

From: Alain LaBont/e'/ (
Date: Thu Feb 06 1997 - 14:01:41 EST

>Return-Path: <alexis@VNET.IBM.COM>
>Date: 6 Feb 1997 13:52:17 EST
>From: <alexis@VNET.IBM.COM>
>To: <>
>Subject: Regular expressions for non-English languages
>To: UMAVS --TOROLAB6 Uma Umamaheswaran
> LaBont<e'>, Alain <>
>cc: CHUPA --TOROLAB6 Chupa, Ken
>From: C.Y.Alexis Cheng, National Language Technical Centre
>IBM Mail: CAIB2499 at IBMMAIL Internet:
>( )IBM Confidential (x)Unclassified
>Subject: Regular expressions for non-English languages
>Hi there,
>Someone asked about regular expressions for non-English languages.
>Please forward this note to whomever was asking that question.
>Version 2 of the X/Open "Internationalisation Guide" mentions
>an enhancement to the regular expression syntax to support
>a particular language based on its active locale.
>E.g. Instead of using "(a-z)" to denote all lower case
>letters, where ( and ) are really square brackets,
>one would specify it as "(:lower:)", which would be locale
>sensitive. Another example is "(=a=)", which would match
>all instances of the lower case letter 'a', including a, a acute,
>a grave, etc. "((.ch.)-e)" would match all characters between
>'ch' and 'e'. Commands affected by this enhancement include
>awk, ed, egrep, expr, grep, gp, and sed. For existing
>regular expressions, the Guide suggests shell scripts imported
>from non-internationalized environments be executed in the default
>POSIX (or C) locale.
>Cheers, Alexis
>IBM Canada Laboratory, 3R/979/1150/TOR, Phone: (416)448-3670
>1150 Eglinton Avenue East, North York, Fax: (416)448-4414
>Ontario, Canada. M3C 1H7 Tieline: 778-

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT