Re: Unicode Emoji 5.0 characters now final from Mark Davis ☕️ on 2017-03-28 (Unicode Mail List Archive)

From: Mark Davis ☕️ <mark_at_macchiato.com>
Date: Tue, 28 Mar 2017 12:49:39 +0200

Thanks. Probably best as:

unicode_locale_id = unicode_language_id
( transformed_extensions unicode_locale_extensions?
| unicode_locale_extensions transformed_extensions? )?
;

even clearer would be two steps:

unicode_locale_id = unicode_language_id extensions? ;

extensions = transformed_extensions unicode_locale_extensions?
| unicode_locale_extensions transformed_extensions? ;

Could you file a CLDR ticket on this?

Mark

On Tue, Mar 28, 2017 at 12:36 PM, Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:

> I note this in TR32
> *3.2 Unicode Locale Identifier
> <http://unicode.org/reports/tr35/index.html#Unicode_locale_identifier>*
>
> EBNF
> ABNF
>
> unicode_locale_id
> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> =
> unicode_language_id
> (transformed_extensions
> unicode_locale_extensions?
> | unicode_locale_extensions?
> transformed_extensions?) ; = unicode_language_id
> ([trasformed_extensions
> [unicode_locale_extensions]]
> / [unicode_locale_extensions
> [transformed_extensions]])
>
> * first there's a typo in the ABNF syntax ("trasformed")
> * the syntax is not strictly equivalent, or the ABNF is unnecessarily not
> context-free
>
> It should better be:
>
> EBNF
> ABNF
>
> unicode_locale_id
> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> =
> unicode_language_id
> (transformed_extensions
> unicode_locale_extensions?
> | unicode_locale_extensions
> transformed_extensions?)?; = unicode_language_id
> [transformed_extensions
> [unicode_locale_extensions]
> / unicode_locale_extensions
> [transformed_extensions]]
>
>
>
> 2017-03-28 11:56 GMT+02:00 Joan Montané <joan_at_montane.cat>:
>
>>
>>
>> 2017-03-28 7:57 GMT+02:00 Mark Davis ☕️ <mark_at_macchiato.com>:
>>
>>> To add to what Ken and Markus said: like many other identifiers, there
>>> are a number of different categories.
>>>
>>> 1. *Ill-formed: *"$1"
>>> 2. *Well-formed, but not valid: *"usx". Is *syntactic* according to
>>> http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence
>>> <http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence>,
>>> but is not *valid* according to http://unicode.org/reports/tr5
>>> 1/proposed.html#valid-emoji-tag-sequences
>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences>
>>> .
>>> 3. *Valid, but not recommended: "usca". *Corresponds to the valid
>>> Unicode subdivision code for California according to
>>> http://unicode.org/reports/tr51/proposed.html#valid-emoji-ta
>>> g-sequences
>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences>
>>> and CLDR, but is not listed in http://unicode.org/Public/emoji/5.0/.
>>> 4. *Recommended:* "gbsct". Corresponds to the valid Unicode
>>> subdivision code for Scotland, and *is* listed in
>>> http://unicode.org/Public/emoji/5.0/
>>> <http://unicode.org/Public/emoji/5.0/>.
>>>
>>> As Ken says, the terminology is a little bit in flux for term
>>> 'recommended'. TR51 is still open for comment, although we won't make any
>>> changes that would invalidate http://unicode.org/Public/emoji/5.0/.
>>>
>>
>> Just two remarks
>>
>> 1st one: point 4 (Unicode subdivision codes listed in emoji Unicode site)
>> arises something like chicken-egg problem. Vendors don't easily add new
>> subdivision-flags (because they aren't recommended), and Unicode doesn't
>> recommend new subdivision flags (because they aren't supported by vendors).
>>
>> 2n one: What about "Adopt a Character" (AKA "Adopt an emoji"). Will be
>> valid, but not recommended, Unicode subdivisions codes eligible? For
>> instances, say, could someone adopt California, Texas, Pomerania, or
>> Catalonia flags?
>>
>>
>> Regards,
>> Joan Montané
>>
>>
>
Received on Tue Mar 28 2017 - 05:50:31 CDT

This archive was generated by hypermail 2.2.0 : Tue Mar 28 2017 - 05:50:31 CDT