Re: Emoji data in UCD xml ? from Mark Davis ☕️ on 2015-10-29 (Unicode Mail List Archive)

From: Mark Davis ☕️ <mark_at_macchiato.com>
Date: Thu, 29 Oct 2015 11:20:50 -0700

As Ken said, there's been some preliminary discussion, but we wanted to get
initial information out in connection with UTR #51 first, and take more
time to consider what UCD properties would look like, and which are
necessary.

The basic information that people want to access for implementations are:

   - Is a character emoji or not?
   - Which emoji have default text presentation? (others having emoji
   presentation)
   - Which emoji are modifiers, and which are modifier bases? (others being
   neither)
   - Which sequences of emoji are recommended (zwj and/or combining marks)
   for those who support them?
   - flags and modifier sequences are specified algorithmically, and don't
      need to be listed.

The levels, the distinction between primary and secondary, and the carrier
sources were useful in development of the emoji data and tr51 but aren't
really necessary for implementations.

Mark

On Thu, Oct 29, 2015 at 9:14 AM, Ken Whistler <kenwhistler_at_att.net> wrote:

> There has been some preliminary discussion of this. The problem is that
> the data in emoji-data.txt has not yet been formally rationalized into a
> coherent set of Unicode character properties. The UTC would first need to
> determine exactly what property (or list of properties) is involved, before
> incorporating it (or them) formally into the Unicode Character Database
> (UCD)
> and into the XML version of the UCD, and the documentation of it (or them)
> formally into UAX #44.
>
> --Ken
>
>
> On 10/26/2015 10:39 AM, Daniel Bünzli wrote:
>
>> If I read correctly UTR #51, the way of determining if a scalar value is
>> an emoji character is to consult this data file [1]. Are there any plans to
>> integrate this data in the UCD xml ?
>>
>>
>>
>
Received on Thu Oct 29 2015 - 13:22:35 CDT

This archive was generated by hypermail 2.2.0 : Thu Oct 29 2015 - 13:22:35 CDT