Flag emoji

From: Mark Davis β˜• <mark_at_macchiato.com>
Date: Thu, 31 May 2012 16:18:01 -0700

The UTC considered as one of the possible approaches to the problem. While
easier in terms of line breaking, there'd still be a requirement to change
grapheme cluster boundaries and word boundaries to join sequences
like πŸ‡¦πŸ‡¦, and people felt the approach didn't work well with encoding
conversion. About conversion, I think the discussion was something like the
following:

It is relatively simple to have a mapping like:

<sjis bytes> ↔ πŸ‡¦[joiner]πŸ‡¦

If we used ZWSP, then we'd have:

<sjis bytes> ← πŸ‡¦πŸ‡¦ // but the code wouldn't know when to also absorb
adjacent ZWSPs.

<sjis bytes> β†’ πŸ‡¦πŸ‡¦ // but the code would need context to know when to add
adjacent ZWSPs.

Both of those would be complicated for encoding converters to handle.
People also felt that πŸ‡¦[joiner]πŸ‡¦ would be more consistent with treating
the sequence as a unit, both conceptually and in fonts.

I personally favored the ZWSP, but was convinced during the discussion that
ZWJ was a better approach.

------------------------------
Mark <https://plus.google.com/114199149796022210033>
*
*
*β€” Il meglio Γ¨ l’inimico del bene β€”*
**

On Thu, May 31, 2012 at 2:47 AM, Andrew West <andrewcwest_at_gmail.com> wrote:

> On 31 May 2012 00:24, Mark Davis β˜• <mark_at_macchiato.com> wrote:
> >
> > There is definitely a problem.
>
> Is it really such a problem? Why can't implementations simply use
> ZWSP to demarcate the 2-character units in a sequence of more than two
> regional indicator symbols (and maybe always emit 2-character codes
> wrapped between ZWSP on either side to be safe), so for example
> US<ZWSP>ES<ZWSP>GE would be parsed as the regional indicator symbols
> for USA, SPAIN and Georgia, whereas U<ZWSP>SE<ZWSP>SG<ZWSP>E would be
> parsed as the regional indicator symbols for U (invalid), Sweden,
> Singapore and E (invalid). Algorithms such as line-breaking would not
> break between two regional indicator symbols, but only at a ZWSP.
>
> And if implementations wanted to support two- and three-letter
> regional codes, they might parse
> <ZWSP>GB<ZWSP>CYM<ZWSP>ENG<ZWSP>NIR<ZWSP>SCO<ZWSP> as the codes for
> United Kingdom, Wales, England, Northern Ireland, and Scotland, and
> represent them visually with the appropriate flag icons.
>
> Andrew
>
>
>
Received on Thu May 31 2012 - 18:20:01 CDT

This archive was generated by hypermail 2.2.0 : Thu May 31 2012 - 18:20:01 CDT