Re: Unused Unicode planes

From: verdy_p (verdy_p@wanadoo.fr)
Date: Wed Jan 14 2009 - 03:34:26 CST

  • Next message: Ruszlan Gaszanov: "RE: Emoji-- all or nothing?"

    "Doug Ewell" <doug@ewellic.org> wrote:
    > Ruszlan Gaszanov <ruszlan at ather dot net> wrote:
    >
    > >> A hypothetical "Everycode" standard that encodes arbitrary bits of
    > >> data certainly should include Unicode characters as a subset
    > >
    > > I believe this format is more commonly known as "raw binary data" and
    > > has been rumored to be in widespread use ever since the invention of
    > > first electronic computers ;)
    >
    > "Raw binary data" isn't a structured standard. I was talking about a
    > general-purpose data format, just as Michael D'Errico was, except that
    > his suggestion was to extend UTF-8, whereas mine was to create an
    > all-new format with one of the data components being Unicode characters.
    >
    > As an example, PNG has an "iTXt" chunk that consists partly of Unicode
    > characters encoded in UTF-8. This is not at all the same as raw binary
    > data into which some UTF-8 characters have been asynchronously thrown
    > in.

    For me the "hypothetical" Everycode standard (which is a fully structured format that can embed arbitrary Unicode
    characters or texts as a subset or arbitrary binary data as well) already exists, and everyone in this list must
    already know it. In TUS, it is described as an "upper layer protocol", and its name is... "X M L".



    This archive was generated by hypermail 2.1.5 : Wed Jan 14 2009 - 03:38:27 CST