Re: proposal for a "Standard-Exit" or "Namespace" character

From: Dennis Heuer (dh@triple-media.com)
Date: Tue Apr 14 2009 - 03:55:01 CDT

  • Next message: Michael Everson: "Aramaic revival"

    On Tue, 14 Apr 2009 15:54:12 +1000
    "Dean Harding" <dean.harding@dload.com.au> wrote:

    > > escape sequences are messy, and one can't tell from a text file
    > > which convention is behind an escape sequence. hence, was it
    > > ISO-2022 or a different one?
    >
    > Here's an idea. Write up your proposal, and use one of the many available
    > characters in the PUA of Unicode (there's 130,000-odd to choose from). If
    > your proposed scheme gains traction, you can come back here and ask that
    > your characters be encoded in Unicode.

    at least a constructive comment. but, the writing on the unicode site
    said that things are better discussed here first. if i can't gain
    traction here, is another proposal of any worth?

    what else shall i write? the character-set switch is just one character,
    already explained, and basing on an already existing idea you know of,
    and the most basic formatting characters (bold, italic, etc.) are also
    well known. just take a look at the font dialog of your text writer.
    how to use them is known from given file formats: one can enclose a
    passage, which is more clear but needs more parsing, or only positively
    switch to the next formatting, like: <bold><italic>important<normal>.
    that's it. the symbols you can also take from your text writer. they
    are already very normative by defacto standard MS W*rd. as you can see,
    there's prior art to it.

    > In general, characters do not get encoded in Unicode unless they're already
    > in use. The SOCCER BALL symbol that you referenced before, if you look at
    > the proposal, already had a number of uses in the real world.
    >
    > > yes, i did. what the heck html has to do in plain text files. there is
    > > no generally agreed on text format except of plain text. today, we
    > > could talk about OpenDocument. but would you store your system config
    > > in odt? would your store your programming scripts in odt? would you
    > > store your emails in odt? ...
    >
    > Do you want bold and italics in your system configuration?

    you not? do you rather like blocks like:

    ####################
    ### this is info ###
    ####################

    also, this again only separates one case to stick at. you may even
    not want to have bold text in your email. though, you ever heard about
    html-emails? some even want flash in emails. wait for silverlight to
    spread! however, some only struggle with old-school restrictions
    enforcing ugly formatting like the above, though the below were
    producing a much more natural experience:

    <header_moderate># this is info<normal>

    only to be safe (you comment on everything): the placeholders written
    with angled braces in the above example are really meant to be
    understood as placeholders. the user would, instead, type a key or
    key-combination to set the respective formatting character.

    btw., it seems that unicode smileys in emails are ok!? the unicode
    standard cares about a lot of nonsense (even clingon script, dreaming
    of being a cultural heritage archive (if not museum) instead of a
    technical standard). though, the most commonly used formattings shall
    not be of relevance??? again, what about all these spacing characters???
    did anyone of you bark about them not being available on his
    keyboard????

    just make your own private statistics: how often did you use 'U+271D
    LATIN CROSS' and how often did you type bold formatted text this month.

    > > why? text-processing systems already have buttons for entering those
    > > codes. only, at the moment they enter meta-data into their proprietary
    > > text-file formats...
    >
    > My keyboard has no way for me to type the "ESC" character. Pressing the
    > "ESC" key does something totally different. My keyboard certainly doesn't
    > have a way for me to type totally new characters that haven't even been
    > encoded yet!

    do you refer to my paragraph above? can't see the relation. did
    you misunderstand 'button' with 'key'? i talked about clickable
    buttons in the graphical interface. whatever, yes, a character set does
    not guarantee that any related font will include all glyphs, and a
    keyboard does not guarantee that any key will behave as expected in all
    programs or on all desktops or whatever. what do you mean by that? and,
    did that hinder the development of unicode? how do you type in the
    preferred newline character or the preferred quote characters? are
    these available from your keyboard??? yes, there's more to come than
    just unicode. is this new to you? do you live under shock that the
    unicode standard did not change your hardware restrictions? am i
    writing to the unicode mailinglist?????

    >
    > > read from above again and try to understand that ISO-2022 is not an
    > > example of what people want.
    >
    > What "people" are you talking about? It's not what YOU want, granted, but
    > why generalize that to include ALL people?

    ISO-2022 introduces a deep change to how text is stored on disk. this
    must be done in one go for all text-processing tools. otherwise, there
    is ambiguity because the tools used with the text (which might be
    a script or output, for example) might misinterpret the escape
    sequences. if unicode provides just one new key, this is a harmless
    update. the use can be recognized easily and answered with a warning,
    for example (still needing a small update to the program.) otherwise,
    the replacement character tells about the unknown character in the text.
    also, this approach does not introduce any ambivalence because there is
    no prior use of the code.

    >
    > > not for graphical text editors, for email and other messaging systems,
    > > for wikis, blogs and ... ah ... so many ... also, before they switch to
    > > html they rather support new uni-codes.
    >
    > You mentioned above having a toolbar buttons for entering these new
    > characters which affects bold, italics, etc. What's wrong with having
    > toolbar buttons that output HTML instead? There's plenty of WYSIWYG HTML
    > editors for wikis, blogs, and so on. What's wrong with those?

    first, this does not work in gnomepad, does it??? when will you get the
    point that unicode is not just one standard for just one task but a
    general standard for general tasks, fulfilling the needs of even
    old-school text-editor users. it shall be fully integrateable even on
    console (admin) level. html-editors are definetly not! there's not only
    the problem that html does not better the use of plain text files;
    there's also the problem known from implementing html-capable
    email-readers. some, like sylpheed, dropped the approach because of
    maintenance reasons, others tried small but inefficient solutions.
    but, plugging in firefox only to read email is weird! this is what
    i meant with the difference between a typographic standard and a
    meta-format. introducing a technically complex beast like html (no
    matter how reduced the semantics actually are) for only some basic
    typographic outlining (as is so common in all texts, even those that
    must fall back on ASCII-formatting (heavily)) is weird.

    second, embedding html-code into text can make it unreadable. it is
    also a security problem (cross-site scripting, for example.) some
    don't use a text-field but rather lots of javascript to implement a
    pseudo-typewriter. this is complex and can also introduce security
    issues. look at the most known wiki on earth: wikipedia. the schema
    used is widely spread under wikis. it is a more typographic approach by
    using indention and 'ASCII-art' to instruct the editor. this is much
    easier to use and implement than html. however, even this causes lots
    of confusion to beginners and implies a learning curve. just setting
    something to bold or italic by inserting a character is, instead,
    feeling more natural and the better understood alternative.

    > You say you don't want to use HTML or some other higher-level markup
    > language, but you haven't really said why. You just assume that it's
    > "obvious" that HTML is not the right way to do these things. Why not? It's
    > worked pretty well so far.

    i talked about it in several paragraphs in all of my emails to this
    mailinglist (except the first, which was about the namespace key.) if
    you can't read, what shall i do?

    regards,
    dennis



    This archive was generated by hypermail 2.1.5 : Tue Apr 14 2009 - 03:57:26 CDT