RE: PH technical issues (was RE: Why Fraktur is irrelevant

From: Peter Constable (petercon@microsoft.com)
Date: Fri May 28 2004 - 13:08:57 CDT

  • Next message: Mike Ayers: "RE: [BULK] - Re: PH technical issues (was RE: Why Fraktur is irre levant"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of Peter Kirk

    > To me the answer to this argument is simple: plain text is intended to
    > communicate semantic content only...

    The problem with your response is that by-passes the question of what
    should reasonably be considered semantic content. The premise of the
    argument -- that the semantic content is the same -- is not valid as a
    premise because it is precisely what is at stake. Your argument is
    therefore circular: PH and sq H should not be considered semantically
    different because they are not semantically different.

    Obviously, we can choose to decide -- and will be so deciding -- whether
    encoded characters for PH and sq Heb are considered to have the same or
    different semantic content -- i.e. whether they are the same encoded
    characters or different encoded characters. Our decision cannot be based
    on a premise that they do or do not. It must be based on factors such
    as:

    - whether users *perceive* the semantic content to be the same or
    different (as that will affect their expectations of how IT systems
    behave)

    - whether the IT needs of users overall will be best served by
    considering the semantic content to be the same or different.

    The point of this usage scenario is that Sally and Latisha would be
    well-served if the scripts are encoded as having *distinct* semantic
    content.

    > she should not have
    > any expectations about what the result will look like - only that its
    > semantics will be preserved, which they will be as her Phoenician
    words
    > are still meaningful with square Hebrew glyphs

    Her "Phoenician words" in this case are probably something like her
    name, or a transliteration of English words. The suggestion is that they
    would not be meaningful to her in square Hebrew glyphs, but that
    suggestion (like your argument) presupposes what is considered semantic
    content.

    > (that is how many scholars represent Phoenician text).

    But, as we have already seen, only *some* scholars, and the overall user
    community includes many people other than paleography scholars.

    > If she wants to control its
    > appearance, she should use graphics or PDF format, or at least HTML
    > which will specify the font used on her computer

    So, your rebuttal is the same as David Starner's: use markup or non-text
    representation. As I said to him, I think that is a greater
    inconvenience to these users than character folding is to the Semitic
    paleographers that consider the semantics the same.

    > The scenario you are looking at is actually a rather unlikely one.

    Fine. So you reject this scenario. Let's move on. As I said, it was just
    an example that tried to get away from paleography. You and David have
    said to use markup or don't use text. I've suggested that's probably a
    greater burden than character folding. I think we should then try to
    consider whether other more reasonable scenarios would lead us to accept
    or reject that position.

     
    > On 28/05/2004 02:56, Christopher Fynn wrote:

    Is there a reason why you have started grouping your responses together?
    I don't at all care for it.

    > > If this is "trivial" for scholarly users then using a tailoring to
    > > achieve interleaved collation and / or folding wouldn't be difficult
    > > for them either.
    >
    > I disagree. Tailoring is possible, but it is far more complicated than
    > adding a script or scribe name tag to a database.

    I know that you use Shoebox or Toolbox for linguistic corpus data. In
    that context, this amounts to setting up a new language configuration,
    or a new sort order and character classes for a given language. If you
    do the former, you will end up doing the latter. So, for you, both are
    possible and comparable in terms of difficulty.

    > Anyway, D. Starner's
    > requirement for detailed script marking will not be met by defining a
    > separate Phoenician script.

    That's not really relevant. No matter what we decide here, the situation
    David has described would still require detailed script marking. That is
    *not* a problem scenario that the PH proposal (or any proposal for *any*
    script) was intended to solve.

    > I think we can assume that Unicode will not
    > want to encode individual scribes' handwriting as separate scripts.
    :-)

    Not only has it not been proposed, it has already been stated clearly
    that that is not what is sought.

    Peter
     
    Peter Constable
    Globalization Infrastructure and Font Technologies
    Microsoft Windows Division



    This archive was generated by hypermail 2.1.5 : Fri May 28 2004 - 13:11:38 CDT