Formatting Durations
The following is a strawman proposal for additions to CLDR 1.5 to support
durations. The goal is to have a mechanism that allows for reasonable formatting
of common durations, with a format which is as easy as possible for translators
to use (with instruction).
Formats
- Add a set of flexible formats targeted at durations.
<durationFormatItem id="hhmm">hh:mm:ss</dateFormatItem>
<durationFormatItem id="hhmmss">hh:mm</dateFormatItem>
<durationFormatItem id="hhhmmm">hhh mmm</dateFormatItem>
<durationFormatItem id="wwwddd">ddd www</dateFormatItem>
- The one and two letter fields have their normal semantics, except that
the numeric width of the top field is unbounded. Eg 108 hours and 23.5
minutes would be "108:23:30" with the first of the above formats. The
difference between hh,HH,kk,KK is ignored: all hour fields are 0..∞.
- Field values of 3 or above are handled by formatting as special field
fields (see below), then substituting. 3 letters (eg hhh) gets the
abbreviated format, four letters (like hhhh) gets the wide format.
- There is an additional concatenation format for fallback, eg "{0}, {1}".
If there is no exact match, the longest initial match in big-endian order is
used, and the results concatenated with this format. Eg, suppose the key is
"dms", and there is no match. Then we try for "dm", then "d", and
concatenate that result with the result of formatting the rest, using "{0},
{1}". So we might get "1 day, 3 minutes 17 seconds".
Special Field Values
- We add structure to CLDR for each field type, something like the following.
<durationLength type="wide">
<duration type="h" number="singular">1 Stunde</duration>
<duration type="h" number="other">{0} Stunden</duration>
<duration type="m" number="singular">1 Minute</duration>
...
</durationLength>
<durationLength type="abbreviated">
<duration type="h" number="other">{0}s</duration>
<duration type="m" number="singular">1m</duration>
...
- In these fields, {0} is a placeholder that uses the default number
format for that locale.
- The number attribute keywords are defined to be the following,
initially. (We would add more attributes as we find languages that need them.)
So for Russian, what corresponds to the above list would be contain oneMod
and fewMod. The other keyword is matched if none of the available
others match.
| keyword |
tests condition |
comment |
| zero |
x == 0 |
|
| one |
x == 1 |
|
| two |
x == 2 |
used in Slovenian |
| some |
x == 3 || x == 4 |
used in Slovenian |
| oneMod |
x == 1 || x > 20 && (x mod 10) == 1 |
used in Russian, Serbian,... |
| fewMod |
2 <= x && x <= 4 || x > 20 && 2 <= (x mod 10) && (x mod 10) <= 4 |
used in Russian, Serbian,... |
| other |
x == anything |
only matches if no other conditions true |
- Issue: should we allow multiple attributes for single element, like
<duration type="h" number="one some">{0}zw</duration>. At this point I don't
think it is necessary.
Expected API
The expected API would have certain parameters.
- It would allow the programmer to pass in a key (eg "hm" or "hhhmmm") as
in flexible formats. (While it is possible to pass in fields of mixed
lengths, we caution the programmer that it is unlikely that good results
will obtain.)
- It would allow the least field to have fractions. The
programmer could pass in just the min/max fractional digits (or maybe for
generality a number format (for that locale of course)). Thus 108 hours 23.5
minutes would be "108:23.5" with the second of the above formats.
- It would allow leading, trailing, and/or interior zero fields (in any
combination) to be suppressed. Suppose for example that the key is "dhms",
and the actual value turns out to be 0 days 3 hours 0 minutes 5 seconds.
Then here are some results:
- suppress leading results in "3 hours 0 minutes 5 seconds"
- suppress leading+interior results in "3 hours 5 seconds"
The actual key value that is looked up will change if suppression is
chosen. So if the key is dhms and the h value is zero, then "dms" is
actually looked up in the flexible duration list.