Old concerns about transliteration fonts

The content of this post is outdated, but I will keep it for "historical" reasons. A new up-to-date page will be found at http://jsesh.qenherkhopeshef.org/fr/node/1132

Currently, this topic is mostly a reminder that I want to write something on the subject. The problem of transliteration font is definitly not solved yet. But what is the problem exactly and what are the issues?

At first glance, one wants to print nice transliteration characters in his articles. Not a big deal, it seems. There are lots of fonts editors around there, and lots of people have created transliteration fonts.

Well, incompatible transliteration fonts, that is. A fonts is, to say things quickly, a catalogue of characters. Each character has a numeric code. In the past, those codes where limited to the range 0-255. Most of them were based on ASCII, which defined the 128 first codes, leaving 128 others for extensions. Lots of extensions have been created, all of them incompatible for all things non-ascii. For instance, MacRoman, the mac default encoding for western european writing system, was completely different from the corresponding Windows encoding and from the ISO encoding generaly used by Unix systems at that time.

For the transliteration of Egyptian, a de-facto standard appeared with the “Manuel de Codage”. Basically, most ASCII codes where kept the same (the code for "b" was used for "b"), and a few other were changed the code for "a" was reused for "ayin", the code for "A" for "Aleph", etc.

It was as close to a standard for transliteration as possible. But yet, the softwares which implemented the manuel had no "standard" encoding for the capital letters. For instance, as the code 72 (ASCII H) is used for ḥ, how do you code a capital "H". How do you code a capital "Ḥ"? Actually, in Winglyph and in the Manuel de codage, those signs would have been encoded ^h and ^H respectively, but only for inclusion in the hieroglyphic software, not for mixing with other text in a word processor.

The result was quite a nightmare for anyone writing a article on a Mac and sending it to a journal composed on Windows, and vice-versa. I remember suppressing all capitals letters in transliteration from an article because of that.

Now, a bright future awaits. Unicode is going to save us. In fact, Unicode should have saved us long time ago, because most characters we need for transliteration are already there (see the ḥ and the Ḫ). Unicode has been available on personnal computers for at least ten years now.

The only problem was that 3 characters were missing: the aleph, the ayin, and the yod. Now, both the aleph and the ayin have been incorporated in Unicode 5.1, which is great. Of course, most fonts don't include them yet.

Another problem is that, in unicode, there are composite characters. For instance, you can create an "é" by using a "e" and the code U+301 "Combining Acute Accent". Now, normally, softwares should handle those characters gracefully. But they still don't, and not in every case. For Egyptology, it's a bit annoying, as three important characters are only available in this form.

First, there is the capital ẖ, which is normally H + U0331 ("combining macron below"). If you are lucky, you have a decent capital "ẖ" here: "H̱". Or maybe you are not lucky and you see "H□". And does it work with italics "H̱"? In many cases, I do indeed see an "H" with a line below it, but most of the time, the alignment of the said line is far from perfect. Alas, alas.

The Egyptological yod has the same fate. As the poor letter has a bit of an "i" in it, the unicode commitee decided to make it a composite character (pre-composed characters are supposed to be things of the past, only for legacy softwares, except that it works better with existing software and hardware). The problem is that the egyptological yod is complex to recreate. In fact, the "right" accent would be the greek psili "ı̓" (U+313 "combining comma above"). But "i" and U+313 do not combine well for capital yod. It gives something like "Ỉ" (well, actually, it's "I̓"). The diacritic stands above the "I" and not in front of it. We would like something like "Ἰ" (I'm cheating again).

The right accent has been found, it's the cyrillic psili pneumatica (U+486 "Combining cyrillic psili pneumatica"). Normally, i҆ and I҆ should be just the right shapes. The problem is that lots of fonts have wrong shaped glyphs for this particular diacritic (a kind of horizontal "T", and not a comma).

All this rambling to say that I really (really) wish that the egyptological community will decide on some temporary convention. Either to keep using the "old" ASCII encoding a while, or to encode the problematic glyphs in the private part of unicode. Because a) people need to write transliteration now. and b) we need to proceed toward unicode in a orderly way, lest we have a bunch of unusable text files in the end, at least until all softwares and computers are able to use a full unicode notation, which, as egyptologists are not gamers who change their computer every other year (it's more like every other ten years), may be five years from now.

P.S. I'm contemplating the idea of getting some academic support to as for the egyptological yod to be a first-class citizen in Unicode.

The participants in the "Informatique & Égyptologie" conference in Wien were asked about the problem of the yod, and most of us where in favour of a first-class citizen yod.

Another point in favour of it is that only OpenType fonts would be able to compose the accent correctly if we used the cyrillic psili. Truetype and postscript fonts would not do. And there are still lots of softwares which use Truetype fonts.

Serge Rosmorduc

Good luck with getting that yod in. I asked about this years and go and it went nowhere. Most of the people at Unicode where downright against it, arguing that precomposed glyphs are not really ideal and it can be composed from other glyphs. There really seemed to be little interest in listening to professional Egyptologists at that time.

Maybe things have changed but I doubt it. I just use the space that the Tebtunis group has suggested (and I believe was in the earlier proposal). Sadly nothing will change until the TLA ,and maybe the IFAO, CCER and other big groups start demanding it. On the plus side, /j/ doesn't require a special glyph