Text only | Skip links
Skip links||IT Services, University of Oxford

1. TEI internationalisation project

The aim is to lower the barrier for entry of non-English-speaking users by:
  • Ensuring that all of the TEI is Unicode-safe
  • Translating the reference element and attribute descriptions
  • Providing tools to easily access non-English versions
  • Localizing TEI software
  • Localizing the TEI examples
  • Translating the running prose of the Guidelines
  • Translating the object names

2. Reminder: definitions

Internationalization (I18N)
Internationalization is the process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for redesign. Internationalization takes place at the level of program design and document development.
Localization (L10N)
Localization is the process of taking a product and making it linguistically and culturally appropriate to a given target locale (country/region and language) where it will be used.

3. Examples of translation

  • instead of <addrLine>, the TEI user might prefer to write <líneaDirección>, <ligneAdresse>, <linDireccio> or <AdressZeile>.
  • instead of supplies the descriptive and declarative information making up an electronic title page prefixed to every TEI-conformant text., the French-speaking user might find it more helpful to read L'élément En-tête TEI <TeiHeader> fournit les informations descriptives et déclaratives précédant chaque texte conforme à la TEI et qui permettent de constituer une page de titre électronique

4. Localisation of examples

What does this
<lg>
 <l>Sire Thopas was a doghty swayn;</l>
 <l>White was his face as payndemayn,</l>
 <l>His lippes rede as rose;</l>
 <l>His rode is lyk scarlet in grayn,</l>
 <l>And I yow telle in good certayn,</l>
 <l>He hadde a semely nose.</l>
</lg>
mean to a Chinese scholar?

5. Example of translated ODD

<elementSpec xmlns="http://www.tei-c.org/ns/1.0"

 module="header"
 xml:id="TEIHEAD"
 usage="req"
 ident="teiHeader">

<equiv xmlns="http://www.tei-c.org/ns/1.0"
/>

<gloss xmlns="http://www.tei-c.org/ns/1.0"
>
TEI Header</gloss>
<gloss xmlns="http://www.tei-c.org/ns/1.0"
 version="2006-10-28xml:lang="ja">
TEIヘダー</gloss>
<gloss xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="zh-tw">
tei標頭</gloss>
<gloss xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">
En-tête TEI</gloss>
<desc xmlns="http://www.tei-c.org/ns/1.0"
>
supplies the descriptive and declarative information making
up an electronic title page prefixed to every TEI-conformant
text.</desc>
<desc xmlns="http://www.tei-c.org/ns/1.0"
 version="2006-10-28xml:lang="ja">
TEI準拠テキストに付与される,電子版のタイトルページを
構成する,記述的・宣言的情報を含む.</desc>
<desc xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="zh-tw">
在所有符合TEI標準的文本起始的電子題名頁當中提供敘述性以及宣告性的資訊。</desc>
<desc xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">
L'élément En-tête TEI <gi xmlns="http://www.tei-c.org/ns/1.0"
>
TeiHeader</gi> fournit les informations descriptives et déclaratives précédant chaque texte conforme à la TEI et qui permettent de constituer une page de titre électronique.</desc>
<content xmlns="http://www.tei-c.org/ns/1.0"
>

<rng:group>
 <rng:ref name="fileDesc"/>
 <rng:zeroOrMore>
  <rng:ref name="model.headerPart"/>
 </rng:zeroOrMore>
 <rng:optional>
  <rng:ref name="revisionDesc"/>
 </rng:optional>
</rng:group></content>
<attList xmlns="http://www.tei-c.org/ns/1.0"
>

<attDef xmlns="http://www.tei-c.org/ns/1.0"
 ident="typeusage="opt">

<equiv xmlns="http://www.tei-c.org/ns/1.0"
/>

<desc xmlns="http://www.tei-c.org/ns/1.0"
>
specifies the kind of document to which the header is attached.</desc>
<desc xmlns="http://www.tei-c.org/ns/1.0"
 version="2006-10-28xml:lang="ja">
当該ヘダーが付与される文書の種類を示す.</desc>
<desc xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">
Spécifie le type de document auquel l'en-tête se rapporte.</desc>
<datatype xmlns="http://www.tei-c.org/ns/1.0"
>

<rng:ref name="data.enumerated"/></datatype>
<defaultVal xmlns="http://www.tei-c.org/ns/1.0"
>
text</defaultVal>
<valList xmlns="http://www.tei-c.org/ns/1.0"
 type="open">

<valItem xmlns="http://www.tei-c.org/ns/1.0"
 ident="text">

<equiv xmlns="http://www.tei-c.org/ns/1.0"
/>

<gloss xmlns="http://www.tei-c.org/ns/1.0"
>
the header is attached to a single text.</gloss>
<gloss xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">
L'en-tête se rapporte à un texte isolé.</gloss></valItem>
<valItem xmlns="http://www.tei-c.org/ns/1.0"
 ident="corpus">

<equiv xmlns="http://www.tei-c.org/ns/1.0"
/>

<gloss xmlns="http://www.tei-c.org/ns/1.0"
>
the header is attached to a corpus.</gloss>
<gloss xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">
L'en-tête se rapporte à un corpus.</gloss></valItem></valList></attDef></attList>
<exemplum xmlns="http://www.tei-c.org/ns/1.0"
>

<egXML><teiHeader>
  <fileDesc>
   <titleStmt>
    <title>Shakespeare: the first folio (1623) in electronic form</title>
    <author>Shakespeare, William (1564–1616)</author>
    <respStmt>
     <resp>Originally prepared by</resp>
     <name>Trevor Howard-Hill</name>
    </respStmt>
    <respStmt>
     <resp>Revised and edited by</resp>
     <name>Christine Avern-Carr</name>
    </respStmt>
   </titleStmt>
   <publicationStmt>
    <distributor>Oxford Text Archive</distributor>
    <address>
     <addrLine>13 Banbury Road, Oxford OX2 6NN, UK</addrLine>
    </address>
    <idno type="OTA">119</idno>
    <availability>
     <p>Freely available on a non-commercial basis.</p>
    </availability>
    <date value="1968">1968</date>
   </publicationStmt>
   <sourceDesc>
    <bibl>The first folio of Shakespeare, prepared by Charlton Hinman
         (The Norton Facsimile, 1968)</bibl>
   </sourceDesc>
  </fileDesc>
  <encodingDesc>
   <projectDesc>
    <p>Originally prepared for use in the production of a series of
         old-spelling concordances in 1968, this text was extensively
         checked and revised for use during the editing of the new Oxford
         Shakespeare (Wells and Taylor, 1989).</p>
   </projectDesc>
   <editorialDecl>
    <correction>
     <p>Turned letters are silently corrected.</p>
    </correction>
    <normalization>
     <p>Original spelling and typography is retained, except
           that long s and ligatured forms are not encoded.</p>
    </normalization>
   </editorialDecl>
   <refsDecl xml:id="ASLREF">
    <cRefPattern
      matchPattern="(\S+) ([^.]+)\.(.*)"
      replacementPattern="#xpath(//div1[@n='$1']/div2/[@n='$2']//lb[@n='$3'])">

     <p>A reference is created by assembling the following,
           in the reverse order as that listed here:
     <list>
       <item>the <att>n</att> value of the preceding <gi>lb</gi>
       </item>
       <item>a period</item>
       <item>the <att>n</att> value of the ancestor <gi>div2</gi>
       </item>
       <item>a space</item>
       <item>the <att>n</att> value of the parent <gi>div1</gi>
       </item>
      </list>
     </p>
    </cRefPattern>
   </refsDecl>
  </encodingDesc>
  <revisionDesc>
   <list>
    <item>
     <date value="1989-04-12">12 Apr 89</date> Last checked by CAC</item>
    <item>
     <date value="1989-03-01">1 Mar 89</date> LB made new file</item>
   </list>
  </revisionDesc>
</teiHeader>
</egXML></exemplum>
<exemplum xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">

<egXML><teiHeader>
  <fileDesc>
   <titleStmt>
    <title>Shakespeare: the first folio (1623) sous forme électronique</title>
    <author>Shakespeare, William (1564–1616)</author>
    <respStmt>
     <resp>Préparé par</resp>
     <name>Trevor Howard-Hill</name>
    </respStmt>
    <respStmt>
     <resp>Révisé et édité par</resp>
     <name>Christine Avern-Carr</name>
    </respStmt>
   </titleStmt>
   <publicationStmt>
    <distributor>Oxford Text Archive</distributor>
    <address>
     <addrLine>13 Banbury Road, Oxford OX2 6NN, UK</addrLine>
    </address>
    <idno type="OTA">119</idno>
    <availability>
     <p>Disponible gratuitement à des fins non commerciales.</p>
    </availability>
    <date value="1968">1968</date>
   </publicationStmt>
   <sourceDesc>
    <bibl>The first folio of Shakespeare, préparé par Charlton Hinman
         (The Norton Facsimile, 1968)</bibl>
   </sourceDesc>
  </fileDesc>
  <encodingDesc>
   <projectDesc>
    <p>Préparé pour la production d'une collection de concordances old-spelling en 1968, ce texte a été profondément relu et révisé pour l'édition du new Oxford
         Shakespeare (Wells and Taylor, 1989).</p>
   </projectDesc>
   <editorialDecl>
    <correction>
     <p>Les caractères bloqués sont corrigés sans commentaire.</p>
    </correction>
    <normalization>
     <p>L'orthographe et la typographie originales sont conservées, à l'exception des s longs et des ligatures qui ne sont pas encodées.</p>
    </normalization>
   </editorialDecl>
   <refsDecl xml:id="ASLREF-FR">
    <cRefPattern
      matchPattern="(\S+) ([^.]+)\.(.*)"
      replacementPattern="#xpath(//div1[@n='$1']/div2/[@n='$2']//lb[@n='$3'])">

     <p>Une référence est créée en assemblant les éléments suivants dans l'ordre inverse de la liste suivante :
     <list>
       <item>la valeur de l'attribut <att>n</att> de l'élément <gi>lb</gi>précédent.</item>
       <item>un point</item>
       <item>la valeur de l'attribut <att>n</att> de l'élément <gi>div2</gi> ancêtre.</item>
       <item>un espace</item>
       <item>la valeur de l'attribut <att>n</att> de l'élément <gi>div1</gi> parent.</item>
      </list>
     </p>
    </cRefPattern>
   </refsDecl>
  </encodingDesc>
  <revisionDesc>
   <list>
    <item>
     <date value="1989-04-12">12 avril 1989</date> Dernière vérification par CAC</item>
    <item>
     <date value="1989-03-01">1er mars 1989</date> Nouveau fichier par LB</item>
   </list>
  </revisionDesc>
</teiHeader>
</egXML></exemplum>
<remarks xmlns="http://www.tei-c.org/ns/1.0"
>

<p xmlns="http://www.tei-c.org/ns/1.0"
>
One of the few elements unconditionally required in any
</p></remarks>
<remarks xmlns="http://www.tei-c.org/ns/1.0"
 xml:lang="fr">

<p xmlns="http://www.tei-c.org/ns/1.0"
>
Un des rares éléments obligatoires dans tout document TEI.</p></remarks>
<listRef xmlns="http://www.tei-c.org/ns/1.0"
>

<ptr xmlns="http://www.tei-c.org/ns/1.0"
 target="#HD11"/>

<ptr xmlns="http://www.tei-c.org/ns/1.0"
 target="#CCDEF"/>
</listRef></elementSpec>

6. The scale of work involved

  • 494 elements
  • 116 classes
  • 476 attributes
  • 1115 <gloss> elements, 29357 characters
  • 1170 <desc> elements, 98415 characters

7. Who's up for this?

Volunteers for the following languages have stepped up:
  • Chinese
  • Dutch
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Polish
  • Portuguese
  • Romanian
  • Serbian
  • Slovenian
  • Spanish
  • Swedish

8. The 2007 project

The ALLC has generously provided funding of £10,0000 to oil the wheels and we should now be able to deliver in French, Spanish, German, Chinese and Japanese by this time next year, producing:
  • translated <desc> and <gloss> texts
  • a web application to allow review and update of translations
  • a mechanism to allow users to easily take advantage of the work
  • translated element and attribute names

9. Infrastructure work

Roma has been changed to support the following output schemes:
  • canonical: English names, descriptions in English
  • local descriptions: English names, descriptions in chosen language
  • local names: names designed to make sense to a speaker of the chosen language, descriptions in English
  • fully localized: both names and descriptions in chosen language

10. Partners

Chinese
Weining Huang, Würzburg; checking by Marcus Bingenheimer
French
Coordinated by Pierre-Yves Duchesmin (ENSSIB) for AFNOR
German
Werner Wegstein, Würzburg; checking by Christian Wittern
Japanese
Ohya Kazushi, Tsurumi University
Spanish
? University of Alicante
infrastructure
Sebastian Rahtz


Sebastian Rahtz. Date: October 2006
Copyright University of Oxford