Text only | Skip links
Skip links||IT Services, University of Oxford

1. Before you start

In this exercise, you will use Roma, a web tool available from the TEI web site and usable with any web browser: Firefox or Internet Explorer or Opera for example. Once you have created your schema, you will also need an XML-aware editor: oXygen in our example.

Our goal is to make a schema which we can use to to validate our ‘SleepyHollow’ document. Although currently this validates, it is doing so against the default TEI schema, this includes all sorts of things we don't need for our particular use of the TEI. Experience has shown that if you are able to constrain your schema to only include those elements that you actually want to use (and only then add them when you really need to), your encoders will be less likely to use the wrong elements and your processing of the documents will be easier. To that end, we're going to create a schema that would be useful for marking up that type of document.

It would have made sense to do this before we marked up the document, since then we could have really used it! The key to creating a good schema is being familiar with the documents one is going to edit. Sometimes marking up a small sample of the documents is a good way to get a feel for the types of phenomena they contain. Since we don't need all of TEI Lite, never mind the full TEI, we'll need to pick and choose from the various modules those bits that we want. We'll also have to tinker with some of the modules to make our schema more helpful with daily editing.

2. Making your own schema

  1. Open the Roma application by pointing your favourite web browser at
    http://tei.oucs.ox.ac.uk/Roma/
  2. The Roma start screen allows you to create a new customization, or to upload an existing customization for further work. We will start from scratch, which means ticking the first radio button ("Build schema (Create a new customisation by adding elements and modules to the smallest recommended schema)"). Press the Submit button at bottom left of the screen to continue.
The next and subsequent screens show you a row of tabs for acting on your customization (New, Customize, Language, Modules, Add elements, Change classes, Schema, and Documentation, Save Customization, Sanity Checker. We won't explore all of these in this exercise, we just don't have time! By default the "Set your parameters" screen is displayed. This allows you to specify a file name and other details for the schema, and also to change the interface language if you wish. Type in the following values:
Title
Schema for Sleepy Hollow
Filename
sleepyhollow
Language
English
Author name
YOUR NAME
Description
A schema to validate Sleepy Hollow
When you have filled in these values click on the ‘Submit’ button.
When you have done this click on the Modules tab to proceed. The modules screen shows two lists: on the left are all available TEI modules; on the right are the modules currently selected for your schema. (By default, 'core', 'tei', 'header' and 'textstructure' are selected, because you really probably don't want to create a TEI document without these!) You can add modules from the list on the left, and remove modules from the list on the right, by clicking the appropriate word next to the module you wish to operate on.
  1. For this exercise, we will need the following extra modules:
    • transcr
    • namesdates
    Click the word add next to these modules.

We are now ready to generate a schema. Click the Schema tab, select 'Relax NG XML schema' and then press Submit. (Though using the Relax NG compact syntax is also possible if you prefer.) Your browser should ask whether you want to save or open the generated file: you should save it onto your Desktop.

Complete this stage by going to the Save tab and saving your work to the same place. This saved file is an ‘ODD’ file itself written in a specialized subset of TEI markup. Do not close the web browser, we'll use it again shortly.

3. Using your schema in oXygen

You can use oXygen and the file SleepyHollow-roma.xml to check that you've made your schema correctly. Proceed as follows:
  1. If it isn't already open, load the oXygen XML editor.
  2. Open the file SleepyHollow-roma.xml.
  3. In oXygen, go to the Document menu, then select XML Document and then Associate Schema. (There is also an icon button to do this which looks a bit like a drawing pin poking a red square and blue triangle.) Choose the RelaxNG Schema tab, and locate your schema file (using the folder icon on the right to browse).
  4. If all goes well, oXygen will insert a processing instruction at the top to indicate the schema location, and attempt to validate the file. Try inserting some new elements, and you should see a much-reduced collection compared to when you first edited the file. To make sure that oXygen is using new schema, you can go to the Document menu, then select Validate, and then Reset Cache and Validate.
  5. Have a play with adding new elements, find a name and surround it with a <name> element.

4. Removing unnecessary elements

The modules chosen contain many more elements than we need, so we will now remove some of them in order to decrease the number of elements available to us and simplify the view in the XML editor. Click the name of a module in the List of selected modules (the right-most column) to see a list of the elements this module defines.

Each element listed has a name, a radio button indicating whether it is to be included or excluded, a tag name, a description, and a link to a further screen where its attributes are specified. You can toggle inclusion or exclusion of all elements in the list by clicking the appropriate column heading. You can click on Exclude to remove all elements from that the module. To return to the list of modules, click the Modules tab.

Now work down the list below by excluding all the elements of each module, and then clicking the radio button to add back in those elements needed for this exercise. Remember to click the Submit Query button at the bottom when you have finished editing each module. One of the benefits of doing this exercise is that you get to see all the elements you are not including, and thus increase your familiarity with the elements that the TEI provides. When you have submitted your changes to a module, click the Modules tab or back links to go back to the list of modules.
From the core module
Include: <abbr>, <add>, <author>, <choice>, <corr>, <del>, <editor>, <expan>, <gap>, <head>, <hi>, <l>, <lg>, <milestone>, <note>, <orig>, <p>, <reg>, <sic>, <title>, and <unclear>.
From the header module
Include: <fileDesc>, <publicationStmt>, <sourceDesc>, <teiHeader>, and <titleStmt>.
From the textstructure module
Include: <TEI>, <back>, <body>, <div>, <front>, and <text>.
From the transcr module
Include: <damage> and <supplied>.
From the namesdates module
Include: <affiliation>, <age>, <birth>, <country>, <death>, <district>, <education>, <event>, <faith>, <floruit>, <forename>, <listPerson>, <listPlace>, <location>, <nationality>, <occupation>, <persName>, <person>, <personGrp>, <place>, <placeName>, <population>, <region>, <settlement>, <sex>, <state>, <surname>, <trait>.
Any other already selected modules
Include all elements by default.

Delete the existing schema file on your desktop and generate your schema again as described above, saving it to your desktop. (If you don't delete the schema, when you next save your schema it will get a sequential number like ‘(1)’ appended to it and you'll have to re-associate your document with this new schema instead.) You may also wish to do the same with the TEI ODD XML file you saved before as well, replacing it with a new saved customization. To make sure that oXygen is using new schema, you can go to the Document menu, then select Validate, and then Reset Cache and Validate. Test to make sure that only these elements are available. For example, while you should be able add <persName> and <placeName>, unless you accidentally included it in the 'core' module, the <name> element shouldn't be available. A number of other elements are not available as well, you can choose to edit your document until it is valid or return to Roma to add these elements back in.

5. Enhancing your schema: Constraining attribute values

Now let us return to the Roma, and make the schema more constrained, to make sure we get just what we want in our documents. The example we have chosen is to constrain the allowed values of the type attribute on <div>, and to make it compulsory.

First delete the schema (.rng or .rnc) you saved to your desktop as before.

Go back to Roma. If you have closed the browser, you could restart Roma and load in the session (a TEI ODD file) that you saved earlier. Go to the Modules tab and click on textstructure in the right-hand column. Find the <div> element definition and click on Change attributes on the right-hand side. This will show you all the attributes of <div>. Click on type, and you will be able to change its properties:
  1. Change the Is it optional radio button to make it compulsory
  2. Change the radio button for Closed list? to make it a closed list
  3. In the box for List of values, type
    preface, main, postscript
    (i.e. a list of possible values, separated by commas).
Now save the schema as before and reload the file in oXygen. There should be validation errors, because the <div> elements in the SleepyHollow-roma.xml file have no type attribute. Add one to each <div>, save the file and see if it validates.

6. Documenting your schema with Roma

One of the major benefits of Roma, is that after you have customized your schema it can produce two ways of documenting the changes you have made. One of these is the ODD file that you saved earlier (and can save again if you want) which indicates how your schema differs from full TEI, what modules you have included, what elements you may have added and changed, amongst other things. This is a good file to keep with your generated schema, in case you need to generate a new schema with additional elements or constrain it further. It also means that others can generate the schema for your documents in different schema languages if needed.

You may wish to open the ODD file you saved earlier and see how this format works.

You don't need to use the web version of Roma to create ODD files, you can just author them by hand if you are feeling geeky. There is a command-line Roma script which you can use to generate any of the outputs which the web version produces.

However, since the TEI Guidelines themselves are made up of ODD documents, Roma also allows you to generate a set documentation for your particular customization of the TEI. You don't need to worry about creating the documentation for the existing elements you have included, Roma already knows about them. If you've added new elements you are able to provide descriptions and information about the element at that point, and Roma will use that. Moreover, if you've changed the names of the elements or their descriptions (perhaps for reasons of internationalization) it will use the new names you have provided in the documentation it generates.

You may wish to generate this from the Documentation tab.

This concludes the brief exercises on Roma for customizing the TEI schema. If you have time you may wish to experiment with making some other customizations to your schema! Some ideas are below.

7. Other things to try with Roma

  1. How do you go about renaming an existing element? What happens in the ODD when you do?
  2. How do you add a new element? What namespace does it end up in when you do? How can you control this?
  3. Once you have an ODD, you can generate project-specific documentation. Where would you put this prose in the ODD file? Try generating some test documentation with additions you have made.
  4. Experiment with starting with different exemplar customization Roma offers. Which do you think would be best for your project?
  5. Save the ODD for the TEI ALL Plus schema -- how does this add in schemas from other namespaces?
  6. How do you go about changing the content model of an element? What implications does this have for your schema? How is it expressed in ODD?
  7. Try modifying an ODD 'manually' in oXygen to include/exclude other elements and then submitting it to Roma to generate a schema.


James Cummings. Date: 2007-10-31
Copyright University of Oxford