Text only | Skip links
Skip links||IT Services, University of Oxford

1. Before you begin...

In this and subsequent exercises you will be using the XML editor Oxygen to work with data files we've prepared for you. Oxygen should be installed on your work stations already: you can start it up by just clicking on the Oxygen XML Editor icon on the desktop, or selecting it from the Programs menu in the usual way.

We recommend you to keep all the files you make during the exercises in a single folder. We refer to this below as your Work folder; we have pre-installed a Work folder on the USB key supplied for the summer school with all the files you'll need.

2. First steps with Oxygen

In the first part of this exercise you will learn how to use Oxygen to
  • create a new XML document
  • add markup to a document
  • keep your document well-formed
  • display and edit your document without seeing tags

3. Creating a new document

Start up Oxygen, and click the New Document icon at top left (or select New from the File menu, or type CTRL-N.) to display the New dialog.

Select the default option (XML), and press OK.

In the "Create an XML Document" dialog, uncheck the tick box Use a DTD or a schema. In this part of the exercise we will produce a well formed (but invalid) file since we don't yet have a schema.

Press OK to continue.

Oxygen opens a blank document, containing just the XML declaration at the start. As you start typing in XML as instructed below, notice how hard Oxygen tries to make your document well-formed.

  1. move to the second line
  2. type in <div type=" and pause
  3. notice that Oxygen supplies the closing quote for you
  4. continue typing verse and move the cursor after the closing quote (use an arrow key)
  5. Type the closing > and note that Oxygen immediately supplies the closing </div> for you. Press the RETURN key

4. Adding text to a document

Your document should now look like this:

  1. From the Document menu select File (Note: Not the File Menu on the menu bar, but that one found under the Document Menu!)
  2. Select Insert File from the submenu that opens.
  3. Navigate to the file progress.txt in your Work folder (or it is at materials/Work/progress.txt if you are reading this on the online) and insert it into your document.

This is a plain text version of the poem at the start of our sample Punch issue. In the rest of this exercise we will add some minimal tagging.

5. Tagging bits of a document

  1. With the mouse, select the word ‘PROGRESS’ at the start of the text.
  2. Press CTRL-E (or select XML Refactoring/Surround with Tag from the Document menu).
  3. Type the name of the tag you want to use into the popup: this is a heading, so type head and press OK

Repeat the process for the whole of the following paragraph including square brackets: this time, tag it as a <p>.

6. Tagging the poem

We will tag the poem proper using <lg> to enclose each stanza, and <l> to enclose each line.

  1. Try typing the sequence <lg><l> at the start of the first line. Delete the unwanted </l> and possibly </lg> tags inserted by Oxygen.
  2. Put the cursor at the end of the first stanza (after the word ‘sticks’) and type <
  3. A small menu opens, showing that you can enter a end-tag (to close the <l> element) here. Select it.
  4. Type another < and the same menu appears: but note that this time the end-tag to be inserted is </lg>. Select this, and the document is well-formed — no red lines visible.
  5. What will happen if you repeat the process, do you think? Try it and see!

7. Tagging the poem properly

Although well-formed our tagging is not very honest. We have a single <l> element containing several lines, and lots of stanzas which are not tagged as anything at all. If we were validating this document against a schema we'd be in trouble.

  1. put the cursor at the end of the first line (after ‘die!’)
  2. select XML Refactoring/Split Element from the Document menu.

This closes the current <l> and immediately opens a new one, so that our document remains valid. We just need to repeat this process for each line. We could do that by repeating what we just did. Or, more simply perhaps, we could add the XML Document Refactoring toolbar, which would provide a single button to do the job. Or we could note the shortcut key for this operation when in the Document/XML Document/Refactoring menu, and use that!

Another way would be to copy and paste it: use the mouse to select the sequence of characters you just inserted (</l><l>); copy it with CTRL-C; move the cursor to the end of the next line; paste it with CTRL-V. Repeat this for each line (except the last one of course) in the stanza.

8. Tagging another way

Some people just don't like tags. Fortunately, Oxygen also has a ‘tag free’ editing mode: it works by displaying parts of the text which are tagged in different ways in different styles. We specify the style for each tag by means of a stylesheet associated with the document.

  1. Select XML Document/Associate XSLT/CSS Stylesheet from the Document menu (or click the appropriate button if you can find it). In the Associate dialog, navigate to the file progress.css in your working folder and select it. Observe that a new processing instruction is added at the start of your document.
  2. At the bottom of the editing window, you see a choice of Mode displays: Text (the default), Grid, and Author. Select the last. Observe that the display changes and a new Menu option (Author) is now available. Select this and observe the effect of the various command options on it.
  3. Choose No Tags for the next part of this exercise.
  4. You will see that the last part of the poem is now displayed as a single block of text — the line breaks have disappeared because by default XML regards them as the same as any other white space characters, and there is no XML element to indicate where the lines should be broken.

9. Quick splitting

  1. Use the mouse to select the the rest of the poem (from the word ‘The’ to the word ‘shrink’) excluding the name ‘Evoe’ at the end.
  2. Use CTRL-E to tag this all as a single <lg>.
  3. Select the same stretch of text again and tag it as a single <l> in the same way.
  4. Now place the cursor at the end of the first line of the second stanza (after ‘around’). Press RETURN. A menu offers you the choice of splitting the <l> element. Press RETURN again.
  5. As you move around notice the help oXygen gives you to tell you which element you are currently inside
  6. Move on to the next line (after ‘banks’) and repeat. Repeat for each subsequent line in the poem. You'll have to guess where the lines start (hint: they start with capitals and follow rhyme words)
  7. To split the stanzas, put the cursor at the end of the last line of the stanza, and use the right arrow key to move it between the invisible tags (or if you get frustrated switch back to Text mode). You should then be able to split the stanzas in exactly the same way.
  8. You can also experiment with other ways of splitting the text, of course. If you get into a mess, use CTRL-Z to undo the last change you made. Remember you can switch between Author and Text views as much as you like.
  9. Finally, tag the name of the author of the poem (‘Evoe’) using the <signed> element.
  10. When you are done, save your file under a different name in your Work directory: we suggest you call it progress.xml.

The rest of the exercises will assume that you are using the ‘Text’ mode rather than the ‘Author’ mode.

10. Second steps with Oxygen

Now that you are a bit more familiar with Oxygen, we will use it to make a complete TEI document, with all its parts. We will also check its validity, and start enriching its mark up.

10.1. Creating a valid TEI document

Open Oxygen again (if necessary) and as before choose the File/New menu item.

Oxygen provides two ways of specifying the kind of document you want to create. You can use one of the predefined templates or you can specify directly that you want to create an XML document and supply a schema yourself. Oxygen comes with many predefined templates, including some for several popular TEI schemas, which you may wish to explore at your leisure.

For this exercise, however, we suggest that you select the default ‘XML Document’ as before. This time when you see the "Create an XML Document" dialog, check the tick box Use a DTD or a schema

You need to specify the schema language for your schema, by choosing one of the tabs XML Schema, DTD, or Relax NG. It doesn't matter much which you choose, but we recommend the last. Then click on the little yellow folder icon to the right of the empty box labelled URL, choose Browse for local file, and navigate to the location where the required schema file is stored.

For this exercise we've prepared a schema called tei_ipp which you should find in your Work folder, as an XSD, DTD, RNG or RNC file. Oxygen will choose the one it wants, depending on the schema language you chose earlier.

Press OK to continue.

Oxygen does its best to take advantage of the information in your schema file, both by providing you with online help, and by adding any mandatory elements for you. It also has a slightly distressing habit of trying to hide the tags from you by default. If you are now looking at a screen like this:

all you need to do is to select the "Text" mode at the bottom left in order to see a more reassuring display, probably like this:

(The details of the display may be slightly different depending on the validator you are using; if you don't see any red lines, click on the red tick on the menu bar in order to force validation of your document)

To make the document valid, you need to insert some minimal metadata. At the very least, you should supply a non-empty <title> element, and put a few remarks into a paragraphs inside the <publicationStmt> and <sourceDesc> elements. It's up to you how much detail you feel like adding, but we suggest you should aim to produce something like this:

10.2. Adding some content

If your document doesn't have a <body> element inside the <text> element, then add it. But now your document has a <body> but there is nothing in it. Why not insert the <div> element containing the poem which you made earlier? If you still have that open, you can cut and paste between windows in Oxygen. If not, use the Document/File menu to insert the file progress.xml you saved earlier. And if you didn't quite finish the previous part of the exercise we have thoughtfully prepared a suitable file for you: it's in the file called spoilers/progress.xml on your USB key.

10.3. Add some more sections

There is also a file called snippets.xml in your Work directory. Open this with Oxygen, and cut and paste all the <div>s which it contains after the verse <div>. The document should still be well-formed.

In the same way, insert the <div> containing Toby's Essence of Parliament in the right place in your document. A lightly marked up version of this is available in the file called essence.xml in your Work directory. Use the ‘Outline view’ in Oxygen (lower left panel) to check that you have the right structure for your document. If you have turned this off you can turn it back on by going to Perspective -> Show View -> Outline. It should look something like this:

10.4. Validate the file

Up in the menu bar, look for the validation button , and click on it. This will validate your file against the tei_ipp schema we specified when you created it. (You may have noticed in text view that there is an incantation at the top of the file which begins like this: <?oxygen RNGSchema=. This is an Oxygen-specific processing-instruction which indicates the location of the schema to be used to validate the file.) You can configure the ‘validation scenario’ Oxygen uses in many other ways, which we won't detail in this tutorial.

10.5. Explore!

As you've seen, Oxygen is very good at suggesting elements you might want to use to tag parts of the document. Just type a < at any point, to see a drop-down list of elements legal at that point. For a list of the attributes an element can take, insert a space inside its start-tag. In each case you will also see a small amount of information for each element and each attribute. For full details however, you need to consult the TEI Guidelines.

Here are some suggestions for ways you could improve the mark up of the material added:
  • mark the page boundaries, using <pb/>
  • mark as <title> elements all the titles currently tagged as <hi rend="it">)
  • mark up the dialogue between Greece and Turkey using <sp> elements
  • decide what to do about the paragraphs of stars: maybe they should just go? maybe they should become milestones? maybe the paragraphs they separate should become list items?

Make sure you leave your document valid (check for a happy green square in the upper-right corner!) and remember to save the file you've made in the Work folder on your USB key. Call it exercise-1b.xml — we will use it again in a later exercise.

11. A reading knowledge of the TEI

You need to be familiar with reading the TEI Guidelines. Start by visiting http://www.tei-c.org/Guidelines/P5/ and make sure you can
  • See the PDF version of the entire Guidelines
  • Browse the chapters of the Guidelines in English
  • (Browse the chapters of the Guidelines in another language, if you need)
  • Navigate the element catalogue at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/REF-ELEMENTS.html and follow the reference sections.
  • Jump around the reference materials following links

We've include a copy of the whole of the latest release of the TEI Guidelines on your USB key as well, so you don't need to be online to read them.

TEI@Oxford. Date: 2010-07
Copyright University of Oxford