These Guidelines provide an encoding scheme suitable for encoding a wide range of texts, and capable of being extended in well-defined and documented ways for texts that cannot be conveniently or appropriately encoded using what is provided. This chapter describes how the TEI encoding scheme may be modified and extended.
The formal mechanisms provided by these Guidelines are syntactic
mechanisms for encoding information in electronic texts. The semantics
associated with these syntactic mechanisms are described only
informally in the accompanying text. Although the descriptions have been
written with care, there will inevitably be cases where the intention of
the contributors has not been conveyed with sufficient clarity to
prevent users of the Guidelines from
Beyond this unintentional semantic extension, some of the elements
provided can intentionally be used in a variety of ways; for example,
the element
Furthermore, there are several ways for combining and extending the
existing syntactic mechanisms themselves. Earlier chapters have
identified these:
The TEI DTD is designed to support modification of the tag sets
in a documented way that can be validated by an SGML parser. Those
wishing to modify the tag sets must do so using This approach to modification implies the existence of two versions
of any DTD derived from the TEI tag sets. The first version is a
fixed human-readable version. The TEI DTD fragments that have
appeared earlier in these Guidelines are all presented in this
version, save in the few cases where forward pointers to this chapter
appear. This is the The second version of a TEI DTD is automatically derived from the
first by the introduction of parameter entities. These parameter
entities are given values that contain small parts of the definition
of the DTD. If these parameter entities are not modified, their
default expansion (equivalent to the publication version) is used
within the DTD. However, their definitions can be easily changed for
a specific application. This is the In the absence of any modification, the TEI core DTD and
the additions to it embodied in the base and
additional tag sets behave as follows:
These Guidelines provide for user modification of the TEI DTD largely
by using SGML parameter entities at appropriate points in the DTD. It
is not absolutely essential to understand them in detail to modify the
TEI DTD (the examples later in this chapter can be followed cookbook
fashion), but it will probably prove helpful.
Parameter entities are an SGML mechanism for allowing string
substitution within markup declarations; they can thus can be used to
effect changes in declarations.
A parameter entity, unlike an element, may be declared more than once in
an SGML document type definition; if more than one declaration is given,
the parser uses the first one it encounters. Since the declaration
subset within the document is read before the external file containing
the predefined DTD, an entity declaration in the DTD subset will take
precedence over one in the external file. In the TEI DTD, the literal
string which defines the model group for some elements, say Local modifications are most conveniently grouped into two files,
one containing modifications to the TEI parameter entities, and the
other containing new or changed declarations of elements and their
attributes. Names for these files should be specified by the
parameter entities There are several kinds of modification that can be made to the TEI
DTD as follows:
Each kind of modification changes the set of documents that can be
parsed using the DTD. Each combination of the original TEI DTD
fragments may be thought of as defining a certain set of
documents. Each DTD resulting from a modified set of TEI DTD
fragments allows a different set of documents to be parsed. The set
of documents parsed by the original DTD may be properly contained in
the set of documents parsed by a modified DTD, or vice versa.
Modifications that have either of these two results are called
The simplest way to modify the supplied tag sets is to suppress one
or more of the supplied elements. In the modifiable version of the
DTDs, every element declaration is enclosed by a marked section. The
marked section is governed by one of the keywords Thus, to delete the declaration of a generic identifier and thus
suppress the element entirely, the entity that provides the guard on the
marked section wherein the element declaration appears must simply be
set to Two different cases of deleting one or more elements from the TEI
DTD can be identified. The first case involves deleting only elements
that are optional wherever they appear in TEI documents. Deleting these
is clean in the sense that documents that are parseable with the modified
DTD can also be parsed according to the original TEI DTD. To say this
another way, the set of documents matching the new DTD is contained in
the set of documents matching the original DTD.
The second case involves deleting elements that are required in one
or more of their appearances in TEI documents. Deleting these is
unclean in that some documents that can be parsed according to the new
DTD could not be parsed according to the original TEI DTD. To say this
another way, the set of documents matching the new DTD neither contains
nor is contained in the set of documents matching the original DTD.
In the modifiable version of the TEI DTD, elements are not referred to
directly by their generic identifiers; instead, the modifiable version
of the DTD makes use of parameter entities that expand to the standard
generic identifiers. This allows renaming of elements by redefining the
appropriate parameter entities. The names of parameter entities used
for naming are formed by taking the standard generic identifier of the
element and attaching the string Two different cases of renaming can be identified. The first case
involves replacing existing names with names that are otherwise unused
in the TEI name space. (This can be easily checked by looking in the
index of the Guidelines.) Such a modification is clean in that the new
DTD would still accepts any document accepted by the publication DTD
(given the appropriate renaming of elements). The new name cannot
possibly conflict with the generic identifier of any other element,
since there can be no other occurrences. To say this another way, the
set of documents matching the new DTD is isomorphic to the set of
documents matching the old DTD. The example given results in a clean
modification because there is no element The second case involves introducing a name already used somewhere
in a TEI tag set. This is unclean in that it changes what an existing
generic identifier means. The name in question could not be declared
by any tag set that is used in the document, as it is syntactically
invalid to provide two declarations for the same element. The new
generic identifier might occur in some TEI tag set not currently
included in the DTD used to parse the document. For example, if in
some setting the element As a special case, consider translating all of the generic
identifiers for all elements into some other language, L. It may be,
for example, that the word for The formal declarations of the parameter entities used for generic
identifiers are contained in the file If an element is renamed using the techniques described here, its
declaration for the global In the normal course of events, this value of this attribute will
never be specified in a TEI-conformant document; all occurrences will
have the default value. In some special circumstances, it can be
useful to specify a non-default value on some instances of an element;
this allows application programs to process correctly a locally
defined element which usually corresponds to one TEI element (which
would be expressed by the default value) but sometimes to another TEI
element (which would be expressed by explicit values attached to the
element instance).
In In the modifiable version of the TEI DTD, an additional entity is
defined for each model class. This additional entity also takes the
name of the class, this time prefixed by the string For example, the class An encoder can add an element to the class by providing a new
declaration for the x-dot entity. For example, to add a new element
called Class extension is always clean in that the set of documents matching
the DTD containing the extended class contains all of the documents
matching the original DTD. Class extension can be by adding an existing
element to a class, or by adding a new element (as described in the next
section) and extending the class with the new element.
Encoders can modify the content models that specify what is
contained in an element or set of elements defined by the TEI DTD,
modify the attributes of existing elements, or add new elements to the
DTD.
Content models or attributes for existing elements are modified in
two stages. First, the existing definition of the element must be
deleted in the manner described in the first section of this chapter.
Second, a new definition of the element is given. This new definition
must be inserted in the file associated with the entity
For example, suppose that symbolic designations to be marked with the
element New elements are defined by inserting their definitions into the file
associated with the entity The set of documents matched by the modified DTD and the set of
documents matched by the original DTD may be related in several
different ways. It is certainly possible that
the former could properly include the latter or vice versa; either of
these could be said to be clean modifications because the set to be
matched has become strictly larger or strictly smaller.
It is also possible that the set of documents matched by the modified
DTD is different from the set matched by the original DTD and they may
either contain some common documents or have no documents in common;
either of these is said to be an unclean modification.
Radical revision is possible. It would be possible to remodel so
that the When the modification mechanisms are used, their use must be
documented. There are two ways in which information about the
modifications is recorded.
The first record of the modifications is in the use of the extension
files. The file associated with the entity
The file associated with the entity These files give an SGML parser sufficient information to implement
the modifications and are also useful in providing human readers with
some indication of the changes made in the TEI DTD. Full
documentation of any additional or modified elements should also be
provided, using the ancillary tag set described in chapter
This chapter explains how the supplied tag sets can be modified by
suppressing elements, renaming elements, extending syntactic classes,
and adding elements. The different techniques described here have
different effects on the level of TEI conformance to be ascribed to a
text, as described in chapter
The modification mechanisms allow the user to override these defaults
in the following ways, while retaining conformance to the TEI
Guidelines:
The modification mechanisms presented in this chapter are quite general,
and may be used to make all the types of changes just listed. They can
also be used to make more complete modifications of the DTD; if changes
other than those listed above are made, however, the resulting text will
no longer be TEI conformant.
These are described in the remaining sections of this chapter.
IGNORE
or INCLUDE
, which is
provided indirectly using a parameter entity. This parameter entity
has the same name as the generic identifier of the element. Thus, the
declaration for the paragraph element, INCLUDE
in all cases. The
construct above is interpreted thus: the first of the three lines is
the opening of a marked section; when the parser encounters the
section and sees the keyword INCLUDE
as its guard
(more precisely, sees a parameter entity the value of which is the
keyword INCLUDE
), the content of the marked section
is parsed; the second line of the three is the content of the marked
section; and the third line of the three is the closing of a marked
section. If the guard is changed to IGNORE
, the
SGML parser will ignore the content of the marked section.
IGNORE
. For example, if the n.
(for name
) as a
prefix. Thus, the standard generic identifiers for paragraphs, notes
and quotations, paragraph
in language L begins
with the letter s
and that thus the paragraph element is
renamed as m.
(for model
) to the name of the
class. For example, the value of the parameter entity
x.
(for
extension
). The default value of these bibl | biblFull | biblStruct
.
myBibl | bibl | biblFull |
biblStruct
. If more than one element is to be added to a
class, the x-dot entity for the class should be redefined as a list of
the new generic identifiers, each one (IGNORE
for each revised element is done in the last section of the file.