v0.999, 2001-07-25
Robert (\rho) Barta
Bond University
Since the stabilisation of XTM, an XML-based notation for Topic Maps the interest in authoring Topic Maps has increased.
While the automatic generation of topic maps towards XTM can easily be achieved, manual authoring is tedious and error-prone, even if one uses XML aware development tools, such as XML-editors. This may change when Topic Map authoring tools will be available on a broad basis, but this still has to come. Server-side solutions are certainly more powerful, but are too slow and inconvenient to use at this stage of writing.
In the following we suggest a textual notation, AsTMa, sufficiently rich to prototype medium sized topic maps. This notation is heavily influenced by LTM, Ontopias Linear Topic Map Notation where the need for a simplified notation was already acknowledged. Moreover, AsTMa has the following design objectives:
At this stage AsTMa does not fulfill all of the above objectives.
This document has no formal status. It is a technical report of the local university.
The setting assumes that the AsTMa text will be either directly understood by Topic Map processing software or that a specialized processor will convert the AsTMa stream into an XTM stream.
First we present the core concepts in a short tutorial, before we turn to a semi-formal language specification. You can find the running example used throughout the tutorial at the Bond Topicmap Server. You might want to peruse an online converter.
filesystem (software)
already defines a topic (as explained below). If there is more to a topic (or an association) this information will be on follow-up lines:
filesystem (software) bn: File System
An empty line, thus, separates items like topic and associations. On any line white-spaces are silently ignored. Any line also can contain comment like
filesystem (software) # more information will follow
Such comments will be discarded by any processor and are only for internal documentation purposes. If you would like to have a comment in the processor output, then this comment MUST begin at the start of the line:
AsTMa | XTM |
---|---|
# I will survive and (hopefully) # the line structure will not # be broken |
<!-- I will survive and (hopefully) the line structure will not be broken --> |
Comments on consecutive lines will be treated as one comment. Any non-comment line signals the end of a group. Also, any '-->' occurrence within a comment will be converted into '- - >'.
filesystem (software)
indicates the definition of topic with id filesystem
which is an instance of another topic, software
:
AsTMa | XTM |
---|---|
filesystem (software) |
<topic id="filesystem"> <instanceOf> <topicRef xlink:href="#software"/> </instanceOf> <baseName> <baseNameString>filesystem</baseNameString> </baseName> </topic> |
As we did not provide a base name, the topic id 'filesystem' is also assumed to be the basename. While this heuristic approach works fine for some words, it does not with others, say,
linux-distribution (software)
Any AsTMa processor is free to apply any other heuristics, such as:
AsTMa | XTM |
---|---|
linux-distribution (software) |
<topic id="linux-distribution"> <instanceOf> <topicRef xlink:href="#software"/> </instanceOf> <baseName> <baseNameString>linux distribution</baseNameString> </baseName> </topic> |
substituting dashes by blanks, looking up 3rd-party databases or leaving it as it is. Of course, the author can enforce a particular base name:
AsTMa | XTM |
---|---|
RedHat-Linux-sparc (linux-distribution-port) bn: RedHat Linux for SPARC |
<topic id="RedHat-Linux-sparc"> <instanceOf> <topicRef xlink:href="#linux-distribution-port"/> </instanceOf> <baseName> <baseNameString>RedHat Linux for SPARC</baseNameString> </baseName> </topic> |
On a similar take, you can also specify occurrences for topics:
AsTMa | XTM |
---|---|
linux (os) bn: Linux kernel oc: http://www.kernel.org/ |
<topic id="linux"> <instanceOf> <topicRef xlink:href="#os"/> </instanceOf> <baseName> <baseNameString>Linux kernel</baseNameString> </baseName> <occurrence> <resourceRef xlink:href="http://www.kernel.org/"/> </occurrence> </topic> |
in the case for resource references or also for inline data (aka resourceData):
AsTMa | XTM |
---|---|
linux-port-on-sparc (linux-port) bn: SPARC Linux port oc: http://www.sparc.org/linux.shtml in: The kernel and kernel modules \ are 64-bit on sparc64, \ userland is still 32-bit, \ and in fact the same as on sparc32. |
<topic id="linux-port-on-sparc"> <instanceOf> <topicRef xlink:href="#linux-port"/> </instanceOf> <baseName> <baseNameString>SPARC Linux port</baseNameString> </baseName> <occurrence> <resourceRef xlink:href="http://www.sparc.org/linux.shtml"/> </occurrence> <occurrence> <resourceData>The kernel and kernel mod....</resourceData> </occurrence> </topic> |
If appropriate, you can also type topic characteristics:
AsTMa | XTM |
---|---|
reiserfs (filesystem) bn: Reiser File System, ReiserFS oc (download): http://www.namesys.com/download.html |
<topic id="reiserfs"> <instanceOf> <topicRef xlink:href="#filesystem"/> </instanceOf> <baseName> <baseNameString>Reiser File System, ReiserFS</baseNameString> </baseName> <occurrence> <instanceOf> <topicRef xlink:href="#download"/> </instanceOf> <resourceRef xlink:href="http://www.namesys.com/download.html"/> </occurrence> </topic> |
To scope a characteristic you use '@' to introduce a particular context:
AsTMa | XTM |
---|---|
RedHat-Linux-sparc (linux-distribution-port) bn: RedHat Linux for SPARC bn @ deutsch : RedHat Linux für SPARC |
<topic id="RedHat-Linux-sparc"> <instanceOf> <topicRef xlink:href="#linux-distribution-port"/> </instanceOf> <baseName> <baseNameString>RedHat Linux for SPARC</baseNameString> </baseName> <baseName><scope><topicRef xlink:href="#deutsch"/></scope> <baseNameString>RedHat Linux für SPARC</baseNameString> </baseName> </topic> |
Associations may or may not have a particular type. In any case they have a number of members playing roles:
AsTMa | XTM |
---|---|
(kernel-patch-provides-feature) feature: reiserfs platform: i386 patch: generic-reiserfs-patch-2.4.x |
<association> <instanceOf> <topicRef xlink:href="#kernel-patch-provides-feature"/> </instanceOf> <member> <roleSpec> <topicRef xlink:href="#feature"/> </roleSpec> <topicRef xlink:href="#reiserfs"/> </member> <member> <roleSpec> <topicRef xlink:href="#platform"/> </roleSpec> <topicRef xlink:href="#i386"/> </member> <member> <roleSpec> <topicRef xlink:href="#patch"/> </roleSpec> <topicRef xlink:href="#generic-reiserfs-patch-2.4.x"/> </member> </association> |
For better readability you may want to indent the roles
(kernel-patch-provides-feature) feature: reiserfs platform: i386 patch: generic-reiserfs-patch-2.4.x
To inform the processor about the name (id) of the topic map itself, the very first non-empty line within the document MUST provide it:
AsTMa | XTM |
---|---|
sparclinux : iso-8859-1 |
<?xml version="1.0" encoding="iso-8859-1"?> <topicMap id="sparclinux" xmlns = 'http://www.topicmaps.org/xtm/1.0/' xmlns:xlink = 'http://www.w3.org/1999/xlink'> |
Optionally, you can specify an particular encoding, like in the example above. The encoding defaults to iso-8859-1, though.
For authors who have no access to a general macro expansion environment, the language supports a rudimentary macro facility which comes handy when to abbreviate long strings. The idea is to first declare a macro via
de=http://www.topicmaps.org/xtm/1.0/language.xtm#de
somewhere towards the beginning of the document and then use this definition for, say, scoping:
AsTMa | XTM |
---|---|
oc @ &de; (press-release) : http://www.... |
<occurrence> <scope><topicRef xlink:href="http://www.topicmaps.org/xtm/1.0/language.xtm#de"/></scope> <instanceOf> <topicRef xlink:href="#press-release"/> </instanceOf> <resourceRef xlink:href="http://www.s...."/> </occurrence> |
Every instance of &de;
throughout the document will be expanded.
While these macros cannot have parameters, they will be evaluated recursively, i.e. macros can contain other macros. Any processor MAY detect circular definitions.
It goes without saying that the notation &...;
may collide with other
XML entities. It lies in the responsibility of the author to take care of that.
tbd, just thoughts at this stage.
special character in id, encoding according to www.isi.edu/in-notes/iana/assignments/character-sets,
causal stream, no implicit merging (not even with ids)
conformant converter is free to produce any additional topic/assocs appropriate??
in terms of an abstract pattern processor?
causal streams (really?) , performance at bigger maps
specifying additional behavior? auto complete, defaults for assocs (types?)