Document Type Definition and Schema

Document Type Definitions and Schema

This version:

This Document: 0.14 17 Feb 2001
The DTD: 0.13 17 Feb 2001
The Schema: 0.11 26 Jan 2001

Editor: Michael Lake (mikel AATT

Contributing Authors:
Michael Lake (mikel AATT
Martin Laverty (martinl AT
John Halleck (jhalleck AT

This document can be found at:

Table of Contents

CaveSurveyXML Abstract


This document describes a draft Document Type Definition and schema for XML based cave survey data.

There are two DTDs/Schemas currently being developed:

  1. CaveSurvey DTD and Schema - which covers the survey of the cave and which this document describes.
  2. CaveMap DTD - which covers objects in the cave and which may appear on a cave map. This is in a separate document.

Status of this Document

This document describes the CaveScript CaveSurveyXML being developed by Michael Lake. The DTD and schema's described here may differ from that being developed by the International Union of Speleology, Informatics Commission's XML Working Group at:

UISIC XML Working Group

CaveSurveyXML Scope

The CaveSurvey DTD/schema defines the XML structure which describes cave survey data; instruments, surveyors, the stations and their descriptions, shots or legs and the data, information or hints as to how the data should be processed and anything else that pertains to the actual cave survey. One might describe these things as the abstract, non-natural things about a cave survey.

The CaveSurvey XML does not extend to describing the natural objects or contents in a cave and which appear on the cave map. These are the things that you see in a cave, both natural and artificial, such as the walls, floor and roof features, speleothems, clastic deposits, biota, geology, or other features such as the geology. Although this information would be recorded in the cave surveyor's notes this information belongs to the CaveMap file which will have a separate DTD/schema to describe it.

Documentation, DTD and Schema Version and Change History

Note: Increment the version number of this document at the top (above the table of contents).

The version number of this document must always be equal to or greater than the version number of the DTD or schema described. This document may have a version number greater than the DTD or schema if changes are made to this document which do not affect the DTD or schema. However changes to the DTD or schema must result in the version number of this document being incremented.

The DTD version number is separate to the schema version number. At present both are receiving parallel development however eventually development on the DTD will decrease as schemas become the norm and some stage the DTD will not be developed further.

Documentation Version and Change History

Date Doc Version Change
17 Feb 2001:0.14 * DTD change: encoding from US-ASCII to UTF-8 on suggestion of John Halleck
* DTD change: moved comments to after <?xml> declaration. The declaration must be the first line (bought to my notice by John Halleck)
* DTD change: in the shot element changed dist, azim and elev attribute type from required to implied.
* Separated version number of DTD from Schema. I originally had the version numbers being the same and incremented one whenever I incremented the other but even now they have different functionality and eventually the DTD will be phased out as schemas become the norm.
26 Jan 20010.13 * added an 'Abstract' and a 'Status of this Document' the latter to make clear the relationship between this work and that of the UISIC XML Working Group's CaveXML
* DTD change and schema change
20 Jan 20010.12 Merged Schema from Martin Laverty into this documentation. This will help to keep the two in sync.
01 Jun 20000.11 DTD change
01 Jan 20000.1 First draft version published

DTD Version and Change History

<cavesurvey.dtd version>= (U->)
<?xml version="1.0" encoding="UTF-8"?>

<!-- Version number of this CaveSurvey DTD -->
<!ENTITY % version "0.13">

DTD Version Change History:
Date DTD Version Change
17 Feb 20010.13 * Changed encoding from US-ASCII to UTF-8 on suggestion of John Halleck
* Moved comments to after <?xml> declaration. The declaration must be the first line (bought to my notice by John Halleck)
* in the shot element changed dist, azim and elev attribute type from required to implied. This is because if surveying by triangulations, the shots don't have distances and if doing trilateralizaitons, the shots are complete with just tape measurements.
26 Jan 20010.12 * changed my DTD date element name to dates
* changed my dElementName to errorElementName to fit in with Martin Laverty's schema
* added a general comment element
01 Jun 20000.11 Corrected XML version in declaration
01 Jan 20000.1 First draft version published

Schema Version and Change History

Similarly for the Schema we have:

<cavesurvey.schema version>= (U->)
<?xml version="1.0" encoding="UTF-8"?>

<!-- Version number of this CaveSurvey Schema -->
<!-- version "0.12" -->

Date Schema Version Change
17 Feb 20010.12 * in the shot element changed dist, azim and elev attribute type from required to implied. This is because if surveying by triangulations, the shots don't have distances and if doing trilateralizaitons, the shots are complete with just tape measurements.
26 Jan 20010.11 * changed my DTD date element name to dates
* changed my dElementName to errorElementName to fit in with Martin Laverty's schema
* added schema element and namespace at start
* added a general comment element
01 Jan 20000.1 First draft version published

Notes and Ideas

This section is for quick notes on ideas for new tags or attributes.


Previous Questions and their followup.

Preface Comments to DTD and Schema

Top Some comments are inserted at the start of the DTD and Schema files so that I'll know where these files came from.

DTD Code

<cavesurvey.dtd preface>= (U->)
<!-- DTD for the CaveScript CaveSurvey XML -->
<!-- Author: Michael Lake, mikel AATT -->
<!-- This file is generated from noweb source file CaveSurvey.nw -->

Schema Code

<cavesurvey.schema preface>= (U->)
<!-- XSD for the CaveScript CaveSurvey XML -->
<!-- by: Martin Laverty, -->
<!-- after DTD by: Michael Lake, -->

<!-- notation name="svx" system="survex.exe" /-->

<schema xmlns=''

The schema must have a closing element tag.

<cavesurvey.schema end>= (U->)

Parameter Entities for the DTD

Parameter entity references appear in DTDs and are replaced by their entity definitions in the DTD. All parameter entity references begin with a percent sign which means they cannot be used in an XML document---only the DTD in which they are defined. (If you want entities that can substitute for other characters inside an XML document then refer to `General Entities'.)

Parameter entity references are used to make it easier for sets of elements to share common attributes and, in the case of the %if and other entities, to provide the XML document some control over the DTD.

The parameter entity references are at the start because they must be declared before they are used.

The declaration for a parameter entity uses the following format:

<!ENTITY % name "replacement_characters" >

DTD Code

<cavesurvey.dtd>= [D->]
<cavesurvey.dtd version>
<cavesurvey.dtd preface>

<!-- ============ Parameter Entities ============ -->

<!-- Date models for use later -->

<!-- These values can be set within the XML docs to select appropriate -->
<!-- STN attributes for normal/diving/topofil surveys.                 -->
<!-- Set to either INCLUDE or IGNORE.                                  -->
<!ENTITY % ifdiving  "IGNORE">
<!ENTITY % iftopofil "IGNORE">

Defines %day, %ifdiving, %iftopofil, %month, %year (links are to index).

In the following chunk I have defined parameters %zero; and %one; because the quotes surrounding the 0.0 or 1.0 cannot be used within the quotes bounding the parameter being defined.

This would be illegal: <!ENTITY % zero_correct "ZERO CDATA "0.0"" >

<cavesurvey.dtd>+= [<-D->]
<!-- These are for the Instruments section -->
<!ENTITY % instrument_id "ID ID #IMPLIED" > 
<!ENTITY % zero "0.0" >
<!ENTITY % one  "1.0" >
<!ENTITY % zero_correct  "ZERO  CDATA %zero;" >
<!ENTITY % scale_correct "SCALE CDATA %one;"  >

Defines %accuracy, %instrument_id, %one, %scale_correct, %used, %zero, %zero_correct (links are to index).

Parameter entities don't appear in schemas.

Root Element Name

[*] Top

Element structures for DTDs are declared using an element type declaration with the following syntax:
<!ELEMENT elementName contentModel>

For the CaveSurvey DTD the root element name is declared by:

DTD Code

<cavesurvey.dtd>+= [<-D->]
<!-- ============ Root Element Name and Content ============ -->

Defines CAVESURVEY (links are to index).

Schema Code

<cavesurvey.schema>= [D->]
<cavesurvey.schema version>
<cavesurvey.schema preface>
<!-- ============ Root Element Name and Content ============ -->
<element name="caveSurvey" type="caveSurvey" >


<complexType name="caveSurvey">
  <element name="head" type="head" maxOccurs="1" />
  <element name="surveyors" minOccurs="0" maxOccurs="unlimited" />
  <element name="instruments" minOccurs="0" maxOccurs="unlimited" />
  <element name="surveySeries" maxOccurs="unlimited" />
  <element ref="comment" type="string" maxOccurs="0" maxOccurs="unlimited" />

<simpleType name="comment"></simpleType>

Defines cavesurvey (links are to index).

TODO: check on syntax for comment in schema above.


This defines a content model with four elements: HEAD, SURVEYORS, INSTRUMENTS and SERIES.

This specifies exactly what elements the root element can contain.

  1. The root element CAVESURVEY must contain one and only one HEAD. (DTD: there is no following indicator, Schema: maxOccurs="1").
  2. The root element can contain an optional list of the elements SURVEYORS and/or INSTRUMENTS but if they are used they can appear only once (DTD: a ? indicator follows, Schema: ).
  3. The element SERIES must occur at least once but multiple consecutive occurances can occur (the + indicator).
  4. The root element cannot contain any character data ie. text because it does not contain the keyword #PCDATA.

The root element name must be the same as the name of the root element for the document in which the declaration appears. That means that as the root element name is CAVESURVEY this name must appear as the root element in a CaveScript CaveSurvey XML document. See the example CaveSurvey XML file example.

Example Fragment of CaveSurvey XML File

<?xml version="1.0" encoding="UTF-8"?>

<mysurvey.xml head>
<mysurvey.xml surveyors>
<mysurvey.xml instruments>
<mysurvey.xml series>

All XML documents start with the XML declaration which tells the processing application that this is an XML document, the version of XML being used, what character encoding is used and whether the document may need to refer to external resources for parsing. The declaration also must be in lower case. Note that the version 1.0 is that of the XML W3C not the version of CaveScript.

The declaration for CaveScript XML documents is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>

Documents that use DTDs must be prefaced by a document type declaration, commonly refered to as the DOCTYPE declaration. The document type declaration must come between the XML prolog and the root element of the document and only one declaration must appear in a document.

The syntax for the DOCTYPE declaration is:
<!DOCTYPE rootElementName [ ...declarations... ]>

CaveScript XML files do not need to specify if they are or are not standalone and so the standalone declaration will not be used.

Finally the document begins and ends with the root element name CAVESURVEY.

The Head Element

Top The HEAD element provides identifying information on the cave. The information here should be sufficient to uniquely identify the area, the cave and the survey.

DTD Code

<cavesurvey.dtd>+= [<-D->]
<!-- ============ Area Cave and Date Information ============ -->


Defines HEAD (links are to index).

Schema Code

<cavesurvey.schema>+= [<-D->]
<!-- ============ Area Cave and Date Information ============ -->

<complexType name="head" >
  <element name="area"  type="area"  maxOccurs="1" />  
  <element name="cave"  type="cave"  maxOccurs="1" />  
  <element name="dates" type="dates" minOccurs="0" maxOccurs="1" /> 

Defines head (links are to index).


The AREA and CAVE elements are required and must occur only once within HEAD. The DATES element is optional but can appear only once. Table [->] describes the elements and their attributes. The declination attribute is an interesting one. It's a ``property'' of both the area and the date of a survey and could appear within either an AREA or a SERIES (as the different series of a survey may have been done years apart).

Element Attribute Purpose
AREANAMEThe name of the region the cave is in eg. Jenolan Caves, NSW
DECLINATIONThe declination correction to be applied for that area and the given survey date. This value will be used if there is no value set in a subsequent SERIES element.
CAVENAMEThe name of the cave eg. Spider Cave
TAGThe tag number affixed to the cave extrance. eg J174
DATESThe DATES element is used as a container for other date elements. There are no attributes.
Elements and attributes declared within element HEAD [*]

Area and Cave Elements


DTD Code

<cavesurvey.dtd>+= [<-D->]
          NAME            CDATA #REQUIRED


Defines AREA, CAVE (links are to index).

Schema Code

<cavesurvey.schema>+= [<-D->]
<complexType name="area" >
  <attribute name="name"        use="required" />   
  <attribute name="declination" type="double" />

<complexType name="cave" >
  <attribute name="name" minOccurs="0" /> 
  <attribute name="tag"  use="required"  /> 

Defines area, cave (links are to index).


DTD attributes for elements are declared with an ATTLIST. The general syntax is:

<!ATTLIST elementName
          attributeName attributeType defaultDeclaration 
          [...more attributes...] >

The AREA and CAVE elements are have a content model of EMPTY. They do not contain any content themselves but carry information in their attributes.

The attribute type CDATA for NAME means the attribute value can be any string of legal XML characters.

The attribute default #REQUIRED means that element instances must explicitly provide a value for this attribute each time it is used. The #IMPLIED attribute default means elements can provide a value for this attribute if the document author wishes.

The Dates Element

DTD Code

<cavesurvey.dtd>+= [<-D->]
                                LASTACCESSDATE?) >




Schema Code

<cavesurvey.schema>+= [<-D->]
<!-- change date to dates and add attribute type? -->
<complexType name="dates" >
  <element name="surveyDate"       type="date" minOccurs="0" maxOccurs="1" />  
  <element name="creationDate"     type="date" minOccurs="0" maxOccurs="1" />  
  <element name="modificationDate" type="date" minOccurs="0" maxOccurs="1" />  
  <element name="lastAccessDate"   type="date" minOccurs="0" maxOccurs="1" />  


The DATES element is used as a container for useful date values. If the SERIES does not provide a date to the parsing application then it could use the SURVEYDATE value as a default.

Element Purpose
SURVEYDATE The date the survey was done. This value will be used if there is no value set in a subsequent SERIES element.
CREATIONDATE The creation date of this document.
MODIFICATIONDATEThe date that this document was last modified.
LASTACCESSDATE The date that this document was last read.
Elements declared within element DATES [*]

Attribute Purpose
YEAR The year as four digits.
MONTH The month as two digits.
DAY The day as two digits.
Attributes declared within DATE type elements [*]

The order and separator used to format the date can be looked after by a style sheet and the application parsing and displaying the XML file. That way the aussies, brits and yanks can have their own date formats.

Example Fragment of CaveSurvey XML File

Following the DOCTYPE declaration in the CaveSurvey XML file is the HEAD element.

<mysurvey.xml head>= (<-U)
        DECLINATION="-11.0" />
  <CAVE NAME="Sigma Cave" 
        TAG="W15" >The cave is located on the hillside.</CAVE>
  <DATES><SURVEYDATE       YEAR="1974" MONTH="09" DAY="12" />
         <CREATIONDATE     YEAR="1998" MONTH="01" DAY="10" />
         <MODIFICATIONDATE YEAR="1999" MONTH="06" DAY="22" /></DATES>

There is an International Standards Organisation standard for date, being ISO 8601, however there are still many countries that do not follow this standard. In addition it is not clear how to format the date when you only know a partial date eg. a year and month but not the day. Because of this it the date is broken into its elemental components [cite xmlwg:Date1, xmlwg:Date2, xmlwg:Date3].

The Surveyors and Surveyor Elements


DTD code

<cavesurvey.dtd>+= [<-D->]
<!-- ============ The Cave Surveyors ============ -->

          NAME                CDATA #REQUIRED

Defines SURVEYOR, SURVEYORS (links are to index).

Schema code

<cavesurvey.schema>+= [<-D->]
<!-- ============ The Cave Surveyors ============ -->

<complexType name="surveyors" base="surveyor" derivedBy="extension" >
  <element name="surveyor" type="surveyor" minOccurs="1" maxOccurs="unlimited" />

<complexType name="surveyor" >   
  <attribute name="name" use="required" />
  <attribute name="name_link" type="uri"   />  
  <attribute name="affiliation" />  
  <attribute name="affiliation_link" type="uri"  />  


If used the SURVEYORS element must contain at least one or more SURVEYOR elements (DTD: hence the + sign).

Each surveyor element corresponds to one surveyor or organisation represented on a survey. The surveyor element is empty ie. there is no content between the element tags (hence the EMPTY keyword). The attributes of the element contain all the information about each surveyor as described in Table [->].

The surveyor element must contain the name of a surveyor or organisation (hence the #REQUIRED) and may contain further information (#IMPLIED).

Attribute Purpose
NAME Name of the surveyor
NAME_CONTACT A contact address, phone or email for that surveyor
AFFILLIATION The affiliation of the surveyor eg. their caving club
AFFILIATION_CONTACT A contact address or phone for the organisation or club
Attributes of the SURVEYOR element [*]

Override Behavior of the SURVEYORS Element

The SURVEYORS element can appear within the CAVESURVEY element tags or within survey SERIES element tags (see Section [->]).

If there is a SURVEYORS element within CAVESURVEY element tags then it shall provide default values for all the survey series within that cave survey ie. all SERIES elements within the CAVESURVEY tags. This requirement means that we don't need to repeat surveyor information in several or more survey series - we can just include it once and override it for a particular series if needed.

If there is a SURVEYORS element within survey SERIES element tags then that information will override previous surveyor information.


Example Fragment of CaveSurvey XML File

<mysurvey.xml surveyors>= (<-U)
        <SURVEYOR NAME="Mike Lake"    AFFILIATION="SUSS" />
        <SURVEYOR NAME="Jill Rowling" AFFILIATION="SUSS" />

The Instruments Element and its Children


DTD Code

<cavesurvey.dtd>+= [<-D->]
<!-- ============ The Instruments Used ============ -->



<!-- Depth Gauges are used in cave diving surveys -->


Schema Code

<cavesurvey.schema>+= [<-D->]
<!-- ============ The Instruments Used ============ -->

<complexType name="instruments" >
  <element name="instrument" type="instrument" abstract="true" 
           minOccurs="0" maxOccurs="unlimited" />

<complexType name="instrument" > 
  <attribute name="instrument_id" type="" use="required" />
  <attribute name="used"          type="" use="required" />
  <attribute name="zero_correct"  type="" use="default"  value="0.0" />
  <attribute name="scale_correct" type="" use="default"  value="1.0" />
  <attribute name="accuracy"      type="" />

<element name="tape" equivClass="instrument" >
  <attribute name="units"  type="" use="default" value="metres">
   <enumeration value="metres" />
   <enumeration value="feet" />
   <enumeration value="yards" />

 <element name="compass" equivClass="instrument" >
  <attribute name="units" type="" use="default" value="degrees" >
   <enumeration value="deg" />
   <enumeration value="degrees" />
   <enumeration value="grads" />
   <enumeration value="mils" />
   <enumeration value="minutes" />

 <element name="clinometer" equivClass="instrument" > 
  <attribute name="units" type="" use="default" value="degrees" >
   <enumeration value="deg" />
   <enumeration value="degrees" />
   <enumeration value="grads" />
   <enumeration value="mils" />
   <enumeration value="percent" />
  </attribute >

 <element name="theodolite" equivClass="instrument" >
  <attribute name="units" type="" use="default" value="degrees" >
   <enumeration value="deg" />
   <enumeration value="degrees" />
   <enumeration value="grads" />
   <enumeration value="mils" />
   <enumeration value="minutes" />
   <enumeration value="seconds" />

 <element name="topofil" equivClass="instrument" >
  <attribute name="units"  type="" use="default" value="metres" >
   <enumeration value="metres" />
   <enumeration value="feet" />
   <enumeration value="yards" />

 <!-- Depth Gauges are used in cave diving surveys -->
 <element name="depthGauge" equivClass="instrument" > 
  <attribute name="units" use="default" value="metres" >
   <enumeration value="metres" />
   <enumeration value="feet" />
   <enumeration value="yards" />


The instruments are optional but can only appear once. The individual instruments can have text content (#PCDATA) which can be used to hold further information about them such as detailed descriptions.

Attributes of the Instruments

Most instruments will common attributes as shown in Table [->]. Note that the DTD declarations for these attributes makes use of several parameter entities.

ID A unique identification for the instrument. It may be the serial number or some other string eg. ``T3'' for a tape
ZERO_CORRECT The length correction to be applied to this instrument to bring it back to zero
SCALE_CORRECTThe scale multiplication correction to be applied to this instrument
ACCURACY Equivalent to the standard deviation expected from this type of instrument
UNITS The units of the instrument eg. ``metres''
DESCRIPTION Further description of the instrument eg. ``Jims Stfford's tape''
Attributes of Instruments [*]

ID and Used Attributes

If an attribute name is given an attribute type of ID then the value of that name must be unique amoung all attributes of type ID. Attributes type of ID can never have fixed default values and only one attribute per element can be of type ID.

The entity parameter %instrument_id; expands to ``ID ID #IMPLIED'' and is a unique reference for that particular instrument. It is not required but many caving clubs will identify their instruments especially if some instruments require zero or scale corrections applied. The ID can be any string of characters but it must start with a letter. See below for a discussion of the problem that this XML restriction creates.

To refer to this instrument in an XML document we can use an IDREF attribute. These attribute values must match the value of an ID attribute for an element in the same XML document. The entity parameter %used; expands to ``USED IDREF #IMPLIED''

Example: If we had:

<TAPE ID="SUSS10">Fibreglass tape with 6cm missing off end.</TAPE>

Later in the document we could specify that that instrument was used for a particular series by writing:

<TAPE USED="SUSS10" /> We should not have to add any information again about the zero correction factors, accuracy or description.

Problem 1: If a Suunto Twin is used the compass and clino are one unit and any identification will pertain to both ``instruments'', however the compass and clino ID's cannot be the same in one XML document. Options for resolving this are:

  1. prefix or affix a letter to the instrument IDs to make them unique
  2. have a new COMP_CLINO combined element with a single ID
  3. inscribe two IDs onto the back of all your Suunto twins ;-)

Problem 2: The XML specification states that the value of an ID attribute must begin with a letter but otherwise can be composed of letters, digits, hyphens, underscores and the full stop character. What if the ID of an instrument, inscribed onto it, starts with a digit? This would not be an uncommon situation. One way be be to prefix all instruments with numeric ID's with the club name like SUSS6.

Zero and Scale Correction

The zero correction factor of ``0.0'' and a scale correction of ``1.0'' will be used as a default value if the attributes are not specifically declared in the element.

Value = ( Reading - ZeroError ) * Scale

Be careful about the sign of the zero error. As in Survex it is the amount needed to correct the reading you got to zero. If the tape measure has the end missing, and you are using the 30cm mark to take all measurements from, then correct it with:

  <TAPE ZERO_CORRECT="+0.3">30m fibreglass with end missing



Bearings in degrees, minutes and seconds are not implemented nor are units such as grads or mils. [360 degrees = 400 grads (also known as Mils)] Does anyone need these?

Description of Instruments

Descriptive information about the instruments is currently included as element content like so:

  <TAPE ZERO_CORRECT="+0.3">30m fibreglass with end missing

However it could be placed into a DESC description attribute defined in the DTD like this:


and the XML document would then look like:

        DESC="30m fibreglass with end missing"</TAPE>


I really don't require any set order here but the commas force a sequence.


Example Fragment of CaveSurvey XML File

<mysurvey.xml instruments>= (<-U)
        <TAPE NAME="SUSS10" >30m fibreglass</TAPE>
        <TAPE NAME="SUSS11" ZERO_CORRECT="+0.1" >30m fibreglass</TAPE>
        <TAPE NAME="JILL">6m steel</TAPE>
        <COMPASS ID="SUSS1">Suunto Twin</COMPASS>
        <CLINO ID="SUSS2"/>

The Series Element

[*] Top This element contains the survey or traverse data. You can have one series for each survey trip into a cave or organise it so that each series is a particular section of the cave or some other system again.

DTD Code

<cavesurvey.dtd>+= [<-D->]
<!-- ============ The Survey Data ============ -->
          NAME        CDATA #REQUIRED
          DATE        CDATA #IMPLIED
          DECLINATION CDATA #IMPLIED          

Defines SERIES (links are to index).

Schema Code

<cavesurvey.schema>+= [<-D->]
<!-- ============ The Survey Data ============ -->
<complexType name="surveySeries" >
      <element name="surveyors" type="surveyors" minOccurs="0" 
                   maxOccurs="unlimited" /> 
          <element name="instruments" type="instruments" minOccurs="0" 
               maxOccurs="unlimited" />
      <element name="status" type="status" minOccurs="0" 
                   maxOccurs="unlimited" />

      <element name="station" type="station" maxOccurs="unlimited" />
      <element name="shot" type="shot" maxOccurs="unlimited" />

  <element name="x_sect" type="x_sect" minOccurs="0" maxOccurs="unlimited" />
  <element name="surveySeries" type="surveySeries" minOccurs="0" 
           maxOccurs="unlimited" />

  <attribute name="description" />
  <attribute name="date" type="date" minOccurs="0" />
  <attribute name="declination" type="double" minOccurs="0" />

<simpleType name="status" >
  <enumeration value="raw data" />
  <enumeration value="calibrated data" />
  <enumeration value="fixed data" />
  <enumeration value="preliminary" />
  <enumeration value="fully processed" />


The series element encloses survey data in the same way that Survex's *begin and *end does. A series can specify the surveyors for that particular series and its own instrument settings. As can be seen from the DTD these included elements are optional but if they are used they must appear first---before any survey information---and they can only occur once.

The station and shot information comes second. There must be at least one instance of a station/shot group. Normal compass and tape data consists of at least two stations and one shot. Unfortunately XML does not allow one to specify that at least two of something is required.

Note that the element SERIES can also contain further SERIES elements. This is how CaveScript CaveSurvey files implement hierarchical station naming like Survex.


All survey series must contain at least one station though usually a series would contain many stations and the shots between them.

DTD Code

<cavesurvey.dtd>+= [<-D->]
          ERROR_E     CDATA #IMPLIED
          ERROR_N     CDATA #IMPLIED
          ERROR_H     CDATA #IMPLIED

Schema Code

<cavesurvey.schema>+= [<-D->]
<complexType name="station" >
  <element name="name" type="string" maxOccurs="1" />
  <element name="east" type="double" maxOccurs="1" />
  <element name="north" type="double" maxOccurs="1" />
  <element name="vertical" type="double" maxOccurs="1" />
  <!-- altitude, depth, rel, abs -->
  <element name="error_east" type="double" minOccurs="0" maxOccurs="1" />
  <element name="error_north" type="double" minOccurs="0" maxOccurs="1" />
  <element name="error_vertical" type="double" minOccurs="0" maxOccurs="1" />

<cavesurvey.schema>+= [<-D->]
<complexType name="grade" >
  <attribute name="type" >
    <enumeration value="ASF" />
    <enumeration value="BCRA" />
    <enumeration value="CRG" />
  <attribute name="level_line" >
    <enumeration value="2" />
    <enumeration value="4" />
    <enumeration value="5" />
  <attribute name="level_detail" >
    <enumeration value="a" />
    <enumeration value="b" />
    <enumeration value="c" />
    <enumeration value="d" />

Aside: The station NAME is of type CDATA whereas it would have been nicer to have it of type ID. There are two problems with this;

  1. In XML all ID's in XML must start with a letter so a station with an ID="10" is illegal---we must have something like ID="A10".
  2. The ID value must be unique amoung all attributes of type ID within the document. Yet in Survex data we can have two or more stations of name ``1'' if they are in a different SERIES.

Note: To satisfy the latter requirement perhaps a prefix defined as an parameter entity could be added to the DTD. <!ENTITY % prefix "A">
<!ENTITY % affix "Z">

A fixed station determined via RDF or theodolite would have EAST, NORTH and HEIGHT values set. A constrained station might only have EAST and NORTH specified.

The attributes ERROR_E, ERROR_N and ERROR_H are the errors or standard deviations in the Easting, Northing and Height.

The station description is provided via the parsed character data within the element.
Example: <STN NAME="87">Cusp of rock in narrow passage."</STN>

Question: What about control points specified as Long and Lat?


DTD Code

<cavesurvey.dtd>+= [<-D->]
          FROM  CDATA                           #REQUIRED
          TO    CDATA                           #REQUIRED
          DIST  CDATA               #IMPLIED
          AZIM (CDATA | - )         #IMPLIED
          ELEV (CDATA | UP | DOWN ) #IMPLIED
          ERROR_DIST CDATA          #IMPLIED
          ERROR_AZIM CDATA          #IMPLIED
          ERROR_ELEV CDATA          #IMPLIED

<![%ifdiving; [

Defines SHOT (links are to index).

Schema Code

<cavesurvey.schema>+= [<-D->]
<element name="shot" type="shot"  />

<complexType name="shot"  abstract="true" >
  <attribute name="from" type="" use="required" />
  <attribute name="to"   type="" use="required" />
  <attribute name="dist" type="double"       minInclusive="0"/>
  <attribute name="azim" type="double"       minInclusive="0" 
             maxInclusive="360" />
  <attribute name="error_dist" type="double" minInclusive="0" />
  <attribute name="error_azim" type="double" minInclusive="0" />
  <choice >
    <element name="standardLeg" />   
    <element name="divingLeg" />   
<element name="standardLeg" equivClass="shot"  >
  <attribute name="incl" type="double" use="required" minInclusive="-90" 
             maxInclusive="90" />
  <attribute name="error_i" type="double" minInclusive="0" />
<element name="divingLeg" equivClass="shot"  >
  <attribute name="fromDepth"  type="double" use="required" />
  <attribute name="toDepth"  type="double" use="required" />
  <attribute name="error_v" type="double" minInclusive="0" />

Shots can contain parsed character data which will usually be any comments about that shot.


<SHOT FROM="88" TO="88a" AZIM="7.18" ELEV="+10" DIST="44.5" >
      This shot was a bit difficult.</SHOT>

The first five attributes must be present (#REQUIRED).

Note: If the station names could have been of type ID then the FROM and TO attribute types could be IDREF which means that the value must match the value of an ID attribute for an element in the document---in this case a STN element.

Some surveyors may prefer alternative terms to DIST, AZIM and ELEV such as those in the following table:

NameElement NameAlternative Element Names
Alternative names for some series elements [*]

Note that angles can be expressed as an azimuth, a bearing or less commonly as mils or grads. Azimuthal readings go from 0 to 360^ whereas bearings only range from 0 to 90^ because they are the angle around a quadrant of the full circle. An example would be South 30 degrees East, written S-45^-E.

The ERROR_DIST, ERROR_AZIM and ERROR_ELEV allows the surveyor to downgrade a shot compared to what is currently specified.

Question: How can we ensure that a higher accuracy than what's possible with the given instruments can't be given here?

Theodolite, Topofil and Other Data

Not supported yet is:

Normal/Spherical Polar FromStn ToStn Dist Bearing Elev
Diving FromStn ToStn Dist Bearing FromDepth ToDepth
Topofil FromStn ToStn FromCount ToCount Bearing Elev
Theodolite AtStn IncludedAngle ElevBack ElevFore Dist

Cross Sections and LRUD Information

The Left, Right, Up and Down (LRUD) information that is often collected by surveyors can be viewed as cross section information where only four data points are recorded. CaveScript uses a generic cross section element to store all cross sections.

DTD Code

<cavesurvey.dtd>+= [<-D->]
          SHOT  CDATA                                   #REQUIRED
          POS  (#PCDATA | start | end)* #REQUIRED
          STRIKE CDATA                  #IMPLIED
          DIP CDATA                     #IMPLIED

Defines XDATA, XSECT (links are to index).

Schema Code

<cavesurvey.schema>+= [<-D->]
<element name="x_sect" type="x_sect" />

<complexType name="x_sect" >
  <element name="points" maxOccurs="unlimited" />

<complexType name="points" >
  <attribute name="shot" type="shot" use="required" />
  <attribute name="position" type="double"  />
  <attribute name="strike" type="double"  minInclusive="0" maxInclusive="360" />
  <attribute name="dip" type="double" minInclusive="-90" maxInclusive="90" />

Cross section data is stored as character content within XDATA elements as sequences of equispaced measurements in the form r1 r2 r3... where 0 <=theta<=360^. Attributes of the XDATA element specify the orientation of the cross section.

Defaults: STRIKE="Bearing of the shot" and DIP="0.0"

Note: Nicer to have had the SHOT attribute type as IDREFs which means that there must be at least one value (though in this case there will be two only) and the values must match ID values in the document---in this case a STN elements.

           2.0                      2.0
            ^                    0.8 ^ 0.9 
            |                        |
    <--0.5     1.5-->       <--0.5       1.5-->
            |                        |
            |                    0.9 | 1.0
           0.4                      0.4

     LURD Info only           8 measurements taken, equispaced 
<DATA SHOT="88 88a" POS="start"> at start of leg ie. at station 88

<DATA SHOT="88 88a" POS="2.0"> at position 2.0 metres along leg

<DATA SHOT="88 88a" POS="end"> at end of leg ie. at station 88a

The application needs to ensure that the value given for POS is less than or at most equal to the DIST of the shot.

Example Fragment of CaveSurvey XML File

<mysurvey.xml series>= (<-U)
<SERIES NAME="sigma">
<STN NAME="85">Cusp (lower one) of rock at base of 1st drop.</STN>
        <STN NAME="86">Cusp of rock at apex or corner in passage.</STN>
        <STN NAME="87">Cusp of rock in narrow passage.</STN>
        <STN NAME="88">Cusp of rock 1m above stream bed.</STN>
        <STN NAME="89">Southerly-most end of ridge of rock at waist height.</STN>
        <STN NAME="90" EAST="0.5" NORTH="2.0" HEIGHT="0.0">Fixed by RDF location.</STN>
        <SHOT FROM="86" TO="85" DIST="5.42" AZIM="328" ELEV="+43" />
        <SHOT FROM="87" TO="86" DIST="2.16" AZIM="0.0" ELEV="+22" />
        <SHOT FROM="88" TO="87" DIST="5.90" AZIM="343" ELEV="+1" />
        <SHOT FROM="88" TO="89" DIST="3.71" AZIM="180" ELEV="-3" />
        <SHOT FROM="90" TO="89" DIST="10.3" AZIM="69"  ELEV="-7" />

        <SERIES NAME="sigma2" DATE="1999-11-02" DECLINATION="-12.2">
                <?CAVERN PROCESS="NO" ?>
                Upstream section from the top of &quot;Fallaway Drop&quot; at the end 
                of &quot;Gamma Grovel&quot; (which is the far end of the Pointed Finger 
                chamber) then along the streamway to &quot;Knockers Cavern Two&quot;.
                        <SURVEYOR NAME="Phil Maynard"    />
                        <SURVEYOR NAME="Geoff McDonnell" /> 
                        <TAPE USED="SUSS11" >Old 30m fibreglass one</TAPE>
                        <TAPE>A 6m metal tape was used for cross sections.</TAPE>
                <SHOT FROM="88" TO="88a" DIST="7.18" AZIM="10" ELEV="+44.5" >
                This shot was a bit difficult.</SHOT>
                <SHOT FROM="88b" TO="88a" DIST="3.6" AZIM="225" ELEV="-55" />
                <SHOT FROM="88b" TO="88c" DIST="7.3" AZIM="82"  ELEV="+43" />
                <SHOT FROM="88b" TO="88d" DIST="2.84" AZIM="92" ELEV="+37" />
                <SHOT FROM="88b" TO="88e" DIST="3.08" AZIM="62" ELEV="+39" />
                        <XDATA SHOT="88 88a" POS="start">
                                0.5 2.0 1.5 0.4 
                        <XDATA SHOT="88 88a" POS="2.0">
                                0.5 0.8 2.0 0.9 1.5 1.0 0.4 0.9
                        <XDATA SHOT="88 88a" STRIKE="30" DIP="-45" POS="end">
                                0.5 2.0 1.5 0.4 


Equating Survey Stations


DTD Code

<cavesurvey.dtd>+= [<-D]

Schema Code

<cavesurvey.schema>+= [<-D->]
<element name="equate" type="equate" />

<complexType name="equate" >
  <attribute name="equate" type="xml:link" use="required" />

Finally append the remaining element end tags.

<cavesurvey.schema>+= [<-D]
<cavesurvey.schema end>

Unresolved Questions


Survey Comments

TODO: How do I allow text information such as comments that might occur in other survey programs to be imported into the XML files?

In Survex such comments are preceded by a semi-colon (;) whilst in Walls comments are preceded by a hash (#). In a cave survey XML document such comments would not be required as all information should be contained as content within appropriate elements.

I could have CDATA sections to include such information or a specific element called COMMENT which contains the comments.

Eg. From Survex a comment such as:

; This was a difficult leg.

would become either

<![CDATA [; This was a difficult leg.] ]> or

<COMMENT>; This was a difficult leg.</COMMENT>

TODO: I need to ensure that when converting older Survex or Walls data that their comments don't contain the characters; & < > " or '.

Processing Instructions

Example: <?CAVERN PROCESS="NO" ?>

The ``target'' application is CAVERN and a processing instruction embedded within the XML file instructs the application not to process the series that is in scope.

Quick Summary of noweb Usage

Top The package noweb is a literate programmming tool where code chunks are interspersed with the documentation that describes them. Invoking noweb with the appropriate options generates either the documentation in LaTeX (this documentation) or assembles the code from the code chunks.

Table [->] shows the noweb commands used to extract the code and documentation. The options supplied to noweb are listed in Table [->]. The Makefile is a more convenient way to perform the same tasks and is covered on page [->]. Defined code chunks in this noweb document are listed in Section [->].

To create:Run:
Makefile: notangle -t4 -RMakefile CaveSurvey.nw > Makefile
CaveSurvey DTD: notangle -t4 -Rcavesurvey.dtd CaveSurvey.nw > CaveSurvey.dtd
CaveSurvey schema: notangle -t4 -Rcavesurvey.schema CaveSurvey.nw > CaveSurvey.xsd
Example XML file: notangle -t4 -Rmysurvey.xml CaveSurvey.nw > mysurvey.xml
LaTeX documentation noweave -t4 -delay -index CaveSurvey.nw > CaveSurvey.tex
latex CaveSurvey.tex
HTML documentation noweave -html -filter l2h -x CaveSurvey.nw | htmltoc > CaveSurvey.html
All the files: noweb CaveSurvey.nw
Quick summary of noweb usage [*]

notangle/noweave options
-t4 Copy tabs untouched from input to output, and use tabs for indentation.
Tabs get set to 8 by default in noweb.
-R extracts root chunk from noweb file
Options in notangle/noweave usage [*]

Interpreting noweb Cross References

Top Throughout the dvi and Postscript documentation (not the HTML documentation) you will see that each chunk of code is uniquely identified by a page number and an alphabetic sub-page reference. An example is:

10b <cavesurvey.dtd 9>+=== (15) 10a 11

This line tells us that we are now in code chunk 10b. This code chunk is on page 10 and it is the second code chunk defined on this page.

The construct <cavesurvey.dtd 9>+=== tells us that we are in a code chunk called cavesurvey.dtd, that its definition began in chunk 9 and the +=== means we are adding to its definition (noweb concatenates definitions with the same name in order of appearance).

At the right margin we find: (15) 10a 11

This tells us that the chunk we're defining is used within chunk 15, and that this current chunk is continued from chunk 10a and is continued in chunk 11.

At the end of each code chunk a %def is be used to define any variables within that code chunk that we want to cross reference. These defined variables get listed in the noweb index with a page number to where they were defined. The LaTeX hyperref package is being used so this page number will be a hyperlink and show as underlined.

Any defined variables enclosed in double square brackets like this [[variable]] in the documentation text becomes a hyperlink, again to the place where that variable is defined.

Makefile for DTD's and Documentation


Top The following Makefile provides a convenient way to create or update the code or documentation after modifications to the noweb source file rather than typing all the notangle or noweave commands. Code or documentation changes are done by making the modifications in the noweb source file and running the appropriate make command.

To extract the Makefile:

notangle -t4 -RMakefile CaveSurvey.nw > Makefile

Run ``make help'' to see what options there are.

For instance, after making changes to any of the DTD's via the noweb source file I run ``make dvi'' to see my changes in xdvi or do a ``make dtd'' to create the up-to-date DTD's. One generally never changes the output files directly (except for quick hacks).

# Makefile for creating CaveSurvey DTD's and their Documentation

# The noweb source file

# If the user just types 'make' with no args then help, being the 
# first routine will be invoked.
        @echo 'Usage: make [dtd schema examples dvi ps html all clean]'

# Create DTDs
         @echo 'Creating CaveSurvey.dtd ...'
         notangle -t4 -Rcavesurvey.dtd $(CAVESURVEY_NOWEB_SOURCE) | cpif CaveSurvey.dtd
#        notangle -t4 -Rcavemap.dtd    $(CAVEMAP_NOWEB_SOURCE) | cpif CaveMap.dtd

# Create Schemas
         @echo 'Creating CaveSurvey.xsd ...'
         notangle -t4 -Rcavesurvey.schema $(CAVESURVEY_NOWEB_SOURCE) | cpif CaveSurvey.xsd

# Create examples
         @echo 'Creating example XML files ...'
         notangle -t4 -Rmysurvey.xml $(CAVESURVEY_NOWEB_SOURCE) | cpif mysurvey.xml
#        notangle -t4 -Rmymap.xml   $(CAVEMAP_NOWEB_SOURCE) | cpif mymap.xml

# Create documentation
         noweave -t4 -delay -index $(CAVESURVEY_NOWEB_SOURCE) >| CaveSurvey.tex
         @echo 'Running "latex CaveSurvey.tex" ...'
         latex CaveSurvey.tex
         @echo 'You may need to run latex again.'
         @echo 'latex CaveSurvey.tex'

ps: dvi
        dvips CaveSurvey.dvi -o

          noweave -html -filter l2h -index $(CAVESURVEY_NOWEB_SOURCE)\
                 | htmltoc >| CaveSurvey.html
#         noweave -html -filter l2h -index $(CAVEMAP_NOWEB_SOURCE)\
#                | htmltoc >| CaveMap.html

all: dtd schema ps html

#       lintex
        rm -f *.aux
        rm -f *.dvi
        rm -f *.lof
        rm -f *.log
        rm -f *.lot
        rm -f *.toc



Complete CaveSurvey DTD

CaveSurvey.dtd: CaveSurvey.dtd

Complete CaveSurvey Schema

This schema was originally created by Martin Laverty based on the DTD defined in this document and since extended by Michael Lake. CaveSurvey.xsd: CaveSurvey.xsd

Example CaveSurvey XML File


Accuracy is specified in asf5.xml

mysurvey.xml mysurvey.xml

Useful References

Hierarchical Tagged Objects (HTO) Portable Data Format Specification, Version 1.4, December 15, Doug Dotson 1994

Inside XML DTDs, St Laurent and Biggar, McGraw-Hill, 1999

Literate Programming Using Noweb, Linux Journal, October 97, Issue 42, p64-69

The site "" can generate a DTD for a supplied XML document.



[1] Email: Subject: Re: text vs. comment and data format, Date: Sun, 04 Feb 2001, From: Devin Kouts

[2] Email: Subject: Re: text vs. comment and data format, Date: Sun, 4 Feb 2001, From: Martin Laverty

[3] Email: Subject: Re: text vs. comment and data format, Date: Sun, 04 Feb 2001, From: Michael Lake

Defined Chunks


Top The following is a list of all the code chunks defined in this document. References are interpreted as in the following example:
(cavesurvey.dtd 32a) 32a 32b: The code chunk cavesurvey.dtd was defined on page 32. The a means it was the first chunk referenced on that page. The 32a and 32b are all pages which reference the chunk.


Top This is a list of all defined variables which are declared using %def.