This version:
This Document: | 0.14 | 17 Feb 2001 |
The DTD: | 0.13 | 17 Feb 2001 |
The Schema: | 0.11 | 26 Jan 2001 |
Editor: Michael Lake (mikel AATT speleonics.com.au)
Contributing Authors:
Michael Lake (mikel AATT speleonics.com.au)
Martin Laverty (martinl AT talk21.com)
John Halleck (jhalleck AT netcom.utah.edu)
This document can be found at: http://www.speleonics.com.au/cavescript/
This document describes a draft Document Type Definition and schema for XML based cave survey data.
There are two DTDs/Schemas currently being developed:
This document describes the CaveScript CaveSurveyXML being developed by
Michael Lake. The DTD and schema's described here may differ from that being
developed by the
International Union of Speleology, Informatics Commission's XML Working Group at:
http://www.karto.ethz.ch/neumann/caving/cavexml/
The CaveSurvey DTD/schema defines the XML structure which describes cave survey data; instruments, surveyors, the stations and their descriptions, shots or legs and the data, information or hints as to how the data should be processed and anything else that pertains to the actual cave survey. One might describe these things as the abstract, non-natural things about a cave survey.
The CaveSurvey XML does not extend to describing the natural objects or contents in a cave and which appear on the cave map. These are the things that you see in a cave, both natural and artificial, such as the walls, floor and roof features, speleothems, clastic deposits, biota, geology, or other features such as the geology. Although this information would be recorded in the cave surveyor's notes this information belongs to the CaveMap file which will have a separate DTD/schema to describe it.
Note: Increment the version number of this document at the top (above the table of contents).
The version number of this document must always be equal to or greater than the version number of the DTD or schema described. This document may have a version number greater than the DTD or schema if changes are made to this document which do not affect the DTD or schema. However changes to the DTD or schema must result in the version number of this document being incremented.
The DTD version number is separate to the schema version number. At present both are receiving parallel development however eventually development on the DTD will decrease as schemas become the norm and some stage the DTD will not be developed further.
Date | Doc Version | Change |
2001 | ||
17 Feb 2001: | 0.14 | * DTD change: encoding from US-ASCII to UTF-8 on suggestion of John Halleck |
* DTD change: moved comments to after <?xml> declaration. The declaration must be the first line (bought to my notice by John Halleck) | ||
* DTD change: in the shot element changed dist, azim and elev attribute type from required to implied. | ||
* Separated version number of DTD from Schema. I originally had the version numbers being the same and incremented one whenever I incremented the other but even now they have different functionality and eventually the DTD will be phased out as schemas become the norm. | ||
26 Jan 2001 | 0.13 | * added an 'Abstract' and a 'Status of this Document' the latter to make clear the relationship between this work and that of the UISIC XML Working Group's CaveXML |
* DTD change and schema change | ||
20 Jan 2001 | 0.12 | Merged Schema from Martin Laverty into this documentation. This will help to keep the two in sync. |
2000 | ||
01 Jun 2000 | 0.11 | DTD change |
01 Jan 2000 | 0.1 | First draft version published |
<cavesurvey.dtd version>= (U->) <?xml version="1.0" encoding="UTF-8"?> <!-- Version number of this CaveSurvey DTD --> <!ENTITY % version "0.13">
DTD Version Change History:
Date | DTD Version | Change |
2001 | ||
17 Feb 2001 | 0.13 | * Changed encoding from US-ASCII to UTF-8 on suggestion of John Halleck |
* Moved comments to after <?xml> declaration. The declaration must be the first line (bought to my notice by John Halleck) | ||
* in the shot element changed dist, azim and elev attribute type from required to implied. This is because if surveying by triangulations, the shots don't have distances and if doing trilateralizaitons, the shots are complete with just tape measurements. | ||
26 Jan 2001 | 0.12 | * changed my DTD date element name to dates |
* changed my dElementName to errorElementName to fit in
with Martin Laverty's schema | ||
* added a general comment element | ||
2000 | ||
01 Jun 2000 | 0.11 | Corrected XML version in declaration |
01 Jan 2000 | 0.1 | First draft version published |
Similarly for the Schema we have:
<cavesurvey.schema version>= (U->) <?xml version="1.0" encoding="UTF-8"?> <!-- Version number of this CaveSurvey Schema --> <!-- version "0.12" -->
Date | Schema Version | Change |
2001 | ||
17 Feb 2001 | 0.12 | * in the shot element changed dist, azim and elev attribute type from required to implied. This is because if surveying by triangulations, the shots don't have distances and if doing trilateralizaitons, the shots are complete with just tape measurements. |
26 Jan 2001 | 0.11 | * changed my DTD date element name to dates |
* changed my dElementName to errorElementName to fit in
with Martin Laverty's schema | ||
* added schema element and namespace at start | ||
* added a general comment element | ||
2000 | ||
01 Jan 2000 | 0.1 | First draft version published |
This section is for quick notes on ideas for new tags or attributes.
Example:
<SERIES> <INCLUDE src="file://tattered_survey.xml#root()"> <INCLUDE src="file://tattered_survey.xml#root() .child(all, series, name, "sigma")"> </SERIES>
<coord type="UTM" zone=56 east=224000 north=6255000 elev=838/> </coord> <coord type="geo" zone=56> <lat deg=32 min=48 sec=27.2 hemi=S /> <long deg=150 min=01 sec=11.5 dir= E /> <elev value=838 /> </coord>
Previous Questions and their followup.
Top Some comments are inserted at the start of the DTD and Schema files so that I'll know where these files came from.
<cavesurvey.dtd preface>= (U->) <!-- DTD for the CaveScript CaveSurvey XML --> <!-- Author: Michael Lake, mikel AATT speleonics.com.au --> <!-- This file is generated from noweb source file CaveSurvey.nw -->
<cavesurvey.schema preface>= (U->) <!-- XSD for the CaveScript CaveSurvey XML --> <!-- by: Martin Laverty, martinl@talk21.com --> <!-- after DTD by: Michael Lake, Mike.Lake@uts.edu.au --> <!-- notation name="svx" system="survex.exe" /--> <schema xmlns='http://www.w3.org/1999/XMLSchema' version='0.12'>
The schema must have a closing element tag.
<cavesurvey.schema end>= (U->) </schema>
Parameter entity references appear in DTDs and are replaced by their entity definitions in the DTD. All parameter entity references begin with a percent sign which means they cannot be used in an XML document---only the DTD in which they are defined. (If you want entities that can substitute for other characters inside an XML document then refer to `General Entities'.)
Parameter entity references are used to make it easier for sets of elements to share common attributes and, in the case of the %if and other entities, to provide the XML document some control over the DTD.
The parameter entity references are at the start because they must be declared before they are used.
The declaration for a parameter entity uses the following format:
<!ENTITY % name "replacement_characters" >
<cavesurvey.dtd>= [D->] <cavesurvey.dtd version> <cavesurvey.dtd preface> <!-- ============ Parameter Entities ============ --> <!-- Date models for use later --> <!ENTITY % year "YEAR CDATA #IMPLIED" > <!ENTITY % month "MONTH CDATA #IMPLIED" > <!ENTITY % day "DAY CDATA #IMPLIED" > <!-- These values can be set within the XML docs to select appropriate --> <!-- STN attributes for normal/diving/topofil surveys. --> <!-- Set to either INCLUDE or IGNORE. --> <!ENTITY % ifdiving "IGNORE"> <!ENTITY % iftopofil "IGNORE">
Defines%day
,%ifdiving
,%iftopofil
,%month
,%year
(links are to index).
In the following chunk I have defined parameters %zero;
and %one;
because the quotes surrounding the 0.0 or 1.0 cannot be used within the quotes
bounding the parameter being defined.
This would be illegal: <!ENTITY % zero_correct "ZERO CDATA "0.0"" >
<cavesurvey.dtd>+= [<-D->] <!-- These are for the Instruments section --> <!ENTITY % instrument_id "ID ID #IMPLIED" > <!ENTITY % used "USED IDREF #IMPLIED" > <!ENTITY % zero "0.0" > <!ENTITY % one "1.0" > <!ENTITY % zero_correct "ZERO CDATA %zero;" > <!ENTITY % scale_correct "SCALE CDATA %one;" > <!ENTITY % accuracy "ACCURACY CDATA #IMPLIED" >
Defines%accuracy
,%instrument_id
,%one
,%scale_correct
,%used
,%zero
,%zero_correct
(links are to index).
Parameter entities don't appear in schemas.
Element structures for DTDs are declared using an element type
declaration with the following syntax:
<!ELEMENT elementName contentModel>
For the CaveSurvey DTD the root element name is declared by:
<cavesurvey.dtd>+= [<-D->] <!-- ============ Root Element Name and Content ============ --> <!ELEMENT CAVESURVEY ( HEAD, SURVEYORS?, INSTRUMENTS?, SERIES+ ) >
DefinesCAVESURVEY
(links are to index).
<cavesurvey.schema>= [D->] <cavesurvey.schema version> <cavesurvey.schema preface> <!-- ============ Root Element Name and Content ============ --> <element name="caveSurvey" type="caveSurvey" > <annotation> <appinfo></appinfo> <documentation><h1>CaveScript</h1></documentation> </annotation> <complexType name="caveSurvey"> <element name="head" type="head" maxOccurs="1" /> <element name="surveyors" minOccurs="0" maxOccurs="unlimited" /> <element name="instruments" minOccurs="0" maxOccurs="unlimited" /> <element name="surveySeries" maxOccurs="unlimited" /> <element ref="comment" type="string" maxOccurs="0" maxOccurs="unlimited" /> </complexType> <simpleType name="comment"></simpleType>
Definescavesurvey
(links are to index).
TODO: check on syntax for comment in schema above.
This defines a content model with four elements:
HEAD
, SURVEYORS
, INSTRUMENTS
and SERIES
.
This specifies exactly what elements the root element can contain.
CAVESURVEY
must contain one and only one HEAD
.
(DTD: there is no following indicator, Schema: maxOccurs="1").
SURVEYORS
and/or INSTRUMENTS
but if they are used they
can appear only once (DTD: a ? indicator follows, Schema: ).
SERIES
must occur at least once but multiple
consecutive occurances can occur (the + indicator).
#PCDATA
.
The root element name must be the same as the name of the root element for
the document in which the declaration appears. That means that
as the root element name is CAVESURVEY
this name must appear as the root
element in a CaveScript CaveSurvey XML document. See the example
CaveSurvey XML file example.
<mysurvey.xml>= <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE CAVESURVEY SYSTEM "CaveSurvey.dtd"> <CAVESURVEY> <mysurvey.xml head> <mysurvey.xml surveyors> <mysurvey.xml instruments> <mysurvey.xml series> </CAVESURVEY>
All XML documents start with the XML declaration which tells the processing application that this is an XML document, the version of XML being used, what character encoding is used and whether the document may need to refer to external resources for parsing. The declaration also must be in lower case. Note that the version 1.0 is that of the XML W3C not the version of CaveScript.
The declaration for CaveScript XML documents is:<?xml version="1.0" encoding="UTF-8" standalone="no"?>
Documents that use DTDs must be prefaced by a document type declaration, commonly refered to as the DOCTYPE declaration. The document type declaration must come between the XML prolog and the root element of the document and only one declaration must appear in a document.
The syntax for the DOCTYPE declaration is:
<!DOCTYPE rootElementName [ ...declarations... ]>
CaveScript XML files do not need to specify if they are or are not standalone and so the standalone declaration will not be used.
Finally the document begins and ends with the root element name CAVESURVEY
.
Top
The HEAD
element provides identifying information on the cave.
The information here should be sufficient to
uniquely identify the area, the cave and the survey.
<cavesurvey.dtd>+= [<-D->] <!-- ============ Area Cave and Date Information ============ --> <!ELEMENT HEAD (AREA, CAVE, DATES?) >
DefinesHEAD
(links are to index).
<cavesurvey.schema>+= [<-D->] <!-- ============ Area Cave and Date Information ============ --> <complexType name="head" > <element name="area" type="area" maxOccurs="1" /> <element name="cave" type="cave" maxOccurs="1" /> <element name="dates" type="dates" minOccurs="0" maxOccurs="1" /> </complexType>
Defineshead
(links are to index).
The AREA
and CAVE
elements are required and must occur only once
within HEAD
. The DATES
element is optional but can appear only once.
Table [->] describes the elements and their attributes.
The declination attribute is an interesting one. It's a ``property'' of both the area and
the date of a survey and could appear within either an AREA
or a SERIES
(as the different series of a survey may have been done years apart).
Element | Attribute | Purpose |
AREA | NAME | The name of the region the cave is in eg. Jenolan Caves, NSW |
DECLINATION | The declination correction to be applied for that
area and the given survey date. This value will be used if there is
no value set in a subsequent SERIES element. | |
CAVE | NAME | The name of the cave eg. Spider Cave |
TAG | The tag number affixed to the cave extrance. eg J174 | |
DATES | The DATES element is used as a container for other date elements. There are no attributes. |
<cavesurvey.dtd>+= [<-D->] <!ELEMENT AREA EMPTY > <!ATTLIST AREA NAME CDATA #REQUIRED DECLINATION CDATA #IMPLIED > <!ELEMENT CAVE EMPTY > <!ATTLIST CAVE NAME CDATA #REQUIRED TAG CDATA #IMPLIED >
DefinesAREA
,CAVE
(links are to index).
<cavesurvey.schema>+= [<-D->] <complexType name="area" > <attribute name="name" use="required" /> <attribute name="declination" type="double" /> </complexType> <complexType name="cave" > <attribute name="name" minOccurs="0" /> <attribute name="tag" use="required" /> </complexType>
Definesarea
,cave
(links are to index).
DTD attributes for elements are declared with an ATTLIST
. The general syntax is:
<!ATTLIST elementName attributeName attributeType defaultDeclaration [...more attributes...] >
The AREA
and CAVE
elements are have a content model of EMPTY
.
They do not contain any content themselves but carry information in their
attributes.
The attribute type CDATA
for NAME
means the attribute value can be
any string of legal XML characters.
The attribute default #REQUIRED
means that element instances must
explicitly provide a value for this attribute each time it is used.
The #IMPLIED
attribute default means elements can provide a value
for this attribute if the document author wishes.
<cavesurvey.dtd>+= [<-D->] <!ELEMENT DATES (SURVEYDATE?, CREATIONDATE?, MODIFICATIONDATE?, LASTACCESSDATE?) > <!ELEMENT SURVEYDATE EMPTY> <!ELEMENT CREATIONDATE EMPTY > <!ELEMENT MODIFICATIONDATE EMPTY > <!ELEMENT LASTACCESSDATE EMPTY > <!ATTLIST SURVEYDATE %year; %month; %day; > <!ATTLIST CREATIONDATE %year; %month; %day; > <!ATTLIST MODIFICATIONDATE %year; %month; %day; > <!ATTLIST LASTACCESSDATE %year; %month; %day; >
DefinesCREATIONDATE
,DATES
,LASTACCESSDATE
,MODIFICATIONDATE
,SURVEYDATE
(links are to index).
<cavesurvey.schema>+= [<-D->] <!-- change date to dates and add attribute type? --> <complexType name="dates" > <element name="surveyDate" type="date" minOccurs="0" maxOccurs="1" /> <element name="creationDate" type="date" minOccurs="0" maxOccurs="1" /> <element name="modificationDate" type="date" minOccurs="0" maxOccurs="1" /> <element name="lastAccessDate" type="date" minOccurs="0" maxOccurs="1" /> </complexType>
The DATES
element is used as a container for useful date values.
If the SERIES
does not provide a date to the parsing application then it
could use the SURVEYDATE
value as a default.
Element | Purpose |
SURVEYDATE | The date the survey was done. This value will be used if
there is no value set in a subsequent SERIES element. |
CREATIONDATE | The creation date of this document. |
MODIFICATIONDATE | The date that this document was last modified. |
LASTACCESSDATE | The date that this document was last read. |
Attribute | Purpose |
YEAR | The year as four digits. |
MONTH | The month as two digits. |
DAY | The day as two digits. |
The order and separator used to format the date can be looked after by a style sheet and the application parsing and displaying the XML file. That way the aussies, brits and yanks can have their own date formats.
Following the DOCTYPE declaration in the CaveSurvey XML file is the HEAD element.
<mysurvey.xml head>= (<-U) <HEAD> <AREA NAME="WOMBEYAN CAVES, NSW" DECLINATION="-11.0" /> <CAVE NAME="Sigma Cave" TAG="W15" >The cave is located on the hillside.</CAVE> <DATES><SURVEYDATE YEAR="1974" MONTH="09" DAY="12" /> <CREATIONDATE YEAR="1998" MONTH="01" DAY="10" /> <MODIFICATIONDATE YEAR="1999" MONTH="06" DAY="22" /></DATES> </HEAD>
There is an International Standards Organisation standard for date, being ISO 8601, however there are still many countries that do not follow this standard. In addition it is not clear how to format the date when you only know a partial date eg. a year and month but not the day. Because of this it the date is broken into its elemental components [cite xmlwg:Date1, xmlwg:Date2, xmlwg:Date3].
<cavesurvey.dtd>+= [<-D->] <!-- ============ The Cave Surveyors ============ --> <!ELEMENT SURVEYORS ( SURVEYOR+ ) > <!ELEMENT SURVEYOR EMPTY > <!ATTLIST SURVEYOR NAME CDATA #REQUIRED NAME_CONTACT CDATA #IMPLIED AFFILLIATION CDATA #IMPLIED AFFILIATION_CONTACT CDATA #IMPLIED >
DefinesSURVEYOR
,SURVEYORS
(links are to index).
<cavesurvey.schema>+= [<-D->] <!-- ============ The Cave Surveyors ============ --> <complexType name="surveyors" base="surveyor" derivedBy="extension" > <element name="surveyor" type="surveyor" minOccurs="1" maxOccurs="unlimited" /> </complexType> <complexType name="surveyor" > <attribute name="name" use="required" /> <attribute name="name_link" type="uri" /> <attribute name="affiliation" /> <attribute name="affiliation_link" type="uri" /> </complexType>
If used the SURVEYORS
element must contain at least one or more
SURVEYOR
elements (DTD: hence the + sign).
Each surveyor element corresponds to one surveyor or organisation
represented on a survey. The surveyor element is empty ie. there is no
content between the element tags (hence the EMPTY
keyword).
The attributes of the element contain all the information about each
surveyor as described in Table [->].
The surveyor element must contain the name of a surveyor or organisation (hence the #REQUIRED) and may contain further information (#IMPLIED).
Attribute | Purpose |
NAME | Name of the surveyor |
NAME_CONTACT | A contact address, phone or email for that surveyor |
AFFILLIATION | The affiliation of the surveyor eg. their caving club |
AFFILIATION_CONTACT | A contact address or phone for the organisation or club |
SURVEYORS
element can appear within the CAVESURVEY
element tags or
within survey SERIES
element tags (see Section [->]).
If there is a SURVEYORS
element within CAVESURVEY
element tags then it
shall provide default values for all the survey series within that
cave survey ie. all SERIES
elements within the CAVESURVEY
tags. This
requirement means that we don't need to repeat surveyor information in several
or more survey series - we can just include it once and override it for a
particular series if needed.
If there is a SURVEYORS
element within survey SERIES
element tags then
that information will override previous surveyor information.
SURVEYOR
element or should it be empty?
<mysurvey.xml surveyors>= (<-U) <SURVEYORS> <SURVEYOR NAME="Mike Lake" AFFILIATION="SUSS" /> <SURVEYOR NAME="Jill Rowling" AFFILIATION="SUSS" /> </SURVEYORS>
<cavesurvey.dtd>+= [<-D->] <!-- ============ The Instruments Used ============ --> <!ELEMENT INSTRUMENTS ( TAPE?, COMPASS?, CLINOMETER?, TOPOFIL?, THEODOLITE?, DEPTHGAUGE?) > <!ELEMENT TAPE (#PCDATA) > <!ELEMENT COMPASS (#PCDATA) > <!ELEMENT CLINOMETER (#PCDATA) > <!ELEMENT TOPOFIL (#PCDATA) > <!ELEMENT THEODOLITE (#PCDATA) > <!ELEMENT DEPTHGAUGE (#PCDATA) > <!ATTLIST TAPE %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; UNITS (METRES | FEET | YARDS ) "METRES" > <!ATTLIST COMPASS %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; UNITS (DEG | DEGREES | GRADS | MILS | MINUTES) "DEGREES" > <!ATTLIST CLINOMETER %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; UNITS (DEG | DEGREES | GRADS | MILS | PERCENT | PERCENTAGE) "DEGREES" > <!ATTLIST THEODOLITE %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; > <!ATTLIST TOPOFIL %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; UNITS (METRES | METERS | FEET | YARDS ) "METRES" > <!-- Depth Gauges are used in cave diving surveys --> <!ATTLIST DEPTHGAUGE %instrument_id; %used; %zero_correct; %scale_correct; %accuracy; UNITS (METRES | METERS | FEET | YARDS ) "METRES" >
DefinesCLINOMETER
,COMPASS
,DEPTHGAUGE
,INSTRUMENTS
,TAPE
,THEODOLITE
,TOPOFIL
(links are to index).
<cavesurvey.schema>+= [<-D->] <!-- ============ The Instruments Used ============ --> <complexType name="instruments" > <element name="instrument" type="instrument" abstract="true" minOccurs="0" maxOccurs="unlimited" /> </complexType> <complexType name="instrument" > <attribute name="instrument_id" type="" use="required" /> <attribute name="used" type="" use="required" /> <attribute name="zero_correct" type="" use="default" value="0.0" /> <attribute name="scale_correct" type="" use="default" value="1.0" /> <attribute name="accuracy" type="" /> </complexType> <element name="tape" equivClass="instrument" > <attribute name="units" type="" use="default" value="metres"> <enumeration value="metres" /> <enumeration value="feet" /> <enumeration value="yards" /> </attribute> </element> <element name="compass" equivClass="instrument" > <attribute name="units" type="" use="default" value="degrees" > <enumeration value="deg" /> <enumeration value="degrees" /> <enumeration value="grads" /> <enumeration value="mils" /> <enumeration value="minutes" /> </attribute> </element> <element name="clinometer" equivClass="instrument" > <attribute name="units" type="" use="default" value="degrees" > <enumeration value="deg" /> <enumeration value="degrees" /> <enumeration value="grads" /> <enumeration value="mils" /> <enumeration value="percent" /> </attribute > </element> <element name="theodolite" equivClass="instrument" > <attribute name="units" type="" use="default" value="degrees" > <enumeration value="deg" /> <enumeration value="degrees" /> <enumeration value="grads" /> <enumeration value="mils" /> <enumeration value="minutes" /> <enumeration value="seconds" /> </attribute> </element> <element name="topofil" equivClass="instrument" > <attribute name="units" type="" use="default" value="metres" > <enumeration value="metres" /> <enumeration value="feet" /> <enumeration value="yards" /> </attribute> </element> <!-- Depth Gauges are used in cave diving surveys --> <element name="depthGauge" equivClass="instrument" > <attribute name="units" use="default" value="metres" > <enumeration value="metres" /> <enumeration value="feet" /> <enumeration value="yards" /> </attribute> </element>
The instruments are optional but can only appear once.
The individual instruments can
have text content (#PCDATA
) which can be used to hold further information
about them such as detailed descriptions.
Most instruments will common attributes as shown in Table [->]. Note that the DTD declarations for these attributes makes use of several parameter entities.
Attribute | Description |
ID | A unique identification for the instrument. It may be the serial number or some other string eg. ``T3'' for a tape |
ZERO_CORRECT | The length correction to be applied to this instrument to bring it back to zero |
SCALE_CORRECT | The scale multiplication correction to be applied to this instrument |
ACCURACY | Equivalent to the standard deviation expected from this type of instrument |
UNITS | The units of the instrument eg. ``metres'' |
DESCRIPTION | Further description of the instrument eg. ``Jims Stfford's tape'' |
If an attribute name is given an attribute type of ID then the value of that name must be unique amoung all attributes of type ID. Attributes type of ID can never have fixed default values and only one attribute per element can be of type ID.
The entity parameter %instrument_id;
expands to ``ID ID #IMPLIED
''
and is a unique reference for that particular instrument. It is not required but
many caving clubs will identify their instruments especially if some
instruments require zero or scale corrections applied. The ID can be any
string of characters but it must start with a letter. See below for a
discussion of the problem that this XML restriction creates.
To refer to this instrument in an XML document we can use an IDREF
attribute. These attribute values must match the value of an ID
attribute for an element in the same XML document. The entity parameter
%used;
expands to ``USED IDREF #IMPLIED
''
Example: If we had:
<TAPE ID="SUSS10">Fibreglass tape with 6cm missing off end.</TAPE>
Later in the document we could specify that that instrument was used for a particular series by writing:
<TAPE USED="SUSS10" /> We should not have to add any information again about the zero correction factors, accuracy or description.
Problem 1: If a Suunto Twin is used the compass and clino are one unit and any identification will pertain to both ``instruments'', however the compass and clino ID's cannot be the same in one XML document. Options for resolving this are:
COMP_CLINO
combined element with a single ID
Problem 2: The XML specification states that the value of an ID attribute must begin with a letter but otherwise can be composed of letters, digits, hyphens, underscores and the full stop character. What if the ID of an instrument, inscribed onto it, starts with a digit? This would not be an uncommon situation. One way be be to prefix all instruments with numeric ID's with the club name like SUSS6.
Value = ( Reading - ZeroError ) * Scale
Be careful about the sign of the zero error. As in Survex it is the amount needed to correct the reading you got to zero. If the tape measure has the end missing, and you are using the 30cm mark to take all measurements from, then correct it with:
<INSTRUMENTS> <TAPE ZERO_CORRECT="+0.3">30m fibreglass with end missing </TAPE> </INSTRUMENTS>
Bearings in degrees, minutes and seconds are not implemented nor are units such as grads or mils. [360 degrees = 400 grads (also known as Mils)] Does anyone need these?
Descriptive information about the instruments is currently included as element content like so:
<INSTRUMENTS> <TAPE ZERO_CORRECT="+0.3">30m fibreglass with end missing </TAPE> </INSTRUMENTS>
However it could be placed into a DESC
description attribute defined in
the DTD like this:
<!ATTLIST TAPE DESC CDATA #IMPLIED >
and the XML document would then look like:
<INSTRUMENTS> <TAPE ZERO_CORRECT="+0.3" DESC="30m fibreglass with end missing"</TAPE> </INSTRUMENTS>
Questions
I really don't require any set order here but the commas force a sequence.
<!ELEMENT INSTRUMENTS ( TAPE?, COMPASS?, CLINOMETER?,
TOPOFIL?, THEODOLITE?, DEPTHGAUGE?) >
<mysurvey.xml instruments>= (<-U) <INSTRUMENTS> <TAPE NAME="SUSS10" >30m fibreglass</TAPE> <TAPE NAME="SUSS11" ZERO_CORRECT="+0.1" >30m fibreglass</TAPE> <TAPE NAME="JILL">6m steel</TAPE> <COMPASS ID="SUSS1">Suunto Twin</COMPASS> <CLINO ID="SUSS2"/> </INSTRUMENTS>
[*] Top This element contains the survey or traverse data. You can have one series for each survey trip into a cave or organise it so that each series is a particular section of the cave or some other system again.
<cavesurvey.dtd>+= [<-D->] <!-- ============ The Survey Data ============ --> <!ELEMENT SERIES ( (SURVEYORS?, INSTRUMENTS?)?, (STN+, SHOT+)+, XSECT*, SERIES* > <!ATTLIST SERIES NAME CDATA #REQUIRED DATE CDATA #IMPLIED DECLINATION CDATA #IMPLIED
DefinesSERIES
(links are to index).
<cavesurvey.schema>+= [<-D->] <!-- ============ The Survey Data ============ --> <complexType name="surveySeries" > <sequence> <sequence> <element name="surveyors" type="surveyors" minOccurs="0" maxOccurs="unlimited" /> <element name="instruments" type="instruments" minOccurs="0" maxOccurs="unlimited" /> <element name="status" type="status" minOccurs="0" maxOccurs="unlimited" /> </sequence> <sequence> <element name="station" type="station" maxOccurs="unlimited" /> <element name="shot" type="shot" maxOccurs="unlimited" /> </sequence> <element name="x_sect" type="x_sect" minOccurs="0" maxOccurs="unlimited" /> <element name="surveySeries" type="surveySeries" minOccurs="0" maxOccurs="unlimited" /> </sequence> <attribute name="description" /> <attribute name="date" type="date" minOccurs="0" /> <attribute name="declination" type="double" minOccurs="0" /> </complexType> <simpleType name="status" > <enumeration value="raw data" /> <enumeration value="calibrated data" /> <enumeration value="fixed data" /> <enumeration value="preliminary" /> <enumeration value="fully processed" /> </simpleType>
The series element encloses survey data in the same way that Survex's *begin and *end does. A series can specify the surveyors for that particular series and its own instrument settings. As can be seen from the DTD these included elements are optional but if they are used they must appear first---before any survey information---and they can only occur once.
The station and shot information comes second. There must be at least one instance of a station/shot group. Normal compass and tape data consists of at least two stations and one shot. Unfortunately XML does not allow one to specify that at least two of something is required.
Note that the element SERIES
can also contain further SERIES
elements. This
is how CaveScript CaveSurvey files implement hierarchical station
naming like Survex.
All survey series must contain at least one station though usually a series would contain many stations and the shots between them.
<cavesurvey.dtd>+= [<-D->] <!ELEMENT STN (#PCDATA)> <!ATTLIST STN NAME CDATA #REQUIRED EAST CDATA #IMPLIED NORTH CDATA #IMPLIED HEIGHT CDATA #IMPLIED ERROR_E CDATA #IMPLIED ERROR_N CDATA #IMPLIED ERROR_H CDATA #IMPLIED >
<cavesurvey.schema>+= [<-D->] <complexType name="station" > <element name="name" type="string" maxOccurs="1" /> <element name="east" type="double" maxOccurs="1" /> <element name="north" type="double" maxOccurs="1" /> <element name="vertical" type="double" maxOccurs="1" /> <!-- altitude, depth, rel, abs --> <element name="error_east" type="double" minOccurs="0" maxOccurs="1" /> <element name="error_north" type="double" minOccurs="0" maxOccurs="1" /> <element name="error_vertical" type="double" minOccurs="0" maxOccurs="1" /> </complexType>
<cavesurvey.schema>+= [<-D->] <complexType name="grade" > <attribute name="type" > <enumeration value="ASF" /> <enumeration value="BCRA" /> <enumeration value="CRG" /> </attribute> <attribute name="level_line" > <enumeration value="2" /> <enumeration value="4" /> <enumeration value="5" /> </attribute> <attribute name="level_detail" > <enumeration value="a" /> <enumeration value="b" /> <enumeration value="c" /> <enumeration value="d" /> </attribute> </complexType>
Aside: The station NAME
is of type CDATA
whereas it would have been
nicer to have it of type ID
. There are two problems with this;
ID
's in XML must start with a letter so a
station with an ID="10"
is illegal---we must have something
like ID="A10"
.
ID
value must be unique amoung all attributes of type
ID
within the document. Yet in Survex data we can have two
or more stations of name ``1'' if they are in a different SERIES
.
Note: To satisfy the latter requirement
perhaps a prefix defined as an parameter entity could be added to the DTD.
<!ENTITY % prefix "A">
<!ENTITY % affix "Z">
A fixed station determined via RDF or theodolite would have EAST
,
NORTH
and HEIGHT
values set. A constrained station might only
have EAST
and NORTH
specified.
The attributes ERROR_E
, ERROR_N
and ERROR_H
are the errors or standard deviations
in the Easting, Northing and Height.
The station description is provided via the parsed character data within the
element.
Example: <STN NAME="87">Cusp of rock in narrow passage."</STN>
Question: What about control points specified as Long and Lat?
<cavesurvey.dtd>+= [<-D->] <!ELEMENT SHOT (#PCDATA)> <!ATTLIST SHOT FROM CDATA #REQUIRED TO CDATA #REQUIRED DIST CDATA #IMPLIED AZIM (CDATA | - ) #IMPLIED ELEV (CDATA | UP | DOWN ) #IMPLIED ERROR_DIST CDATA #IMPLIED ERROR_AZIM CDATA #IMPLIED ERROR_ELEV CDATA #IMPLIED > <![%ifdiving; [ <!ATTLIST SHOT FROMDEPTH CDATA #IMPLIED TODEPTH CDATA #IMPLIED ERROR_DEPTH CDATA #IMPLIED ]]>
DefinesSHOT
(links are to index).
<cavesurvey.schema>+= [<-D->] <element name="shot" type="shot" /> <complexType name="shot" abstract="true" > <attribute name="from" type="station.name" use="required" /> <attribute name="to" type="station.name" use="required" /> <attribute name="dist" type="double" minInclusive="0"/> <attribute name="azim" type="double" minInclusive="0" maxInclusive="360" /> <attribute name="error_dist" type="double" minInclusive="0" /> <attribute name="error_azim" type="double" minInclusive="0" /> <choice > <element name="standardLeg" /> <element name="divingLeg" /> </choice> </complexType> <element name="standardLeg" equivClass="shot" > <attribute name="incl" type="double" use="required" minInclusive="-90" maxInclusive="90" /> <attribute name="error_i" type="double" minInclusive="0" /> </element> <element name="divingLeg" equivClass="shot" > <attribute name="fromDepth" type="double" use="required" /> <attribute name="toDepth" type="double" use="required" /> <attribute name="error_v" type="double" minInclusive="0" /> </element>
Shots can contain parsed character data which will usually be any comments about that shot.
Example:
<SHOT FROM="88" TO="88a" AZIM="7.18" ELEV="+10" DIST="44.5" > This shot was a bit difficult.</SHOT>
The first five attributes must be present (#REQUIRED
).
Note: If the station names could have been of type ID then the FROM
and TO
attribute types could be IDREF
which means that the value must
match the value of an ID
attribute for an element in the document---in this
case a STN
element.
Some surveyors may prefer alternative terms to DIST
, AZIM
and ELEV
such as those in the following table:
Name | Element Name | Alternative Element Names |
Distance | DIST | LENGTH |
Azimuth | AZIM | COMP , BEARING , |
Elevation | ELEV | CLINO , INCLINE , GRADIENT
|
Note that angles can be expressed as an azimuth, a bearing or less commonly as mils or grads. Azimuthal readings go from 0 to 360^ whereas bearings only range from 0 to 90^ because they are the angle around a quadrant of the full circle. An example would be South 30 degrees East, written S-45^-E.
The ERROR_DIST
, ERROR_AZIM
and ERROR_ELEV
allows the surveyor to
downgrade a shot compared to what is currently specified.
Question: How can we ensure that a higher accuracy than what's possible with the given instruments can't be given here?
Not supported yet is:
Normal/Spherical Polar | FromStn ToStn Dist Bearing Elev |
Diving | FromStn ToStn Dist Bearing FromDepth ToDepth |
Topofil | FromStn ToStn FromCount ToCount Bearing Elev |
Theodolite | AtStn IncludedAngle ElevBack ElevFore Dist
|
The Left, Right, Up and Down (LRUD) information that is often collected by surveyors can be viewed as cross section information where only four data points are recorded. CaveScript uses a generic cross section element to store all cross sections.
<cavesurvey.dtd>+= [<-D->] <!ELEMENT XSECT (XDATA)+> <!ELEMENT XDATA (#PCDATA) > <!ATTLIST XDATA SHOT CDATA #REQUIRED POS (#PCDATA | start | end)* #REQUIRED STRIKE CDATA #IMPLIED DIP CDATA #IMPLIED >
DefinesXDATA
,XSECT
(links are to index).
<cavesurvey.schema>+= [<-D->] <element name="x_sect" type="x_sect" /> <complexType name="x_sect" > <element name="points" maxOccurs="unlimited" /> </complexType> <complexType name="points" > <attribute name="shot" type="shot" use="required" /> <attribute name="position" type="double" /> <attribute name="strike" type="double" minInclusive="0" maxInclusive="360" /> <attribute name="dip" type="double" minInclusive="-90" maxInclusive="90" /> </complexType>
Cross section data is stored as character content within XDATA
elements as
sequences of equispaced measurements in the form r1 r2 r3... where 0
<=theta<=360^. Attributes of the XDATA
element specify the
orientation of the cross section.
Defaults: STRIKE
="Bearing of the shot" and DIP="0.0"
Note: Nicer to have had the SHOT
attribute type as IDREF
s
which means that there must be at least one value (though in this case there
will be two only) and the values must match ID
values in the document---in
this case a STN
elements.
2.0 2.0 ^ 0.8 ^ 0.9 | | <--0.5 1.5--> <--0.5 1.5--> | | | 0.9 | 1.0 0.4 0.4 LURD Info only 8 measurements taken, equispaced
<DATA SHOT="88 88a" POS="start">
at start of leg ie. at station 88
<DATA SHOT="88 88a" POS="2.0">
at position 2.0 metres along leg
<DATA SHOT="88 88a" POS="end">
at end of leg ie. at station 88a
The application needs to ensure that the value given for POS
is less than
or at most equal to the DIST
of the shot.
<mysurvey.xml series>= (<-U) <SERIES NAME="sigma"> <STN NAME="85">Cusp (lower one) of rock at base of 1st drop.</STN> <STN NAME="86">Cusp of rock at apex or corner in passage.</STN> <STN NAME="87">Cusp of rock in narrow passage.</STN> <STN NAME="88">Cusp of rock 1m above stream bed.</STN> <STN NAME="89">Southerly-most end of ridge of rock at waist height.</STN> <STN NAME="90" EAST="0.5" NORTH="2.0" HEIGHT="0.0">Fixed by RDF location.</STN> <SHOT FROM="86" TO="85" DIST="5.42" AZIM="328" ELEV="+43" /> <SHOT FROM="87" TO="86" DIST="2.16" AZIM="0.0" ELEV="+22" /> <SHOT FROM="88" TO="87" DIST="5.90" AZIM="343" ELEV="+1" /> <SHOT FROM="88" TO="89" DIST="3.71" AZIM="180" ELEV="-3" /> <SHOT FROM="90" TO="89" DIST="10.3" AZIM="69" ELEV="-7" /> <SERIES NAME="sigma2" DATE="1999-11-02" DECLINATION="-12.2"> <?CAVERN PROCESS="NO" ?> Upstream section from the top of "Fallaway Drop" at the end of "Gamma Grovel" (which is the far end of the Pointed Finger chamber) then along the streamway to "Knockers Cavern Two". <SURVEYORS> <SURVEYOR NAME="Phil Maynard" /> <SURVEYOR NAME="Geoff McDonnell" /> </SURVEYORS> <INSTRUMENTS> <TAPE USED="SUSS11" >Old 30m fibreglass one</TAPE> <TAPE>A 6m metal tape was used for cross sections.</TAPE> </INSTRUMENTS> <SHOT FROM="88" TO="88a" DIST="7.18" AZIM="10" ELEV="+44.5" > This shot was a bit difficult.</SHOT> <SHOT FROM="88b" TO="88a" DIST="3.6" AZIM="225" ELEV="-55" /> <SHOT FROM="88b" TO="88c" DIST="7.3" AZIM="82" ELEV="+43" /> <SHOT FROM="88b" TO="88d" DIST="2.84" AZIM="92" ELEV="+37" /> <SHOT FROM="88b" TO="88e" DIST="3.08" AZIM="62" ELEV="+39" /> <XSECT> <XDATA SHOT="88 88a" POS="start"> 0.5 2.0 1.5 0.4 </XDATA> <XDATA SHOT="88 88a" POS="2.0"> 0.5 0.8 2.0 0.9 1.5 1.0 0.4 0.9 </XDATA> <XDATA SHOT="88 88a" STRIKE="30" DIP="-45" POS="end"> 0.5 2.0 1.5 0.4 </XDATA> </XSECT> </SERIES> </SERIES>
<cavesurvey.dtd>+= [<-D] <!ELEMENT EQUATE EMPTY > <!ATTLIST EQUATE XML:LINK CDATA #REQUIRED >
<cavesurvey.schema>+= [<-D->] <element name="equate" type="equate" /> <complexType name="equate" > <attribute name="equate" type="xml:link" use="required" /> </complexType>
Finally append the remaining element end tags.
<cavesurvey.schema>+= [<-D] <cavesurvey.schema end>
Survey Comments
TODO: How do I allow text information such as comments that might occur in other survey programs to be imported into the XML files?
In Survex such comments are preceded by a semi-colon (;) whilst in Walls
comments are preceded by a hash (#
). In a cave survey XML document
such comments would not be required as all information should be contained as
content within appropriate elements.
I could have CDATA
sections to include such information or a specific element
called COMMENT
which contains the comments.
Eg. From Survex a comment such as:
; This was a difficult leg.
would become either
<![CDATA [; This was a difficult leg.] ]> or
<COMMENT>; This was a difficult leg.</COMMENT>
TODO: I need to ensure that when converting older Survex or Walls data that their comments don't contain the characters; & < > " or '.
Processing Instructions
Example: <?CAVERN PROCESS="NO" ?>
The ``target'' application is CAVERN
and a processing instruction embedded
within the XML file instructs the application not to process the series that
is in scope.
noweb
Usage
Top
The package noweb
is a literate programmming tool where code chunks
are interspersed with the documentation that describes them. Invoking noweb
with the appropriate options generates either the documentation in LaTeX
(this documentation) or assembles the code from the code chunks.
Table [->] shows the noweb
commands used to extract the code
and documentation. The options supplied to noweb
are listed in
Table [->]. The Makefile
is a more convenient way to
perform the same tasks and is covered on page [->].
Defined code chunks in this noweb document
are listed in Section [->].
To create: | Run: |
Makefile: | notangle -t4 -RMakefile CaveSurvey.nw > Makefile |
CaveSurvey DTD: | notangle -t4 -Rcavesurvey.dtd CaveSurvey.nw > CaveSurvey.dtd |
CaveSurvey schema: | notangle -t4 -Rcavesurvey.schema CaveSurvey.nw > CaveSurvey.xsd |
Example XML file: | notangle -t4 -Rmysurvey.xml CaveSurvey.nw > mysurvey.xml |
LaTeX documentation | noweave -t4 -delay -index CaveSurvey.nw > CaveSurvey.tex |
latex CaveSurvey.tex | |
HTML documentation | noweave -html -filter l2h -x CaveSurvey.nw | htmltoc > CaveSurvey.html |
All the files: | noweb CaveSurvey.nw
|
notangle/noweave options | |
-t4 | Copy tabs untouched from input to output, and use tabs for indentation. |
Tabs get set to 8 by default in noweb. | |
-R | extracts root chunk from noweb file |
notangle/noweave
usage
[*]
Top Throughout the dvi and Postscript documentation (not the HTML documentation) you will see that each chunk of code is uniquely identified by a page number and an alphabetic sub-page reference. An example is:
10b <cavesurvey.dtd 9>+=== (15) 10a 11
This line tells us that we are now in code chunk 10b. This code chunk is on page 10 and it is the second code chunk defined on this page.
The construct <cavesurvey.dtd 9>+=== tells us that we are in a code chunk called cavesurvey.dtd, that its definition began in chunk 9 and the +=== means we are adding to its definition (noweb concatenates definitions with the same name in order of appearance).
At the right margin we find: (15) 10a 11
This tells us that the chunk we're defining is used within chunk 15, and that this current chunk is continued from chunk 10a and is continued in chunk 11.
At the end of each code chunk a %def is be used to define any variables within that code chunk that we want to cross reference. These defined variables get listed in the noweb index with a page number to where they were defined. The LaTeX hyperref package is being used so this page number will be a hyperlink and show as underlined.
Any defined variables enclosed in double square brackets like this
[[variable]]
in the documentation text becomes a hyperlink,
again to the place where that variable is defined.
Top
The following Makefile provides a convenient way to create or update the code
or documentation after modifications to the noweb
source file rather than
typing all the notangle
or noweave
commands. Code or documentation
changes are done by making the modifications in the noweb
source file and
running the appropriate make
command.
To extract the Makefile:
notangle -t4 -RMakefile CaveSurvey.nw > Makefile
Run ``make help
'' to see what options there are.
For instance, after making changes to any of the DTD's via the
noweb
source file I run ``make dvi
'' to see my changes in xdvi
or
do a ``make dtd
'' to create the up-to-date DTD's. One generally never
changes the output files directly (except for quick hacks).
<Makefile>= ################################################################# # Makefile for creating CaveSurvey DTD's and their Documentation ################################################################# # The noweb source file CAVESURVEY_NOWEB_SOURCE = CaveSurvey.nw CAVEMAP_NOWEB_SOURCE = CaveMap.nw # If the user just types 'make' with no args then help, being the # first routine will be invoked. help: @echo 'Usage: make [dtd schema examples dvi ps html all clean]' # Create DTDs dtd: $(CAVESURVEY_NOWEB_SOURCE) $(CAVEMAP_NOWEB_SOURCE) @echo 'Creating CaveSurvey.dtd ...' notangle -t4 -Rcavesurvey.dtd $(CAVESURVEY_NOWEB_SOURCE) | cpif CaveSurvey.dtd # notangle -t4 -Rcavemap.dtd $(CAVEMAP_NOWEB_SOURCE) | cpif CaveMap.dtd # Create Schemas schema: $(CAVESURVEY_NOWEB_SOURCE) @echo 'Creating CaveSurvey.xsd ...' notangle -t4 -Rcavesurvey.schema $(CAVESURVEY_NOWEB_SOURCE) | cpif CaveSurvey.xsd # Create examples examples: $(CAVESURVEY_NOWEB_SOURCE) $(CAVEMAP_NOWEB_SOURCE) @echo 'Creating example XML files ...' notangle -t4 -Rmysurvey.xml $(CAVESURVEY_NOWEB_SOURCE) | cpif mysurvey.xml # notangle -t4 -Rmymap.xml $(CAVEMAP_NOWEB_SOURCE) | cpif mymap.xml # Create documentation dvi: $(CAVESURVEY_NOWEB_SOURCE) $(CAVEMAP_NOWEB_SOURCE) noweave -t4 -delay -index $(CAVESURVEY_NOWEB_SOURCE) >| CaveSurvey.tex @echo @echo 'Running "latex CaveSurvey.tex" ...' latex CaveSurvey.tex @echo @echo 'You may need to run latex again.' @echo @echo 'latex CaveSurvey.tex' @echo ps: dvi dvips CaveSurvey.dvi -o CaveSurvey.ps html: $(CAVESURVEY_NOWEB_SOURCE) $(CAVEMAP_NOWEB_SOURCE) noweave -html -filter l2h -index $(CAVESURVEY_NOWEB_SOURCE)\ | htmltoc >| CaveSurvey.html # noweave -html -filter l2h -index $(CAVEMAP_NOWEB_SOURCE)\ # | htmltoc >| CaveMap.html all: dtd schema ps html clean: # lintex rm -f *.aux rm -f *.dvi rm -f *.lof rm -f *.log rm -f *.lot rm -f *.toc
Accuracy is specified in asf5.xml
mysurvey.xml mysurvey.xml
Hierarchical Tagged Objects (HTO) Portable Data Format Specification, Version 1.4, December 15, Doug Dotson 1994
Inside XML DTDs, St Laurent and Biggar, McGraw-Hill, 1999
Literate Programming Using Noweb, Linux Journal, October 97, Issue 42, p64-69
The site "http://www.pault.com/Xmltube/dtdgen.html
" can generate a DTD for
a supplied XML document.
[1] Email: Subject: Re: text vs. comment and data format, Date: Sun, 04 Feb 2001, From: Devin Kouts
[2] Email:
Subject: Re: text vs. comment and data format,
Date: Sun, 4 Feb 2001,
From: Martin Laverty
[3] Email:
Subject: Re: text vs. comment and data format,
Date: Sun, 04 Feb 2001,
From: Michael Lake
Top
The following is a list of all the code chunks defined in this document.
References are interpreted as in the following example:
(cavesurvey.dtd 32a) 32a 32b: The code chunk cavesurvey.dtd
was defined on page 32. The a
means it was the first chunk referenced on that page. The 32a and 32b are
all pages which reference the chunk.
Top This is a list of all defined variables which are declared using %def.