next up previous
Next: Discussion: Analysis Database Up: Sharing information on Speleothems Previous: Interested Parties

Case Study: Designing a Geological Specimen Database for the Cave Aragonites Project

In my project, ``Cave Aragonites of NSW'', I need to keep track of rock samples from various surface and underground sites. Some samples have been cut for thin sections; some are intact and others are powders. Conservation note: For this particular project, most of the Aragonite samples are pretty tiny. Further details are in Rowling (2000).

To keep samples in order, I mark the sample bags and containers with a marker pen as follows:

T/N or T/N/M

This forms a field sample ID where:

Cave tag number (eg J105) of cave (underground) or nearest cave (surface)
Sample number (eg 5)
Sub-sample number (eg 3).
So the complete field sample ID could be J105/5 or W52/1/2

The idea is to be able to mark the sample bags in the field under less than ideal conditions (eg poor light, poor weather). In my field notebook, I record more details about the sample such as:

Back home, I can then fill in more details about the sample as I look at it more closely under the microscope or perform some tests on it. If the sample is to be sectioned or broken up, sub-sample numbers are given to each of the pieces. For example, if sample J58/7 is to be thin sectioned, new sub-samples would be called J58/7/1, J58/7/2, J58/7/3 and so on.

Eventually all the samples will be catalogued and referenced using a small database which I will set up on my PC.

These are my proposed fields:

  1. Unique identifier consisting of 6 parts:

    1. Sequence number: Similar to the use of ``AUASF'' numbers as used by the Karst Index, however in this case there may be several samples per cave so there could potentially be a large number of sequence numbers. Each cave scientist may need to generate their own set of sequence numbers. This is because, unlike cave surveying, which is often done by several people in a club, cave mineral sampling is more often done by individuals as part of their university study. The sequence number would be something like:

      Country Code (eg AU)
      Researcher code (eg JR)
      Number which is assigned by the database program (eg 00005) which is unique for each of the researcher's samples. Are 5 digits sufficient? That is, 99999 samples per researcher?

    2. Cave Area Code
      eg 2J (for NSW, Jenolan Caves). This uses the same codes as the Karst Index Database.

    3. Cave Tag Number
      eg 105
      For surface samples, use the closest cave tag (or the most relevant).

    4. Separator /. Separators are not part of the database but they are used out in the field to label sample bags and other things. The separator would appear in listings though.

    5. Sample number
      eg 8
      Sample numbers go 1 .. n for each sampling site (ie cave).
      This information also goes on the sample bag.

    6. Sub Sample Number
      If the sample is derived from another, eg a thin section or a broken-off bit, an additional number identifies it.
      eg 1
      (use / as a separator again, same comments as before). The initial unbroken sample would be numbered 0, although it is unlikely that this number would actually be written on the sample bag. When a sample is broken up, the original sample no longer exists as such, and its sample bag may as well be numbered sub-sample 1.

  2. Map identification
    An alphanumeric sequence which is on a map showing where the sample was obtained from. As several samples can come from one place, I use a different set of numbers to indicate where the sample came from.

    The first part of this field is the map number which is based on the IUS definitions for Map Codes.
    eg 2J105.JR1
    Next comes a separator (again, not part of the database but it is part of the display)
    Then the sample site alphanumeric (as marked on the researcher's copy of the map)
    One problem is where there is more than one map sheet referencing the sample, eg plan view and side view. The map ID is therefore better suited to a list of map IDs referring to the sample.

  3. Surface, underground or entrance
    entrance area (twilight zone)

  4. Description of sample as per field notes. This would be a short text string (say 75 characters) copied from the field notebook, eg ``Red dolomitic? rock near survey point 2''

  5. Classification of sample. Other classifications could be added to this list.
    Organic deposit
    Other deposit

  6. Type. Tables are required for each of the sample classifications, eg:

  7. Sub-type. Tables of sub-types are required.

  8. Orientation (to magnetic north; can be corrected using date) Possibly we may wish to record the type of orientation (eg magnetic) and the date whe it was measured.

  9. List of minerals (this is added after sample is analysed). Use the list in Hill & Forti, using a number to represent the mineral eg 1 = ankerite etc.

  10. Date of sample using an internationally recognised date format. Local environment settings can be used to display the date in a way which is appropriate.

  11. Analysis X-ref
    This is a cross reference to the actual analyses of the material. Usually this is a publication or a report.
    eg ``Morphology, Crystallography and Origin of Needle-fibre Calcite in Quaternary Pedogenic Calcretes of South Australia'' by Phillips, S. E. and Self, P. G., Aust. J. Soil Res., vol 25 no 1 1987, pages 429-444.
    Alternatively, it could use its short reference and put the longer form of the reference in another database. I believe this was proposed some decades back for the Karst Index but never implemented.

  12. Analysis done
    Not formally analysed
    Optically analysed; see X-ref for more information
    Other analysis; see X-ref for more information
    Possibly this could be extended to list all types of analyses done on the material.

The next step for me will be to create some of the tables described above, and create a small database of my samples. The software tools for this will be MySQL (database) with a Web front end, all running on Linux. The back end software may be a mixture of PERL, HTML, and possibly PHP, all running with the Apache web server. The total cost of the software is nothing but my time. If and when I get it going it will be made available under an open source software license.

One thing that will no doubt be considered is the incorporation of this sort of information into a GIS so that spacial information can be related to samples.

I am not a GIS expert, however the short answer from those who are is ``yes - it can be done''. It is easier, apparently, to link a ``normal'' relational database (RDB) of information to a GIS than it is to link a running GIS to another database. The problem is that everything in a GIS is accessed spacially, whereas I am trying to record both spacial information (eg sample location) as well as non-spacial information (eg classification of mineral and analytical information).

There is a big push in information technology to web-enable applications. Rather than having a single large application to do everything, several applications work together to get the information to the person who needs it. Thus a web-enabled GIS could query a web-enabled RDB to give the right information.

Choice of GIS would be up to the user, however be aware that they are generally not cheap, typically several $1000.00 per seat. There is also a free GIS called ``GRASS'' which is used by parts of the US Military to depict spacial information. I will leave that as an exercise for the reader to investigate.

next up previous
Next: Discussion: Analysis Database Up: Sharing information on Speleothems Previous: Interested Parties