CDFT: what goes in it/ source data
Julian Todd
julian@ncgraphics.co.uk
Thu, 11 Jan 2001 19:49:49 -0000
Hi there,
I'll just throw this idea in before anyone gets around to it.
Our cave surveying procedure involves coming out of
the cave with a damp waterproof notepad covered in mud on
which we have scribbled our measurements and a few
diagrams.
What we tend to do is copy the notes from the paper
neatly into a book, dry off the paper, stick it in an
envelope and staple the envelope to the same page of the book
(so we can verify suspicions of bad handwriting and so forth).
Survex has the advantage in that you can copy the text
of the book into an ascii file exactly as it is and
it will parse it. Survex has several commands which
set the mode so it can read whatever ridiculous out-of-order
(eg compass and clino the wrong way round, things spread over
two lines, or previous survey station assumed to be the next
survey station) method that you write your data down in in
the cave.
This proves that the conversion from your ascii notes
convention to any XML format can be done automatically; there
will be no point in converting them into XML by hand or --
more nuttily -- doing your original survey notes in the
cave in XML.
What I would propose is that the text in the survey notebook
is copied out character for character into a text window.
Then, using whatever parser you have for the job (eg the
general purpose Survex one set to the correct mode),
you convert this into an XML file containing all the weird
<shot></shot> commands and so forth, but you still store the
original source data in the file too inside one of the
following XML constructs (which works like a <pre></pre>
in html):
<source_survey form="SurvexStandard1999">
<![CDATA[
; survey text copied from the paper.
*units metres
A B 10 10 9
]]>
</source_survey>
--- followed by, perhaps,
<stations>
<station name="A"></station>
<station name="B"></station>
</stations>
<shots>
<shot from="A", to="B", tape="10", compass="10", clino="9"></shot>
</shots>
This text (in the CDATA[]), being a faithful representation of your notes
from the cave, will not change unless there has been a transcription
error or other blunder. After you have run your Survex parser and
extracted the data into XML notation you could delete it and
carry on without it. However, aside from trying to comply with
the dogma that one must Never Represent the Same Data Twice
Because it Might Clash, I would claim that nothing is really
gained by throwing it away. It is in fact serving the purpose of those
little envelopes of dried out notes you staple into your neat survey
book. In that form it is easy to compare and check for transcription
errors. And everyone can keep to their own quirky notational
habits without losing anything.
Julian Todd.