FoX documentation.

All in one page

Separate pages

Introduction

This document is the primary documentation for FoX, the Fortan/XML library. See below for other sources of documentation. It consists of:

Reference information on versions, standards compliance, and licensing.

Information about how to get up and running with FoX and how to use FoX in an existing project.

Finally, there is full API reference documentation.

Other documentation

This documentation is largely reference in nature. For new users it is best to start elsewhere:

iFaX workshops

Two workshops, entitled iFaX (Integrating Fortran and XML) have been run teaching the use of FoX, one in January 2007, and one in January 2008. The full documentation and lectures from these may be found at:

Tutorials

Out of the above workshops, some tutorial material has been written, focussing on different use cases. Currently two are available:

There is also tutorial information on the use of WKML here.

API documentation

COMMON interfaces

OUTPUT interfaces

INPUT interface

These documents describe all publically usable APIs.

Worked examples of the use of some of these APIs may be found in the examples/ subdirectory, and tutorial-style documentaion is available from the links above.

Other things


FoX versioning

This documentation describes version 4.1 of the FoX library.

This version includes output modules for general XML, and for CML; and a fully validating XML parser, exposed through a Fortran version of the SAX2 input parser and a Fortran mapping of the W3C DOM interface.

This is a stable branch, which will be maintained with important bugfixes.

FoX Changes

As of FoX-3.0, there is one user-visible change that should be noted.

Configuration/compilation

In previous versions of FoX, the configure script was accessible as config/configure. Version 3.0 now follows common practice by placing the script in the main directory, so it is now called as ./configure.

Previous versions of FoX made it quite hard to compile only portions of the library (eg only the CML output portion; or just the SAX input). This is now possible by specifying arguments to the configuration script. For example,

./configure --enable-wcml

will cause the generated Makefile to only compile the CML writing module and its dependencies.

See Compilation for further details.


Configuration and compilation

You will have received the FoX source code as a tar.gz file.

Unpack it as normal, and change directory into the top-level directory, FoX-$VERSION.

Requirements for use

FoX requires a Fortran 95 compiler - not just Fortran 90. All currently available versions of Fortran compilers claim to support F95. If your favoured compiler is not listed as working, I recommend the use of g95, which is free to download and use. And in such a case, please send a bug report to your compiler vendor.

In the event that you need to write a code targetted at multiple compilers, including some which have bugs preventing FoX compilation, please note the possibility of producing a dummy library.

Configuration

This should suffice for most installations. However:

  1. You may not be interested in all of the modules that FoX supplies. For example, you may only be interested in output, not input. If so, you can select which modules you want using --enable-MODULENAME where MODULENAME is one of wxml, wcml, wkml, sax, dom. If none are explicitly enabled, then all will be built. (Alternatively, you can exclude modules one at a time with --disable-MODULENAME) Thus, for example, if you only care about CML output, and not anything else: ./configure --enable-wcml

  2. If you have more than one Fortran compiler available, or it is not on your PATH, you can force the choice by doing:

    ./configure FC=/path/to/compiler/of/choice

  3. It is possible that the configuration fails. In this case

    • please tell me about it so I can fix it
    • all relevant compiler details are placed in the file arch.make; you may be able to edit that file to allow compilation. Again, if so, please let me know what you need to do.
  4. By default the resultant files are installed under the objs directory. If you wish them to be installed elsewhere, you may do

    ./configure --prefix=/path/to/installation

Note that the configure process encodes the current directory location in several places. If you move the FoX directory later on, you will need to re-run configure.

Compilation

In order to compile the full library, now simply do:

make

This will build all the requested FoX modules, and the relevant examples

Testing

In the full version of the FoX library, there are several testsuites included.

To run them all, simply run make check from the top-level directory. This will run the individual testsuites, and collate their results.

If any failures occur (unrelated to known compiler issues, see the up-to-date list), please send a message to the mailing list (fox-discuss@googlegroups.com) with details of compiler, hardware platform, and the nature of the failure.

The testsuites for the SAX and DOM libraries are very extensive, and are somewhat fragile, so are not distributed with FoX. Please contact the author for details.

Linking to an existing program

A script is provided which will provide the appropriate compiler and linker flags for you; this will be created after configuration, in the top-level directory, and is called FoX-config. It may be taken from there and placed anywhere.

FoX-config takes the following arguments:

If it is called with no arguments, it will expand to compile & link flags, thusly:

f95 -o program program.f90 `FoX-config`

For compiling only against FoX, do the following:

f95 -c `FoX-config --fcflags` sourcefile.f90

For linking only to the FoX library, do:

f95 -o program `FoX-config --libs` *.o

or similar, according to your compilation scheme.

Note that by default, FoX-config assumes you are using all modules of the library. If you are only using part, then this can be specified by also passing the name of each module required, like so:

FoX-config --fcflags --wcml

Compiling a dummy library

Because of the shortcomings in some compilers, it is not possible to compile FoX everywhere. Equally, sometimes it is useful to be able to compile a code both with and without support for FoX (perhaps to reduce executable size). Especially where FoX is being used only for additional output, it is useful to be able to run the code and perform computations even without the possibility of XML output.

For this reason, it is possible to compile a dummy version of FoX. This includes all public interfaces, so that your code will compile and link correctly - however none of the subroutines do anything, so you can retain the same version of your code without having to comment out all FoX calls.

Because this dummy version of FoX contains nothing except empty subroutines, it compiles and links with all known Fortran 95 compilers, regardless of compiler bugs.

To compile the dummy code, use the --enable-dummy switch. Note that currently the dummy mode is not yet available for the DOM module.

Alternative build methods

The "-full" versions of FoX are also shipped with files to help compile the code on using other systems using CMake or from within Microsoft Visual Studio. Brief instructions for using these files are below.

CMake

CMake does not build software itself but generates makefiles or projectfiles (depending on the platform), that are then used to compile the software, it should thus be a cross platform method for building FoX (in theory at least).

Files needed for building FoX with CMake are included in the "-full" distribution. These can:

However, CMake cannot, at present, build the run the test suite or the packaging scripts used for release. To build FoX with CMake the following is needed:

CMake Build instructions (linux): Once you installed cmake, go to the main directory of fox and create a build directory, and from there, execute cmake thus:

cd fox/

mkdir build/ && cd build/

cmake ../

make -j

Libaries and module files can then be found in the subdirectories of build.

Windows

It is also possible to build FoX from within Microsoft Visual Studio and the file FoX.vfproj contains a Visual Studio project for Intel Fortran to simplify this process. At time of writing, it is compatible with Visual Studio 2011 and Intel Visual Fortran Composer XE 2011.

The project will build FoX in one of the four configurations: Win32/x64 and debug/release. When building FoX for a specific configuration, an output library file Fox_debug.lib or Fox.lib and associated modules are created in a folder in a relative path ../lib or ../libx64 respectively.

For a given configuration in in your application project you will then need to:

  1. In "Fortran" "General" "Additional Include Directories" add the respective modules folder (generated above)

  2. In "Linker" "General" "Additional library directories" add the path to the respective lib or libx64 folder.

  3. In "Linker" "Input" "Additional dependencies" add Fox_debug.lib or FoX.lib respectively.

Your application should now be able to build and link with FoX.


Using FoX in your own project.

The recommended way to use FoX is to embed the full source code as a subdirectory, into an existing project.

In order to do this, you need to do something like the following:

  1. Put the full source code as a top-level subdirectory of the tree, called FoX.
  2. Incorporate calls to FoX into the program.
  3. Incorporate building FoX into your build process.

To incorporate into the program

It is probably best to isolate use of XML facilities to a small part of the program. This is easily accomplished for XML input, which will generally happen in only one or two places.

For XML output, this can be more complex. The easiest, and least intrusive way is probably to create a F90 module for your program, looking something like example_xml_module.f90

Then you must somewhere (probably in your main program), use this module, and call initialize_xml_output() at the start; and then end_xml_output() at the end of the program.

In any of the subroutines where you want to output data to the xml file, you should then insert use example_xml_module at the beginning of the subroutine. You can then use any of the xml output routines with no further worries, as shown in the examples.

It is easy to make the use of FoX optional, by the use of preprocessor defines. This can be done simply by wrapping each call to your XML wrapper routines in #ifdef XML, or similar. Alternatively, the use of the dummy FoX interfaces allows you to switch FoX on and off at compile time - see Compilation.

To incorporate into the build process:

Configuration

First, FoX must be configured, to ensure that it is set up correctly for your compiler. (See Compilation) If your main code has a configure step, then run FoX's configure as part of it.

If your code doesn't have its own configure step, then the first thing that "make" does should be to configure FoX, if it's not already configured. But that should only happen once; every time you make your code thereafter, you don't need to re-configure FoX, because nothing has changed. To do that, put a target like the following in your Makefile.

FoX/.config:
        (cd FoX; ./configure FC=$(FC))

(Assuming that your Makefile already has a variable FC which sets the Fortran compiler)

When FoX configure completes, it "touch"es a file called FoX/.config. That means that whenever you re-run your own make, it checks to see if FoX/.config exists - if it does, then it knows FoX doesn't need to be re-configured, so it doesn't bother.

Compilation of FoX

Then, FoX needs to be compiled before your code (because your modules will depend on FoX's modules.) But again, it only needs to be compiled once. You won't be changing FoX, you'll only be changing your own code, so recompiling your code doesn't require recompiling FoX.

So, add another target like the following;

FoX/.FoX: FoX/.config
        (cd FoX; $(MAKE))

This has a dependency on the configure script as I showed above, but it will only run it if the configure script hasn't already been run.

When FoX is successfully compiled, the last thing its Makefile does is "touch" the file called FoX/.FoX. So the above target checks to see if that file exists; and if it does, then it doesn't bother recompiling FoX, because it's already compiled. On the very first time you compile your code, it will cd into the FoX directory and compile it - but then never again.

You then need to have that rule be a dependency of your main target; like so:

  MyExecutable: FoX/.FoX

(or whatever your default Makefile rule is).

which will ensure that before MyExecutable is compiled, make will check to see that FoX has been compiled (which most of the time it will be, so nothing further will happen). But the first time you compile your code, it will call the FoX target, and FoX will be configured & compiled.

Compiling/linking your code

You should add this to your FFLAGS (or equivalent - the variable that holds flags for compile-time use.

FFLAGS=-g -O2 -whatever-else $$(FoX/FoX-config --fcflags)

to make sure that you get the path to your modules. (Different compilers have different flags for specifying module paths; some use -I, some use -M, etc, if you use the above construction it will pick the right one automatically for your compiler.)

Similarly, for linking, add the following to your LDFLAGS (or equivalent - the variable that holds flags for link-time use.)

LDFLAGS=-lwhatever $$(FoX/FoX-config --libs)

(For full details of the FoX-config script, see Compilation)

Cleaning up

Finally - you probably have a clean target in your makefile. Don't tie FoX into this target - most of the time when you make clean, you don't want to make clean with FoX as well, because there's no need - FoX won't have changed and it'll take a couple of minutes to recompile.

However, you can add a distclean (or something) target, which you use before moving your code to another machine, that looks like:

distclean: clean
        (cd FoX; $(MAKE) distclean)

and that will ensure that when you do make distclean, even FoX's object files are cleaned up. But of course that will mean that you have to reconfigure & recompile FoX next time you compile your code


Standards compliance

FoX is written with reference to the following standards:

[XML10]: http://www.w3.org/TR/REC-xml/

[XML11]: http://www.w3.org/TR/xml11

[Namespaces10]: http://www.w3.org/TR/xml-names

[Namespaces11]: http://www.w3.org/TR/xml-names11

[xml:id]: http://www.w3.org/TR/xml-id/

[xml:base]: http://www.w3.org/TR/xmlbase/

[CanonicalXML]: http://www.w3.org/TR/xml-c14n

[SAX2]: http://saxproject.org

[DOM1]: http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html

[DOM2]: http://www.w3.org/TR/DOM-Level-2-Core/

[DOM3]: http://www.w3.org/TR/DOM-Level-3-Core/

In particular:

For exceptions, please see the relevant parts of the FoX documentation.


FoX_common

FoX_common is a module exporting interfaces to a set of convenience functions common to all of the FoX modules, which are of more general use.

Currently, there are three publically available functions and four subroutines:

It is fully described in StringFormatting

It is fully described in StringConversion

The final four procedures change the way that errors and warnings are handled when encounterd by any FoX modules. Using these procedures it is possible to convert non-fatal warnings and fatal errors to calls to the internal about routine. This generally has the effect of generating a stack trace or core dump of the program before temination. This is a global setting for all XML documents being manipulated. Two subroutines take a single logical argument to turn on (true) and off (false) the feature for warnings and errors respectivly:

and two functions (without arguments) allow the state to be checked:

Both fatal warnings and errors are off by default. This corresponds to the previous behaviour.


String handling in FoX

Many of the routines in wxml, and indeed in wcml which is built on top of wxml, are overloaded so that data may be passed to the same routine as string, integer, logical, real, or complex data.

In such cases, a few notes on the conversion of non-textual data to text is in order. The standard Fortran I/O formatting routines do not offer the control required for useful XML output, so FoX performs all its own formatting.

This formatting is done internally through a function which is also available publically to the user, str.

To use this in your program, import it via:

use FoX_common, only; str

and use it like so:

 print*, str(data)

In addition, for ease of use, the // concatenation operator is overloaded, such that strings can easily be formed by concatenation of strings to other datatypes. To use this you must import it via:

 use FoX_common, only: operator(//)

and use it like so:

 integer :: data
 print*, "This is a number "//data

This will work for all native Fortran data types - but no floating point formatting is available as described below with concatenation, only with str()

You may pass data of the following primitive types to str:

Scalar data

Character (default kind)

Character data is returned unchanged.

Logical (default kind)

Logical data is output such that True values are converted to the string 'true', and False to the string 'false'.

Integer (default kind)

Integer data is converted to the standard decimal representation.

Real numbers (single and double precision)

Real numbers, both single and double precision, are converted to strings in one of two ways, with some control offered to the user. The output will conform to the real number formats specified by XML Schema Datatypes.

This may be done in one of two ways:

  1. Exponential notation, with variable number of significant figures. Format strings of the form "sn" are accepted, where n is the number of significant figures.

    Thus the number 111, when output with various formats, will produce the following output:

s1 1e2
s2 1.1e2
s3 1.11e2
s4 1.110e2

The number of significant figures should lie between 1 and the number of digits precision provided by the real kind. If a larger or smaller number is specified, output will be truncated accordingly. If unspecified, then a sensible default will be chosen.

This format is not permitted by XML Schema Datatypes 1.0, though it is in 2.0

  1. Non-exponential notation, with variable number of digits after the decimal point. Format strings of the form "rn", where n is the number of digits after the decimal point.

    Thus the number 3.14159, when output with various formats, will produce the following output:

r0 3
r1 3.1
r2 3.14
r3 3.142

The number of decimal places must lie between 0 and whatever would output the maximum digits precision for that real kind. If a larger or smaller number is specified, output will be truncated accorsingly. If unspecified, then a sensible default will be chosen.

This format is the only one permitted by XML Schema Datatypes 1.0

If no format is specified, then a default of exponential notation will be used.

If a format is specified not conforming to either of the two forms above, a run-time error will be generated.

NB Since by using FoX or str, you are passing real numbers through various functions, this means that they must be valid real numbers. A corollary of this is that if you pass in +/-Infinity, or NaN, then the behaviour of FoX is unpredictable, and may well result in a crash. This is a consequence of the Fortran standard, which strictly disallows doing anything at all with such numbers, including even just passing them to a subroutine.

Complex numbers (single and double precision)

Complex numbers will be output as pairs of real numbers, in the following way:

(1.0e0)+i(1.0e0)

where the two halves can be formatted in the way described for 'Real numbers' above; only one format may be specified, and it will apply to both.

All the caveats described above apply for complex number as well; that is, output of complex numbers either of whose components are infinite or NaN is illegal in Fortran, and more than likely will cause a crash in FoX.

Arrays and matrices

All of the above types of data may be passed in as arrays and matrices as well. In this case, a string containing all the individual elements will be returned, ordered as they would be in memory, each element separated by a single space.

If the data is character data, then there is an additional option to str, delimiter which may be any single-character string, and will replace a space as the delimiter.

wxml/wcml wrappers.

All functions in wxml which can accept arbitrary data (roughly, wherever you put anything that is not an XML name; attribute values, pseudo-attribute values, character data) will take scalars, arrays, and matrices of any of the above data types, with fmt= and delimiter= optional arguments where appropriate.

Similarly, wcml functions which can accept varied data will behave similarly.


String conversion

Two procedures are provided to simplify reading data retreved from XML documents into Fortran variables. The subroutine rts performs the data conversion step and the function countrts can be used to allocate an array of the correct size for the incomming data.

rts subroutine

The rts subroutine can be imported from FoX_common. In its simplest form, it is called in this fashion:

call rts(string, data)

string is a simple Fortran string (probably retrieved from an XML file.)

data is any native Fortran datatype: logical, character, integer, real, double precision, complex, double complex, and may be a scalar, 1D or 2D array.

rts will attempt to parse the contents of string into the appropriate datatype, and return the value in data.

Additional information or error handling is accomplished with the following optional arguments:

num

num is an integer; on returning from the function it indicates the number of data items read before either:

iostat

iostat is an integer, which on return from the function has the values:

NB if iostat is not specified, and a non-zero value is returned, then the program will stop with an error message.

String formatting

When string is expected to be an array of strings, the following options are used to break string into its constituent elements:

Numerical formatting.

Numbers are expected to be formatted according to the usual conventions for Fortran input.

Complex number formatting.

Complex numbers may be formatted according to either normal Fortran conventions (comma-separated pairs) or CMLComp conventions

Logical variable formatting.

Logical variables must be encoded according to the conventions of XML Schema Datatypes - that is, True may be written as "true" or "1", and False may be written as "false" or "0".

countrts function

The countrts function can also be imported from FoX_common. In its simplest form, it is called in this fashion:

countrts(string, datatype)

string is a simple Fortran string (probably retrived from an XML file)

datatype is a scalar argument of any native Fortran datatype (logical, character, integer, real, double precision, complex or double complex).

The function returns a default integer equal to the number of elements that rts would return if called with a sufficently large array of the same type as datatype. countrts returns 0 to indicate that characters were found in the string that could not be converted. If datatype is a character, the optional arguments seperator and csv are avalable as described in "string formatting" above. The countrts function is pure and can be used as a specification function.


WXML

wxml is a general Fortran XML output library. It offers a Fortran interface, in the form of a number of subroutines, to generate well-formed XML documents. Almost all of the XML features described in XML11 and Namespaces are available, and wxml will diagnose almost all attempts to produce an invalid document. Exceptions below describes where wxml falls short of these aims.

First, Conventions describes the conventions use in this document.

Then, Functions lists all of wxml's publically exported functions, in three sections:

  1. Firstly, the very few functions necessary to create the simplest XML document, containing only elements, attributes, and text.
  2. Secondly, those functions concerned with XML Namespaces, and how Namespaces affect the behaviour of the first tranche of functions.
  3. Thirdly, a set of more rarely used functions required to access some of the more esoteric corners of the XML specification.

Please note that where the documentation below is not clear, it may be useful to look at some of the example files. There is a very simple example in the examples/ subdirectory, but which nevertheless shows the use of most of the features you will use.

A more elaborate example, using almost all of the XML features found here, is available in the top-level directory as wxml_example.f90. It will be automatically compiled as part of the build porcess.

Conventions and notes:

Conventions used below.

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

It is strongly recommended that the functions be used with keyword arguments rather than replying on implicit ordering.

Derived type: xmlf_t

This is an opaque type representing the XML file handle. Each function requires this as an argument, so it knows which file to operate on. (And it is an output of the xml_OpenFile subroutine) Since all subroutines require it, it is not mentioned below.

Function listing

Frequently used functions

Open a file for writing XML

By default, the XML will have no extraneous text nodes. This can have the effect of it looking slightly ugly, since there will be no newlines inserted between tags.

This behaviour can be changed to produce slightly nicer looking XML, by switching on pretty_print. This will insert newlines and spaces between some tags where they are unlikely to carry semantics. Note, though, that this does result in the XML produced being not quite what was asked for, since extra characters and text nodes have been inserted.

NB: The replace option should be noted. By default, xml_OpenFile will fail with a runtime error if you try and write to an existing file. If you are sure you want to continue on in such a case, then you can specify **replace**=.true. and any existing files will be overwritten. If finer granularity is required over how to proceed in such cases, use the Fortran inquire statement in your code. There is no 'append' functionality by design - any XML file created by appending to an existing file would be invalid.

Close an opened XML file, closing all still-opened tags so that it is well-formed.

In the normal run of event, trying to close an XML file with no root element will cause an error, since this is not well-formed. However, an optional argument, empty is provided in case it is desirable to close files which may be empty. In this case, a warning will still be emitted, but no fatal error generated.

Open a new element tag

Close an open tag

Add an attribute to the currently open tag.

By default, if the attribute value contains markup characters, they will be escaped automatically by wxml before output.

However, in rare cases you may not wish this to happen - if you wish to output Unicode characters, or entity references. In this case, you should set escape=.false. for the relevant subroutine call. Note that if you do this, no checking on the validity of the output string iis performed; the onus is on you to ensure well-formedness

The value to be added may be of any type; it will be converted to text according to FoX's formatting rules, and if it is a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

NB The type option is only provided so that in the case of an external DTD which FoX is unaware of, the attribute type can be specified (which gives FoX more information to ensure well-formedness and validity). Specifying the type incorrectly may result in spurious error messages)

Add text data. The data to be added may be of any type; they will be converted to text according to FoX's formatting rules, and if they are a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

Within the context of character output, add a (system-dependent) newline character. This function can only be called wherever xml_AddCharacters can be called. (Newlines outside of character context are under FoX's control, and cannot be manipulated by the user.)

Namespace-aware functions:

Add an XML Namespace declaration. This function may be called at any time, and its precise effect depends on when it is called; see below

Undeclare an XML namespace. This is equivalent to declaring an namespace with an empty URI, and renders the namespace ineffective for the scope of the declaration. For explanation of its scope, see below.

NB Use of xml_UndeclareNamespace implies that the resultant document will be compliant with XML Namespaces 1.1, but not 1.0; wxml will issue an error when trying to undeclare namespaces under XML 1.0.

Scope of namespace functions

If xml_[Un]declareNamespace is called immediately prior to an xml_NewElement call, then the namespace will be declared in that next element, and will therefore take effect in all child elements.

If it is called prior to an xml_NewElement call, but that element has namespaced attributes

To explain by means of example: In order to generate the following XML output:

 <cml:cml xmlns:cml="http://www.xml-cml.org/schema"/>

then the following two calls are necessary, in the prescribed order:

  xml_DeclareNamespace(xf, 'cml', 'http://www.xml-cml.org')
  xml_NewElement(xf, 'cml:cml')

However, to generate XML input like so: that is, where the namespace refers to an attribute at the same level, then as long as the xml_DeclareNamespace call is made before the element tag is closed (either by xml_EndElement, or by a new element tag being opened, or some text being added etc.) the correct XML will be generated.

Two previously mentioned functions are affected when used in a namespace-aware fashion.

The element or attribute name is checked, and if it is a QName (ie if it is of the form prefix:tagName) then wxml will check that prefix is a registered namespace prefix, and generate an error if not.

More rarely used functions:

If you don't know the purpose of any of these, then you don't need to.

Add XML declaration to the first line of output. If used, then the file must have been opened with addDecl = .false., and this must be the first wxml call to the document.o

NB The only XML versions available are 1.0 and 1.1. Attempting to specify anything else will result in an error. Specifying version 1.0 results in additional output checks to ensure the resultant document is XML-1.0-conformant.

NB Note that if the encoding is specified, and is specified to not be UTF-8, then if the specified encoding does not match that supported by the Fortran processor, you may end up with output you do not expect.

Add an XML document type declaration. If used, this must be used prior to first xml_NewElement call, and only one such call must be made.

Define an internal entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define an external entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define a parameter entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define a notation for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add an ELEMENT declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how elements may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add an ATTLIST declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how attributes may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add a reference to a Parameter Entity in the DTD. No check is made according to whether the PE exists, has been declared, or may legally be used.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add XML stylesheet processing instruction, as described in [Stylesheets]. If used, this call must be made before the first xml_NewElement call.

Add an XML Processing Instruction.

If data is present, nothing further can be added to the PI. If it is not present, then pseudoattributes may be added using the call below. Normally, the name is checked to ensure that it is XML-compliant. This requires that PI targets not start with [Xx][Mm][Ll], because such names are reserved. However, some are defined by later W3 specificataions. If you wish to use such PI targets, then set xml=.true. when outputting them.

The output PI will look like: <?name data?>

Add a pseudoattribute to the currently open PI.

Add an XML comment.

This may be used anywhere that xml_AddCharacters may be, and will insert an entity reference into the contents of the XML document at that point. Note that if the entity inserted is a character entity, its validity well be checked according to the rules of XML-1.1, not 1.0.

If the entity reference is not a character entity, then no check is made of its validity, and a warning will be issued

Functions to query XML file objects

These functions may be of use in building wrapper libraries:

Return the filename of an open XML file

Return the currently open tag of the current XML file (or the empty string if none is open)

Return the current value of pretty_print.

Set the current value of pretty_print to the NewValue. This may be useful in a mixed namespace document where pretty printing the output may change the meaning under one of the namespaces.

Exceptions

Below are explained areas where wxml fails to implement the whole of XML 1.0/1.1. These are divided into two lists; where wxml does not permit the generation of a particular well-formed XML document, and where it does permit the generation of a particular non-well-formed document.

Ways in which wxml renders it impossible to produce a certain sort of well-formed XML document:

  1. Unicode support is limited. Due to the limitations of Fortran, wxml is unable to manipulate characters outwith 7-bit US-ASCII. wxml will ensure that characters corresponding to those in 7-bit ASCII are output correctly within the constraints of the version of XML in use, for a UTF-8 encoding. Attempts to directly output any other characters will have undefined effects. Output of other unicode characters is possible through the use of character entities.
  2. Due to the constraints of the Fortran IO specification, it is impossible to output arbitrary long strings without carriage returns. The size of the limit varies between processors, but may be as low as 1024 characters. To avoid overrunning this limit, wxml will by default insert carriage returns before every new element, and if an unbroken string of attribute or text data is requested greater than 1024 characters, then carriage returns will be inserted as appropriate; within whitespace if possible; to ensure it is broken up into smaller sections to fit within the limits.

wxml will try very hard to ensure that output is well-formed. However, it is possible to fool wxml into producing ill-formed XML documents. Avoid doing so if possible; for completeness these ways are listed here. In all cases where ill-formedness is a possibility, a warning can be issued. These warnings can be verbose, so are off by default, but if they are desired, they can be switched on by manipulating the warning argument to xml_OpenFile.

  1. If you specify a non-default text encoding, and then run FoX on a platform which does not use this encoding, then the result will be nonsense, and more than likely ill-formed. FoX will issue a warning in this case.
  2. When adding any text, if any characters are passed in (regardless of character set) which do not have equivalants within 7-bit ASCII, then the results are processor-dependent, and may result in an invalid document on output. A warning will be issued if this occurs. If you need a guarantee that such characters will be passed correctly, use character entities.
  3. If any parameter entities are referenced, no checks are made that the document after parameter-entity-expansion is well-formed. A warning will be issued.

Validity constraints

Finally, note that constraints on XML documents are divided into two sets - well-formedness constraints (WFC) and validity constraints (VC). The above only applies to WFC checks. wxml can make some minimal checks on VCs, but this is by no means complete, nor is it intended to be. These checks are off by default, but may be switched on by manipulating the validate argument to xml_OpenFile.


WCML

WCML is a library for outputting CML data. It wraps all the necessary XML calls, such that you should never need to touch any WXML calls when outputting CML.

The CML output is conformant to version 2.4 of the CML schema. The output can also be made conformant to the CompChem convention.

The available functions and their intended use are listed below. Quite deliberately, no reference is made to the actual CML output by each function.

Wcml is not intended to be a generalized Fortran CML output layer. rather it is intended to be a library which allows the output of a limited set of well-defined syntactical fragments.

Further information on these fragments, and on the style of CML generated here, is available at http://www.uszla.me.uk/specs/subset.html.

This section of the manual will detail the available CML output subroutines.

Use of WCML

wcml subroutines can be accessed from within a module or subroutine by inserting

 use FoX_wcml

at the start. This will import all of the subroutines described below, plus the derived type xmlf_t needed to manipulate a CML file.

No other entities will be imported; public/private Fortran namespaces are very carefully controlled within the library.

Dictionaries.

The use of dictionaries with WCML is strongly encouraged. (For those not conversant with dictionaries, a fairly detailed explanation is available at http://www.xml-cml.org/information/dictionaries)

In brief, dictionaries are used in two ways.

Identification

Firstly, to identify and disambiguate output data. Every output function below takes an optional argument, dictRef="". It is intended that every piece of data output is tagged with a dictionary reference, which will look something like nameOfCode:nameOfThing.

So, for example, in SIESTA, all the energies are output with different dictRefs, looking like: siesta:KohnShamEnergy, or siesta:kineticEnergy, etc. By doing this, we can ensure that later on all these numbers can be usefully identified.

We hope that ultimately, dictionaries can be written for codes, which will explain what some of these names might mean. However, it is not in any way necessary that this be done - and using dictRef attributes will help merely by giving the ability to disambiguate otherwise indistinguishable quantities.

We strongly recommend this course of action - if you choose to do follow our recommendation, then you should add a suitable Namespace to your code. That is, immediately after cmlBeginFile and before cmlStartCml, you should add something like:

call cmlAddNamespace(xf=xf, 'nameOfCode', 'WebPageOfCode')

Again, for SIESTA, we add:

call cmlAddNamespace(xf, 'siesta, 'http://www.uam.es/siesta')

If you don't have a webpage for your code, don't worry; the address is only used as an identifier, so anything that looks like a URL, and which nobody else is using, will suffice.

Quantification

Secondly, we use dictionaries for units. This is compulsory (unlike dictRefs above). Any numerical quantity that is output through cmlAddProperty or cmlAddParameter is required to carry units. These are added with the units="" argument to the function. In addition, every other function below which will take numerical arguments also will take optional units, although default will be used if no units are supplied.

Further details are supplied in section Units below.

General naming conventions for functions.

Functions are named in the following way:

Conventions used below.

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

Also note that wherever a real number can be passed in (including through anytype) then the formatting can be specified using the conventions described in StringFormatting

Where an array is passed in, it may be passed either as an assumed-shape array; that is, as an F90-style array with no necessity for specifying bounds; thusly:

integer :: array(50)
call cmlAddProperty(xf, 'coords', array)

or as an assumed-size array; that is, an F77-style array, in which case the length must be passed as an additional parameter:

integer :: array(*)
call cmlAddProperty(xf, 'coords', array, nitems=50)

Similarly, when a matrix is passed in, it may be passed in both fashions:

integer :: matrix(50, 50)
call cmlAddProperty(xf, 'coords', matrix)

or

integer :: array(3, *)
call cmlAddProperty(xf, 'coords', matrix, nrows=3, ncols=50)

All functions take as their first argument an XML file object, whose keyword is always xf. This file object is initialized by a cmlBeginFile function.

It is highly recommended that subroutines be called with keywords specified rather than relying on the implicit ordering of arguments. This is robust against changes in the library calling convention; and also stepsides a significant cause of errors when using subroutines with large numbers of arguments.

Units

Note below that the functions cmlAddParameter and cmlAddProperty both require that units be specified for any numerical quantities output.

If you are trying to output a quantity that is genuinely dimensionless, then you should specify units="units:dimensionless"; or if you are trying to output a countable quantity (eg number of CPUs) then you may specify units="units:countable".

For other properties, all units should be specified as namespaced quantities. If you are using a very few common units, it may be easiest to borrow definitions from the provided dictionaries;

(These links do not resolve yet.)

cmlUnits: http://www.xml-cml.org/units/units
siUnits: http://www.xml-cml.org/units/siUnits
atomicUnits: http://www.xml-cml.org/units/atomic

A default units dictionary, containing only the very basic units that wcml needs to know about, which has a namespace of: http://www.uszla.me.uk/FoX/units, and wcml assigns it automatically to the prefix units.

This is added automatically, so attempts to add it manually will fail.

The contents of all of these dictionaries, plus the wcml dictionary, may be viewed at: http://www.uszla.me.uk/unitsviz/units.cgi.

Otherwise, you should feel at liberty to construct your own namespace; declare it using cmlAddNamespace, and markup all your units as:

 units="myNamespace:myunit"

Functions for manipulating the CML file:

This takes care of all calls to open a CML output file.

This takes care of all calls to close an open CML output file, once you have finished with it. It is compulsory to call this - if your program finished without calling this, then your CML file will be invalid.

This adds a namespace to a CML file.
NB This may only ever be called immediately after a cmlBeginFile call, before any output has been performed. Attempts to do otherwise will result in a runtime error.

This will be needed if you are adding dictionary references to your output. Thus for siesta, we do:

call cmlAddNamespace(xf, 'siesta', 'http://www.uam.es/siesta')

and then output all our properties and parameters with dictRef="siesta:something".

This pair of functions begin and end the CML output to an existing CML file. It takes care of namespaces.

Note that unless specified otherwise, there will be a convention attribute added to the cml tag specifying FoX_wcml-2.0 as the convention. (see http://www.uszla.me.uk/FoX for details)

Start/End sections

This pair of functions open & close a metadataList, which is a wrapper for metadata items.

This pair of functions open & close a parameterList, which is a wrapper for input parameters.

This pair of functions open & close a propertyList, which is a wrapper for output properties.

Start/end a list of k-points (added using cmlAddKpoint below)

Note that in most cases where you might want to use a serial number, you should probably be using the cmlStartStep subroutine below.

This pair of functions open & close a module of a computation which is unordered, or loosely-ordered. For example, METADISE uses one module for each surface examined.

This pair of functions open and close a module of a computation which is strongly ordered. For example, DLPOLY uses steps for each step of the simulation.

Adding items.

This adds a single item of metadata. Metadata vocabulary is completely uncontrolled within WCML. This means that metadata values may only be strings of characters. If you need your values to contain numbers, then you need to define the representation yourself, and construct your own strings.

This function adds a tag representing an input parameter

This function adds a tag representing an output property

Adding geometry information

Outputs an atomic configuration. Bonds may be added using the optional arguments bondAtom1Refs, bondAtom2Refs and bondOrders. All these arrays must be the same lenght and all must be present if bonds are to be added. Optionally, bondIds can be used to add Ids to the bond elements. Some valididity constraints are imposed (atomsRefs in the bonds must be defined, bonds cannot be added twice). The meaning of the terms "molecule", "bond" and "bond order" is left loosly defined.

Outputs information about a unit cell, in lattice-vector form

Outputs information about a unit cell, in crystallographic form

Adding eigen-information

Start a kpoint section.

End a kpoint section.

Add an empty kpoint section.

Start a section describing one band.

End a section describing one band.

Add a single eigenvalue to a band.

Add a list of eigenvalues for a kpoint

Add a phononic eigenpoint to the band - which has a single energy, and a 3xN matrix representing the eigenvector.

Echoing input files

It is often considered useful to include a direct representation of input data within an applications output files. FoX_wcml contains a number of procedures to allow this in a CML document based on a specification described in a manuscript currently in review (de Jong, Walker and Hanwell "From Data to Analysis: Linking NWChem and Avogadro with the Syntax and Semantics of Chemical Markup Language". The approach is also designed to make data recovery using an XSL transform straightforward. File metadata such as the original filename is also accessible. We assume that only ASCII data must be stored, arbitrary binary files are out of scope as are XML documents and non-ASCII textural data. Two methods are provided with the most appropriate being dependent on the design of the application.

Using file names

In the this approach the single subroutine, cmlDumpInputDec, is called with an array of file names as input arguments. In turn each file is opened, its contents are written to the CML document in the appropriate form, before the file is closed.

Line by line

Start an outer wrapper for input data.

End the outer wrapper.

Start a wrapper for a single "file" or similar concept.

End the file wrapper.

Put a line of text into the file wrapper.

The convoluted nature of file handling in Fortran combined with the way that some applications read their input data means that this approach is not always available (for example, if the input file is held open for the duration of the calculation, or if data is read from standard input) so an alternative interface with five subroutines (cmlStartDecList, cmlStartDec, cmlAddDecLine, cmlEndDec and cmlEndDecList) is provided. These must be called in order (and usually in two loops, one over files, and an inner loop over lines in each file).

Common arguments

All cmlAdd and cmlStart routines take the following set of optional arguments:


WKML

WKML is a library for creating KML documents. These documents are intended to be used for "expressing geographic annotation and visualization" for maps and Earth browsers such as Google Earth or Marble. WKML wraps all the necessary XML calls, such that you should never need to touch any WXML calls when outputting KML from a Fortran application.

WKML is intended to produce XML documents that conform to version 2.2 of the Open Geospatial Consortium's schema. However, the library offers no guarantee that documents produced will be valid as only a small subset of the constraints are enforced. The API is designed to minimize the possibilty of producing invalid KML in common use cases, and well-formdness is maintained by the underlying WXML library.

The available functions and their intended use are listed below. One useful reference to the use of KML is Google's KML documentation.

Use of WKML

wkml subroutines can be accessed from within a module or subroutine by inserting

 use FoX_wkml

at the start. This will import all of the subroutines described below, plus the derived type xmlf_t needed to manipulate a KML file.

No other entities will be imported; public/private Fortran namespaces are very carefully controlled within the library.

Conventions used below.

All functions take as their first argument an XML file object, whose keyword is always xf. This file object is initialized by a kmlBeginFile function.

It is highly recommended that subroutines be called with keywords specified rather than relying on the implicit ordering of arguments. This is robust against changes in the library calling convention; and also stepsides a significant cause of errors when using subroutines with large numbers of arguments.

Functions for manipulating the KML file:

This takes care of all calls to open a KML output file.

This takes care of all calls to close an open KML output file, once you have finished with it. It is compulsory to call this - if your program finished without calling this, then your KML file will be invalid.

This starts a new folder. Folders are used in KML to organize other objects into groups, the visability of these groups can be changed in one operation within Google Earth. Folders can be nested.

This closes the current folder.

This starts a new document element at this point in the output. Note that no checks are currently performed to ensure that this is permitted, for example only one document is permitted to be a child of the kml root element. Most users should not need to use this subroutine.

This closes the current document element. Do not close the outermose document element created with kmlBeginFile, this must be closed with kmlFinishFile. Most users should not need to use this subroutine.

Functions for producing geometrical objects:

A single function, kmlCreatePoints accepts various combinations of arguments, and will generate a series of individual points to be visualized in Google Earth. In fact, the KML produced will consist of a Folder, containing Placemarks, one for each point. The list of points may be provided in any of the three ways specified above.

A single function, kmlCreateLine accepts various combinations of arguments, and will generate a series of individual points to be visualized as a (closed or open) path in Google Earth. In fact, the KML produced will consist of a LineString, or LinearRing, containing a list of coordinates. The list of points may be provided in any of the three ways specified above.

Creates a filled region with the outer boundary described by the list of points. May be followed by one or more calls to kmlAddInnerBoundary and these must be followed by a call to kmlAddInnerBoundary.

Ends the specification of a region with or without inner boundaries.

Introduces an internal area that is to be excluded from the enclosing region.

2D fields

WKML also contains two subroutines to allow scalar fields to be plotted over a geographical region. Data is presented to WKML as a collection of values and coordinates and this data can be displayed as a set of coloured cells, or as isocontours.

Data input

For all 2-D field subroutines both position and value of the data must be specified. The data values must always be specified as a rank-2 array, values(:,:). The grid can be specified in three ways depending on grid type.

In all cases, single or double precision data may be used so long as all data is consistent in precision within one call.

Control over the third dimension

The third dimension of the data can be visualized in two (not mutually-exclusive) ways; firstly by assigning colours according to the value of the tird dimension, and secondly by using the altitude of the points as a (suitable scaled) proxy for the third dimension. The following optional arguments control this aspect of the visualization (both for cells and for contours)

Where no colormap is provided, one will be autogenerated with the appropriate number of levels as calculated from the provided contourvalues. Where no contourvalues are provided, they are calculated based on the size of the colormap provided. Where neither colormap nor contour_values are provided, a default of 5 levels with an autogenerated colormap will be used.

Subroutines

This subroutine generates a set of filled pixels over a region of the earth.

This subroutine creates a set of contour lines.

Colours

KML natively handles all colours as 32-bit values, expressed as 8-digit hexadecimal numbers in ABGR (alpha-blue-green-red) channel order. However, this is not very friendly. WKML provides a nicer interface to this, and all WKML functions which accept colour arguments will accept them in three ways:

A function and a subroutine are provided to maniputate the color_t derived type:

This function takes a single argument of type integer or string and returns a color_t derived type. If the argument is a string the colour is taken from the set of X11 colours, if it is an integer, i, the ith colour is selected from the X11 list.

This functon takes a single argument of type string(len=8) representing an 8-digit AVGR hexadecimal number and returns a color_t derived type representing that colour.

Several features of wkml make use of "colour maps", arrays of the color_t derived type, which are used to relate numerical values to colours when showing fields of data. These are created and used thus:

program colours
  use FoX_wkml
  type(color_t) :: colourmap(10)

  ! Use X11 colours from 101 to 110:
  colourmap(1:10) = kmlGetCustomColor(101:110)
  ! Except for number 5 which should be red:
  colourmap(5) = kmlGetCustomColor("indian red")
  ! And for number 6 which should be black
  call kmlSetCustomColor(colourmp(6), "00000000")

end program colours

Styles

Controling styling in KML can be quite complex. Most of the subroutines in WKML allow some control of the generated style but they do not ptovide access to the full KML vocabulary which allows more complex styling. In order to access the more complex styles in KML it is necessary to create KML style maps - objects that are defined, named with a styleURL. The styleURL is then used to reference to the style defined by the map.

Styles can be created using the following three subroutines. In each case one argument is necessary: id, which must be a string (starting with an alphabetic letter, and containing no spaces or punctuation marks) which is used later on to reference the style. All other arguments are optional.

Creates a style that can be used for points.

Creates a style that can be used for lines.

Creates a style that can be used for a polygon.


Debugging with FoX.

Following experience integrating FoX_wxml into several codes, here are a few tips for debugging any problems you may encounter.

Compilation problems

You may encounter problems at the compiling or linking stage, with error messages along the lines of: 'No Specific Function can be found for this Generic Function' (exact phrasing depending on compiler, of course.)

If this is the case, it is possible that you have accidentally got the arguments to the offending out of order. If so, then use the keyword form of the argument to ensure correctness; that is, instead of doing:

call cmlAddProperty(file, name, value)

do:

call cmlAddProperty(xf=file, name=name, value=value)

This will prevent argument mismatches, and is recommended practise in any case.

Runtime problems

You may encounter run-time issues. FoX performs many run-time checks to ensure the validity of the resultant XML code. In so far as it is possible, FoX will either issue warnings about potential problems, or try and safely handle any errors it encounters. In both cases, warning will be output on stderr, which will hopefully help diagnose the problem.

Sometimes, however, FoX will encounter a problem it can do nothing about, and must stop. In all cases, it will try and write out an error message highlighting the reason, and generate a backtrace pointing to the offending line. Occasionally though, the compiler will not generate this information, and the error message will be lost.

If this is the case, you can either investigate the coredump to find the problem, or (if you are on a Mac) look in ~/Library/Logs/CrashReporter to find a human-readable log.

If this is not enlightening, or you cannot find the problem, then some of the most common issues we have encountered are listed below. Many of them are general Fortran problems, but sometimes are not easily spotted in the context of FoX.

Incorrect formatting.

Make sure, whenever you are writing out a real number through one of FoX's routines, and specifying a format, that the format is correct according to StringFormatting. Fortran-style formats are not permitted, and will cause crashes at runtime.

Array overruns

If you are outputting arrays or matrices, and are doing so in the traditional Fortran style - by passing both the array and its length to the routine, like so:

 call xml_AddAttribute(xf=file, name=name, value=array, nvalue=n)

then if n is wrong, you may end up with an array overrun, and cause a crash.

We highly recommend wherever possible using the Fortran-90 style, like so:

 call xml_AddAttribute(xf=file, name=name, value=array)

where the array length will be passed automatically.

Uninitialized variables

If you are passing variables to FoX which have not been initialized, you may well cause a crash. This is especially true, and easy to cause if you are passing in an array which (due to a bug elsewhere) has been partly but not entirely initialized. To diagnose this, try printing out suspect variables just before passing them to FoX, and look for suspiciously wrong values.

Invalid floating point numbers.

If during the course of your calculation you accidentally generate Infinities, or NaNs, then passing them to any Fortran subroutine can result in a crash - therefore trying to pass them to FoX for output may result in a crash.

If you suspect this is happening, try printing out suspect variables before calling FoX.


SAX

SAX stands for Simple API for XML, and was originally a Java API for reading XML. (Full details at http://saxproject.org). SAX implementations exist for most common modern computer languages.

FoX includes a SAX implementation, which translates most of the Java API into Fortran, and makes it accessible to Fortran programs, enabling them to read in XML documents in a fashion as close and familiar as possible to other languages.

SAX is a stream-based, event callback API. Conceptually, running a SAX parser over a document results in the parser generating events as it encounters different XML components, and sends the events to the main program, which can read them and take suitable action.

Events

Events are generated when the parser encounters, for example, an element opening tag, or some text, and most events carry some data with them - the name of the tag, or the contents of the text.

The full list of events is quite extensive, and may be seen below. For most purposes, though, it is unlikely that most users will need more than the 5 most common events, documented here.

Given these events and accompanying information, a program can extract data from an XML document.

Invoking the parser.

Any program using the FoX SAX parser must a) use the FoX module, and b) declare a derived type variable to hold the parser, like so:

   use FoX_sax
   type(xml_t) :: xp

The FoX SAX parser then works by requiring the programmer to write a module containing subroutines to receive any of the events they are interested in, and passing these subroutines to the parser.

Firstly, the parser must be initialized, by passing it XML data. This can be done either by giving a filename, which the parser will manipulate, or by passing a string containing an XML document. Thus:

  call open_xml_file(xp, "input.xml", iostat)

The iostat variable will report back any errors in opening the file.

Alternatively,

  call open_xml_string(xp, XMLstring)

where XMLstring is a character variable.

To now run the parser over the file, you simply do:

 call parse(xp, list_of_event_handlers)

And once you're finished, you can close the file, and clean up the parser, with:

 call close_xml_t(xp)

Options to parser

It is unlikely that most users will need to operate any of these options, but the following are available for use; all are optional boolean arguments to parse.

Receiving events

To receive events, you must construct a module containing event handling subroutines. These are subroutines of a prescribed form - the input & output is predetermined by the requirements of the SAX interface, but the body of the subroutine is up to you.

The required forms are shown in the API documentation below, but here are some simple examples.

To receive notification of character events, you must write a subroutine which takes as input one string, which will contain the characters received. So:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

That does very little - it simply prints out the data it receives. However, since the subroutine is in a module, you can save the data to a module variable, and manipulate it elsewhere; alternatively you can choose to call other subroutines based on the input.

So, a complete program which reads in all the text from an XML document looks like this:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

program XMLreader
  use FoX_sax
  use event_handling
  type(xml_t) :: xp
  call open_xml_file(xp, 'input.xml')
  call parse(xp, characters_handler=characters_handler)
  call close_xml_t(xp)
end program

Attribute dictionaries.

The other likely most common event is the startElement event. Handling this involves writing a subroutine which takes as input three strings (which are the local name, namespace URI, and fully qualified name of the tag) and a dictionary of attributes.

An attribute dictionary is essentially a set of key:value pairs - where the key is the attributes name, and the value is its value. (When considering namespaces, each attribute also has a URI and localName.)

Full details of all the dictionary-manipulation routines are given in AttributeDictionaries, but here we shall show the most common.

So, a simple subroutine to receive a startElement event would look like:

module event_handling

contains

 subroutine startElement_handler(URI, localname, name,attributes)
   character(len=*), intent(in)   :: URI  
   character(len=*), intent(in)   :: localname
   character(len=*), intent(in)   :: name 
   type(dictionary_t), intent(in) :: attributes

   integer :: i

   print*, name

   do i = 1, getLength(attributes)
      print*, getQName(attributes, i), '=', getValue(attributes, i)
   enddo

  end subroutine startElement_handler
end module

program XMLreader
 use FoX_sax
 use event_handling
 type(xml_t) :: xp
 call open_xml_file(xp, 'input.xml')
 call parse(xp, startElement_handler=startElement_handler)
 call close_xml_t(xp)
end program

Again, this does nothing but print out the name of the element, and the names and values of all of its attributes. However, by using module variables, or calling other subroutines, the data could be manipulated further.

Error handling

The SAX parser detects all XML well-formedness errors (and optionally validation errors). By default, when it encounters an error, it will simply halt the program with a suitable error message. However, it is possible to pass in an error handling subroutine if some other behaviour is desired - for example it may be nice to report the error to the user, finish parsing, and carry on with some other task.

In any case, once an error is encountered, the parser will finish. There is no way to continue reading past an error. (This means that all errors are treated as fatal errors, in the terminology of the XML standard).

An error handling subroutine works in the same way as any other event handler, with the event data being an error message. Thus, you could write:

subroutine fatalError_handler(msg)
  character(len=*), intent(in) :: msg

  print*, "The SAX parser encountered an error:"
  print*, msg
  print*, "Never mind, carrying on with the rest of the calcaulation."
end subroutine

Stopping the parser.

The parser can be stopped at any time. Simply do (from within one of the callback functions).

call stop_parser(xp)

(where xp is the XML parser object). The current callback function will be completed, then the parser will be stopped, and control will return to the main program, the parser having finished.


Full API

Derived types

There is one derived type, xml_t. This is entirely opaque, and is used as a handle for the parser.

Subroutines

There are four subroutines:

This opens a file. xp is initialized, and prepared for parsing. string must contain the name of the file to be opened. iostat reports on the success of opening the file. A value of 0 indicates success.

This closes down the parser (and closes the file, if input was coming from a file.) xp is left uninitialized, ready to be used again if necessary.

(Advanced: See above for the list of options that the parse subroutine may take.)

The full list of event handlers is in the next section. To use them, the interface must be placed in a module, and the body of the subroutine filled in as desired; then it should be specified as an argument to parse as:
name_of_event_handler = name_of_user_written_subroutine
Thus a typical call to parse might look something like:

  call parse(xp, startElement_handler = mystartelement, endElement_handler = myendelement, characters_handler = mychars)

where mystartelement, myendelement, and mychars are all subroutines written by you according to the interfaces listed below.


Callbacks.

All of the callbacks specified by SAX 2 are implemented. Documentation of the SAX 2 interfaces is available in the JavaDoc at http://saxproject.org, but as the interfaces needed adjustment for Fortran, they are listed here.

For documentation on the meaning of the callbacks and of their arguments, please refer to the Java SAX documentation.

Triggered when some character data is read from between tags.

NB Note that all character data is reported, including whitespace. Thus you will probably get a lot of empty characters events in a typical XML document.

NB Note also that it is not required that a single chunk of character data all come as one event - it may come as multiple consecutive events. You should concatenate the results of subsequent character events before processing.

Triggered when the parser reaches the end of the document.

Triggered by a closing tag.

Triggered when a namespace prefix mapping goes out of scope.

Triggered when whitespace is encountered within an element declared as having no PCDATA. (Only active in validating mode.)

Triggered by a Processing Instruction

Triggered when either an external entity, or an undeclared entity, is skipped.

Triggered when the parser starts reading the document.

Triggered when an opening tag is encountered. (see LINK for documentation on handling attribute dictionaries.

Triggered when a namespace prefix mapping start.

Triggered when a NOTATION declaration is made in the DTD

Triggered when an unparsed entity is declared

Triggered when a error is encountered in parsing. Parsing will continue after this event.

Triggered when a fatal error is encountered in parsing. Parsing will cease after this event.

Triggered when a parser warning is generated. Parsing will continue after this event.

Triggered when an attribute declaration is encountered in the DTD.

Triggered when an element declaration is enountered in the DTD.

Triggered when a parsed external entity is declared in the DTD.

Triggered when an internal entity is declared in the DTD.

Triggered when a comment is encountered.

Triggered by the end of a CData section.

Triggered by the end of a DTD.

Triggered at the end of entity expansion.

Triggered by the start of a CData section.

Triggered by the start of a DTD section.

Triggered by the start of entity expansion.


Exceptions.

The FoX SAX implementation implements all of XML 1.0 and 1.1; all of XML Namespaces 1.0 and 1.1; xml:id and xml:base.

Although FoX tries very hard to work to the letter of the XML and SAX standards, it falls short in a few areas.

(It is impossible to implement IO of non-ASCII documents in a portable fashion using standard Fortran 95, and it is impossible to handle non-ASCII data internally using standard Fortran strings. A fully unicode-capable FoX version is under development, but requires Fortran 2003. Please enquire for further details if you're interested.)

Beyond this, any aspects of the listed XML standards to which FoX fails to do justice to are bugs.


What of Java SAX 2 is not included in FoX?

The difference betweek Java & Fortran means that none of the SAX APIs can be copied directly. However, FoX offers data types, subroutines, and interfaces covering most of the facilities offered by SAX. Where it does not, this is mentioned here.

org.sax.xml:

org.sax.xml.ext:

org.sax.xml.helpers:


Attributes dictionaries.

When parsing XML using the FoX SAX module, attributes are returned contained within a dictionary object.

This dictionary object implements all the methods described by the SAX interfaces Attributes and Attributes2. Full documentation is available from the SAX Javadoc, but is reproduced here for ease of reference.

All of the attribute dictionary objects and functions are exported through FoX_sax - you must USE the module to enable them. The dictionary API is described here.

An attribute dictionary consists of a list of entries, one for each attribute. The entries all have the following pieces of data:

and for namespaced attributes:

In addition, the following pieces of data will be picked up from a DTD if present:


Derived types

There is one derived type of interest, dictionary_t.

It is opaque - that is, it should only be manipulated through the functions described here.

Functions

Inspecting the dictionary

Returns an integer with the length of the dictionary, ie the number of dictionary entries.

Returns a logical value according to whether the dictionary contains an attribute named key or not.

Returns a logical value according to whether the dictionary contains an attribute with the correct URI and localname.

Retrieving data from the dictionary

Return the full name of the ith dictionary entry.

If an integer is passed in - the value of the ith attribute.

If a single string is passed in, the value of the attribute with that name.

If two strings are passed in, the value of the attribute with that uri and localname.

Returns a string containing the nsURI of the ith attribute.

Returns a string containing the localName of the ith attribute.

DTD-driven functions

The following functions are only of interest if you are using DTDs.

If an integer is passed in, returns the type of the ith attribute.

If a single string is passed in, returns the type of the attribute with that QName.

If a single string is passed in, returnsthe type of the attribute with that {uri,localName}.

If an integer is passed in, returns false unless the ith attribute is declared in the DTD.

If a single string is passed in, returns false unless the attribute with that QName is declared in the DTD.

If a single string is passed in, returns false unless the attribute with that {uri,localName} is declared in the DTD.

If an integer is passed in, returns true unless the ith attribute is a default value from the DTD.

If a single string is passed in, returns true unless the attribute with that QName is a default value from the DTD.

If a single string is passed in, returns true unless the attribute with that {uri,localName} is a default value from the DTD.


DOM

Overview

The FoX DOM interface exposes an API as specified by the W3C DOM Working group.

FoX implements essentially all of DOM Core Levels 1 and 2, (there are a number of minor exceptions which are listed below) and a substantial portion of DOM Core Level 3.

Interface Mapping

FoX implements all objects and methods mandated in DOM Core Level 1 and 2. (A listing of supported DOM Core Level 3 interfaces is given below.)

In all cases, the mapping from DOM interface to Fortran implementation is as follows:

  1. All DOM objects are available as Fortran types, and should be referenced only as pointers (though see 7 and 8 below). Thus, to use a Node, it must be declared first as:
    type(Node), pointer :: aNode
  2. A flat (non-inheriting) object hierarchy is used. All DOM objects which inherit from Node are represented as Node types.
  3. All object method calls are modelled as functions or subroutines with the same name, whose first argument is the object. Thus:
    aNodelist = aNode.getElementsByTagName(tagName)
    should be converted to Fortran as:
    aNodelist => getElementsByTagName(aNode, tagName)
  4. All object method calls whose return type is void are modelled as subroutines. Thus:
    aNode.normalize()
    becomes call normalize(aNode)
  5. All object attributes are modelled as a pair of get/set calls (or only get where the attribute is readonly), with the naming convention being merely to prepend get or set to the attribute name. Thus:
    name = node.nodeName
    node.nodeValue = string
    should be converted to Fortran as
    name = getnodeName(node)
    call setnodeValue(string)
  6. Where an object method or attribute getter returns a DOM object, the relevant Fortran function must always be used as a pointer function. Thus:
    aNodelist => getElementsByTagName(aNode, tagName)
  7. No special DOMString object is used - all string operations are done on the standard Fortran character strings, and all functions that return DOMStrings return Fortran character strings.
  8. Exceptions are modelled by every DOM subroutine/function allowing an optional additional argument, of type DOMException. For further information see (#DOM Exceptions) below.

String handling

The W3C DOM requires that a DOMString object exist, capable of holding Unicode strings; and that all DOM functions accept and emit DOMString objects when string data is to be transferred.

FoX does not follow this model. Since (as mentioned elsewhere) it is impossible to perform Unicode I/O in standard Fortran, it would be obtuse to require users to manipulate additional objects merely to transfer strings. Therefore, wherever the DOM mandates use of a DOMString, FoX merely uses standard Fortran character strings.

All functions or subroutines which expect DOMString input arguments should be used with normal character strings.
All functions which should return DOMString objects will return Fortran character strings.

Using the FoX DOM library.

All functions are exposed through the module FoX_DOM. USE this in your program:

program dom_example

  use FoX_DOM
  type(Node) :: myDoc

  myDoc => parseFile("fileIn.xml")
  call serialize(myDoc, "fileOut.xml")
end program dom_example

Documenting DOM functions

This manual will not exhaustively document the functions available through the Fox_DOM interface. Primary documentation may be found in the W3C DOM specifications:`

The systematic rules for translating the DOM interfaces to Fortran are given in the previous section. For completeness, though, there is a list here. The W3C specifications should be consulted for the use of each.

DOMImplementation:
type(DOMImplementation), pointer

Document: type(Node), pointer

Node:
type(Node), pointer

NodeList:
type(NodeList), pointer

NamedNodeMap:
type(NamedNodeMap), pointer

CharacterData:
type(Node), pointer

Attr:
type(Node), pointer

Element:
type(Node), pointer

Text:
type(Node), pointer

DocumentType:
type(Node), pointer

Notation:
type(Node), pointer

Entity:
type(Node), pointer

ProcessingInstruction:
type(Node), pointer

In addition, the following DOM Core Level 3 functions are available:

Document:

Node:

Attr:

Entity:

Text:

DOMConfiguration:
type(DOMConfiguration)

NB For details on DOMConfiguration, see below

Object Model

The DOM is written in terms of an object model involving inheritance, but also permits a flattened model. FoX implements this flattened model - all objects descending from the Node are of the opaque type Node. Nodes carry their own type, and attempts to call functions defined on the wrong nodetype (for example, getting the target of a node which is not a PI) will result in a FoX_INVALID_NODE exception.

The other types available through the FoX DOM are:

FoX DOM and pointers

All DOM objects exposed to the user may only be manipulated through pointers. Attempts to access them directly will result in compile-time or run-time failures according to your environment.

This should have little effect on the structure of your programs, except that you must always remember, when calling a DOM function, to perform pointer assignment, not direct assignment, thus:
child => getFirstChild(parent)
and not
child = getFirstChild(parent)

Memory handling

Fortran offers no garbage collection facility, so unfortunately a small degree of memory handling is necessarily exposed to the user.

However, this has been kept to a minimum. FoX keeps track of all memory allocated and used when calling DOM routines, and keeps references to all DOM objects created.

The only memory handling that the user needs to take care of is destroying any DOM Documents (whether created manually, or by the parse() routine.) All other nodes or node structures created will be destroyed automatically by the relevant destroy() call.

As a consequence of this, all DOM objects which are part of a given document will become inaccessible after the document object is destroyed.

Additional functions.

Several additional utility functions are provided by FoX.

Input and output of XML data

Firstly, to construct a DOM tree, from either a file or a string containing XML data.

filename should be an XML document. It will be opened and parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a PARSE_ERR exception will be raised. configuration is an optional argument - see DOMConfiguration for its meaning.

XMLstring should be a string containing XML data. It will be parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a PARSE_ERR exception will be raised. configuration is an optional argument - see DOMConfiguration for its meaning.

Both parseFile and parseString return a pointer to a Node object containing the Document Node.`

Secondly, to output an XML document:

This will open fileName and serialize the DOM tree by writing into the file. If fileName already exists, it will be overwritten. If an problem arises in serializing the document, then a fatal error will result.

(Control over serialization options is done through the configuration of the arg's ownerDocument, see below.)

Finally, to clean up all memory associated with the DOM, it is necessary to call:

This will clear up all memory usage associated with the document (or documentType) node passed in.

Extraction of data from an XML file.

The standard DOM functions only deal with string data. When dealing with numerical (or logical) data, the following functions may be of use.

These extract data from, respectively, the text content of an element, from one of its attributes, or from one of its namespaced attributes. They are used like so:

(where p is an element which has been selected by means of the other DOM functions)

call extractDataContent(p, data)

The subroutine will look at the text contents of the element, and interpret according to the type of data. That is, if data has been declared as an integer, then the contents of p will be read as such an placed into data.

data may be a string, logical, integer, real, double precision, complex or double complex variable.

In addition, if data is supplied as a rank-1 or rank-2 variable (ie an array or a matrix) then the data will be read in assuming it to be a space- or comma-separated list of such data items.

Thus, the array of integers within the XML document:

<element> 1 2 3 4 5 6 </element>

could be extracted by the following Fortran program:

type(Node), pointer :: doc, p
integer :: i_array(6)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataContent(p, i_array)

Contents and Attributes

For extracting data from text content, the example above suffices. For data in a non-namespaced attribute (in this case, a 2x2 matrix of real numbers)

<element att="0.1, 2.3 7.56e23, 93"> Some uninteresting text </element>

then use a Fortran program like:

type(Node), pointer :: doc, p
real :: r_matrix(2,2)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttribute(p, "att", r_matrix)

or for extracting from a namespaced attribute (in this case, a length-2 array of complex numbers):

<myml xmlns:ns="http://www.example.org">
  <element ns:att="0.1,2.3  3.4e2,5.34"> Some uninteresting text </element>
</myml>

then use a Fortran program like:

type(Node), pointer :: doc, p
complex :: c_array(2)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttributeNS(p, &
     namespaceURI="http://www.example.org", localName="att", &
     data=c_array)

Error handling

The extraction may fail of course, if the data is not of the sort specified, or if there are not enough elements to fill the array or matrix. In such a case, this can be detected by the optional arguments num and iostat.

num will hold the number of items successfully read. Hopefully this should be equal to the expected number of items; but it may be less if reading failed for some reason, or if there were less items than expected in the element.

iostat will hold an integer - this will be 0 if the extraction went ok; -1 if too few elements were found, 1 if although the read went ok, there were still some elements left over, or 2 if the extraction failed due to either a badly formatted number, or due to the wrong data type being found.

String arrays

For all data types apart from strings, arrays and matrices are specified by space- or comma-separated lists. For strings, some additional options are available. By default, arrays will be extracted assuming that separators are spaces (and multiple spaces are ignored). So:

<element> one two     three </element>

will result in the string array (/"one", "two", "three"/).

However, you may specify an optional argument separator, which specifies another single-character separator to use (and does not ignore multiple spaces). So:

<element>one, two, three </element>

will result in the string array (/"one", " two", " three "/). (note the leading and trailing spaces).

Finally, you can also specify an optional logical argument, csv. In this case, the separator is ignored, and the extraction proceeds assuming that the data is a list of comma-separated values. (see: CSV)

Other utility functions

This affects whether additional FoX-only checks are made (see DomExceptions below).

Retrieves the current setting of FoX_checks.

Note that FoX_checks can only be turned on and off globally, not on a per-document basis.

arg must be a Document Node. Calling this function affects whether any nodelists active on the document are treated as live - ie whether updates to the documents are reflected in the contents of nodelists (see DomLiveNodelists below).

Retrieves the current setting of liveNodeLists.

Note that the live-ness of nodelists is a per-document setting.

Exception handling

Exception handling is important to the DOM. The W3C DOM standards provide not only interfaces to the DOM, but also specify the error handling that should take place when invalid calls are made.

The DOM specifies these in terms of a DOMException object, which carries a numeric code whose value reports the kind of error generated. Depending upon the features available in a particular computer language, this DOMException object should be generated and thrown, to be caught by the end-user application.

Fortran of course has no mechanism for throwing and catching exceptions. However, the behaviour of an exception can be modelled using Fortran features.

FoX defines an opaque DOMException object. Every DOM subroutine and function implemented by FoX will take an optional argument, 'ex', of type DOMException.

If the optional argument is not supplied, any errors within the DOM will cause an immediate abort, with a suitable error message. However, if the optional argument is supplied, then the error will be captured within the DOMException object, and returned to the caller for inspection. It is then up to the application to decide how to proceed.

Functions for inspecting and manipulating the DOMException object are described below:

A function returning a logical value, according to whether ex is in exception - that is, whether the last DOM function or subroutine, from which ex returned, caused an error. Note that this will not change the status of the exception.

A function returning an integer value, describing the nature of the exception reported in ex. If the integer is 0, then ex does not hold an exception. If the integer is less than 200, then the error encountered was of a type specified by the DOM standard; for a full list, see below, and for explanations, see the various DOM standards. If the integer is 200 or greater, then the code represents a FoX-specific error. See the list below.

Note that calling getExceptionCode will clean up all memory associated with the DOMException object, and reset the object such that it is no longer in exception.

Exception handling and memory usage.

Note that when an Exception is thrown, memory is allocated within the DOMException object. Calling getExceptionCode on a DOMEXception will clean up this memory. If you use the exception-handling interfaces of FoX, then you must check every exception, and ensure you check its code, otherwise your program will leak memory.

FoX exceptions.

The W3C DOM interface allows the creation of unserializable XML document in various ways. For example, it permits characters to be added to a text node which would be invalid XML. FoX performs multiple additional checks on all DOM calls to prevent the creation of unserializable trees. These are reported through the DOMException mechanisms noted above, using additional exception codes. However, if for some reason, you want to create such trees, then it is possible to switch off all FoX-only checks. (DOM-mandated checks may not be disabled.) To do this, use the setFoX_checks function described in DomUtilityFunctions.

Note that FoX does not yet currently check for all ways that a tree may be made non-serializable.

List of exceptions.

The following is the list of all exception codes (both specified in the W3C DOM and those related to FoX-only checks) that can be generated by FoX:

Live nodelists

The DOM specification requires that all NodeList objects are live - that is, that any change in the document structure is immediately reflected in the contents of any nodelists.

For example, any nodelists returned by getElementsByTagName or getElementsByTagNameNS must be updated whenever nodes are added to or removed from the document; and the order of nodes in the nodelists must be changed if the document structure changes.

Though FoX does keep all nodelists live, this can impose a significant performance penalty when manipulating large documents. Therefore, FoX can be instructed to inly use 'dead' nodelists - that is, nodelists which reflect a snapshot of the document structure at the point they were created. To do this, call setLiveNodeLists (see API documentation).

However, note that the nodes within the nodelist remain live - any changes made to the nodes will be reflected in accessing them through the nodelist.

Furthermore, since the nodelists are still associated with the document, they and their contents will be rendered inaccessible when the document is destroyed.

DOM Configuration

Multiple valid DOM trees may be produced from a single document. When parsing input, some of these choices are made available to the user.

By default, the DOM tree presented to the user will be produced according to the following criteria:

However, if another tree is desired, the user may change this. For example, very often you would rather be working with the fully canonicalized tree, with all cdata sections replaced by text nodes and merged, and all entity references replaced with their contents.

The mechanism for doing this is the optional configuration argument to parseFile and parseString. configuration is a DOMConfiguration object, which may be manipulated by setParameter calls.

Note that FoX's implementation of DOMConfiguration does not follow the specification precisely. One DOMConfiguration object controls all of parsing, normalization and serialization. It can be used like so:

use FoX_dom
implicit none
type(Node), pointer :: doc
! Declare a new configuration object
type(DOMConfiguration), pointer :: config
! Request full canonicalization
! ie convert CDATA sections to text sections, remove all entity references etc.
config => newDOMConfig()
call setParameter(config, "canonical-form", .true.)
! Turn on validation
call setParameter(config, "validate", .true.)
! parse the document
doc => parseFile("doc.xml", config)

! Do a whole lot of DOM processing ...

! change the configuration to allow cdata-sections to be preserved.
call setParameter(getDomConfig(doc), "cdata-sections", .true.)
! normalize the document again 
call normalizeDocument(doc)
! change the configuration to influence the output - make sure there is an XML declaration
call setParameter(getDomConfig(doc), "xml-declaration", .true.)
! and write the document out.
call serialize(doc)
! once everything is done, destroy the doc and config
call destroy(doc)
call destroy(config)

The available configuration options are fully explained in:

and are all implemented, with the exceptions of: error-handler, schema-location, and schema-type.
In total there are 24 implemented configuration options (schema-location and schema-type are not implemented). The options known by FoX are as follows:

Setting canonical-form changes the value of entities, cdata-sections, discard-default-content, invalid-pretty-print, and xml-declarationto false and changes namespaces, namespace-declarations, and element-content-whitespace to true. Unsetting canonical-form causes these options to revert to the defalt settings. Changing the values of any of these options has the side effect of unsetting canonical-form (but does not cause the other options to be reset). Setting validate unsets validate-if-schema and vica versa.

DOM Miscellanea

Other issues

It was decided to implement W3C DOM interfaces primarily because they are specified in a language-agnostic fashion, and thus made Fortran implementation possible. A number of criticisms have been levelled at the W3C DOM, but many apply only from the perspective of Java developers. However, more importantly, the W3C DOM suffers from a lack of sufficient error checking so it is very easy to create a DOM tree, or manipulate an existing DOM tree into a state, that cannot be serialized into a legal XML document.

(Although the Level 3 DOM specifications finally addressed this issue, they did so in a fashion that was neither very useful, nor easily translatable into a Fortran API.)

Therefore, FoX will by default produce errors about many attempts to manipulate the DOM in such a way as would result in invalid XML. These errors can be switched off if standards-compliant behaviour is wanted. Although extensive, these checks are not complete. In particular, the way the W3C DOM mandates namespace handling makes it trivial to produce namespace non-well-formed document trees, and very difficult for the processor to automatically detect the non-well-formedness. Thus a fully well-formed tree is only guaranteed after a suitable normalizeDocument call.


UTILS

FoX_utils is a collection of general utility functions that the rest of FoX depends on, but which may be of independent use. They are documented here.

All functions are accessible from the FoX_utils module.

NB Unlike the APIs of WXML, WCML, and SAX, the UTILS APIs may not remain constant between FoX versions. While some effort will be expended to ensure they don't change unnecessarily, no guarantees are made.

For any end-users interested in the code who are worried about interface changes, it is recommended that the relevant code (all found in the utils/ directory be lifted directly and imported into other projects, rather than accessed through the FoX interfaces.

Two sets of utility functions are provided; one concerned with UUIDs, and a set concerned with URIs.

UUID

UUIDs (see RFC 4122) are Universally Unique IDentifiers. They are a 128-bit number, represented as a 36-character string. For example:

 f81d4fae-7dec-11d0-a765-00a0c91e6bf6

The intention of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else.

This property also makes them useful as Uniform Resource Names, to refer to a given document without requiring a position in a particular URI scheme. Thus the above UUID could be referred to as

urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6

UUIDs are used by WCML to ensure that every document generated has a unique ID. This enables users to go back later on and have confidence that they are examining the same document, regardless of where it might have ended up in file-system hierarchies or databases.

In addition, UUIDs come in several flavours, one of which stores the time of creation to 100-nanosecond accuracy. This can later be extracted (see, for example this service) to verify creation time.

This may well be useful for other XML document types, or indeed in non-XML applications. Thus, UUIDs may be generated by the following function, with one optional argument.

This function returns a 36-character string containing the UUID.

version identifies the version of UUID to be used (see section 4.1.3 of the RFC). Only versions 0, 1, and 4 are supported. Version 0 generates a nil UUID; version 1 a time-based UUID, and version 4 a pseudo-randomly-generated UUID.

Version 1 is the default, and is recommended.

(Note: all pseudo-random-numbers are generated using the high-quality Mersenne Twister algorithm, using the Fortran implementation of Scott Robert Ladd.)

URI

URIs (see RFC 2396) are Universal Resource Identifiers. A URI is a string, containing several components, which identifies a resource. Very often, this resource is a file, and the URI represents the local or network path to this file.

For example:

http://www.uszla.me.uk/FoX/DoX/index.html

is a URI pointing to the FoX documentation.

Equally, however:

FoX/configure

is a URI reference pointing to the FoX configure script (relative to the current directory, or base URI).

A string which is a URI reference contains several components, some of which are optional.

In addition, a URI reference may contain userinfo, host, port, query, and fragment information. (see the RFC for full details.)

The FoX URI library provides the following features:

If the string provided is not a valid URI reference, then a null pointer is returned; thus this function can be used to check whether a URI is valid.

Thus, if the first URI were /FoX/DoX, and the second ../DoX2/index.html, then the resulting URI would be /FoX/DoX2/index.html

For each component a URI might have (scheme, authority, userinfo, host, port, path, query, fragment) there are two functions for extracting the component:

Thus, listing these functions in full:


Further information

FoX evolved from the initial codebase of xmlf90, which was written largely by Alberto Garcia <albertog@icmab.es> and Jon Wakelin <jon.wakelin@bristol.ac.uk>.

FoX is the work of Toby White <tow@uszla.me.uk>, and all bug reports/complaints/bouquets of roses should be sent to him. Andrew Walker <andrew.walker@bristol.ac.uk> currently looks after maintenance of FoX.

There is a FoX website at http://www1.gly.bris.ac.uk/~walker/FoX/.

There is also a mailing list for announcements/queries/bug reports. Information on how to subscribe may be found at http://groups.google.com/group/fox-discuss/. The archive of an older mailing list can be found at http://www.uszla.me.uk/pipermail/fox/.

This manual is © Toby White 2006-2008 with additional modifications by Andrew Walker 2008-2010.


Licensing

FoX is licensed under the agreement below. This is intended to make it as freely available as possible, subject only to retaining copyright notices and acknowledgements.

If for any reason this license causes issues with your intended use of the code, please contect the author.

The license can also be found within the distributed source, in the file FoX/LICENSE

Copyright:
© 2003, 2004, Alberto Garcia, Jon Wakelin
© 2005-2008, Toby White
© 2007-2009, Gen-Tao Chiang © 2008-2012, Andrew Walker All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Third-party code.

In addition, FoX includes a random number library, written by Scott Robert Ladd, which is licensed as follows:

! This computer program source file is supplied "AS IS". Scott Robert
! Ladd (hereinafter referred to as "Author") disclaims all warranties,
! expressed or implied, including, without limitation, the warranties
! of merchantability and of fitness for any purpose. The Author
! assumes no liability for direct, indirect, incidental, special,
! exemplary, or consequential damages, which may result from the use
! of this software, even if advised of the possibility of such damage.
!
! The Author hereby grants anyone permission to use, copy, modify, and
! distribute this source code, or portions hereof, for any purpose,
! without fee, subject to the following restrictions:
!
! 1. The origin of this source code must not be misrepresented.
!
! 2. Altered versions must be plainly marked as such and must not
! be misrepresented as being the original source.
!
! 3. This Copyright notice may not be removed or altered from any
! source or altered source distribution.
!
! The Author specifically permits (without fee) and encourages the use
! of this source code for entertainment, education, or decoration. If
! you use this source code in a product, acknowledgment is not required
! but would be appreciated.