Configuration and compilation

You will have received the FoX source code as a tar.gz file.

Unpack it as normal, and change directory into the top-level directory, FoX-$VERSION.

Requirements for use

FoX requires a Fortran 95 compiler - not just Fortran 90. All currently available versions of Fortran compilers claim to support F95. If your favoured compiler is not listed as working, I recommend the use of g95, which is free to download and use. And in such a case, please send a bug report to your compiler vendor.

In the event that you need to write a code targetted at multiple compilers, including some which have bugs preventing FoX compilation, please note the possibility of producing a dummy library.

Configuration

In order to generate the Makefile, make sure that you have a Fortran compiler in your PATH, and do:

./configure

This should suffice for most installations. However:

You may not be interested in all of the modules that FoX supplies. For example, you may only be interested in output, not input. If so, you can select which modules you want using --enable-MODULENAME where MODULENAME is one of wxml, wcml, wkml, sax, dom. If none are explicitly enabled, then all will be built. (Alternatively, you can exclude modules one at a time with --disable-MODULENAME) Thus, for example, if you only care about CML output, and not anything else: ./configure --enable-wcml
If you have more than one Fortran compiler available, or it is not on your PATH, you can force the choice by doing:

./configure FC=/path/to/compiler/of/choice
It is possible that the configuration fails. In this case
- please tell me about it so I can fix it
- all relevant compiler details are placed in the file arch.make; you may be able to edit that file to allow compilation. Again, if so, please let me know what you need to do.
By default the resultant files are installed under the objs directory. If you wish them to be installed elsewhere, you may do

./configure --prefix=/path/to/installation

Note that the configure process encodes the current directory location in several places. If you move the FoX directory later on, you will need to re-run configure.

You may be interested in dummy compilation. This is activated with the --enable-dummy switch (but only works for wxml/wcml currently).

./configure --enable-wcml --enable-dummy

Compilation

In order to compile the full library, now simply do:

make

This will build all the requested FoX modules, and the relevant examples

Testing

In the full version of the FoX library, there are several testsuites included.

To run them all, simply run make check from the top-level directory. This will run the individual testsuites, and collate their results.

If any failures occur (unrelated to known compiler issues, see the up-to-date list), please send a message to the mailing list (fox-discuss@googlegroups.com) with details of compiler, hardware platform, and the nature of the failure.

The testsuites for the SAX and DOM libraries are very extensive, and are somewhat fragile, so are not distributed with FoX. Please contact the author for details.

Linking to an existing program

The files all having been compiled and installed, you need to link them into your program.

A script is provided which will provide the appropriate compiler and linker flags for you; this will be created after configuration, in the top-level directory, and is called FoX-config. It may be taken from there and placed anywhere.

FoX-config takes the following arguments:

--fcflags: return flags for compilation
--libs: return flags for linking
--wxml: return flags for compiling/linking against wxml
--wcml: return flags for compiling/linking against wcml
--sax: return flags for compiling/linking against sax

If it is called with no arguments, it will expand to compile & link flags, thusly:

   f95 -o program program.f90 `FoX-config`

For compiling only against FoX, do the following:

f95 -c `FoX-config --fcflags` sourcefile.f90

For linking only to the FoX library, do:

f95 -o program `FoX-config --libs` *.o

or similar, according to your compilation scheme.

Note that by default, FoX-config assumes you are using all modules of the library. If you are only using part, then this can be specified by also passing the name of each module required, like so:

FoX-config --fcflags --wcml

Compiling a dummy library

Because of the shortcomings in some compilers, it is not possible to compile FoX everywhere. Equally, sometimes it is useful to be able to compile a code both with and without support for FoX (perhaps to reduce executable size). Especially where FoX is being used only for additional output, it is useful to be able to run the code and perform computations even without the possibility of XML output.

For this reason, it is possible to compile a dummy version of FoX. This includes all public interfaces, so that your code will compile and link correctly - however none of the subroutines do anything, so you can retain the same version of your code without having to comment out all FoX calls.

Because this dummy version of FoX contains nothing except empty subroutines, it compiles and links with all known Fortran 95 compilers, regardless of compiler bugs.

To compile the dummy code, use the --enable-dummy switch. Note that currently the dummy mode is not yet available for the DOM module.

Using FoX in your own project.

The recommended way to use FoX is to embed the full source code as a subdirectory, into an existing project.

In order to do this, you need to do something like the following:

Put the full source code as a top-level subdirectory of the tree, called FoX.
Incorporate calls to FoX into the program.
Incorporate building FoX into your build process.

To incorporate into the program

It is probably best to isolate use of XML facilities to a small part of the program. This is easily accomplished for XML input, which will generally happen in only one or two places.

For XML output, this can be more complex. The easiest, and least intrusive way is probably to create a F90 module for your program, looking something like example_xml_module.f90

Then you must somewhere (probably in your main program), use this module, and call initialize_xml_output() at the start; and then end_xml_output() at the end of the program.

In any of the subroutines where you want to output data to the xml file, you should then insert use example_xml_module at the beginning of the subroutine. You can then use any of the xml output routines with no further worries, as shown in the examples.

It is easy to make the use of FoX optional, by the use of preprocessor defines. This can be done simply by wrapping each call to your XML wrapper routines in #ifdef XML, or similar. Alternatively, the use of the dummy FoX interfaces allows you to switch FoX on and off at compile time - see Compilation.

To incorporate into the build process:

Configuration

First, FoX must be configured, to ensure that it is set up correctly for your compiler. (See Compilation) If your main code has a configure step, then run FoX's configure as part of it.

If your code doesn't have its own configure step, then the first thing that "make" does should be to configure FoX, if it's not already configured. But that should only happen once; every time you make your code thereafter, you don't need to re-configure FoX, because nothing has changed. To do that, put a target like the following in your Makefile.

FoX/.config:
        (cd FoX; ./configure FC=$(FC))

(Assuming that your Makefile already has a variable FC which sets the Fortran compiler)

When FoX configure completes, it "touch"es a file called FoX/.config. That means that whenever you re-run your own make, it checks to see if FoX/.config exists - if it does, then it knows FoX doesn't need to be re-configured, so it doesn't bother.

Compilation of FoX

Then, FoX needs to be compiled before your code (because your modules will depend on FoX's modules.) But again, it only needs to be compiled once. You won't be changing FoX, you'll only be changing your own code, so recompiling your code doesn't require recompiling FoX.

So, add another target like the following;

FoX/.FoX: FoX/.config
        (cd FoX; $(MAKE))

This has a dependency on the configure script as I showed above, but it will only run it if the configure script hasn't already been run.

When FoX is successfully compiled, the last thing its Makefile does is "touch" the file called FoX/.FoX. So the above target checks to see if that file exists; and if it does, then it doesn't bother recompiling FoX, because it's already compiled. On the very first time you compile your code, it will cd into the FoX directory and compile it - but then never again.

You then need to have that rule be a dependency of your main target; like so:

  MyExecutable: FoX/.FoX

(or whatever your default Makefile rule is).

which will ensure that before MyExecutable is compiled, make will check to see that FoX has been compiled (which most of the time it will be, so nothing further will happen). But the first time you compile your code, it will call the FoX target, and FoX will be configured & compiled.

Compiling/linking your code

You should add this to your FFLAGS (or equivalent - the variable that holds flags for compile-time use.

FFLAGS=-g -O2 -whatever-else $$(FoX/FoX-config --fcflags)

to make sure that you get the path to your modules. (Different compilers have different flags for specifying module paths; some use -I, some use -M, etc, if you use the above construction it will pick the right one automatically for your compiler.)

Similarly, for linking, add the following to your LDFLAGS (or equivalent - the variable that holds flags for link-time use.)

LDFLAGS=-lwhatever $$(FoX/FoX-config --libs)

(For full details of the FoX-config script, see Compilation)

Cleaning up

Finally - you probably have a clean target in your makefile. Don't tie FoX into this target - most of the time when you make clean, you don't want to make clean with FoX as well, because there's no need - FoX won't have changed and it'll take a couple of minutes to recompile.

However, you can add a distclean (or something) target, which you use before moving your code to another machine, that looks like:

distclean: clean
        (cd FoX; $(MAKE) distclean)

and that will ensure that when you do make distclean, even FoX's object files are cleaned up. But of course that will mean that you have to reconfigure & recompile FoX next time you compile your code

Standards compliance

FoX is written with reference to the following standards:

[XML10]: http://www.w3.org/TR/REC-xml/

[XML11]: http://www.w3.org/TR/xml11

[Namespaces10]: http://www.w3.org/TR/xml-names

[Namespaces11]: http://www.w3.org/TR/xml-names11

[xml:id]: http://www.w3.org/TR/xml-id/

[xml:base]: http://www.w3.org/TR/xmlbase/

[CanonicalXML]: http://www.w3.org/TR/xml-c14n

[SAX2]: http://saxproject.org

[DOM1]: http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html

[DOM2]: http://www.w3.org/TR/DOM-Level-2-Core/

[DOM3]: http://www.w3.org/TR/DOM-Level-3-Core/

In particular:

FoX_wxml knows about [XML10], [XML11], [Namespaces10], [Namespaces11], [CanonicalXML]
FoX_sax knows about [XML10], [XML11], [Namespaces10], [Namespaces11], [xml:id], [xml:base], [SAX2]
FoX_dom knows about [XML10], [XML11], [Namespaces10], [Namespaces11], [xml:id], [xml:base], [DOM1], [DOM2], [DOM3], [CanonicalXML]

For exceptions, please see the relevant parts of the FoX documentation.

FoX_common

FoX_common is a module exporting interfaces to a set of convenience functions common to all of the FoX modules, which are of more general use.

Currently, there are three publically available functions and four subroutines:

The subroutine str converts primitive datatypes into strings in a consistent fashion, conformant with the expectations of XML processors.

It is fully described in StringFormatting

The subroutine rts performs the reverse function, taking a string (obtained from an XML document) and converts it into a primitive Fortran datatype.
The function countrts examinies a string and determines the size of array requiered to hold all its data, once converted to a primitive Fortran datatype.

It is fully described in StringConversion

The final four procedures change the way that errors and warnings are handled when encounterd by any FoX modules. Using these procedures it is possible to convert non-fatal warnings and fatal errors to calls to the internal about routine. This generally has the effect of generating a stack trace or core dump of the program before temination. This is a global setting for all XML documents being manipulated. Two subroutines take a single logical argument to turn on (true) and off (false) the feature for warnings and errors respectivly:

FoX_set_fatal_warnings for warnings
FoX_set_fatal_errors for errors

and two functions (without arguments) allow the state to be checked:

FoX_get_fatal_warnings for warnings
FoX_get_fatal_errors for errors

Both fatal warnings and errors are off by default. This corresponds to the previous behaviour.

String handling in FoX

Many of the routines in wxml, and indeed in wcml which is built on top of wxml, are overloaded so that data may be passed to the same routine as string, integer, logical, real, or complex data.

In such cases, a few notes on the conversion of non-textual data to text is in order. The standard Fortran I/O formatting routines do not offer the control required for useful XML output, so FoX performs all its own formatting.

This formatting is done internally through a function which is also available publically to the user, str.

To use this in your program, import it via:

use FoX_common, only; str

and use it like so:

 print*, str(data)

In addition, for ease of use, the // concatenation operator is overloaded, such that strings can easily be formed by concatenation of strings to other datatypes. To use this you must import it via:

 use FoX_common, only: operator(//)

and use it like so:

 integer :: data
 print*, "This is a number "//data

This will work for all native Fortran data types - but no floating point formatting is available as described below with concatenation, only with str()

You may pass data of the following primitive types to str:

Scalar data

Character (default kind)

Character data is returned unchanged.

Logical (default kind)

Logical data is output such that True values are converted to the string 'true', and False to the string 'false'.

Integer (default kind)

Integer data is converted to the standard decimal representation.

Real numbers (single and double precision)

Real numbers, both single and double precision, are converted to strings in one of two ways, with some control offered to the user. The output will conform to the real number formats specified by XML Schema Datatypes.

This may be done in one of two ways:

Exponential notation, with variable number of significant figures. Format strings of the form "sn" are accepted, where n is the number of significant figures.

Thus the number 111, when output with various formats, will produce the following output:

s1	1e2
s2	1.1e2
s3	1.11e2
s4	1.110e2

The number of significant figures should lie between 1 and the number of digits precision provided by the real kind. If a larger or smaller number is specified, output will be truncated accordingly. If unspecified, then a sensible default will be chosen.

This format is not permitted by XML Schema Datatypes 1.0, though it is in 2.0

Non-exponential notation, with variable number of digits after the decimal point. Format strings of the form "rn", where n is the number of digits after the decimal point.

Thus the number 3.14159, when output with various formats, will produce the following output:

r0	3
r1	3.1
r2	3.14
r3	3.142

The number of decimal places must lie between 0 and whatever would output the maximum digits precision for that real kind. If a larger or smaller number is specified, output will be truncated accorsingly. If unspecified, then a sensible default will be chosen.

This format is the only one permitted by XML Schema Datatypes 1.0

If no format is specified, then a default of exponential notation will be used.

If a format is specified not conforming to either of the two forms above, a run-time error will be generated.

NB Since by using FoX or str, you are passing real numbers through various functions, this means that they must be valid real numbers. A corollary of this is that if you pass in +/-Infinity, or NaN, then the behaviour of FoX is unpredictable, and may well result in a crash. This is a consequence of the Fortran standard, which strictly disallows doing anything at all with such numbers, including even just passing them to a subroutine.

Complex numbers (single and double precision)

Complex numbers will be output as pairs of real numbers, in the following way:

(1.0e0)+i(1.0e0)

where the two halves can be formatted in the way described for 'Real numbers' above; only one format may be specified, and it will apply to both.

All the caveats described above apply for complex number as well; that is, output of complex numbers either of whose components are infinite or NaN is illegal in Fortran, and more than likely will cause a crash in FoX.

Arrays and matrices

All of the above types of data may be passed in as arrays and matrices as well. In this case, a string containing all the individual elements will be returned, ordered as they would be in memory, each element separated by a single space.

If the data is character data, then there is an additional option to str, delimiter which may be any single-character string, and will replace a space as the delimiter.

wxml/wcml wrappers.

All functions in wxml which can accept arbitrary data (roughly, wherever you put anything that is not an XML name; attribute values, pseudo-attribute values, character data) will take scalars, arrays, and matrices of any of the above data types, with fmt= and delimiter= optional arguments where appropriate.

Similarly, wcml functions which can accept varied data will behave similarly.

String conversion

Two procedures are provided to simplify reading data retreved from XML documents into Fortran variables. The subroutine rts performs the data conversion step and the function countrts can be used to allocate an array of the correct size for the incomming data.

`rts` subroutine

The rts subroutine can be imported from FoX_common. In its simplest form, it is called in this fashion:

call rts(string, data)

string is a simple Fortran string (probably retrieved from an XML file.)

data is any native Fortran datatype: logical, character, integer, real, double precision, complex, double complex, and may be a scalar, 1D or 2D array.

rts will attempt to parse the contents of string into the appropriate datatype, and return the value in data.

Additional information or error handling is accomplished with the following optional arguments:

`num`

num is an integer; on returning from the function it indicates the number of data items read before either:

an error occurred
the string was exhausted of data items
data was filled.

`iostat`

iostat is an integer, which on return from the function has the values:

0 for no problems
-1 if too few elements were found in string to fill up data
1 if data was filled, but there were still data items left in string
2 if the characters found in string could not be converted to the appropriate type for data.

NB if iostat is not specified, and a non-zero value is returned, then the program will stop with an error message.

String formatting

When string is expected to be an array of strings, the following options are used to break string into its constituent elements:

By default it is assumed that the elements are separated by whitespace, and that multiple whitespace characters are not significant. No zero-length elements are possible, nor are elements containing whitespace.

An optional argument, separator may be specified, which is a single character. In this case, each element consists of all characters between subsequent occurences of the separator. Zero-length elements are possible, but no escaping mechanism is possible.

Alternatively, an optional logical argument csv may be specified. In this case, the value of delimiter is ignored, and the string is parsed as a Comma-Separated-Value string, according to RFC 4180.

Numerical formatting.

Numbers are expected to be formatted according to the usual conventions for Fortran input.

Complex number formatting.

Complex numbers may be formatted according to either normal Fortran conventions (comma-separated pairs) or CMLComp conventions

Logical variable formatting.

Logical variables must be encoded according to the conventions of XML Schema Datatypes - that is, True may be written as "true" or "1", and False may be written as "false" or "0".

`countrts` function

The countrts function can also be imported from FoX_common. In its simplest form, it is called in this fashion:

countrts(string, datatype)

string is a simple Fortran string (probably retrived from an XML file)

datatype is a scalar argument of any native Fortran datatype (logical, character, integer, real, double precision, complex or double complex).

The function returns a default integer equal to the number of elements that rts would return if called with a sufficently large array of the same type as datatype. countrts returns 0 to indicate that characters were found in the string that could not be converted. If datatype is a character, the optional arguments seperator and csv are avalable as described in "string formatting" above. The countrts function is pure and can be used as a specification function.

WXML

wxml is a general Fortran XML output library. It offers a Fortran interface, in the form of a number of subroutines, to generate well-formed XML documents. Almost all of the XML features described in XML11 and Namespaces are available, and wxml will diagnose almost all attempts to produce an invalid document. Exceptions below describes where wxml falls short of these aims.

First, Conventions describes the conventions use in this document.

Then, Functions lists all of wxml's publically exported functions, in three sections:

Firstly, the very few functions necessary to create the simplest XML document, containing only elements, attributes, and text.
Secondly, those functions concerned with XML Namespaces, and how Namespaces affect the behaviour of the first tranche of functions.
Thirdly, a set of more rarely used functions required to access some of the more esoteric corners of the XML specification.

Please note that where the documentation below is not clear, it may be useful to look at some of the example files. There is a very simple example in the examples/ subdirectory, but which nevertheless shows the use of most of the features you will use.

A more elaborate example, using almost all of the XML features found here, is available in the top-level directory as wxml_example.f90. It will be automatically compiled as part of the build porcess.

Conventions and notes:

Conventions used below.

Function names are in monospace
argument names are in bold
optional argument names are in (parenthesized bold)
argument types are in italic and may consist of:
string: string of arbitrary (unless otherwise specified) length
integer: default integer
real(sp): single precision real number
real(dp): double precision real number
logical: default logical
real: either of real(sp) or real(dp)
anytype: any of logical, integer, real(sp), real(dp), string

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

It is strongly recommended that the functions be used with keyword arguments rather than replying on implicit ordering.

Derived type: `xmlf_t`

This is an opaque type representing the XML file handle. Each function requires this as an argument, so it knows which file to operate on. (And it is an output of the xml_OpenFile subroutine) Since all subroutines require it, it is not mentioned below.

Function listing

Frequently used functions

xml_OpenFile
filename: string: Filename to be opened
xf: xmlf_t: XML File handle
(channel): integer: What Fortran file handle should the XML file be attached to? default: picked by the library at runtime
(pretty_print): logical: Should the XML output be formatted to look pretty? (This implies that whitespace is not significant) default: false
(replace): logical: Should the file be replaced if it already exists? default: no, stop at runtime if file already exists
(addDecl): logical: Should an XML declaration be added at the start of the file? default: yes
(namespace): logical: Should wxml prevent the output of namespace-ill-formed documents? default: yes
(validate): logical: Should wxml carry out any checks on the optional VC constraints specified by XML? default: no
(warning): logical: Should wxml emit warnings when it is unable to guarantee well-formedness? default: no

Open a file for writing XML

By default, the XML will have no extraneous text nodes. This can have the effect of it looking slightly ugly, since there will be no newlines inserted between tags.

This behaviour can be changed to produce slightly nicer looking XML, by switching on pretty_print. This will insert newlines and spaces between some tags where they are unlikely to carry semantics. Note, though, that this does result in the XML produced being not quite what was asked for, since extra characters and text nodes have been inserted.

NB: The replace option should be noted. By default, xml_OpenFile will fail with a runtime error if you try and write to an existing file. If you are sure you want to continue on in such a case, then you can specify **replace**=.true. and any existing files will be overwritten. If finer granularity is required over how to proceed in such cases, use the Fortran inquire statement in your code. There is no 'append' functionality by design - any XML file created by appending to an existing file would be invalid.

xml_Close
xf: xmlf_t: XML File handle (empty): Can the file be empty? default: .false.

Close an opened XML file, closing all still-opened tags so that it is well-formed.

In the normal run of event, trying to close an XML file with no root element will cause an error, since this is not well-formed. However, an optional argument, empty is provided in case it is desirable to close files which may be empty. In this case, a warning will still be emitted, but no fatal error generated.

xml_NewElement
name: string: Name of tag (for namespaced output, you need to include the prefix)

Open a new element tag

xml_EndElement
name: string: Name of tag to be closed (if it doesn't match currently open tag, you'll get an error)

Close an open tag

xml_AddAttribute
name: string: Name of attribute
value: anytype: Value of attribute
(escape): logical: if the attribute value is a string, should the attribute value be escaped? default: true
(type): string: the type of the attribute. This must be one of CDATA, ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, ENTITY, ENTITIES, or NOTATION (always upper case). If specified, this must match any attribute declarations that have been previously declared in the DTD. If unspecified this (as the XML standard requires) defaults to CDATA.

Add an attribute to the currently open tag.

By default, if the attribute value contains markup characters, they will be escaped automatically by wxml before output.

However, in rare cases you may not wish this to happen - if you wish to output Unicode characters, or entity references. In this case, you should set escape=.false. for the relevant subroutine call. Note that if you do this, no checking on the validity of the output string iis performed; the onus is on you to ensure well-formedness

The value to be added may be of any type; it will be converted to text according to FoX's formatting rules, and if it is a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

NB The type option is only provided so that in the case of an external DTD which FoX is unaware of, the attribute type can be specified (which gives FoX more information to ensure well-formedness and validity). Specifying the type incorrectly may result in spurious error messages)

xml_AddCharacters
chars anytype: The text to be output
(parsed): logical: Should the output characters be parsed (ie should the library replace '&' with '&' etc?) or unparsed (in which case the characters will be surrounded by CDATA tags. default: yes
(delimiter): character(1): If data is a character array, what should the delimiter between elements be on output? default: a single space
(ws_significant): logical: Is any whitespace in the string significant? default: unknown

Add text data. The data to be added may be of any type; they will be converted to text according to FoX's formatting rules, and if they are a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

xml_AddNewline

Within the context of character output, add a (system-dependent) newline character. This function can only be called wherever xml_AddCharacters can be called. (Newlines outside of character context are under FoX's control, and cannot be manipulated by the user.)

Namespace-aware functions:

xml_DeclareNamespace
nsURI string: The URI of the namespace
(prefix) string: The namespace prefix to be used in the document. If absent, then the default namespace is affected.

Add an XML Namespace declaration. This function may be called at any time, and its precise effect depends on when it is called; see below

xml_UndeclareNamespace
(prefix) string: The namespace prefix to be used in the document. If absent, then the default namespace is affected.

Undeclare an XML namespace. This is equivalent to declaring an namespace with an empty URI, and renders the namespace ineffective for the scope of the declaration. For explanation of its scope, see below.

NB Use of xml_UndeclareNamespace implies that the resultant document will be compliant with XML Namespaces 1.1, but not 1.0; wxml will issue an error when trying to undeclare namespaces under XML 1.0.

Scope of namespace functions

If xml_[Un]declareNamespace is called immediately prior to an xml_NewElement call, then the namespace will be declared in that next element, and will therefore take effect in all child elements.

If it is called prior to an xml_NewElement call, but that element has namespaced attributes

To explain by means of example: In order to generate the following XML output:

 <cml:cml xmlns:cml="http://www.xml-cml.org/schema"/>

then the following two calls are necessary, in the prescribed order:

  xml_DeclareNamespace(xf, 'cml', 'http://www.xml-cml.org')
  xml_NewElement(xf, 'cml:cml')

However, to generate XML input like so: that is, where the namespace refers to an attribute at the same level, then as long as the xml_DeclareNamespace call is made before the element tag is closed (either by xml_EndElement, or by a new element tag being opened, or some text being added etc.) the correct XML will be generated.

Two previously mentioned functions are affected when used in a namespace-aware fashion.

xml_NewElement, xml_AddAttribute

The element or attribute name is checked, and if it is a QName (ie if it is of the form prefix:tagName) then wxml will check that prefix is a registered namespace prefix, and generate an error if not.

More rarely used functions:

If you don't know the purpose of any of these, then you don't need to.

xml_AddXMLDeclaration
(version) string: XML version to be used. default: 1.0
(encoding) string: character encoding of the document default: absent
(standalone) logical: is this document standalone? default: absent

Add XML declaration to the first line of output. If used, then the file must have been opened with addDecl = .false., and this must be the first wxml call to the document.o

NB The only XML versions available are 1.0 and 1.1. Attempting to specify anything else will result in an error. Specifying version 1.0 results in additional output checks to ensure the resultant document is XML-1.0-conformant.

NB Note that if the encoding is specified, and is specified to not be UTF-8, then if the specified encoding does not match that supported by the Fortran processor, you may end up with output you do not expect.

xml_AddDOCTYPE
name string: DOCTYPE name
(system) string: DOCTYPE SYSTEM ID
(public) string: DOCTYPE PUBLIC ID

Add an XML document type declaration. If used, this must be used prior to first xml_NewElement call, and only one such call must be made.

xml_AddInternalEntity
name string: name of internal entity
value string: value of internal entity

Define an internal entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddExternalEntity
name string: name of external entity
system string: SYSTEM ID of external entity
(public) string: PUBLIC ID of external entity default: absent
(notation) string: notation for external entity default: absent

Define an external entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddParameterEntity
name string: name of parameter entity
(PEdef) string: definition of parameter entity default: absent
(system) string: SYSTEM ID of parameter entity default: absent
(public) string: PUBLIC ID of parameter entity default: absent

Define a parameter entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddNotation
name string: name of notation
(system) string: SYSTEM ID of notation default: absent
(public) string: PUBLIC ID of notation default: absent

Define a notation for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddElementToDTD
name string: name of element
declaration string: declaration of element

Add an ELEMENT declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how elements may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddAttlistToDTD
name string: name of element
declaration string: declaration of element

Add an ATTLIST declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how attributes may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddPEreferenceToDTD
name string: name of PEreference

Add a reference to a Parameter Entity in the DTD. No check is made according to whether the PE exists, has been declared, or may legally be used.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

xml_AddXMLStylesheet
href :string: address of stylesheet
type: string: type of stylesheet (generally "text/xsl")
(title): string: title of stylesheet default: none
(media): string: output media type default: none
(charset): string charset of media type default:none
(alternate): string: alternate default:none

Add XML stylesheet processing instruction, as described in [Stylesheets]. If used, this call must be made before the first xml_NewElement call.

xml_AddXMLPI
name: string: name of PI
(data): string: data for PI
(xml): logical: (see below) default: false
(ws_significant): logical: if this is a PI containing only data, then is any whitespace in the data significant? default: unknown

Add an XML Processing Instruction.

If data is present, nothing further can be added to the PI. If it is not present, then pseudoattributes may be added using the call below. Normally, the name is checked to ensure that it is XML-compliant. This requires that PI targets not start with [Xx][Mm][Ll], because such names are reserved. However, some are defined by later W3 specificataions. If you wish to use such PI targets, then set xml=.true. when outputting them.

The output PI will look like: <?name data?>

xml_AddPseudoAttribute
name: string: Name of pseudoattribute
value: anytype: Value of pseudoattribute (ws_significant): logical: If there is any whitespace in the value of this pseudoattribute, is is significant?

Add a pseudoattribute to the currently open PI.

xml_AddComment
comment: string Contents of comment
(ws_significant): logical: is any whitespace in the comment string significant? default: unknown

Add an XML comment.

xml_AddEntityReference
entityref: Entity reference.

This may be used anywhere that xml_AddCharacters may be, and will insert an entity reference into the contents of the XML document at that point. Note that if the entity inserted is a character entity, its validity well be checked according to the rules of XML-1.1, not 1.0.

If the entity reference is not a character entity, then no check is made of its validity, and a warning will be issued

Functions to query XML file objects

These functions may be of use in building wrapper libraries:

xmlf_Name result(string)

Return the filename of an open XML file

xmlf_OpenTag result(string)

Return the currently open tag of the current XML file (or the empty string if none is open)

xmlf_GetPretty_print result(logical)

Return the current value of pretty_print.

xmlf_SetPretty_print NewValue: logical

Set the current value of pretty_print to the NewValue. This may be useful in a mixed namespace document where pretty printing the output may change the meaning under one of the namespaces.

Exceptions

Below are explained areas where wxml fails to implement the whole of XML 1.0/1.1. These are divided into two lists; where wxml does not permit the generation of a particular well-formed XML document, and where it does permit the generation of a particular non-well-formed document.

Ways in which wxml renders it impossible to produce a certain sort of well-formed XML document:

Unicode support is limited. Due to the limitations of Fortran, wxml is unable to manipulate characters outwith 7-bit US-ASCII. wxml will ensure that characters corresponding to those in 7-bit ASCII are output correctly within the constraints of the version of XML in use, for a UTF-8 encoding. Attempts to directly output any other characters will have undefined effects. Output of other unicode characters is possible through the use of character entities.
Due to the constraints of the Fortran IO specification, it is impossible to output arbitrary long strings without carriage returns. The size of the limit varies between processors, but may be as low as 1024 characters. To avoid overrunning this limit, wxml will by default insert carriage returns before every new element, and if an unbroken string of attribute or text data is requested greater than 1024 characters, then carriage returns will be inserted as appropriate; within whitespace if possible; to ensure it is broken up into smaller sections to fit within the limits.

wxml will try very hard to ensure that output is well-formed. However, it is possible to fool wxml into producing ill-formed XML documents. Avoid doing so if possible; for completeness these ways are listed here. In all cases where ill-formedness is a possibility, a warning can be issued. These warnings can be verbose, so are off by default, but if they are desired, they can be switched on by manipulating the warning argument to xml_OpenFile.

If you specify a non-default text encoding, and then run FoX on a platform which does not use this encoding, then the result will be nonsense, and more than likely ill-formed. FoX will issue a warning in this case.
When adding any text, if any characters are passed in (regardless of character set) which do not have equivalants within 7-bit ASCII, then the results are processor-dependent, and may result in an invalid document on output. A warning will be issued if this occurs. If you need a guarantee that such characters will be passed correctly, use character entities.
If any parameter entities are referenced, no checks are made that the document after parameter-entity-expansion is well-formed. A warning will be issued.

Validity constraints

Finally, note that constraints on XML documents are divided into two sets - well-formedness constraints (WFC) and validity constraints (VC). The above only applies to WFC checks. wxml can make some minimal checks on VCs, but this is by no means complete, nor is it intended to be. These checks are off by default, but may be switched on by manipulating the validate argument to xml_OpenFile.

WCML

WCML is a library for outputting CML data. It wraps all the necessary XML calls, such that you should never need to touch any WXML calls when outputting CML.

The CML output is conformant to version 2.4 of the CML schema.

The available functions and their intended use are listed below. Quite deliberately, no reference is made to the actual CML output by each function.

Wcml is not intended to be a generalized Fortran CML output layer. rather it is intended to be a library which allows the output of a limited set of well-defined syntactical fragments.

Further information on these fragments, and on the style of CML generated here, is available at http://www.uszla.me.uk/specs/subset.html.

This section of the manual will detail the available CML output subroutines.

Use of WCML

wcml subroutines can be accessed from within a module or subroutine by inserting

 use FoX_wcml

at the start. This will import all of the subroutines described below, plus the derived type xmlf_t needed to manipulate a CML file.

No other entities will be imported; public/private Fortran namespaces are very carefully controlled within the library.

Dictionaries.

The use of dictionaries with WCML is strongly encouraged. (For those not conversant with dictionaries, a fairly detailed explanation is available at http://www.xml-cml.org/information/dictionaries)

In brief, dictionaries are used in two ways.

Identification

Firstly, to identify and disambiguate output data. Every output function below takes an optional argument, dictRef="". It is intended that every piece of data output is tagged with a dictionary reference, which will look something like nameOfCode:nameOfThing.

So, for example, in SIESTA, all the energies are output with different dictRefs, looking like: siesta:KohnShamEnergy, or siesta:kineticEnergy, etc. By doing this, we can ensure that later on all these numbers can be usefully identified.

We hope that ultimately, dictionaries can be written for codes, which will explain what some of these names might mean. However, it is not in any way necessary that this be done - and using dictRef attributes will help merely by giving the ability to disambiguate otherwise indistinguishable quantities.

We strongly recommend this course of action - if you choose to do follow our recommendation, then you should add a suitable Namespace to your code. That is, immediately after cmlBeginFile and before cmlStartCml, you should add something like:

call cmlAddNamespace(xf=xf, 'nameOfCode', 'WebPageOfCode')

Again, for SIESTA, we add:

call cmlAddNamespace(xf, 'siesta, 'http://www.uam.es/siesta')

If you don't have a webpage for your code, don't worry; the address is only used as an identifier, so anything that looks like a URL, and which nobody else is using, will suffice.

Quantification

Secondly, we use dictionaries for units. This is compulsory (unlike dictRefs above). Any numerical quantity that is output through cmlAddProperty or cmlAddParameter is required to carry units. These are added with the units="" argument to the function. In addition, every other function below which will take numerical arguments also will take optional units, although default will be used if no units are supplied.

Further details are supplied in section Units below.

General naming conventions for functions.

Functions are named in the following way:

All functions begin cml
To begin and end a section of the CML file, a pair of functions will exist:
- cmlStartsomething
- cmlEndsomething
To output a given quantity/property/concept etc. a function will exist cmlAddsomething

Conventions used below.

Function names are in monospace
argument names are in bold
optional argument names are in (parenthesized bold)
argument types are in italic and may consist of:
string: string of arbitrary (unless otherwise specified) length
integer: default integer
real(sp): single precision real number
real(dp): double precision real number
logical: default logical
real: either of real(sp) or real(dp)
anytype: any of logical, integer, real(sp), real(dp), string

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

Also note that wherever a real number can be passed in (including through anytype) then the formatting can be specified using the conventions described in StringFormatting

scalar: single item
array: one-dimensional array of items
matrix: two-dimensional array of items
anydim: any of scalar, array, matrix

Where an array is passed in, it may be passed either as an assumed-shape array; that is, as an F90-style array with no necessity for specifying bounds; thusly:

integer :: array(50)
call cmlAddProperty(xf, 'coords', array)

or as an assumed-size array; that is, an F77-style array, in which case the length must be passed as an additional parameter:

integer :: array(*)
call cmlAddProperty(xf, 'coords', array, nitems=50)

Similarly, when a matrix is passed in, it may be passed in both fashions:

integer :: matrix(50, 50)
call cmlAddProperty(xf, 'coords', matrix)

integer :: array(3, *)
call cmlAddProperty(xf, 'coords', matrix, nrows=3, ncols=50)

All functions take as their first argument an XML file object, whose keyword is always xf. This file object is initialized by a cmlBeginFile function.

It is highly recommended that subroutines be called with keywords specified rather than relying on the implicit ordering of arguments. This is robust against changes in the library calling convention; and also stepsides a significant cause of errors when using subroutines with large numbers of arguments.

Units

Note below that the functions cmlAddParameter and cmlAddProperty both require that units be specified for any numerical quantities output.

If you are trying to output a quantity that is genuinely dimensionless, then you should specify units="units:dimensionless"; or if you are trying to output a countable quantity (eg number of CPUs) then you may specify units="units:countable".

For other properties, all units should be specified as namespaced quantities. If you are using a very few common units, it may be easiest to borrow definitions from the provided dictionaries;

(These links do not resolve yet.)

cmlUnits: http://www.xml-cml.org/units/units
siUnits: http://www.xml-cml.org/units/siUnits
atomicUnits: http://www.xml-cml.org/units/atomic

A default units dictionary, containing only the very basic units that wcml needs to know about, which has a namespace of: http://www.uszla.me.uk/FoX/units, and wcml assigns it automatically to the prefix units.

This is added automatically, so attempts to add it manually will fail.

The contents of all of these dictionaries, plus the wcml dictionary, may be viewed at: http://www.uszla.me.uk/unitsviz/units.cgi.

Otherwise, you should feel at liberty to construct your own namespace; declare it using cmlAddNamespace, and markup all your units as:

 units="myNamespace:myunit"

Functions for manipulating the CML file:

cmlBeginFile
filename: string scalar: Filename to be opened.
unit: integer scalar: what unit number should the file be opened on? If you don't care, you may specify -1 as the unit number, in which case wcml will make a guess
(replace): logical scalar: should the file be replaced if it already exists? default: yes

This takes care of all calls to open a CML output file.

cmlFinishFile

This takes care of all calls to close an open CML output file, once you have finished with it. It is compulsory to call this - if your program finished without calling this, then your CML file will be invalid.

cmlAddNamespace
prefix string scalar: prefix to be used
nsURI string scalar: namespace URI to be used

This adds a namespace to a CML file.
NB This may only ever be called immediately after a cmlBeginFile call, before any output has been performed. Attempts to do otherwise will result in a runtime error.

This will be needed if you are adding dictionary references to your output. Thus for siesta, we do:

call cmlAddNamespace(xf, 'siesta', 'http://www.uam.es/siesta')

and then output all our properties and parameters with dictRef="siesta:something".

cmlStartCml
(fileId) string scalar: name of originating file. (default: current filename)
(version) string scalar: version of CML in use. (default: 2.4)
cmlEndCml

This pair of functions begin and end the CML output to an existing CML file. It takes care of namespaces.

Note that unless specified otherwise, there will be a convention attribute added to the cml tag specifying FoX_wcml-2.0 as the convention. (see http://www.uszla.me.uk/FoX for details)

Start/End sections

cmlStartMetadataList
(name) string scalar: name for the metadata list
(role) string scalar role which the element plays
cmlEndMetadataList

This pair of functions open & close a metadataList, which is a wrapper for metadata items.

cmlStartParameterList
(ref) string scalar: Reference an id attribute of another element (generally deprecated)
(role) string scalar role which the element plays
cmlEndParameterList

This pair of functions open & close a parameterList, which is a wrapper for input parameters.

cmlStartPropertyList
(ref) string scalar: Reference an id attribute of another element (generally deprecated)
(role) string scalar role which the element plays
cmlEndPropertyList

This pair of functions open & close a propertyList, which is a wrapper for output properties.

cmlStartKpointList
cmlEndKpointList

Start/end a list of k-points (added using cmlAddKpoint below)

cmlStartModule
(serial) string scalar: serial id for the module
(role) string scalar role which the element plays

Note that in most cases where you might want to use a serial number, you should probably be using the cmlStartStep subroutine below.

cmlEndModule

This pair of functions open & close a module of a computation which is unordered, or loosely-ordered. For example, METADISE uses one module for each surface examined.

cmlStartStep
(index) integer scalar: index number for the step. In the absence of an index, steps will be assumed to be consecutively numbered. Specifying this is useful if you wish to output eg every hundredth step.
(type) string scalar: what sort of step is this? This should be a namespaced string, for example: siesta:CG is a Conjugate Gradient step in siesta.
cmlEndStep

This pair of functions open and close a module of a computation which is strongly ordered. For example, DLPOLY uses steps for each step of the simulation.

Adding items.

cmlAddMetadata
name: string scalar: Identifying string for metadata
content: character scalar: Content of metadata

This adds a single item of metadata. Metadata vocabulary is completely uncontrolled within WCML. This means that metadata values may only be strings of characters. If you need your values to contain numbers, then you need to define the representation yourself, and construct your own strings.

cmlAddParameter
name: string scalar: Identifying title for parameter
value:anytype anydim: value of parameter
units: string scalar: units of parameter value (optional for logical/character values, compulsory otherwise; see note above)
(constraint) string scalar: Constraint under which the parameter is set (this can be an arbitrary string)
(ref) string scalar: Reference an id attribute of another element (generally deprecated)
(role) string scalar role which the element plays

This function adds a tag representing an input parameter

cmlAddProperty
title: string scalar
value: any anydim
units: string scalar units of property value (optional for logical/character values, compulsory otherwise; see note above)
(ref) string scalar: Reference an id attribute of another element (generally deprecated)
(role) string scalar role which the element plays

This function adds a tag representing an output property

Adding geometry information

cmlAddMolecule
coords: real: a 3xn matrix of real numbers representing atomic coordinates (either fractional or Cartesian) . These must be specified in Angstrom or fractional units (see style below.)
OR
x, y, z: real: 3 one-dimensional arrays containing the x, y, and z coordinates of the atoms in the molecule. These must be specified in Angstrom or fractional units (see style below.)
elements: string array: a length-n array of length-2 strings containing IUPAC chemical symbols for the atoms
(natoms) integer scalar: number of atoms in molecule (default: picked up from length of coords array)
(occupancies): real array : a length-n array of the occupancies of each atom.
(atomRefs): string array: a length-n array of strings containing references which may point to IDs elsewhere of, for example, pseudopotentials or basis sets defining the element's behaviour.
(atomIds): string array: a length-n array of strings containing IDs for the atoms.
(style): string scalar: cartesian - the coordinates are Cartesian, or fractional - the coordinates are fractional. The default is Cartesian.
(ref) string scalar: Reference an id attribute of another element (generally deprecated)
(formula) string scalar: An IUPAC chemical formula
(chirality) string scalar: The chirality of the molecule. No defined vocabulary.
(role) string scalar: Role of molecule. No defined vocabulary.
(bondAtom1Refs) string array: Length-m array of references to atomIds at one "end" of a list of bonds.
(bondAtom2Refs) string array: Length-m array of references to atomIds at another "end" of a list of bonds.
(bondOrders) string array: Length-m array of bond orders. See below.
(bondIds) string array: Length-m array of strings containing IDs for bonds.
(nobondcheck) logical scalar: Enable (.true., the default) of dissable (.false.) bond validition.

Outputs an atomic configuration. Bonds may be added using the optional arguments bondAtom1Refs, bondAtom2Refs and bondOrders. All these arrays must be the same lenght and all must be present if bonds are to be added. Optionally, bondIds can be used to add Ids to the bond elements. Some valididity constraints are imposed (atomsRefs in the bonds must be defined, bonds cannot be added twice). The meaning of the terms "molecule", "bond" and "bond order" is left loosly defined.

cmlAddLattice
cell: real matrix a 3x3 matrix of the unit cell
(spaceType): string scalar: real or reciprocal space.
(latticeType): string scalar Space group of the lattice. No defined vocabulary
(units): string scalar units of (reciprocal) distance that cell vectors is given in; default: Angstrom

Outputs information about a unit cell, in lattice-vector form

cmlAddCrystal
a: real scalar the 'a' parameter (must be in Angstrom)
b: real scalar the 'b' parameter
c: real scalar the 'c' parameter
alpha: real scalar the 'alpha' parameter
beta: real scalar the 'beta' parameter
gamma: real scalar the 'gamma' parameter
(z): integer scalar the 'z' parameter: number of molecules per unit cell.
(lenunits): string scalar: Units of length: default is units:angstrom
(angunits): string scalar: Units of angle: default is units:degrees
(lenfmt): string scalar: format for crystal lengths
(angfmt): string scalar: format for crystal angles
(spaceGroup): string scalar Space group of the crystal. No defined vocabulary.

Outputs information about a unit cell, in crystallographic form

Adding eigen-information

cmlStartKPoint
kpoint: real array-3 the reciprocal-space coordinates of the k-point
(weight): real scalar the weight of the kpoint
(kptfmt): string scalar numerical formatting for the k-point
(wtfmt): string scalar numerical formatting for the weight

Start a kpoint section.

cmlEndKPoint

End a kpoint section.

cmlAddKPoint
kpoint: real array-3 the reciprocal-space coordinates of the k-point
(weight): real scalar the weight of the kpoint
(kptfmt): string scalar numerical formatting for the k-point
(wtfmt): string scalar numerical formatting for the weight

Add an empty kpoint section.

cmlStartBand
(spin): string scalar the spin of this band. Must be either "up" or "down"
(label): the label of this band.

Start a section describing one band.

cmlEndBand

End a section describing one band.

cmlAddEigenValue
value: real scalar the eigenvalue
units: QName scalar the units of the eigenvalue

Add a single eigenvalue to a band.

cmlAddBandList
values: real array the eigenvalues
spin: string scalar the spin orientation ("up" or "down")
units: QName scalar the units of the eigenvalue

Add a list of eigenvalues for a kpoint

cmlAddEigenValueVector value: real scalar the eigenvalue for this band
units: QName scalar the units of the eigenvalue
vector: real/complex 3xN matrix the eigenvectors for this band
(valfmt): string scalar numerical formatting for the eigenvalue
(vecfmt): string scalar numerical formatting for the eigenvector

Add a phononic eigenpoint to the band - which has a single energy, and a 3xN matrix representing the eigenvector.

Common arguments

All cmlAdd and cmlStart routines take the following set of optional arguments:

id: Unique identifying string for element. (Uniqueness is not enforced, though duplicated ids on output are usually an error and may cause later problems)
title: Human-readable title of element for display purposes
dictRef: reference to disambiguate element. Should be a QName; a namespaced string. An actual dictionary entry may or may not exist. It is not an error for it not to.
convention: convention by which the element is to be read.
(The wording of the definitions for convention is deliberately loose.)

WKML

WKML is a library for creating KML documents. These documents are intended to be used for "expressing geographic annotation and visualization" for maps and Earth browsers such as Google Earth or Marble. WKML wraps all the necessary XML calls, such that you should never need to touch any WXML calls when outputting KML from a Fortran application.

WKML is intended to produce XML documents that conform to version 2.2 of the Open Geospatial Consortium's schema. However, the library offers no guarantee that documents produced will be valid as only a small subset of the constraints are enforced. The API is designed to minimize the possibilty of producing invalid KML in common use cases, and well-formdness is maintained by the underlying WXML library.

The available functions and their intended use are listed below. One useful reference to the use of KML is Google's KML documentation.

Use of WKML

wkml subroutines can be accessed from within a module or subroutine by inserting

 use FoX_wkml

at the start. This will import all of the subroutines described below, plus the derived type xmlf_t needed to manipulate a KML file.

No other entities will be imported; public/private Fortran namespaces are very carefully controlled within the library.

Conventions used below.

Function names are in monospace
argument names are in bold
optional argument names are in (parenthesized bold)
argument types are in italic and may consist of:
string: string of arbitrary (unless otherwise specified) length
integer: default integer
real(sp): single precision real number
real(dp): double precision real number
logical: default logical
real: either of real(sp) or real(dp)
arguments may be:
scalar: single item
array: one-dimensional array of items
matrix: two-dimensional array of items
anydim: any of scalar, array, matrix

All functions take as their first argument an XML file object, whose keyword is always xf. This file object is initialized by a kmlBeginFile function.

Functions for manipulating the KML file:

kmlBeginFile
fx: xmlf_t: An XML file object
filename: string scalar: Filename to be opened.
unit: integer scalar: what unit number should the file be opened on? If you don't care, you may specify -1 as the unit number, in which case wkml will make a guess
(replace): logical scalar: should the file be replaced if it already exists? default: yes
(docName): string scalar: an optional name for the outermost document element. If absent, "WKML output" will be used

This takes care of all calls to open a KML output file.

kmlFinishFile
fx: xmlf_t: An XML file object

This takes care of all calls to close an open KML output file, once you have finished with it. It is compulsory to call this - if your program finished without calling this, then your KML file will be invalid.

kmlOpenFolder
fx: xmlf_t: An XML file object
(name): string scalar: an optional name for the new folder.
(id): string scalar: an optional xml id for the new folder.

This starts a new folder. Folders are used in KML to organize other objects into groups, the visability of these groups can be changed in one operation within Google Earth. Folders can be nested.

kmlCloseFolder
fx: xmlf_t: An XML file object

This closes the current folder.

kmlOpenDocument
fx: xmlf_t: An XML file object
name: string scalar: a name for the new document element.
(id): string scalar: an optional xml id for the new document element.

This starts a new document element at this point in the output. Note that no checks are currently performed to ensure that this is permitted, for example only one document is permitted to be a child of the kml root element. Most users should not need to use this subroutine.

kmlCloseDocument
fx: xmlf_t: An XML file object

This closes the current document element. Do not close the outermose document element created with kmlBeginFile, this must be closed with kmlFinishFile. Most users should not need to use this subroutine.

Functions for producing geometrical objects:

kmlCreatePoints
fx: xmlf_t: An XML file object
(extrude): logical scalar: If altitude is non-zero, should the point be connected to the ground?
(altitudeMode): logical scalar: If altitude is specified, is it relativeToGround or absolute?
(name): string scalar: A name for the collection of points
(color): color_t: Line colour as a kml color type (See Colours)
(colorname): string scalar: Line colour as a name (See Colours)
(colorhex): string(len=8) scalar: Line colour in hex (See Colours)
(scale): real scalar: Scaling size for the point icon.
(description): string array: A description for each point.
(description_numbers): real array: Numeric description for each point.
(styleURL): string scalar: Location of style specification (see Style Handling)
and:
longitude: real array: longitude of each point in degrees
latitude: real array: latitude of each point in degrees
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 2xN array with the longitude of each point in the first row, and the latitude in the second row. In degrees.
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 3xN array with the longitude of each point in the first row, the latitude in the second row, and the altitude in the third row. Longitude and latitude in degrees and altitude in metres.

A single function, kmlCreatePoints accepts various combinations of arguments, and will generate a series of individual points to be visualized in Google Earth. In fact, the KML produced will consist of a Folder, containing Placemarks, one for each point. The list of points may be provided in any of the three ways specified above.

kmlCreateLine
fx: xmlf_t: An XML file object
(closed): logicl scalar: Should the last point be joined to the first point?
(extrude): logical scalar: If altitude is non-zero, should the point be connected to the ground?
(tessellate): logical scalar: If altitude is not specified, should the line produced follow the altitude of the ground below it?
(altitudeMode): logical scalar: If altitude is specified, is it relativeToGround or absolute?
(name): string scalar: A name for the collection of points
(color): color_t: Line colour as a kml color type (See Colours)
(colorname): string scalar: Line colour as a name (See Colours)
(colorhex): string(len=8) scalar: Line colour in hex (See Colours)
(width): integer scalar: Width of the lines.
(scale): real scalar: Scaling size for the point icon.
(description): string array: A description for each point.
(styleURL): string scalar: Location of style specification (see Style Handling)
and:
longitude: real array: longitude of each point in degrees
latitude: real array: latitude of each point in degrees
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 2xN array with the longitude of each point in the first row, and the latitude in the second row. In degrees.
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 3xN array with the longitude of each point in the first row, the latitude in the second row, and the altitude in the third row. Longitude and latitude in degrees and altitude in metres.

A single function, kmlCreateLine accepts various combinations of arguments, and will generate a series of individual points to be visualized as a (closed or open) path in Google Earth. In fact, the KML produced will consist of a LineString, or LinearRing, containing a list of coordinates. The list of points may be provided in any of the three ways specified above.

kmlStartRegion
fx: xmlf_t: An XML file object
(extrude): logical scalar: If altitude is non-zero, should the point be connected to the ground?
(tessellate): logical scalar: If altitude is not specified, should the line produced follow the altitude of the ground below it?
(altitudeMode): logical scalar: If altitude is specified, is it relativeToGround or absolute?
(name): string scalar: A name for the region
(fillcolor): color_t: Region colour as a kml color type (See Colours)
(fillcolorname): string scalar: Region colour as a name (See Colours)
(fillcolorhex): string(len=8) scalar: Region colour in hex (See Colours)
(linecolor): color_t: Line colour as a kml color type (See Colours)
(linecolorname): string scalar: Line colour as a name (See Colours)
(linecolorhex): string(len=8) scalar: Line colour in hex (See Colours)
(linewidth): integer scalar: Width of the line.
(description): string scalar: A description for the region.
(styleURL): string scalar: Location of style specification (see Style Handling)
and:
longitude: real array: longitude of each point in degrees
latitude: real array: latitude of each point in degrees
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 2xN array with the longitude of each point in the first row, and the latitude in the second row. In degrees.
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 3xN array with the longitude of each point in the first row, the latitude in the second row, and the altitude in the third row. Longitude and latitude in degrees and altitude in metres.

Creates a filled region with the outer boundary described by the list of points. May be followed by one or more calls to kmlAddInnerBoundary and these must be followed by a call to kmlAddInnerBoundary.

kmlEndRegion
fx: xmlf_t: An XML file object

Ends the specification of a region with or without inner boundaries.

kmlAddInnerBoundary
fx: xmlf_t: An XML file object
and:
longitude: real array: longitude of each point in degrees
latitude: real array: latitude of each point in degrees
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 2xN array with the longitude of each point in the first row, and the latitude in the second row. In degrees.
(altitude): real array: altitude of each point in metres
or:
location: real matrix: rank-two 3xN array with the longitude of each point in the first row, the latitude in the second row, and the altitude in the third row. Longitude and latitude in degrees and altitude in metres.

Introduces an internal area that is to be excluded from the enclosing region.

2D fields

WKML also contains two subroutines to allow scalar fields to be plotted over a geographical region. Data is presented to WKML as a collection of values and coordinates and this data can be displayed as a set of coloured cells, or as isocontours.

Data input

For all 2-D field subroutines both position and value of the data must be specified. The data values must always be specified as a rank-2 array, values(:,:). The grid can be specified in three ways depending on grid type.

Regular rectangular grid: Specify north, south, east, west. These specify the four corners of the grid (which must be aligned with lines of longitude and latitude).
Irregularly spaced rectangular grid. Specify two rank-one arrays, longitude(:) and latitude(:). The grid must be aligned with lines of longitude and latitude so that: Grid-point (i, j) = (longitude(i), latitude(j))
Entirely irregular (topologically rectangular) grid. Specify two rank-two arrays, longitude(:,:) and latitude(:,:). The grid may be of any form, aligned with no other projection: Grid-point (i, j) is taken as (longitude(i, j), latitude(i, j))

In all cases, single or double precision data may be used so long as all data is consistent in precision within one call.

Control over the third dimension

The third dimension of the data can be visualized in two (not mutually-exclusive) ways; firstly by assigning colours according to the value of the tird dimension, and secondly by using the altitude of the points as a (suitable scaled) proxy for the third dimension. The following optional arguments control this aspect of the visualization (both for cells and for contours)

type(color) :: colormap(:): an array of colours (see Colours) which will be used for painting the various layers
real, contour_values(:): an array of values which will be used to divide each layer of the third dimension. Single/double precision according to context.
integer :: numvalues: where contourvalues is not specified, this provides that the range of the values by divided into equal sized layers such that there are this many divisors.
real :: height: where this is specified, the generated visualization will vary in height as well as colour. The value of this variable will be used to as a multiplicative prefactor to scale the data before visualization.

Where no colormap is provided, one will be autogenerated with the appropriate number of levels as calculated from the provided contourvalues. Where no contourvalues are provided, they are calculated based on the size of the colormap provided. Where neither colormap nor contour_values are provided, a default of 5 levels with an autogenerated colormap will be used.

Subroutines

kmlCreateCells
fx: xmlf_t: An XML file object
and:
east: real scalar: east edge of data set.
west: real scalar: west edge of data set.
south: real scalar: south edge of data set.
north: real scalar: north edge of data set.
or:
longitude: real array: points in north-south direction where grid lines cross lines of longitude.
latitude: real array: points in east-west direction where grid lines cross lines of latitude.
or:
longitude: real matrix: longitude of each point in values matrix.
latitude: real matrix: latitude of each point in values matrix.
and:
values: real matrix: data values.
(colormap): color_t array: colours used to describe values.
(height): real(sp) scalar: where this is specified, the generated visualization will vary in height as well as colour. The value of this variable will be used to as a multiplicative prefactor to scale the data before visualization.
(contourvalues): *real(sp)* *array*: values used to contour data.
(numlevels): integer scalar: number of data values to show.
(name): string scalar: name describing the cells.

This subroutine generates a set of filled pixels over a region of the earth.

kmlCreateContours
fx: xmlf_t: An XML file object
and:
east: real scalar: east edge of data set.
west: real scalar: west edge of data set.
south: real scalar: south edge of data set.
north: real scalar: north edge of data set.
or:
longitude: real array: points in north-south direction where grid lines cross lines of longitude.
latitude: real array: points in east-west direction where grid lines cross lines of latitude.
or:
longitude: real matrix: longitude of each point in values matrix.
latitude: real matrix: latitude of each point in values matrix.
and:
values: real matrix: data values.
(colormap): color_t array: colours used to describe values.
(height): real(sp) scalar: where this is specified, the generated visualization will vary in height as well as colour. The value of this variable will be used to as a multiplicative prefactor to scale the data before visualization.
(contourvalues): *real(sp)* *array*: values used to contour data.
(numlevels): integer scalar: number of data values to show.
(name): string scalar: name describing the cells.
(lines): logical scalar: should contour lines be shown.
(regions): logical scalar: should contour regions be shown.

This subroutine creates a set of contour lines.

Colours

KML natively handles all colours as 32-bit values, expressed as 8-digit hexadecimal numbers in ABGR (alpha-blue-green-red) channel order. However, this is not very friendly. WKML provides a nicer interface to this, and all WKML functions which accept colour arguments will accept them in three ways:

(*color) color_t: the colour is passed as a wkml color_t derived type. This type is opaque and is created as described below.
(*colorname) string: a free-text string describing a colour. WKML understands any of the approximately 700 colour names used by X11.
(*colorhex) string(len=8): an 8-digit ABGR hexadecimal number as understood by Google Earth.

A function and a subroutine are provided to maniputate the color_t derived type:

kmlGetCustomColor

This function takes a single argument of type integer or string and returns a color_t derived type. If the argument is a string the colour is taken from the set of X11 colours, if it is an integer, i, the ith colour is selected from the X11 list.

kmlSetCustomColor
myCI color_t: This intent(out) variable is set to the chosen colour.
colorhex *string(len=8): an 8-digit ABGR hexadecimal number.

This functon takes a single argument of type string(len=8) representing an 8-digit AVGR hexadecimal number and returns a color_t derived type representing that colour.

Several features of wkml make use of "colour maps", arrays of the color_t derived type, which are used to relate numerical values to colours when showing fields of data. These are created and used thus:

program colours
  use FoX_wkml
  type(color_t) :: colourmap(10)

  ! Use X11 colours from 101 to 110:
  colourmap(1:10) = kmlGetCustomColor(101:110)
  ! Except for number 5 which should be red:
  colourmap(5) = kmlGetCustomColor("indian red")
  ! And for number 6 which should be black
  call kmlSetCustomColor(colourmp(6), "00000000")

end program colours

Styles

Controling styling in KML can be quite complex. Most of the subroutines in WKML allow some control of the generated style but they do not ptovide access to the full KML vocabulary which allows more complex styling. In order to access the more complex styles in KML it is necessary to create KML style maps - objects that are defined, named with a styleURL. The styleURL is then used to reference to the style defined by the map.

Styles can be created using the following three subroutines. In each case one argument is necessary: id, which must be a string (starting with an alphabetic letter, and containing no spaces or punctuation marks) which is used later on to reference the style. All other arguments are optional.

kmlCreatePointStyle
fx: xmlf_t: An XML file object
id: string scalar: A URL for the style
(scale): real or integer scalar: A scale factor to set the size of the image displayed at the point (note, if both are present, scale and heading must be of the same type).
(color): color_t: Point colour as a kml color type (See Colours)
(colorname): string scalar: Point colour as a name (See Colours)
(colorhex): string(len=8) scalar: Point colour in hex (See Colours)
(colormode): string(len=6) scalar: A string, either normal or random - if random, the colour will be randomly changed. See the KML documentation
(heading): real or integer scalar: direction to "point" the point icon in (between 0 and 360 degreesnote, if both are present, scale and heading must be of the same type).
(iconhref): string scalar: URL of an icon used to draw the point (e.g. from an http server).

Creates a style that can be used for points.

kmlCreateLineStyle
fx: xmlf_t: An XML file object
id: string scalar: A URL for the style
(width): integer scalar: width of the line in pixels.
(color): color_t: Point colour as a kml color type (See Colours)
(colorname): string scalar: Line colour as a name (See Colours)
(colorhex): string(len=8) scalar: Line colour in hex (See Colours)
(colormode): string(len=6) scalar: A string, either normal or random - if random, the colour will be randomly changed. See the KML documentation

Creates a style that can be used for lines.

kmlCreatePolygonStyle
fx: xmlf_t: An XML file object
id: string scalar: A URL for the style
(fill): logical scalar: Should the polygon be filled?
(outline): logical scalar: Should the polygon have an outline?
(color): color_t: Point colour as a kml color type (See Colours)
(colorname): string scalar: Line colour as a name (See Colours)
(colorhex): string(len=8) scalar: Line colour in hex (See Colours)
(colormode): string(len=6) scalar: A string, either normal or random - if random, the colour will be randomly changed. See the KML documentation

Creates a style that can be used for a polygon.

Debugging with FoX.

Following experience integrating FoX_wxml into several codes, here are a few tips for debugging any problems you may encounter.

Compilation problems

You may encounter problems at the compiling or linking stage, with error messages along the lines of: 'No Specific Function can be found for this Generic Function' (exact phrasing depending on compiler, of course.)

If this is the case, it is possible that you have accidentally got the arguments to the offending out of order. If so, then use the keyword form of the argument to ensure correctness; that is, instead of doing:

call cmlAddProperty(file, name, value)

do:

call cmlAddProperty(xf=file, name=name, value=value)

This will prevent argument mismatches, and is recommended practise in any case.

Runtime problems

You may encounter run-time issues. FoX performs many run-time checks to ensure the validity of the resultant XML code. In so far as it is possible, FoX will either issue warnings about potential problems, or try and safely handle any errors it encounters. In both cases, warning will be output on stderr, which will hopefully help diagnose the problem.

Sometimes, however, FoX will encounter a problem it can do nothing about, and must stop. In all cases, it will try and write out an error message highlighting the reason, and generate a backtrace pointing to the offending line. Occasionally though, the compiler will not generate this information, and the error message will be lost.

If this is the case, you can either investigate the coredump to find the problem, or (if you are on a Mac) look in ~/Library/Logs/CrashReporter to find a human-readable log.

If this is not enlightening, or you cannot find the problem, then some of the most common issues we have encountered are listed below. Many of them are general Fortran problems, but sometimes are not easily spotted in the context of FoX.

Incorrect formatting.

Make sure, whenever you are writing out a real number through one of FoX's routines, and specifying a format, that the format is correct according to StringFormatting. Fortran-style formats are not permitted, and will cause crashes at runtime.

Array overruns

If you are outputting arrays or matrices, and are doing so in the traditional Fortran style - by passing both the array and its length to the routine, like so:

 call xml_AddAttribute(xf=file, name=name, value=array, nvalue=n)

then if n is wrong, you may end up with an array overrun, and cause a crash.

We highly recommend wherever possible using the Fortran-90 style, like so:

 call xml_AddAttribute(xf=file, name=name, value=array)

where the array length will be passed automatically.

Uninitialized variables

If you are passing variables to FoX which have not been initialized, you may well cause a crash. This is especially true, and easy to cause if you are passing in an array which (due to a bug elsewhere) has been partly but not entirely initialized. To diagnose this, try printing out suspect variables just before passing them to FoX, and look for suspiciously wrong values.

Invalid floating point numbers.

If during the course of your calculation you accidentally generate Infinities, or NaNs, then passing them to any Fortran subroutine can result in a crash - therefore trying to pass them to FoX for output may result in a crash.

If you suspect this is happening, try printing out suspect variables before calling FoX.

SAX

SAX stands for Simple API for XML, and was originally a Java API for reading XML. (Full details at http://saxproject.org). SAX implementations exist for most common modern computer languages.

FoX includes a SAX implementation, which translates most of the Java API into Fortran, and makes it accessible to Fortran programs, enabling them to read in XML documents in a fashion as close and familiar as possible to other languages.

SAX is a stream-based, event callback API. Conceptually, running a SAX parser over a document results in the parser generating events as it encounters different XML components, and sends the events to the main program, which can read them and take suitable action.

Events

Events are generated when the parser encounters, for example, an element opening tag, or some text, and most events carry some data with them - the name of the tag, or the contents of the text.

The full list of events is quite extensive, and may be seen below. For most purposes, though, it is unlikely that most users will need more than the 5 most common events, documented here.

startDocument - generated when the parser starts reading the document. No accompanying data.
endDocument - generated when the parser reaches the end of the document. No accompanying data.
startElement - generated by an element opening tag. Accompanied by tag name, namespace information, and a list of attributes
endElement - generated by an element closing tag. Accompanied by tag name, and namespace information.
characters - generated by text between tags. Accompanied by contents of text.

Given these events and accompanying information, a program can extract data from an XML document.

Invoking the parser.

Any program using the FoX SAX parser must a) use the FoX module, and b) declare a derived type variable to hold the parser, like so:

   use FoX_sax
   type(xml_t) :: xp

The FoX SAX parser then works by requiring the programmer to write a module containing subroutines to receive any of the events they are interested in, and passing these subroutines to the parser.

Firstly, the parser must be initialized, by passing it XML data. This can be done either by giving a filename, which the parser will manipulate, or by passing a string containing an XML document. Thus:

  call open_xml_file(xp, "input.xml", iostat)

The iostat variable will report back any errors in opening the file.

Alternatively,

  call open_xml_string(xp, XMLstring)

where XMLstring is a character variable.

To now run the parser over the file, you simply do:

 call parse(xp, list_of_event_handlers)

And once you're finished, you can close the file, and clean up the parser, with:

 call close_xml_t(xp)

Options to parser

It is unlikely that most users will need to operate any of these options, but the following are available for use; all are optional boolean arguments to parse.

namespaces
Does namespace processing occur? Default is .true., and if on, then any non-namespace-well-formed documents will be rejected, and namespace URI resolution will be performed according to the version of XML in question. If off, then documents will be processed without regard for namespace well-formedness, and no namespace URI resolution will be performed.
namespace_prefixes Are xmlns attributes reported through the SAX parser? Default is .false.; all such attributes are removed by the parser, and transparent namespace URI resolution is performed. If on, then such attributes will be reported, and treated according to the value of xmlns-uris below. (If namespaces is false, this flag has no effect)
validate Should validation be performed? Default is .false., no validation checks are made, and the influence of the DTD on the XML Infoset is ignored. (Ill-formed DTD's will still cause fatal errors, of course.) If .true., then validation will be performed, and the Infoset modified accordingly.
xmlns_uris Should xmlns attributes have a namespace of http://www.w3.org/2000/xmlns/? Default is .false.. If such attributes are reported, they have no namespace. If .true. then they are supplied with the appropriate namespace. (if namespaces or namespace-prefixes are .false., then this flag has no effect.)

Receiving events

To receive events, you must construct a module containing event handling subroutines. These are subroutines of a prescribed form - the input & output is predetermined by the requirements of the SAX interface, but the body of the subroutine is up to you.

The required forms are shown in the API documentation below, but here are some simple examples.

To receive notification of character events, you must write a subroutine which takes as input one string, which will contain the characters received. So:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

That does very little - it simply prints out the data it receives. However, since the subroutine is in a module, you can save the data to a module variable, and manipulate it elsewhere; alternatively you can choose to call other subroutines based on the input.

So, a complete program which reads in all the text from an XML document looks like this:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

program XMLreader
  use FoX_sax
  use event_handling
  type(xml_t) :: xp
  call open_xml_file(xp, 'input.xml')
  call parse(xp, characters_handler=characters_handler)
  call close_xml_t(xp)
end program

Attribute dictionaries.

The other likely most common event is the startElement event. Handling this involves writing a subroutine which takes as input three strings (which are the local name, namespace URI, and fully qualified name of the tag) and a dictionary of attributes.

An attribute dictionary is essentially a set of key:value pairs - where the key is the attributes name, and the value is its value. (When considering namespaces, each attribute also has a URI and localName.)

Full details of all the dictionary-manipulation routines are given in AttributeDictionaries, but here we shall show the most common.

getLength(dictionary) - returns the number of entries in the dictionary (the number of attributes declared)
hasKey(dictionary, qName) (where qName is a string) returns .true. or .false. depending on whether an attribute named qName is present.
hasKey(dictionary, URI, localname) (where URI and localname are strings) returns .true. or .false. depending on whether an attribute with the appropriate URI and localname is present.
getQName(dictionary, i) (where i is an integer) returns a string containing the key of the ith dictionary entry (ie, the name of the ith attribute.
getValue(dictionary, i) (where i is an integer) returns a string containing the value of the ith dictionary entry (ie the value of the ith attribute.
getValue(dictionary, URI, localname) (where URI and localname are strings) returns a string containing the value of the attribute with the appropriate URI and localname (if it is present)

So, a simple subroutine to receive a startElement event would look like:

module event_handling

contains

 subroutine startElement_handler(URI, localname, name,attributes)
   character(len=*), intent(in)   :: URI  
   character(len=*), intent(in)   :: localname
   character(len=*), intent(in)   :: name 
   type(dictionary_t), intent(in) :: attributes

   integer :: i

   print*, name

   do i = 1, getLength(attributes)
      print*, getQName(attributes, i), '=', getValue(attributes, i)
   enddo

  end subroutine startElement_handler
end module

program XMLreader
 use FoX_sax
 use event_handling
 type(xml_t) :: xp
 call open_xml_file(xp, 'input.xml')
 call parse(xp, startElement_handler=startElement_handler)
 call close_xml_t(xp)
end program

Again, this does nothing but print out the name of the element, and the names and values of all of its attributes. However, by using module variables, or calling other subroutines, the data could be manipulated further.

Error handling

The SAX parser detects all XML well-formedness errors (and optionally validation errors). By default, when it encounters an error, it will simply halt the program with a suitable error message. However, it is possible to pass in an error handling subroutine if some other behaviour is desired - for example it may be nice to report the error to the user, finish parsing, and carry on with some other task.

In any case, once an error is encountered, the parser will finish. There is no way to continue reading past an error. (This means that all errors are treated as fatal errors, in the terminology of the XML standard).

An error handling subroutine works in the same way as any other event handler, with the event data being an error message. Thus, you could write:

subroutine fatalError_handler(msg)
  character(len=*), intent(in) :: msg

  print*, "The SAX parser encountered an error:"
  print*, msg
  print*, "Never mind, carrying on with the rest of the calcaulation."
end subroutine

Stopping the parser.

The parser can be stopped at any time. Simply do (from within one of the callback functions).

call stop_parser(xp)

(where xp is the XML parser object). The current callback function will be completed, then the parser will be stopped, and control will return to the main program, the parser having finished.

Full API

Derived types

There is one derived type, xml_t. This is entirely opaque, and is used as a handle for the parser.

Subroutines

There are four subroutines:

open_xml_file type(xml_t), intent(inout) :: xp character(len=*), intent(in) :: string integer, intent(out), optional :: iostat

This opens a file. xp is initialized, and prepared for parsing. string must contain the name of the file to be opened. iostat reports on the success of opening the file. A value of 0 indicates success.

open_xml_string type(xml_t), intent(inout) :: xpi character(len=*), intent(in) :: string

This prepares to parse a string containing XML data. xp is initialized. string must contain the XML data.
close_xml_t type(xml_t), intent(inout) :: xp

This closes down the parser (and closes the file, if input was coming from a file.) xp is left uninitialized, ready to be used again if necessary.

parse type(xml_t), intent(inout) :: xp external :: list of event handlers logical, optional, intent(in) :: validate

This tells xp to start parsing its document.

(Advanced: See above for the list of options that the parse subroutine may take.)

The full list of event handlers is in the next section. To use them, the interface must be placed in a module, and the body of the subroutine filled in as desired; then it should be specified as an argument to parse as:
name_of_event_handler = name_of_user_written_subroutine
Thus a typical call to parse might look something like:

  call parse(xp, startElement_handler = mystartelement, endElement_handler = myendelement, characters_handler = mychars)

where mystartelement, myendelement, and mychars are all subroutines written by you according to the interfaces listed below.

Callbacks.

All of the callbacks specified by SAX 2 are implemented. Documentation of the SAX 2 interfaces is available in the JavaDoc at http://saxproject.org, but as the interfaces needed adjustment for Fortran, they are listed here.

For documentation on the meaning of the callbacks and of their arguments, please refer to the Java SAX documentation.

characters_handler subroutine characters_handler(chunk) character(len=*), intent(in) :: chunk end subroutine characters_handler

Triggered when some character data is read from between tags.

NB Note that all character data is reported, including whitespace. Thus you will probably get a lot of empty characters events in a typical XML document.

NB Note also that it is not required that a single chunk of character data all come as one event - it may come as multiple consecutive events. You should concatenate the results of subsequent character events before processing.

endDocument_handler subroutine endDocument_handler() end subroutine endDocument_handler

Triggered when the parser reaches the end of the document.

endElement_handler subroutine endElement_handler(namespaceURI, localName, name) character(len=*), intent(in) :: namespaceURI character(len=*), intent(in) :: localName character(len=*), intent(in) :: name end subroutine endElement_handler

Triggered by a closing tag.

endPrefixMapping_handler subroutine endPrefixMapping_handler(prefix) character(len=*), intent(in) :: prefix end subroutine endPrefixMapping_handler

Triggered when a namespace prefix mapping goes out of scope.

ignorableWhitespace subroutine ignorableWhitespace_handler(chars) character(len=*), intent(in) :: chars end subroutine ignorableWhitespace_handler

Triggered when whitespace is encountered within an element declared as having no PCDATA. (Only active in validating mode.)

processingInstruction_handler subroutine processingInstruction_handler(name, content) character(len=*), intent(in) :: name character(len=*), intent(in) :: content end subroutine processingInstruction_handler

Triggered by a Processing Instruction

skippedEntity_handler subroutine skippedEntity_handler(name) character(len=*), intent(in) :: name end subroutine skippedEntity_handler

Triggered when either an external entity, or an undeclared entity, is skipped.

startDocument_handler subroutine startDocument_handler() end subroutine startDocument_handler

Triggered when the parser starts reading the document.

startElement_handler subroutine startElement_handler(namespaceURI, localName, name, attributes) character(len=*), intent(in) :: namespaceUri character(len=*), intent(in) :: localName character(len=*), intent(in) :: name type(dictionary_t), intent(in) :: attributes end subroutine startElement_handler

Triggered when an opening tag is encountered. (see LINK for documentation on handling attribute dictionaries.

startPrefixMapping_handler subroutine startPrefixMapping_handler(namespaceURI, prefix) character(len=*), intent(in) :: namespaceURI character(len=*), intent(in) :: prefix end subroutine startPrefixMapping_handler

Triggered when a namespace prefix mapping start.

notationDecl_handler subroutine notationDecl_handler(name, publicId, systemId) character(len=*), intent(in) :: name character(len=*), intent(in) :: publicId character(len=*), intent(in) :: systemId end subroutine notationDecl_handler

Triggered when a NOTATION declaration is made in the DTD

unparsedEntityDecl_handler subroutine unparsedEntityDecl_handler(name, publicId, systemId, notation) character(len=*), intent(in) :: name character(len=*), intent(in) :: publicId character(len=*), intent(in) :: systemId character(len=*), intent(in) :: notation end subroutine unparsedEntityDecl_handler

Triggered when an unparsed entity is declared

error_handler subroutine error_handler(msg) character(len=*), intent(in) :: msg end subroutine error_handler

Triggered when a error is encountered in parsing. Parsing will continue after this event.

fatalError_handler subroutine fatalError_handler(msg) character(len=*), intent(in) :: msg end subroutine fatalError_handler

Triggered when a fatal error is encountered in parsing. Parsing will cease after this event.

warning_handler subroutine warning_handler(msg) character(len=*), intent(in) :: msg end subroutine warning_handler

Triggered when a parser warning is generated. Parsing will continue after this event.

attributeDecl_handler subroutine attributeDecl_handler(eName, aName, type, mode, value) character(len=*), intent(in) :: eName character(len=*), intent(in) :: aName character(len=*), intent(in) :: type character(len=*), intent(in) :: mode character(len=*), intent(in) :: value end subroutine attributeDecl_handler

Triggered when an attribute declaration is encountered in the DTD.

elementDecl_handler subroutine elementDecl_handler(name, model) character(len=*), intent(in) :: name character(len=*), intent(in) :: model end subroutine elementDecl_handler

Triggered when an element declaration is enountered in the DTD.

externalEntityDecl_handler subroutine externalEntityDecl_handler(name, publicId, systemId) character(len=*), intent(in) :: name character(len=*), intent(in) :: publicId character(len=*), intent(in) :: systemId end subroutine externalEntityDecl_handler

Triggered when a parsed external entity is declared in the DTD.

internalEntityDecl_handler subroutine internalEntityDecl_handler(name, value) character(len=*), intent(in) :: name character(len=*), intent(in) :: value end subroutine internalEntityDecl_handler

Triggered when an internal entity is declared in the DTD.

comment_handler subroutine comment_handler(comment) character(len=*), intent(in) :: comment end subroutine comment_handler

Triggered when a comment is encountered.

endCdata_handler subroutine endCdata_handler() end subroutine endCdata_handler

Triggered by the end of a CData section.

endDTD_handler subroutine endDTD_handler() end subroutine endDTD_handler

Triggered by the end of a DTD.

endEntity_handler subroutine endEntity_handler(name) character(len=*), intent(in) :: name end subroutine endEntity_handler

Triggered at the end of entity expansion.

startCdata_handler subroutine startCdata_handler() end subroutine startCdata_handler

Triggered by the start of a CData section.

startDTD_handler subroutine startDTD_handler(name, publicId, systemId) character(len=*), intent(in) :: name character(len=*), intent(in) :: publicId character(len=*), intent(in) :: systemId end subroutine startDTD_handler

Triggered by the start of a DTD section.

startEntity_handler subroutine startEntity_handler(name) character(len=*), intent(in) :: name end subroutine startEntity_handler

Triggered by the start of entity expansion.

Exceptions.

The FoX SAX implementation implements all of XML 1.0 and 1.1; all of XML Namespaces 1.0 and 1.1; xml:id and xml:base.

Although FoX tries very hard to work to the letter of the XML and SAX standards, it falls short in a few areas.

FoX will only process documents consisting of nothing but US-ASCII data. It will accept documents labelled with any single byte character set which is identical to US-ASCII in its lower 7 bits (for example, any of the ISO-8859 charsets, or UTF-8) but an error will be generated as soon as any character outside US-ASCII is encountered. (This includes non-ASCII characters present only be character entity reference)
As a corollary, UTF-16 documents of any endianness will also be rejected.

(It is impossible to implement IO of non-ASCII documents in a portable fashion using standard Fortran 95, and it is impossible to handle non-ASCII data internally using standard Fortran strings. A fully unicode-capable FoX version is under development, but requires Fortran 2003. Please enquire for further details if you're interested.)

FoX has no network capabilities. Therefore, when external entities are referenced, any entities not available on the local filesystem will not be accessed (specifically, any entities whose URI reference includes a scheme component, where that scheme is not file, will be skipped)

Beyond this, any aspects of the listed XML standards to which FoX fails to do justice to are bugs.

What of Java SAX 2 is not included in FoX?

The difference betweek Java & Fortran means that none of the SAX APIs can be copied directly. However, FoX offers data types, subroutines, and interfaces covering most of the facilities offered by SAX. Where it does not, this is mentioned here.

org.sax.xml:

Querying/setting of feature flags/property values for the XML parser. The effect of a subset of these may be accessed by options to the parse subroutine.
XML filters - Java SAX makes it possible to write filters to intercept the flow of events. FoX does not support this.
Entity resolution - SAX 2 exports an interface to the application for entity resolution, but FoX does not - all entities are resolved within the parser.
Locator - SAX 2 offers an interface to export information regarding object locations within the document, FoX does not.
XMLReader - FoX only offers the parse() method - no other methods really make sense in Fortran.
AttributeList/DocumentHandler/Parser - FoX only offers namespace aware attributes, not the pre-namespace SAX-1 versions.

org.sax.xml.ext:

EntityResolver2 - not implemented
Locator2 - not implemented

org.sax.xml.helpers:

None of these helper methods are implemented.

Attributes dictionaries.

When parsing XML using the FoX SAX module, attributes are returned contained within a dictionary object.

This dictionary object implements all the methods described by the SAX interfaces Attributes and Attributes2. Full documentation is available from the SAX Javadoc, but is reproduced here for ease of reference.

All of the attribute dictionary objects and functions are exported through FoX_sax - you must USE the module to enable them. The dictionary API is described here.

An attribute dictionary consists of a list of entries, one for each attribute. The entries all have the following pieces of data:

qName - the attribute's full name
value - the attribute's value

and for namespaced attributes:

uri - the namespace URI (if any) of the attribute
localName - the local name of the attribute

In addition, the following pieces of data will be picked up from a DTD if present:

declared - is the attribute declared in the DTD?
specified - is this instance of the attribute specified in the XML document, or is it a default from the DTD?
type - the type of the attribute (if declared)

Derived types

There is one derived type of interest, dictionary_t.

It is opaque - that is, it should only be manipulated through the functions described here.

Functions

Inspecting the dictionary

getLength type(dictionary_t), intent(in) :: dict

Returns an integer with the length of the dictionary, ie the number of dictionary entries.

hasKey type(dictionary_t), intent(in) :: dict character(len=*), intent(in) :: key

Returns a logical value according to whether the dictionary contains an attribute named key or not.

hasKey type(dictionary_t), intent(in) :: dict character(len=*), intent(in) :: uri character(len=*), intent(in) :: localname

Returns a logical value according to whether the dictionary contains an attribute with the correct URI and localname.

Retrieving data from the dictionary

getQName type(dictionary_t), intent(in) :: dict integer, intent(in) :: i

Return the full name of the ith dictionary entry.

getValue type(dictionary_t), intent(in) integer, intent(in) :: i

If an integer is passed in - the value of the ith attribute.

getValue type(dictionary_t), intent(in) character(len=*), intent(in) :: qName

If a single string is passed in, the value of the attribute with that name.

getValue type(dictionary_t), intent(in) character(len=*), intent(in) :: uri, localname

If two strings are passed in, the value of the attribute with that uri and localname.

getURI type(dictionary_t), intent(in) integer, intent(in) :: i

Returns a string containing the nsURI of the ith attribute.

getlocalName type(dictionary_t), intent(in) integer, intent(in) :: i

Returns a string containing the localName of the ith attribute.

DTD-driven functions

The following functions are only of interest if you are using DTDs.

getType type(dictionary_t), intent(in) integer, intent(in), optional :: i

If an integer is passed in, returns the type of the ith attribute.

getType type(dictionary_t), intent(in) character(len=*), intent(in) :: qName

If a single string is passed in, returns the type of the attribute with that QName.

getType type(dictionary_t), intent(in) character(len=*), intent(in) :: uri character(len=*), intent(in) :: localName

If a single string is passed in, returnsthe type of the attribute with that {uri,localName}.

isDeclared type(dictionary_t), intent(in) integer, intent(in), optional :: i

If an integer is passed in, returns false unless the ith attribute is declared in the DTD.

isDeclared type(dictionary_t), intent(in) character(len=*), intent(in) :: qName

If a single string is passed in, returns false unless the attribute with that QName is declared in the DTD.

isDeclared type(dictionary_t), intent(in) character(len=*), intent(in) :: uri character(len=*), intent(in) :: localName

If a single string is passed in, returns false unless the attribute with that {uri,localName} is declared in the DTD.

isSpecified type(dictionary_t), intent(in) integer, intent(in), optional :: i

If an integer is passed in, returns true unless the ith attribute is a default value from the DTD.

isSpecified type(dictionary_t), intent(in) character(len=*), intent(in) :: qName

If a single string is passed in, returns true unless the attribute with that QName is a default value from the DTD.

isSpecified type(dictionary_t), intent(in) character(len=*), intent(in) :: uri character(len=*), intent(in) :: localName

If a single string is passed in, returns true unless the attribute with that {uri,localName} is a default value from the DTD.

DOM

Overview

The FoX DOM interface exposes an API as specified by the W3C DOM Working group.

FoX implements essentially all of DOM Core Levels 1 and 2, (there are a number of minor exceptions which are listed below) and a substantial portion of DOM Core Level 3.

Quick overview of how to map the DOM interface to Fortran
More detailed explanation of Fortran interface
Additional (non-DOM) utility functions
String handling
Exception handling
Live nodelists
DOM Configuration
Miscellanea

Interface Mapping

FoX implements all objects and methods mandated in DOM Core Level 1 and 2. (A listing of supported DOM Core Level 3 interfaces is given below.)

In all cases, the mapping from DOM interface to Fortran implementation is as follows:

All DOM objects are available as Fortran types, and should be referenced only as pointers (though see 7 and 8 below). Thus, to use a Node, it must be declared first as:
type(Node), pointer :: aNode
A flat (non-inheriting) object hierarchy is used. All DOM objects which inherit from Node are represented as Node types.
All object method calls are modelled as functions or subroutines with the same name, whose first argument is the object. Thus:
aNodelist = aNode.getElementsByTagName(tagName)
should be converted to Fortran as:
aNodelist => getElementsByTagName(aNode, tagName)
All object method calls whose return type is void are modelled as subroutines. Thus:
aNode.normalize()
becomes call normalize(aNode)
All object attributes are modelled as a pair of get/set calls (or only get where the attribute is readonly), with the naming convention being merely to prepend get or set to the attribute name. Thus:
name = node.nodeName
node.nodeValue = string
should be converted to Fortran as
name = getnodeName(node)
call setnodeValue(string)
Where an object method or attribute getter returns a DOM object, the relevant Fortran function must always be used as a pointer function. Thus:
aNodelist => getElementsByTagName(aNode, tagName)
No special DOMString object is used - all string operations are done on the standard Fortran character strings, and all functions that return DOMStrings return Fortran character strings.
Exceptions are modelled by every DOM subroutine/function allowing an optional additional argument, of type DOMException. For further information see (#DOM Exceptions) below.

String handling

The W3C DOM requires that a DOMString object exist, capable of holding Unicode strings; and that all DOM functions accept and emit DOMString objects when string data is to be transferred.

FoX does not follow this model. Since (as mentioned elsewhere) it is impossible to perform Unicode I/O in standard Fortran, it would be obtuse to require users to manipulate additional objects merely to transfer strings. Therefore, wherever the DOM mandates use of a DOMString, FoX merely uses standard Fortran character strings.

All functions or subroutines which expect DOMString input arguments should be used with normal character strings.
All functions which should return DOMString objects will return Fortran character strings.

Using the FoX DOM library.

All functions are exposed through the module FoX_DOM. USE this in your program:

program dom_example

  use FoX_DOM
  type(Node) :: myDoc

  myDoc => parseFile("fileIn.xml")
  call serialize(myDoc, "fileOut.xml")
end program dom_example

Documenting DOM functions

This manual will not exhaustively document the functions available through the Fox_DOM interface. Primary documentation may be found in the W3C DOM specifications:`

The systematic rules for translating the DOM interfaces to Fortran are given in the previous section. For completeness, though, there is a list here. The W3C specifications should be consulted for the use of each.

DOMImplementation:
type(DOMImplementation), pointer

hasFeature(impl, feature, version)
createDocumentType(impl, qualifiedName, publicId, systemId)
createDocument(impl, qualifiedName, publicId, systemId)

Document: type(Node), pointer

getDocType(doc)
getImplementation(doc)
getDocumentElement(doc)
createElement(doc, tagname)
createDocumentFragment(doc)
createTextNode(doc, data)
createComment(doc, data)
createCDataSection(doc, data)
createProcessingInstruction(doc, target, data)
createAttribute(doc, name)
createEntityReference(doc, name)
getElementsByTagName(doc, tagname)
importNode(doc, importedNode, deep)
createElementNS(doc, namespaceURI, qualifiedName)
createAttributeNS(doc, namespaceURI, qualifiedName)
getElementsByTagNameNS(doc, namespaceURI, qualifiedName)
getElementById(doc, elementId)

Node:
type(Node), pointer

getNodeName(arg)
getNodeValue(arg)
setNodeValue(arg, value)
getNodeType(arg)
getParentNode(arg)
getChildNodes(arg)
getFirstChild(arg)
getLastChild(arg)
getPreviousSibling(arg)
getNextSibling(arg)
getAttributes(arg)
getOwnerDocument(arg)
insertBefore(arg, newChild, refChild)
replaceChild(arg, newChild, refChild)
removeChild(arg, oldChild)
appendChild(arg, newChild)
hasChildNodes(arg)
cloneNode(arg, deep)
normalize
isSupported(arg, feature, version)
getNamespaceURI(arg)
getPrefix(arg)
setPrefix(arg, prefix)
getLocalName(arg)
hasAttributes(arg)

NodeList:
type(NodeList), pointer

item(arg, index)
getLength(arg)

NamedNodeMap:
type(NamedNodeMap), pointer

getNamedItem(map, name)
setNamedItem(map, arg)
removeNamedItem(map, name)
item(map, index)
getLength(map)
getNamedItemNS(map, namespaceURI, qualifiedName)
setNamedItemNS(map, arg)
removeNamedItemNS(map, namespaceURI, qualifiedName)

CharacterData:
type(Node), pointer

getData(np)
setData(np, data)
getLength(np)
substringData(np, offset, count)
appendData(np, arg)
deleteData(np, offset, count)
replaceData(np, offset, count, arg)

Attr:
type(Node), pointer

getName(np)
getSpecified(np)
getValue(np)
setValue(np, value)
getOwnerElement(np)

Element:
type(Node), pointer

getTagName(np)
getAttribute(np, name)
setAttribute(np, name, value)
removeAttribute(np, name)
getAttributeNode(np, name)
setAttributeNode(np, newAttr)
removeAttributeNode(np, oldAttr)
getElementsByTagName(np, name)
getAttributeNS(np, namespaceURI, qualifiedName)
setAttributeNS(np, namespaceURI, qualifiedName, value)
removeAttributeNS(np, namespaceURI, qualifiedName)
getAttributeNode(np, namespaceURI, qualifiedName)
setAttributeNode(np, newAttr)
removeAttributeNode(np, oldAttr)
getElementsByTagNameNS(np, namespaceURI, qualifiedName)
hasAttribute(np, name)
hasAttributeNS(np, namespaceURI, qualifiedName)

Text:
type(Node), pointer

splitText(np, offset)

DocumentType:
type(Node), pointer

getName(np)
getEntites(np)
getNotations(np)
getPublicId(np)
getSystemId(np)
getInternalSubset(np)

Notation:
type(Node), pointer

getPublicId(np)
getSystemId(np)

Entity:
type(Node), pointer

getPublicId(np)
getSystemId(np)
getNotationName(np)

ProcessingInstruction:
type(Node), pointer

getTarget(np)
getData(np)
setData(np, data)

In addition, the following DOM Core Level 3 functions are available:

Document:

getDocumentURI(np)
setDocumentURI(np, documentURI)
getDomConfig(np)
getInputEncoding(np)
getStrictErrorChecking(np)
setStrictErrorChecking(np, strictErrorChecking)
getXmlEncoding(np)
getXmlStandalone(np)
setXmlStandalone(np, xmlStandalone)
getXmlVersion(np)
setXmlVersion(np, xmlVersion)
adoptNode(np, source)
normalizeDocument(np)
renameNode(np, namespaceURI, qualifiedName)

Node:

getBaseURI(np)
getTextContent(np)
setTextContent(np, textContent)
isEqualNode(np, other)
isSameNode(np)
isDefaultNamespace(np, namespaceURI)
lookupPrefix(np, namespaceURI)
lookupNamespaceURI(np, prefix)

Attr:

getIsId(np)

Entity:

getInputEncoding(np)
getXmlVersion(np)
getXmlEncoding(np)

Text:

getIsElementContentWhitespace(np)

DOMConfiguration:
type(DOMConfiguration)

canSetParameter(arg, name, value)
getParameter(arg, name)
getParameterNames(arg)
setParameter(arg, name)

NB For details on DOMConfiguration, see below

Object Model

The DOM is written in terms of an object model involving inheritance, but also permits a flattened model. FoX implements this flattened model - all objects descending from the Node are of the opaque type Node. Nodes carry their own type, and attempts to call functions defined on the wrong nodetype (for example, getting the target of a node which is not a PI) will result in a FoX_INVALID_NODE exception.

The other types available through the FoX DOM are:

DOMConfiguration
DOMException
DOMImplementation
NodeList
NamedNodeMap

FoX DOM and pointers

All DOM objects exposed to the user may only be manipulated through pointers. Attempts to access them directly will result in compile-time or run-time failures according to your environment.

This should have little effect on the structure of your programs, except that you must always remember, when calling a DOM function, to perform pointer assignment, not direct assignment, thus:
child => getFirstChild(parent)
and not
child = getFirstChild(parent)

Memory handling

Fortran offers no garbage collection facility, so unfortunately a small degree of memory handling is necessarily exposed to the user.

However, this has been kept to a minimum. FoX keeps track of all memory allocated and used when calling DOM routines, and keeps references to all DOM objects created.

The only memory handling that the user needs to take care of is destroying any DOM Documents (whether created manually, or by the parse() routine.) All other nodes or node structures created will be destroyed automatically by the relevant destroy() call.

As a consequence of this, all DOM objects which are part of a given document will become inaccessible after the document object is destroyed.

Additional functions.

Several additional utility functions are provided by FoX.

Input and output of XML data

Firstly, to construct a DOM tree, from either a file or a string containing XML data.

parseFile
filename: string
(configuration): DOMConfiguration
(ex): DOMException

filename should be an XML document. It will be opened and parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a PARSE_ERR exception will be raised. configuration is an optional argument - see DOMConfiguration for its meaning.

parseString
XMLstring: string
(configuration): DOMConfiguration
(ex): DOMException

XMLstring should be a string containing XML data. It will be parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a PARSE_ERR exception will be raised. configuration is an optional argument - see DOMConfiguration for its meaning.

Both parseFile and parseString return a pointer to a Node object containing the Document Node.`

Secondly, to output an XML document:

serialize
arg: Node, pointer fileName: string

This will open fileName and serialize the DOM tree by writing into the file. If fileName already exists, it will be overwritten. If an problem arises in serializing the document, then a fatal error will result.

(Control over serialization options is done through the configuration of the arg's ownerDocument, see below.)

Finally, to clean up all memory associated with the DOM, it is necessary to call:

destroy
np: Node, pointer

This will clear up all memory usage associated with the document (or documentType) node passed in.

Extraction of data from an XML file.

The standard DOM functions only deal with string data. When dealing with numerical (or logical) data, the following functions may be of use.

extractDataContent
extractDataAttribute
extractDataAttributeNS

These extract data from, respectively, the text content of an element, from one of its attributes, or from one of its namespaced attributes. They are used like so:

(where p is an element which has been selected by means of the other DOM functions)

call extractDataContent(p, data)

The subroutine will look at the text contents of the element, and interpret according to the type of data. That is, if data has been declared as an integer, then the contents of p will be read as such an placed into data.

data may be a string, logical, integer, real, double precision, complex or double complex variable.

In addition, if data is supplied as a rank-1 or rank-2 variable (ie an array or a matrix) then the data will be read in assuming it to be a space- or comma-separated list of such data items.

Thus, the array of integers within the XML document:

<element> 1 2 3 4 5 6 </element>

could be extracted by the following Fortran program:

type(Node), pointer :: doc, p
integer :: i_array(6)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataContent(p, i_array)

Contents and Attributes

For extracting data from text content, the example above suffices. For data in a non-namespaced attribute (in this case, a 2x2 matrix of real numbers)

<element att="0.1, 2.3 7.56e23, 93"> Some uninteresting text </element>

then use a Fortran program like:

type(Node), pointer :: doc, p
real :: r_matrix(2,2)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttribute(p, "att", r_matrix)

or for extracting from a namespaced attribute (in this case, a length-2 array of complex numbers):

<myml xmlns:ns="http://www.example.org">
  <element ns:att="0.1,2.3  3.4e2,5.34"> Some uninteresting text </element>
</myml>

then use a Fortran program like:

type(Node), pointer :: doc, p
complex :: c_array(2)

doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttributeNS(p, &
     namespaceURI="http://www.example.org", localName="att", &
     data=c_array)

Error handling

The extraction may fail of course, if the data is not of the sort specified, or if there are not enough elements to fill the array or matrix. In such a case, this can be detected by the optional arguments num and iostat.

num will hold the number of items successfully read. Hopefully this should be equal to the expected number of items; but it may be less if reading failed for some reason, or if there were less items than expected in the element.

iostat will hold an integer - this will be 0 if the extraction went ok; -1 if too few elements were found, 1 if although the read went ok, there were still some elements left over, or 2 if the extraction failed due to either a badly formatted number, or due to the wrong data type being found.

String arrays

For all data types apart from strings, arrays and matrices are specified by space- or comma-separated lists. For strings, some additional options are available. By default, arrays will be extracted assuming that separators are spaces (and multiple spaces are ignored). So:

<element> one two     three </element>

will result in the string array (/"one", "two", "three"/).

However, you may specify an optional argument separator, which specifies another single-character separator to use (and does not ignore multiple spaces). So:

<element>one, two, three </element>

will result in the string array (/"one", " two", " three "/). (note the leading and trailing spaces).

Finally, you can also specify an optional logical argument, csv. In this case, the separator is ignored, and the extraction proceeds assuming that the data is a list of comma-separated values. (see: CSV)

Other utility functions

setFoX_checks
FoX_checks: logical

This affects whether additional FoX-only checks are made (see DomExceptions below).

getFoX_checks
arg: DOMImplementation, pointer

Retrieves the current setting of FoX_checks.

Note that FoX_checks can only be turned on and off globally, not on a per-document basis.

setLiveNodeLists
arg: Node, pointer
liveNodeLists: logical

arg must be a Document Node. Calling this function affects whether any nodelists active on the document are treated as live - ie whether updates to the documents are reflected in the contents of nodelists (see DomLiveNodelists below).

getLiveNodeLists
arg: Node, pointer

Retrieves the current setting of liveNodeLists.

Note that the live-ness of nodelists is a per-document setting.

Exception handling

Exception handling is important to the DOM. The W3C DOM standards provide not only interfaces to the DOM, but also specify the error handling that should take place when invalid calls are made.

The DOM specifies these in terms of a DOMException object, which carries a numeric code whose value reports the kind of error generated. Depending upon the features available in a particular computer language, this DOMException object should be generated and thrown, to be caught by the end-user application.

Fortran of course has no mechanism for throwing and catching exceptions. However, the behaviour of an exception can be modelled using Fortran features.

FoX defines an opaque DOMException object. Every DOM subroutine and function implemented by FoX will take an optional argument, 'ex', of type DOMException.

If the optional argument is not supplied, any errors within the DOM will cause an immediate abort, with a suitable error message. However, if the optional argument is supplied, then the error will be captured within the DOMException object, and returned to the caller for inspection. It is then up to the application to decide how to proceed.

Functions for inspecting and manipulating the DOMException object are described below:

inException:
ex: DOMException

A function returning a logical value, according to whether ex is in exception - that is, whether the last DOM function or subroutine, from which ex returned, caused an error. Note that this will not change the status of the exception.

getExceptionCode
ex: DOMException

A function returning an integer value, describing the nature of the exception reported in ex. If the integer is 0, then ex does not hold an exception. If the integer is less than 200, then the error encountered was of a type specified by the DOM standard; for a full list, see below, and for explanations, see the various DOM standards. If the integer is 200 or greater, then the code represents a FoX-specific error. See the list below.

Note that calling getExceptionCode will clean up all memory associated with the DOMException object, and reset the object such that it is no longer in exception.

Exception handling and memory usage.

Note that when an Exception is thrown, memory is allocated within the DOMException object. Calling getExceptionCode on a DOMEXception will clean up this memory. If you use the exception-handling interfaces of FoX, then you must check every exception, and ensure you check its code, otherwise your program will leak memory.

FoX exceptions.

The W3C DOM interface allows the creation of unserializable XML document in various ways. For example, it permits characters to be added to a text node which would be invalid XML. FoX performs multiple additional checks on all DOM calls to prevent the creation of unserializable trees. These are reported through the DOMException mechanisms noted above, using additional exception codes. However, if for some reason, you want to create such trees, then it is possible to switch off all FoX-only checks. (DOM-mandated checks may not be disabled.) To do this, use the setFoX_checks function described in DomUtilityFunctions.

Note that FoX does not yet currently check for all ways that a tree may be made non-serializable.

Live nodelists

The DOM specification requires that all NodeList objects are live - that is, that any change in the document structure is immediately reflected in the contents of any nodelists.

For example, any nodelists returned by getElementsByTagName or getElementsByTagNameNS must be updated whenever nodes are added to or removed from the document; and the order of nodes in the nodelists must be changed if the document structure changes.

Though FoX does keep all nodelists live, this can impose a significant performance penalty when manipulating large documents. Therefore, FoX can be instructed to inly use 'dead' nodelists - that is, nodelists which reflect a snapshot of the document structure at the point they were created. To do this, call setLiveNodeLists (see API documentation).

However, note that the nodes within the nodelist remain live - any changes made to the nodes will be reflected in accessing them through the nodelist.

Furthermore, since the nodelists are still associated with the document, they and their contents will be rendered inaccessible when the document is destroyed.

DOM Configuration

Multiple valid DOM trees may be produced from a single document. When parsing input, some of these choices are made available to the user.

By default, the DOM tree presented to the user will be produced according to the following criteria:

there will be no adjacent text nodes
Cdata nodes will appear as such in the DOM tree
EntityReference nodes will appear in the DOM tree.

However, if another tree is desired, the user may change this. For example, very often you would rather be working with the fully canonicalized tree, with all cdata sections replaced by text nodes and merged, and all entity references replaced with their contents.

The mechanism for doing this is the optional configuration argument to parseFile and parseString. configuration is a DOMConfiguration object, which may be manipulated by setParameter calls.

Note that FoX's implementation of DOMConfiguration does not follow the specification precisely. One DOMConfiguration object controls all of parsing, normalization and serialization. It can be used like so:

use FoX_dom
implicit none
type(Node), pointer :: doc
! Declare a new configuration object
type(DOMConfiguration), pointer :: config
! Request full canonicalization
! ie convert CDATA sections to text sections, remove all entity references etc.
config => newDOMConfig()
call setParameter(config, "canonical-form", .true.)
! Turn on validation
call setParameter(config, "validate", .true.)
! parse the document
doc => parseFile("doc.xml", config)

! Do a whole lot of DOM processing ...

! change the configuration to allow cdata-sections to be preserved.
call setParameter(getDomConfig(doc), "cdata-sections", .true.)
! normalize the document again 
call normalizeDocument(doc)
! change the configuration to influence the output - make sure there is an XML declaration
call setParameter(getDomConfig(doc), "xml-declaration", .true.)
! and write the document out.
call serialize(doc)
! once everything is done, destroy the doc and config
call destroy(doc)
call destroy(config)

The available configuration options are fully explained in:

and are all implemented, with the exceptions of: error-handler, schema-location, and schema-type.
In total there are 24 implemented configuration options (schema-location and schema-type are not implemented). The options known by FoX are as follows:

canonical-form default: false, can be set to true. See note below.
cdata-sections default: true, can be changed.
check-character-normalization default: false, cannot be changed.
comments default: true, can be changed.
datatype-normalization default: false, cannot be changed.
element-content-whitespace default: true, can be changed.
entities default: true, can be changed.
error-handler default: false, cannot be changed. This is a breach of the DOM specification.
namespaces default: true, can be changed.
namespace-declarations default: true, can be changed.
normalize-characters default: false, cannot be changed.
split-cdata-sections default: true, can be changed.
validate default: false, can be changed. See note below.
validate-if-schema default: false, can be changed.
well-formed default true, cannot be changed.
charset-overrides-xml-encoding default false, cannot be changed.
disallow-doctype default false, cannot be changed.
ignore-unknown-character-denormalizations default true, cannot be changed.
resource-resolver default false, cannot be changed.
supported-media-types-only default false, cannot be changed.
discard-default-content default: true, can be changed.
format-pretty-print default: false, cannot be changed.
xml-declaration default: true, can be changed.
invalid-pretty-print default: false, can be changed. This is a FoX specific extension which works like format-pretty-print but does not preseve the validity of the document.

Setting canonical-form changes the value of entities, cdata-sections, discard-default-content, invalid-pretty-print, and xml-declarationto false and changes namespaces, namespace-declarations, and element-content-whitespace to true. Unsetting canonical-form causes these options to revert to the defalt settings. Changing the values of any of these options has the side effect of unsetting canonical-form (but does not cause the other options to be reset). Setting validate unsets validate-if-schema and vica versa.

DOM Miscellanea

Other issues

As mentioned in the documentation for WXML, it is impossible within Fortran to reliably output lines longer than 1024 characters. While text nodes containing such lines may be created in the DOM, on serialization newlines will be inserted as described in the documentation for WXML.
All caveats with regard to the FoX SAX processor apply to reading documents through the DOM interface. In particular, note that documents containing characters beyond the US-ASCII set will not be readable.

It was decided to implement W3C DOM interfaces primarily because they are specified in a language-agnostic fashion, and thus made Fortran implementation possible. A number of criticisms have been levelled at the W3C DOM, but many apply only from the perspective of Java developers. However, more importantly, the W3C DOM suffers from a lack of sufficient error checking so it is very easy to create a DOM tree, or manipulate an existing DOM tree into a state, that cannot be serialized into a legal XML document.

(Although the Level 3 DOM specifications finally addressed this issue, they did so in a fashion that was neither very useful, nor easily translatable into a Fortran API.)

Therefore, FoX will by default produce errors about many attempts to manipulate the DOM in such a way as would result in invalid XML. These errors can be switched off if standards-compliant behaviour is wanted. Although extensive, these checks are not complete. In particular, the way the W3C DOM mandates namespace handling makes it trivial to produce namespace non-well-formed document trees, and very difficult for the processor to automatically detect the non-well-formedness. Thus a fully well-formed tree is only guaranteed after a suitable normalizeDocument call.

UTILS

FoX_utils is a collection of general utility functions that the rest of FoX depends on, but which may be of independent use. They are documented here.

All functions are accessible from the FoX_utils module.

NB Unlike the APIs of WXML, WCML, and SAX, the UTILS APIs may not remain constant between FoX versions. While some effort will be expended to ensure they don't change unnecessarily, no guarantees are made.

For any end-users interested in the code who are worried about interface changes, it is recommended that the relevant code (all found in the utils/ directory be lifted directly and imported into other projects, rather than accessed through the FoX interfaces.

Two sets of utility functions are provided; one concerned with UUIDs, and a set concerned with URIs.

UUID

UUIDs (see RFC 4122) are Universally Unique IDentifiers. They are a 128-bit number, represented as a 36-character string. For example:

 f81d4fae-7dec-11d0-a765-00a0c91e6bf6

The intention of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else.

This property also makes them useful as Uniform Resource Names, to refer to a given document without requiring a position in a particular URI scheme. Thus the above UUID could be referred to as

urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6

UUIDs are used by WCML to ensure that every document generated has a unique ID. This enables users to go back later on and have confidence that they are examining the same document, regardless of where it might have ended up in file-system hierarchies or databases.

In addition, UUIDs come in several flavours, one of which stores the time of creation to 100-nanosecond accuracy. This can later be extracted (see, for example this service) to verify creation time.

This may well be useful for other XML document types, or indeed in non-XML applications. Thus, UUIDs may be generated by the following function, with one optional argument.

generate_UUID
version: integer

This function returns a 36-character string containing the UUID.

version identifies the version of UUID to be used (see section 4.1.3 of the RFC). Only versions 0, 1, and 4 are supported. Version 0 generates a nil UUID; version 1 a time-based UUID, and version 4 a pseudo-randomly-generated UUID.

Version 1 is the default, and is recommended.

(Note: all pseudo-random-numbers are generated using the high-quality Mersenne Twister algorithm, using the Fortran implementation of Scott Robert Ladd.)

URI

URIs (see RFC 2396) are Universal Resource Identifiers. A URI is a string, containing several components, which identifies a resource. Very often, this resource is a file, and the URI represents the local or network path to this file.

For example:

http://www.uszla.me.uk/FoX/DoX/index.html

is a URI pointing to the FoX documentation.

Equally, however:

FoX/configure

is a URI reference pointing to the FoX configure script (relative to the current directory, or base URI).

A string which is a URI reference contains several components, some of which are optional.

scheme - eg, http
authority - eg, www.uszla.me.uk
path - eg, /FoX/DoX/index.html

In addition, a URI reference may contain userinfo, host, port, query, and fragment information. (see the RFC for full details.)

The FoX URI library provides the following features:

type(URI) This is an opaque Fortran type which is used to hold URI information. The functions described below use this type.
parseURI This takes one argument, a URI reference, and returns a pointer to a newly-allocated URI object.

If the string provided is not a valid URI reference, then a null pointer is returned; thus this function can be used to check whether a URI is valid.

expressURI This takes one argument, a URI object, and returns the (fully-escaped) string representing that URI.
rebaseURI This takes two arguments, both URI objects, and returns a pointer to a third URI object. It calculates the location of the second URI with reference to the first.

Thus, if the first URI were /FoX/DoX, and the second ../DoX2/index.html, then the resulting URI would be /FoX/DoX2/index.html

destroyURI This takes one argument, a pointer to a URI object, and clears up all memory associated with it.

For each component a URI might have (scheme, authority, userinfo, host, port, path, query, fragment) there are two functions for extracting the component:

hasXXX will return a logical variable according to whether the component is defined. (except for path which is always defined, but may be empty)
getXXX will return a string containing the value of the component. (except for port which is returned as an integer.

Thus, listing these functions in full:

hasScheme Is there a scheme associated with the URI?
getScheme Return the value of the scheme
hasAuthority Is there an authority associated with the URI?
getAuthority Return the value of the authority
hasUserinfo Is there userinfo associated with the URI?
getUserinfo Return the value of the userinfo
hasHost Is there a host associated with the URI?
getHost Return the value of the host
hasPort Is there a port associated with the URI?
getPort Return the value of the port
getPath Return the value of the path
hasQuery Is there a query associated with the URI?
getQuery Return the value of the query
hasFragment Is there a fragment associated with the URI?
getFragment Return the value of the fragment

FoX documentation.

Introduction

Other documentation

iFaX workshops

Tutorials

API documentation

COMMON interfaces

OUTPUT interfaces

INPUT interface

Other things

FoX versioning

FoX Changes

Configuration/compilation

Configuration and compilation

Requirements for use

Configuration

Compilation

Testing

Linking to an existing program

Compiling a dummy library

Using FoX in your own project.

To incorporate into the program

To incorporate into the build process:

Configuration

Compilation of FoX

Compiling/linking your code

Cleaning up

Standards compliance

FoX_common

String handling in FoX

Scalar data

Character (default kind)

Logical (default kind)

Integer (default kind)

Real numbers (single and double precision)

Complex numbers (single and double precision)

Arrays and matrices

wxml/wcml wrappers.

String conversion

rts subroutine

num

iostat

String formatting

Numerical formatting.

Complex number formatting.

Logical variable formatting.

countrts function

WXML

Conventions and notes:

Conventions used below.

Derived type: xmlf_t

Function listing

Frequently used functions

Namespace-aware functions:

Scope of namespace functions

More rarely used functions:

Functions to query XML file objects

Exceptions

Validity constraints

WCML

Use of WCML

Dictionaries.

Identification

Quantification

General naming conventions for functions.

Conventions used below.

Units

Functions for manipulating the CML file:

Start/End sections

Adding items.

Adding geometry information

Adding eigen-information

Common arguments

WKML

Use of WKML

Conventions used below.

Functions for manipulating the KML file:

Functions for producing geometrical objects:

2D fields

Data input

`rts` subroutine

`num`

`iostat`

`countrts` function

Derived type: `xmlf_t`