Table of Contents
The record model described in this chapter applies to the fundamental,
structured XML
record type DOM
, introduced in
Section 2.5.1, “DOM XML Record Model and Filter Module”. The DOM XML record model
is experimental, and its inner workings might change in future
releases of the Zebra Information Server.
The DOM XML filter uses a standard DOM XML structure as internal data model, and can therefore parse, index, and display any XML document type. It is well suited to work on standardized XML-based formats such as Dublin Core, MODS, METS, MARCXML, OAI-PMH, RSS, and performs equally well on any other non-standard XML format.
A parser for binary MARC records based on the ISO2709 library standard is provided, it transforms these to the internal MARCXML DOM representation. Other binary document parsers are planned to follow.
The DOM filter architecture consists of four different pipelines, each being a chain of arbitrarily many successive XSLT transformations of the internal DOM XML representations of documents.
Table 7.1. DOM XML filter pipelines overview
Name | When | Description | Input | Output |
---|---|---|---|---|
input | first | input parsing and initial transformations to common XML format | Input raw XML record buffers, XML streams and binary MARC buffers | Common XML DOM |
extract | second | indexing term extraction transformations | Common XML DOM | Indexing XML DOM |
store | second | transformations before internal document storage | Common XML DOM | Storage XML DOM |
retrieve | third | multiple document retrieve transformations from storage to different output formats are possible | Storage XML DOM | Output XML syntax in requested formats |
The DOM XML filter pipelines use XSLT (and if supported on your platform, even EXSLT), it brings thus full XPATH support to the indexing, storage and display rules of not only XML documents, but also binary MARC records.