Most current implementations of XQuery are either "toy" -- they break on large documents or they are "fake" -- they are SQL masquerading as XQuery. We shall describe some recent work on a native XML store and an interpreter for a useful subset of XQuery that scales in the way one would expect of a database query language. Preliminary results on a large-ish (80GB) data set show that the techniques produce performance which is comparable with well-tuned SQL queries running on the same data in a commercial RDBMS.
The technique is based on a combination of two existing ideas. The first is to extend a very old idea of using column-based storage of tabular data to the storage of XML. An XML document is separated into a "skeleton", which describes the structure of the document and a set of "vectors", which are the sequences of data values appearing under all paths bearing a given sequence of tag names. The second idea is to generate a query-friendly compressed version of the skeleton.