ISO IEC 19776-1:2009 pdf download – Information technology — Computer graphics, image processing and environmental data representation — Extensible 3D (X3D) encodings — Part 1 : Extensible Markup Language (XML) encoding.
4 XQuery regular expressions 4.1 Context of XQuery regular expressions The requirements for the material discussed in this document shall be as specified in ISO/IEC 9075-2, ISO/IEC 9075-14, XML 1.0, XML 1.1, XML Schema Part 2: Datatypes, XQuery and XPath Functions and Operators 3.1, and Unicode Technical Standard #18 . 4.2 Introduction to XQuery regular expressions This document explains the manner in which XQuery regular expressions are used by database language SQL in ISO/IEC 9075-2 and in ISO/IEC 9075-14. Both ISO/IEC 9075-2 and ISO/IEC 9075-14 specify requirements for the material discussed in this document. XQuery regular expression syntax is specified in XQuery and XPath Functions and Operators 3.1 , section 5.6.1, “Regular expression syntax”. This paper references the XQuery specification, with two small modi- fications (required since character strings in an RDBMS are not necessarily normalized according to XML conventions). The following subsections provide an overview of this syntax. The XQuery regular expression syntax is itself a modification of another regular expression syntax found in XML Schema Part 2: Datatypes. This section presents an overview of the capabilities of XQuery regular expression syntax. In the process, this section will illustrate some of the SQL operators. The SQL operators themselves are presented in Clause 5, “Operators using regular expressions” . The following discussion does not cover every aspect of XQuery regular expressions; for this, XQuery and XPath Functions and Operators 3.1 is the reference (though hardly a tutorial; a variety of popular works contain detailed treatments of regular expressions).
Notice that some of the matches are substrings of other matches. The rules of XQuery regular expressions are designed to ignore certain matches, so that the recognized matches are mutually disjoint. Obviously there are many ways to do this, so the rules provide priorities in determining the recognized matches. There are three priorities: 1) The top priority is to find a match as early in the string as possible. This is commonly called the leftmost rule. 2) The second priority is to find the first alternative of an alternation, if possible. There does not appear to be a common name for this rule. 3) The last priority is to find the longest possible match for greedy quantifiers, and the shortest match for reluctant quantifiers. In the case of greedy quantifiers, this is commonly called the longest rule ; there does not appear to be a common name for the rule regarding reluctant quantifiers. [Historical note: POSIX only has a leftmost longest rule. There were no reluctant quantifiers, and the priority for matching alternations was the longest match rather than the first alternative.] These rules are illustrated by examples in Table 1, “Match priorities”.