on using XQuery for the first time

[Sat, 12 Sep 2009 16:26:13 +0000]
Obviously, I've been playing around with XSLT [] lately. So naturally, the next logical step was to delve into XQuery, the XML query language de jure. Eventually I want to run queries on MusicXML [] documents, but I need to start small. While the W3Schools tutorial on XQuery [] is a great introduction, there's one little problem. It doesn't really tell you how to implement XQuery: i.e. how to actually run a query and retrieve results. So after some random perusing and downloading, I - like the fool I am - was made aware by Dr. Michael Kay's "Learn XQuery in 10 Minutes: An XQuery Tutorial" that the Saxon XSLT processor I was already using for XSLT transformations already had an XQuery engine built in. That's to say that the .NET version of Saxon [] has 2 command line executables: 1. Transform.exe, which I'd already used for XSLT transformations 2. Query.exe, which allows one to run XQuery queries So much for paying attention to what I download ... From there, it was a simple matter to use XQuery for the first time. Here are the steps: 1. I downloaded the books.xml [] file provided by W3Schools and place it into the "bin" directory of Saxon on my drive. This is same directory where the 2 afformentioned executables reside. 2. Using the kick-tail text editor jEdit [], I copy/pasted/saved this query example from the W3Schools [] as "test.xquery" (also saved in the "bin" directory): <ul> { for $x in doc("books.xml")/bookstore/book/title order by $x return <li>{data($x)}</li> } </ul> This query simply lists all the titles from "books.xml" in alphabetical order. 3. Then using jEdit's command line plug-in called "Console", I set Console to the Saxon "bin" directory where "query.exe", "books.xml", and "test.xquery" reside. The easiest way to set the directory in Console is to type: cd "C:\Documents and Settings\nitin\Desktop\saxon\bin" Of course, you might extract Saxon elsewhere, but the important thing is to type cd + opening quotation mark + full path to Saxon's "bin" folder + ending quotation mark. 4. Now I was in the correct folder and could run the XQuery with the following command line syntax: query test.xquery And my results look like this: IMAGE: [] I know what you're thinking: no line breaks! Sure, the computer doesn't care, but this is really hard for humans to read! Yes, that's true. But I went ahead and pasted the following: <?xml version="1.0" encoding="UTF-8"?><ul><li>Everyday Italian</li><li>Harry Potter</li><li>Learning XML</li><li>XQuery Kick Start</li></ul> into a new document in jEdit anyway. We're gonna take care of those line breaks now ... 5. One of the many great things about jEdit is the ability to run Beanshell [] commands, which despite my attempts to sound authoritative, I only learned about roughly 30 minutes ago. This means that a search and replace can be done in jEdit using simple Java syntax to fix that line break issue. The first step is identifying where to insert the line break. I needed it in between > and <. Specifically, I needed a line break between all the red and green colored brackets: <?xml version="1.0" encoding="UTF-8"?><ul><li>Everyday Italian</li><li>Harry Potter</li><li>Learning XML</li><li>XQuery Kick Start</li></ul> So I just invoked the jEdit search/replace box and did the following: [xquery_jedit_replace.jpg]-Submit This simply says: Find all instances of >< and Replace it with > < - i.e. the text between the quotation marks. The n is, by the way, the line break syntax. When I hit "Replace All", this was the result: <?xml version="1.0" encoding="UTF-8"?> <ul> <li>Everyday Italian</li> <li>Harry Potter</li> <li>Learning XML</li> <li>XQuery Kick Start</li> </ul> Problem solved. 6. Now I simply saved this document as "test.html" and opened it in a browser. Anyway, that's my very simple start to XQuery, but I'm feeling pretty good about it nonetheless.


  1. nitin [2009-09-13 21:15:00]

    The indent option works! The output is well-nested XML now, too. Thanks for the great tip. I haven't formulated my thoughts on MusicXML/Xquery too much, but I'd really like to see digital libraries start OCRing their scanned sheet music and then work toward search capabilities for those collections. Furthermore, I'd like to see a non-textual search interface for querying pitches and patterns, etc. i.e. typing in search queries on a staff line. Maybe advanced searches would require text-based input, but it would be good from the "cool" factor POV to have staff entry for basic queries. So if universities and even open-web projects got underway we could start playing catch-up with the articles repositories. I haven't had much luck running Audiveris but we're gonna need a free, open-source music OCR program if this kind of thing could ever take root on the web. BTW: thanks for being my first commentor! :) Update 091509: jEdit has an XML plug-in with an "Indent XML" command. But it's still likely best to use the indent option while running the query as suggested by the great comment above.

  2. lasconic [2009-09-13 08:58:10]

    Nice! For line break, the name is "indent". You can have them right from query.exe I think. Try query test.xquery !indent=yes Out of curiosity, what's your plan with MusicXML and Xquery?