One of the continuing quandries I’ve had with XML is the management of multiple XML documents. If I have one, big XML document, then it’s easy to work with — to parse with an API, to transform with XSLT, to query with XPath.
But what if I have many documents? For instance, what if I have all my blog entries (400+ at last count) as individual XML documents in a directory somewhere and I want to find all entries containing the word “cuisinart”? What do you do then? Iterate through all the documents firing off XPath queries and somehow persist all the documents that match then go back and get them when the loop is done? This seems ugly, but the alternative — having everything in one, monolithic XML document — seems worse.
I’ve heard that Oracle 8 will let you do an XPath query on an individual field in the WHERE clause. I’m trying to figure out if SQL Server 2000 will let you do the same thing. MySQL would be even better, but perhaps that’s hoping for too much.
There are some XML databases out there (Xindice, for instance; more here), but they’re very new and I don’t know of any that have Windows binaries or that will work without me getting all geeked out.
Is the relational model of data storage the best, most effiicient way to store data? I'm talking about the traditional database model of tables, fields, row, foreign keys, etc. What are the other ways? There's object oriented, where you have a table of classes and attributes, object instances and…
Thoughts on Content Management: This guy and I think alike. In the beginning of the article he touches on the same things I talked about when I compared open and closed content management systems. Then, he runs into the same problem: there are too many types of content, each storing…
Check out Virtuoso. http://www.openlinksw.com/virtuoso/
It's not free (well, it's got a 30-day extendable evaluation license), but it's powerful...
SQL-92 database (both Virtual and Relational DBMS) blended with XML database, with built in X-Path and XQuery support, as well as XSLT, and bunches more. Including Blogger, MetaWeblog, and Moveable Type API support -- from both client and server perspectives -- in the latest update (v3.2)!
I can't do it justice in a comment -- and it'll probably seem like marketing speak, since I work at OpenLink Software, which publishes Virtuoso... I don't benefit directly from downloads or sales...
Check it out. Let me know what you think.
Our open source XML extensions for PostgreSQL do exactly what you want: provide XPath support within your SQL statements.
SELECT title FROM documents WHERE xpath_string(xml,'/document/title') = 'Konrad Lorenz';
or
SELECT xpath_string(xml,'/document/title') AS title FROM documents WHERE id = 22;
The extensions are based on the very fast and lightweight libxml2 library. See:
http://www.throwingbeans.org/tech/postgresqlandxml.html