Tech Archaeology: Unearthing the Artifacts of a False Prediction

Greetings. This is going to be a shorter rant. New year, new me! Anyway, I was inspired to write this after I caught myself falling into a usual habit: investigating the validity of a prediction which claims that a technology (it could be anything) will take over in the future. I'll start from the beginning.

It all started when I was dutifully studying for my Databases class. While reading the textbook, Database Processing (13th edition) by Kroenke and Auer, I came across a passage that was summarizing the history of database processing. Being that this book first came out around 1977, it has probably witnessed very few shifts in the popularity of database technology over its existence; namely, the rise of Relational Model and its subsequent dominance. Never-the-less, in a table that describes the emergence of database technology, there is a row for the "XML and Web Services" era (after "Open-Source DBMS" and right before the "Big Data and NoSQL" era). It lists in the "Important Products" column, "XML, SOAP, WSDL, UDDI, and other standards", and in the "Remarks" column it casually states:

"XML provides tremendous benefits to web-based database applications. Very important today. May replace relational databases within your career. See chapter 12." (p 22).

How should I feel after reading this remark? Well, I feel how I always do when I see a remark about the future importance of some technology: I'm immediately suspicious of it being bullshit! Then the claim becomes an irresistible urge to dig through the lost treasures of the Internet in hopes of refuting it.

For this particular case, I can remember hearing a similar claim made by the inspirational Steve Yegge. Luckily, I found an instance of Yegge himself recently addressing the claim that XML databases will take over Relational in popularity, but before I go there I'd like to return to the history table in the Database Processing textbook.

The row right after the "XML and Web Services" era is the "Big Data and the NoSQL movement" which according to the text began in 2009 and continues to the present. In the remarks column the book states:

"Web applications such as Facebook and Twitter use Big Data technologies, often using Hadoop and related products. The NoSQL movement is really a NoRelationalDB movement that replaces relational databases with non-relational data structures. See Chapter 12." (p 22).

There's no mention of the previous claim about XML. Nothing like, "Oh, but by the way, XML is going to replace Relational Databases, so don't worry about NoSQL." I wonder if they missed that line in there when they added the NoSQL row, and therefore didn't remove the claim.. Or perhaps they still believe in it? Turning to Chapter 12 may be the only way to find out.. (*Flips pages*).

To my surprise, Chapter 12 is entirely focused to Big Data and NoSQL. XML is only mentioned as a potential format for document based storage. The book even says to look up dbXML, but the link it cites (www.dbxml.com) is dead. This leads me to believe that the remark above (that says both that XML Web Services will be king, and to read Chapter 12 for more info) contains some logical errors. Perhaps in an earlier version, Chapter 12 was about XML. Or perhaps when they say dbXML, they mean Berkeley DB XML. I can't be sure. Either way, Chapter 11 is about XML, and Chapter 12 is about NoSQL. So I flip through Chapter 11 to see what the fuss over XML is about.

..And so I find a reasonable argument in the chapter summary, namely that:

"The confluence of database processing and document processing is one of the most important developments in information systems technology today. Database processing and document processing need each other. Database processing needs document processing for the representation and materialization of database views. Document processing needs database processing for the permanent storage of data" (p. 523).

This is an interesting claim, and my BS meter isn't detecting anything suspicious. By document processing, I believe the book is referring to building up documents by pulling data from a relational database and generating XML documents with a language like PHP or Java Server Pages (p. 523). The summary continues with:

"Although XML can be used to materialize Web pages, this one of the least important uses. Most important is its use for describing, representing, and materializing database views. XML is on the leading edge of database processing" (p. 523).

This is a little more suspect, though still quite interesting. Perhaps XML was in fact the leading edge of document-based database processing, and it may still be killer, but from what I can tell JSON has become the defacto format for the popular NoSQL databases of today.

So after digging in and following my suspicion I believe I've uncovered a brief resolution that has dove-tailed into a new curiosity. I'm now more curious about learning XML and its related technologies, but not because I believe it's "The Next Big Thing". The claim that XML Web Services may replace Relational Databases seems to be an out-dated remnant from a previous edition of the book, since it points to the chapter on NoSQL databases. This speculation can be reinforced by looking a similar claim made by Steve Yegge, and his recent assessment of its accuracy.

During a rather feverish digging session, I came across this thread on Hacker News that points to a revision Mr Yegge made to an old blog-post. Yegge originally stated, in 2004, that "XML databases will surpass relational databases in popularity by 2011". Looking back in 2015, he comments on his prediction. Was he correct?

"Nope. I had the right problem and right solution space, but called the wrong solution. NoSQL data stores in their various incarnations apparently have [equaled] or slightly surpassed RDBMS as of 2014-2015, according to some reports I just read."

What Yegge means by "surpassed" is open to interpretation, and yes it gives me a tingle of BS, but following that isn't my point here. I just wan't to understand whether or not XML databases are currently relevant to the field of software engineering (and it seems like there are a few proprietary products that are, like Oracles version of Berkeley DB XML, and Marklogic). Really though, Yegge's claim doesn't give me much relief; thus the endless need for one to continually perform archaeological digs through the ruins of technological trends.

A point I'm trying to make is that an enormous effort is necessary for a student of technology to weigh the claims made by various experts in the field, and to keep up with the fashions of the time. It seems often that there are great upsurges in the popularity of a technology, regardless of its technological merit. XML is a great example, because it seems like a less-perfect manifestation of what LISP S-expressions elegantly achieved in the 1960's.

What is notable is the huge amount of effort poured into XML (developing Schemas, XSL, XPATH, etc) in order to allow it to be useful to the thousands of engineers working on the JVM and .NET platforms. This effort has left a body of work, a legacy, that remains in the engineering culture and continues to influence future generations. We have to cope with this technology and embrace it despite its shortcomings (and despite its strengths) if only because it was adopted by such a large body of engineers. What I'm concerned with is how likely it will be for me to have to use XML, and in which scenarios it will be valuable to stick with or whether it would be better to embrace a newer (re)invention.

Either way, I can say that this archaeological dig has turned us some interesting artifacts about using XML for a materialized format for database-views. Now that my interest has been piqued, I'll have to see what I can dig up about it!

Working Life

Search This Blog

Tech Archaeology: Unearthing the Artifacts of a False Prediction

Labels

Comments

Post a Comment

Popular posts from this blog

Parallelism and Task-Decomposition: An Introduciton

One AI to Rule Them All