The Database Tool Trap
In my last posting about NoSQL, I talked about the changing nature of NoSQL. NoSQL is looked at by data management and business professionals as an “unstructured” data source capable of batch processing.
Much of this is due to the “success” of Hadoop. In addition, this has to do with “failures” of Hadoop to describe the business functionality of NoSQL platforms. This time, I want to open up the concept that NoSQL platforms are not just about “unstructured” data. But rather, NoSQL platforms are about the changing nature of schemas associated with the data within our data management environments or “multi-structured” data.
When Henry David Thoreau ventured out to the “wilderness” to write in the 1800s, he came back with this observation:
“But lo! Men have become the tools of their tools. The man who independently plucked the fruits when he was hungry is become a farmer.”
For SQL platforms, this concept might be applied to talk about how we as technology professionals have become “tools” to the relational data schema that was created/institutionalized almost 40 years ago. All data was pushed into this “box,” or tabular format, because that was the structure of relational data stores. The opportunities provided to us by the relational data schema superseded many of the opportunities of that the data provided. We had become “tools” to the relational database rather than expanding into the “nature” of the data. Unfortunately, for NoSQL platforms, we might be falling into the same trap.
Are technologists and business stakeholders becoming beholden to the perceived unstructured nature of NoSQL platforms as we did to the structure of the relational database?
The architects of NoSQL platforms (such as Hadoop, Neo4J, CouchDB, etc.) have focused a lot on the lack of structure of the data being stored within their systems. This is similar to using relational platforms and forcing data into the relational “box” – or becoming tools of the platform.
I think the future of NoSQL platforms is going to reside in the ability of those systems to apply different operational or analytical schemas to multi-structured data sets rather than letting the data reside in a schema-free format. Merely storing multi-structured data sets will not be enough to have a NoSQL platform meet business objectives. The true business value will be in the ability to apply the structures of a particular schema for analysis or for operational workloads in real-time or near real-time.
What say the readers?
Do NoSQL platform technologists see themselves falling into the trap of serving the platform more than the platform serving them? Do NoSQL technologists see themselves avoiding the scenario that I have attributed to relational database technologists? Again, are my concepts off-base?
Post your comments below or contact me directly via Twitter at @JohnLMyers44 using the hashtag #noodlingNoSQL.