Max Schireson, president of 10gen (the makers of MongoDB), has made the case for document-based data systems – such as MongoDB and CouchDB – by arguing against the heavily-normalized relational model.
Max offers up his entry as a challenge to the “relational-is-always-best set”, asking them to prove that the complexity of storing data in a relational form is worth the trouble, at least for the scenario he describes.
Given that I’ve been anointed as an anti-NoSQL crusader on a number of occasions, I feel obligated to argue on behalf of the relational model, which I will do in a later entry.
Despite being a big fan of MongoDB. As I have done many times in the past I encourage everyone to download and play around with the excellent MongoDB product. Do yourself a favour by running through the tutorial.
All things have a place.
I once sat in a meeting where a peer described the purportedly intractable complexity of a task they were failing at. They did this by drawing the various actors on the whiteboard and then detailing their many complex relationships.
Image the best path-finding algorithm. Now imagine the opposite: The least efficient, most unnecessarily sloppy routing imaginable.
That was how complexity was deceptively exaggerated, with absurdly circuitous relationship lines weaving to and fro. It was comical.
That memory came to mind, and how the deception goes both ways, while reading Max’s entry, and again when reading the linked entry by MongoDBer Kyle Banker.
When comparing the document model with the relational model, many if not all examples seem to contrast a complex relational model – one that encapsulates an end-to-end platform for a whole domain – against a trivial island of a tiny subset of data in a document structure. The former usually built to support entire operations and systems, while the latter tends to be crafted for one single purpose (like "allow customer services to look at an order", as was used in Max's scenario).
Max highlights relational complexity by pointing to an Oracle end-to-end order reference platform containing “126 tables”. Kyle does the same thing when comparing a simple could-be-one-single-row document (which humorous includes four relationships, which to resolve would require four expensive round-trips to the MongoDB server given the platform’s bizarre lack of server joins) against a complex catalogue schema. Both explain their arguably deceptive comparisons with statements like “Of course, this is not a complete representation of a product”…
I would argue that in such a case such a comparison shouldn’t be made at all. Why contrast an incomplete example of a document-based implementation – simplistic in its useless innocence – against a fully scoped relational platform?
It is the “MySpace angle” used to hide the ugly reality of technology. If you have a MongoDB simile of the compared product, have at it, but simply hiding the ugly details and zooming in on a non-functional, cherry-picked subset just misleads potential suitors.
Realtors use this trick when taking photos of homes, showing just enough of the grass while avoiding nearby structures. Your mind naturally extrapolates; imagining expanses of lush green fields, when in reality there’s usually another house imposing itself four feet over.
I have a full workload right now, but in the near future, during a mental lull, I will respond to Max. There is a very compelling counterargument to be made.