The future of NSF? Part 1
This is a follow-up to a discussion on Volker's blog. Things got a bit technical (and lengthy) where I shared my thoughts on possible future directions for NSF so I decided to bring it here. Besides, my blog's been fairly empty and it needs some content. Here's the original discussion: Volker's blog First, my original post: Technically speaking, there's not much more IBM needs to do to bring Domino up to the next level DB-wise. Domino 8.0 finally got rid of the last of the 16-bit memory handles. If IBM continues to push into the 64-bit realm with Domino, one would assume they'll push the file handles also, and therefore we should see the expansion of DB sizes past that 64GB limit. All of the latest compression options obviously help to reduce size. DAOS helps move attachments out of an NSF, which is a huge help in many cases. But more importantly, the abstraction introduced with DAOS lays the groundwork for other abstraction. For example, getting the $Collation item on view design docs (i.e. a view index) OUT of the database would be a huge help. It's not replicated, so why even store it there? Many complex Notes apps have over 50% of their space consumed in view indexes. And with complex indexes comes a serious performance hit. Domino locks an entire NSF any time ANY write is performed to it. This includes updating view indexes, a doc update, etc. As a result, getting view indexes out of the NSF files themselves would shrink the filesize further, likely allow for faster simultaneous index updating, and possibly allow for truly complex JOIN indexes to be stored on-disk. Throw in the capabilities NSF currently has (solid replication, readers/authors security, an ultra-flexible storage model), and it's not far from being the obvious db of choice for 90% of most modern applications, provided that somebody knows the query language (@Formula). After that, flesh out the SQL querying capabilities and it's a slam-dunk. But let's hope IBM wants to do this work.
Then the question was asked: "why can't Notes just store its data in a relational db?" Here's my response: You *can* store your data in an SQL database, but you lose the flexibility of the Notes model. Each record (document) is self-defining, and fields that don't exist on a record (document) don't consume any additional space. Each document is also completely encapsulated, so when you hit Mail -> Forward on a document it's easily wrapped up and shipped off. You can shove it into another Notes db with no prior prep work on the destination database. Take all of those capabilities and add multi-value fields, field-level security and field-level replication, it really becomes something that is quite difficult to reproduce in a relational db. Approaching it from the other direction, it would probably be (or have been) easier for IBM to try to squeeze more SQL-like capabilities into NSF: 1. SQL queries - This has been somewhat doable for years with NotesSQL, and now with NSFDB2. It needs to become a core part of the product. 2. JOIN capabilities - This is partially addressed with #1, and you can write backend code or use XPages in 8.5 to do the UI-equivalent but again, indexing speed is key. There is currently NO way to *store* the index that results of a JOIN -- at least not without NSFDB2 and again, it needs to become a core part of the product. The lack of JOIN capabilities is really the main architectural disadvantage of NSF (besides just getting big as we've discussed). There has been a *tiny* bit of progress here in 8.0 with the new NotesDocumentCollection.Intersect() methods and few others. 3. Enforcing Referential integrity - If a doc is "dependent" on another doc, not much is keeping you from deleting the 2nd doc, therefore causing an invalid relationship in the database. This would likely be a lot of work, and when taking replication and doc security into account I'm not even sure it's completely doable. But for many applications this is a "nice to have" that can be addressed in other ways and not an absolute requirement at the db level. 4. Transactional model - IMHO, NSF can't really go there while maintaining it's current capabilities, and this would be the #1 reason to stick with a relational database. People don't enjoy replication/save conflicts in their payment systems. :-)
|
Ratings
0
|
Comments (1)
Well said. To see how a "relational" Notes storage model would look like go here: http://www.wissel.net/blog/d6plinks/SHWL-78N5LS and
http://www.wissel.net/blog/d6plinks/SHWL-7KFP9S. And I agree with
you that this doesn't look like an awful good idea.
However *all* of the big RDBMS systems use XML capabilities. So a
DominoDocumentRecord could have some internal fields (all the
Document Properties) as columns and then one column with the DXL
and one column each for the combined Reader/Author information (so
security becomes easier).
You then can use XPath to design any queries you fancy and surface
them as views (including JOIN).
The difference to the current NSFDB2?
a) It would be a mechanism build on the standard DB/2
capabilities
b) XPATH is more capable to ask all the "questions" you want to ask
and is a viable alternative to the @Formula for querying
c) Since it is a standard mechanism you then could add whatever you
fancy in the DB/2 backend. Stuff like referential integrity,
triggers etc.
d) Even without defining an access view data can be updated into
documents from any app that can issue a DB/2 style SQL (containing
XPATH statements). The Domino document wouldn't be such a blackbox
anymore.
:-) stw