Friday 5 January 2018

What is SmartDB

What is #SmartDB ?

This question came up in the twitter discussion about #SmartDB, and all the advantages it brings (link to twitter).

Over the last year or so (and way before that, with the Helsinki-declaration in 2008), Toon Koppelaars has given us the reasons and guidelines for #SmartDB, and it boils down to “do the work in the database” (correct?)
So the shortest description, IMHO, would be : 

SmartDB: 
Any IT system using a database should do as much as possible of its processing inside the Database, and as little as possible of its processing in other layers.

(Agreed?)

TL;DR ? Read no further.  ;-)


Background of this approach is that this would lead to least-complexity, least-components, least round-trips, least-overhead, and least-complicated troubleshooting (only one component to examine and fix… ). 

Also, my too-short definition doesn’t include (yet) the need to apply sound database-practices. Good IT systems start with good (system) design (based on requirements). It also  includes things like 3NF, ACID, notably resilience, and adequate security based on minimum-privs and minimum exposed surface. Then there are “scalability”, “upgrades” and “monitoring” to allow the system to remain in action over longer periods of time, and under various loads. Sustainability, if you like.

To me, all the above still makes sense. And I feel comfortable to design/build an IT system given those guidelines. 
Of course, some are not content with an extremely short definition, and others demand a more elaborate description or a how-to cookbook-guide. All of that, Toon and Bryn are trying to provide in various presentations, videos and white papers on the subject.

I’m going to add a a suggestion.

To better define and describe SmartDB, I suggest to follow these steps:
- Requirements, 
- Reasoning, 
- Recommendations.
Those three steps should lead us to a better description and thus to a better way to “evangelise” the SmartDB concept. 

To Elaborate each step:

The (list of) Requirements should state why SmartDB is needed, and which problems the concept tries to solve.

By following logical Reasoning, with the knowledge and technology available, we can explain how each requirement is addressed in what is the most efficient way known to us.

To finalize, we create a list of Recommendations (if not: Directives!), on how to implement SmartDB. The recommendations should, at first, not be connected to any particular database or programming language. Those details can be filled in later. Each "vendor" should do this for his own product, and hopefully stay within the concept of SmartDB.

The result of this exercise should be (yet another) white paper, 10 page max, and some presentation material that we could throw at Developers, Architects, Managers and even Dev-Ops and UX wizards to explain and convince them.

Some history: 
Long time ago, in a cubicle-space far far away, an OFA-standard was defined via a similar process in the early 90s
I am that old. I recall how this Oracle Flexible Architecture was created and "explained" (OFA was the mother of all Oracle Standards - link needed).
OFA was created roughly in those three steps: Requirements, Reasoning and Recommendations (Directives!) I’ve not checked the original paper for years but this is what I recall…

This proces of problem identification and reasoned solutions was clear and could be explained to most users (Sysadmins, DBA’s) at the time. The standard was widely adapted and cited in the Oracle-DBA world. 


So, to recapitulate, I think we should follow a similar path to define the SmartDB concept: Requirements, Reasoning, Recommendations. Those three steps should lead us to a better description and thus to a better way to explain, promote, and verify the SmartDB concept. 

(I know, I know, Bryn is going to say that all of the above is covered 11.2 years ago with EBR, and we only need 4 schemas… but still, it wont hurt to re-assert some items…)


Additional notes:

note1: I have barely mentioned Oracle (or PostgreSQL, or MySQL, sukkel-Srvr, or even sp-hna). The SmartDB concept should not be locked into a particular product.
note2: I have not mentioned any theoretical or academic knowledge, but I would assume IT ppl to be familiar with handling requirements, IS-design, ACID, 0+12,  UX, some OOP, various methods of testing, etc… 
note3: I have not mentioned any procedural language, but SmartDB does imply the use of some stored-procedure dialect

note4:… there is much more, but it needs to be discussed. 

Tuesday 2 January 2018

Quick notes on SmartDB (temporary post)

Since there is a discussion going on, mostly on Twitter (link) about SmartDB, I'm going to put my own notes here. That way I have a public-place to "think out loud" about the concept of SmartDB.

Smart-DB is, in my view:
A concept where an IT system is built In the DataBase.
Constructed so that most of the work (and processing, and maintenance) 
is done IN THE DATABASE.

This makes for a single point where all (99%) of the logic can be found, and where all (99%) of the code can be found and maintained.

Maximum logic/work/processing on the DB.
Minimum processing in front-end-tier (browser?).

Eliminate other layers and components.

Key point, in my view: Eliminate layers, eliminate components, and eliminate sources of problems.
Also eliminates round-trips between components, a notable source of delay.

note: SmartDB is more then just using PL/SQL or pl/pgsql, but those "stored procedure" constructs are an essential part of the SmartDB concept. Hence the focus often shifts to these tools.

Keep in mind the overall goal: 
A Working and Sustainable IT system (to process data for some end-user-business purpose).

Data and Data-model tends to survive the tools and even the "processes" of the system.

SmartDB methodology, as i  was taught years ago, would go more or less like this....

 - Ensure you have a notion of IT technology and Design. (e.g. design of process and data)
 - define the goal of your system: why will this system exist and who are the users, “consumers” ? 
 - define datamodel. what data needs to go  in+out of the system, and what needs to be stored.
 - define processes, functionality, functions.
 - refine data-model and table design. one possible Quality checks is 3NF. 
 - define the interaction between processes and data (CRUD matrix)
 - list/define the interfaces : which external actors bring/consume data to/from the system
 - define/determine the format of the interfaces (e.g. message-definitions)
 - program the interfaces to receive and return their data and statuses (e.g. error-messages)
 - check for completeness: determine the life-cycle of all entities/records.


 - logic and data in 1 place. (look no further)
 - back end can survive multiple versions of "front-end" or UX (if a front-end exists at all!)
 - troubleshooting in 1 place (look no further, no discusion, no hiding, no escape.... )
 - potential to minimize round-trips. Every "interaction" should be a message, preferably a single round trip.
 - Messages can be de-coupled (most, not all...), and queueing mechanisms can be used for resilience.
 - messages should be constructed to allow re-play (re-submit) without damage to integrity of the system  or database.
 - ACID only required in the Database. 
 - The database-replication mechanisms, and only the DB-replica!, provides DR capability.


Use/sale of Tools or Products is not a goal
Use/sale of Cloud is not a goal
Use/sale of "big data" is not a goal.
Use/sale of UX / Microservices / ClientServer / Cobol / R-sharp / Dockernetis / etc... is not a goal...



Future work:
- include links to proven methods of data-design (Heli?) . Include data-design in text.
- include links to ACID and 12+1 rules for Databases, include in text.
- include links to proven methods of Requirements and how-to Design... 
 - Do we have to start "Teaching IT from Scratch" again ??? 

So far my notes.
I also have life, you know..