SSoT.me - Single Source of Truth: 2020

Sunday, November 22, 2020

Low-Code works best with a No-Code Model

One of the most common arguments against No-Code / Low-Code development is that software is just too complex to fit into a "database." This perspective makes sense because there are definitely elements of code that do not fit comfortably into a database, however - the scaffolding and basic architecture of most systems certainly do. Let me try to persuade you that writing code without a No-Code model is like walking a tightrope without a safety net!

The Argument

Hand Code (aka “source code”) isn’t going anywhere.
Applications At Rest (i.e., in the DB) are usually great Single Sources of Truth.
An SSoT is essentially a Cross-Platform Interface
Defining Right and Wrong
A quick example shows WHAT might go in a Specification Database.
Production DB (at rest) vs. Specification Database
Traditional Vs. Low Code User (developer) Experience
Low Code Development Flow
In Conclusion

1. Hand Code (aka “source code”) isn’t going anywhere

In the low-code model that I'm proposing, there is still totally 'source code" - though it's more accurate to call it Hand Code, I think. I'm a developer - I want to write code. I just don't want to write even a single line of code that a tool could write for me instead. I want to write the actual function body in most cases.

Still, the context in which I'm writing the code, the names of things, the structure of parameters and return types, the purpose - at least for top-level behaviors, should be predetermined by the spec, and I simply have to provide the implementation. And when I'm writing that implementation code, I should always have code completion for the vast majority of the structure for the system I'm operating within. On any project, in any language, today, tomorrow, and in 5 years. By the time I'm writing code by hand - I want my IDE to have most/all of the scaffolding needed such that I always have code completion.

All that said, if we're writing a function to convert text ToUpperCase - at some point, some code has to loop over the string and create and a new version of the String with uppercase letters. I would never, not in a hundred years, attempt to put that actual logic or any such a function body into a database. That is clearly something that belongs as source code.

When first considering low-code technology, it is quite common for developers to quickly find several specific examples of “source code” that does not easily fit into a database. Incorrectly though, this leads them to believe that it is valid to then dismiss the entire approach as being infeasible.

2. Applications At Rest (i.e., in the DB) are usually great Single Sources of Truth.

Yet think about the fact that we can stop most/all applications. That they can somehow always come to rest, And then, when they restart, they can simply reload their state, usually from a database of some kind, regardless of how sophisticated or complex the application or underlying ideas are. This fact seems to imply that virtually any concept, independent of its complexity, can be unambiguously represented in a well-defined, well-structured database.

In other words, when the system is turned off/at rest - many aspects of its behavior can be stored/saved to a database. As a result, the specific size/shape of that database inevitably almost always confers a massive amount of context about what the system is, what it does, and what it is and is not capable of doing.

Even simple questions like the relationship of contacts with phone numbers may/may not be quickly answered by looking at the python or the Java code for the system. But if you look at the database and see something called PhoneNumbers, with a single link/relationship to a specific Contact - then you know each contact can have N phone numbers.

If, by contrast, the phone numbers entity has no link directly to any contact, then the relationships might be more complicated. Suppose instead you find that there is a ContactPhoneNumber entity along with a CompanyPhoneNumber entity. In that case, we can probably infer that each Contact can have many phone numbers and that even multiple contacts and companies can potentially share the same phone numbers.

All of this logic, the entities involved, their structure, their names, attributes, and relationships, usually provide a detailed context for understanding the limits of what the system can/can not do. Any system which allows you to export all of your data to (for example, a JSON file) will inevitably leak a massive amount of detail about their internal design/model. Not because they accidentally made you a contributor in their repository but because they simply exported your data.

3. An SSoT is essentially a Cross-Platform Interface

What fits comfortably in a database is not the implementation - but rather the architecture. What tends to fit comfortably are the names, attributes, and relationships between things the code is responsible for handling. It's a little bit like an interface that crosses all of the languages in a given project. It is a single source of truth - and is an essential ingredient needed to fill in an entire layer of code that is entirely absent from most “traditional” project repositories at this point.

It's a question of where we choose to save all of the many decisions that we make when designing and implementing software - and many aspects of what the software does - just fits much more comfortably in a database than into "source code.”

So on one end of the spectrum, we do (and always will) have code that is and deserves to be written by hand. That's hand code - and is one component of the "source code" - so in that way, we entirely agree. Even those functions that we should be writing by hand, though, almost always belong to and exist as part of a more extensive system. We really should define at least the actual size/shape of as much of that system as possible, outside of code.

We honestly can pull the description of what that system needs to do apart from the details about how to do that in a specific context, and, as it turns out, the WHAT parts tend to be those that fit comfortably into a database, and the remaining “source code” then does fit comfortably in code.

In a "traditional development" - the square “WHAT” and the round “HOW” get squeezed into one triangular hole called "SOURCE CODE" - and it is just a costly way to approach the problem.

4. Defining Right and Wrong
A Single Source of Truth, by contrast, which authoritatively defines the system’s behavior and is entirely absent in most "traditional" models, allows for many other benefits. By far, the biggest is that it enables us to look at any part of the system and apply the label "RIGHT" or "WRONG" in a way that is almost always utterly unrealistic in most "traditional" environments.

Specifically, we can label it "RIGHT" if it matches the single source of truth, and if it does not, we can mark it "WRONG.” This process is simply not typically possible in a traditional model, at least in my experience, because it is not uncommon to have often considerable variation between the Specification and what gets implemented in the end. Additionally, there is often variation between the "Back-end,” the API, the UI, the Documentation, and the Sales Materials within the final implementation.

Without rigorous process management and oversight - the “source code” is usually a complete hodgepodge of inconsistency between all these different layers. So in a traditional model, there is simply is no "Right" and "Wrong.” There is no right or wrong because the spec usually says one thing, but that gets changed during implementation and changed again when presented in the UI. And we wrote the docs against a beta version, so they don’t match how the final version works - i.e., there's no "right" or "wrong,” there's just the way it works in each layer of the stack.

5. A Quick Example show WHAT might go in a SpecDB:

Picture a thought experiment where we are building a simple String Library. We can decide that it will include a ToUpperCaase and a ToLowercase function - and describe what those functions will do in different situations - and what their parameters and return types will be - before we pick which language or languages in which to implement the library.

We can define many of these parameters before we start writing a single line of code. And in fact, once we pick a language - a simple tool can provide stub functions that just throw "not implemented" exceptions until we write the implementation code by hand. Another conversion-tool can write simple unit tests that call each of those functions.

It's important to note, though, that if we choose to call the function ToUpperCase - we are going to want that function, with that name, to convert the input string To Upper Case in Python, and C# and Java, and JavaScript, Today, tomorrow, in 5 years in 100 years. And if there's an entirely new language created in 5 years, and we port our library over to the brand new language - we're still going to want the function in that language to be called ToUpperCase, to take a string, and return a string.

Maybe that language will instead call a String a tokenArray - but it will have an input TokenArray, and will likely also return a TokenArray. The general notion of what it should do will not change, though. And we should define this description in a format that we can easily share between any language, any operating system, any technical environment/context.

PHP is just not such an animal. A JSON file, or XML file, or CSV file or a DBMS, a NoSQL Data Store, or virtually any other easily other Queryable datastore - is a perfect place to put the structure and shared static contents of the system. Not the implementation, the architecture. The moving parts. The actors. The User Stories. Shared Static Data.

6. Production DB (at rest) vs. Specification Database

We should model the production system off of a non-production, single, shared, Specification Database - and any change made in “the system” should always start in the specification database. If there's a new thing - we probably need an additional table to represent that thing. If one or more entities have additional or different attributes, we probably need to add or change columns or relationships to other things. Suppose there are new instances of things (states, categories, roles, activities, conditions, limits, configuration values). In that case, we probably need to add or modify rows to one of the specification database tables. Or even if the rules change about who can access what and when It should be possible to encode those details into the project specification DB.

One way or the other, though, any idea, however complex, whatever it's implementation languages or final operating environments will run in (iOS, Android, Web, Windows, Mac, Embedded System, Saas tool - whatever), is representable at rest. In other words, the vast majority of the complexity of said systems can be represented, at rest, unambiguously in a well-defined database. And this can be designed thought out before we start to write any "code.” A Single Source of Truth.

We get unmatched cross-platform symmetry by sharing one machine-readable, easily queryable specification database across the N technologies in the project. By doing this, the implementations across even expansive projects will all tend to all match each other in precisely the same way that N technologies in most "traditional" environments will tend not to match each other. At least until they are each vigorously tested and validated. Trying to achieve that symmetry “by hand” is just an extraordinarily costly way to do it - and it gets even more expensive as the project gets larger. Thus with every additional "player,” whether it's a new language, a new system, or API, the complexity increases.

7. Traditional Vs. Low Code User (developer) Experience

Specifically, the parts that will tend to fit comfortably in a database are the system’s parts, which will be true completely independently of which language or operating environment they are running in. For example, the ToUpperCase function is part of a library that also includes a ToLowerCase method. The existence of those two functions - and we can put a precise (English) description of what they will do into a database. And we can add additional metadata about our string library in other rows. In this way - before we've even decided whether this will be a JavaScript, C#, Python library - or maybe all three - we can enumerate the specific details of what the library will include. Perhaps we can split the list of 50 possible string functions into 3 phases/versions - the essential functions in version 1. And then less critical parts in version v1.1 and v1.2.

A simple report written against this metadata lists the three planned versions - and describes which specific functions to include in each version of the software.

A simple, often reusable low code tool can convert that same metadata into, say, a Python module, with a template for each function that simply throws a "Not Implemented" exception. The human developer then only has to write the actual "source code" for what happens when you call StringLib.ToUppercase("foo"). Or, what happens when the StringLib.ToLowerCase("FOO") is called. Or - and this is actually the important bit - what happens when you call StringLib.DoFoo("abc").

8. Low Code Development Flow

When we add “SubString” to the list of supported functions in the SSoT, here's what a low code development context looks like - most of which would simply not be possible in a "traditional" development environment.

5 Unit Tests would immediately start failing

Python StringLib.SubString(...) fails with error: "Not Implemented Exception."
C# StringLib.SubString(...) fails with the error "NotImplementedException thrown by the target of the invocation."
Javascript/TypeScript fails....
PHP fails...
Java fails...

The next time the human developers log in on any of those platforms - they see their CI errors - and would now have, each in their own language, an empty "SubString(...)" function - which simply throws a not implemented exception.
Once each developer writes their version of the function - and checks in that code, the UnitTest resolves itself.
The documentation now shows four functions - at least for those languages which have passed the unit test. The documentation can list those that do not yet pass as functions that are "Still in development.”
Everywhere that used to previously mention the three functions that the string lib supports would now list 4.
We can link bug reports and feature improvements to the specific item in the specification database (the actual "source" of each of these functions) to get metrics, per function, of how they are doing.

9. In Conclusion
The key is that the entire "system" is defined and managed outside of the code. I.e., in a No-Code model. Ideally then, only the actual creative bits (how do we convert text from one form to another) need to be written by hand.

For as much as possible of the entire system, though, the definition and planning details for the specific behaviors are all stored outside of the code. With this abstract model in hand, low-code-tools can then create most/all of the infrastructure/plumbing/scaffolding along with much of the testing/documentation, i.e., the connective tissue - which is essential in virtually every system I’ve ever encountered.

In addressing your concerns about future-proofing, the key is that it is possible to add a 6th language at any time - and that new language can automatically start with placeholders for ToUpperCase, ToLowerCase, DoFoo, and SubString. Additionally, there would immediately also be four failing unit tests - one for each of the functions which we have not yet implemented in that new language.

We can attempt to manage these problems with appropriate oversight and things like agile development methodologies. However, it is still a really expensive way to do it, like, an order of magnitude more costly than it needs to be.

Thursday, October 29, 2020

Low-Code, No-Code, and why "Source Code" is a terrible place to put software

What if there were an entire layer of code simply missing from most "traditional" development projects? What if software developers could be spending their time focusing on just a small, interesting, innovative fraction of today’s “source code,” while having the other 70-90% automatically derived directly from the Specification on their behalf, regardless of which languages or tech stack is involved?

What if the very term itself "Source Code" was in fact a deeply flawed mangling of two completely different things?

Step 1 - Admitting that we have a problem!

As an industry, the problem starts with the very term "Source Code". The name incorrectly implies that "Source Code" is actually the "source" for a given project ... but rarely is that actually true - which is why it belongs in quotes.

The real source is, or rather should always be the "Specification," as determined by the ongoing, ever-changing specific needs of the business in question - i.e. the constraints, rules, inputs, user stories, features, wants, needs, etc - which are all subject to change at any point and frequently do. Perhaps the only constant is "change" itself, right?

The problem with the traditional approach to development though, is that every time the rules change, the "source code" is immediately stale, and will tend to get more and more out of date over time, because most of the code does not automatically update itself in response to even completely predictable changes in the requirements.

Instead, the project stakeholders typically have to wait for the python developer to get into their part of the project to update their "source code". And for the Android team to update the java "source code". And for the DB team to update their "source code".

But you can't really have 3 "sources", can you? That's just not what the word "source" means - right?

Instead, there should always be just 1 (one) source.

A Single Source of Truth

An SSoT for an entire system/platform is one which authoritatively, in a machine readable, easily queryable format that is not code, defines as much of the final solution as possible. i.e. A No-Code "Model".

But importantly, that's not where the story ends.

I'm a developer. If No-code actually worked today, I'd be out of a job. Luckily for me though - I think we're probably still at least 5-10+ years away from that reality. In the meantime however, I still want to write code. I still want my team of developers to be able to write code.

But I want them to write as little code as possible in order to accomplish their objectives. I.e. I want Low-code. Specifically, even 5 years from now, I will want the human team members to write about 20% of the code, and for the remaining 80% to be derived from specification data (SSoT) - in whatever language/tech-stack we are working in at that point.

Redefining "Source Code"

To understand how this really can actually be possible, we need to start unfortunately, by completely redefining the term "Source Code" itself. Specifically, we need to pull it apart into the two completely separate things it actually is.

Source Code = Derivative Code + Hand Code

Once we've done this - it allows us to consider a slight variation from the traditional development stack:

"Traditional" Development Stack

Framework (node, angular, react, .net, django, etc)
Static Libraries (shared code, npm, internal tools, etc)
"Source Code" (mostly written by hand)

With a Single Source of Truth or SSoT as a machine readable specification document, we can add a whole extra layer before the human developers get involved, which can usually encode the vast majority of the business rules/logic for a given system. So essentially, by always starting with language specific, mostly derived SDKs, we can have this stack.

Development based on a Single Source of Truth

Framework (node, angular, react, .net, django, etc)
Static Libraries (shared code, npm, internal tools, etc)
"Source Code" =

Derivative Code ~80% (60-90%?)
SDKs - mostly derived from the Single Source of Truth, in many cases able to encode the vast majority of the business rules/logic for the underlying system, and having little or no “hand code”.
Hand Code ~20%
i.e. the actual Source Code

The result is a completely different development environment where right from the start, human developers always get to focus on the creative, interesting bits of the project - the last 20%, rather than the other 80% which is usually rote, derivative, patternistic, completely predictable code - i.e. the platform, plumbing, scaffolding, etc. - all the "connective tissue" - regardless of language, environment or technical context.

With this model, the bottom 70-90% of the code that is typically written and maintained "by hand" in a traditional environment is instead derived - so when changes occur (as they reliably always do) as much as 8 out of every 10 lines of code can automatically update itself. It can simply "following along". And only the last 15-20% of the code written by hand, may still have to be updated by “real” developers.

This results in a dramatically more dynamic, flexible, responsive code base, top-to-bottom on which to build; resilient to even substantial changes in direction, or even the more extreme pivots that projects so frequently have to address. And it's all possible because we're talking about working with as much as 70-90% less "tech-debt" - on an ongoing basis, even years into a large scale project's life-cycle.

No-Code vs. Low-code

At the end of the day, I largely agree with Linus Lee about No-Code specifically. For the foreseeable future, it will continue to be limited in its ability to fully address the needs of large, long lived, enterprise level systems. So I'm pretty sure that we'll continue to need "real" developers for at least the next decade.

But in my Open Response to Linus Lee, I argue that No-Code tools are actually already really well suited to providing highly detailed, very scalable, machine readable specification data - which can then be leveraged as the source input for Low-code tools, to supply the vast majority of production ready source code for a given system. These Tools make the Single Source of Truth...

Language Agnostic.
Platform Agnostic.
Context Agnostic.
Future Proof.

In this way - even when a new, never before seen language "Wombatz" comes out tomorrow (or next year), a single tool will allow us to immediately, seamlessly integrate that language into any of our projects in development for which we have a well defined Single Source of Truth.

With an SSoT in hand, the newfangled Wombatz code will literally "catch up" to the rest of the languages already in the project - on day 1 - and this can all happen before you've even hired the new human Wombatz developer.

So while I would agree that No-Code tools are still definitely not yet viable on their own as a complete/final solution, they can definitely be used already today to very quickly/flexibly/predictable define the rules for even very complex and sophisticated systems - in a machine readable format - json, xml, swagger, dbml, uml, pfds, etc. Those rules can then be used to generate production ready code, which simply "follows along" as the rules change over time.

That process looks like this, even years into a large scale project:

Changes are requested by end users (business users)
The No-Code Model is updated to match the changes requested
All of the SDK's derived from the No-Code model, across the entire project stack are regenerated - including APIs, UI, Back-End, Front-End, Databases, Documentation, Unit Tests, Python, Swift, Java, TypeScript, Sql, etc - which will always all tend to match each other - as a set ... because they are all derived from a common description - an SSoT, and they are are not each treated as separate silos of "Source Code".
So by the time the human developers get into the source code (in virtually any technical context), 80% or more of the work is potentially already done, and they can immediately begin to focus on the important/creative/non-intuitive bits - like what the UX should be for the new feature, service, behavior, rule, etc.

Tuesday, October 27, 2020

An Open Response to Linus Lee's thoughts

Re: When is No-Code useful?

and Conservation of Complexity

Hi Linus,

I very much enjoyed reading your articles on Conservation of Complexity and When No-Code is useful - and I agree completely with many of the specific conclusions and observations that you make. But... I’m glad it’s still in the “notes” section of your site, as I'd love a chance to convince you that when it comes to no-code, you might be missing the forest for the trees.

I haven’t seen any no-code company or product that allows source control (and I’ve seen many no-code companies, but you’re welcome to prove me wrong.)

I would love the opportunity to try - - and in this post I will present what my development team and I do, with the objective of hopefully finding a new perspective in response to your challenge.

Related Article

This response uses language and a number of terms related to the general notion of Derivative Code - and more specifically, Why "Source Code" is a terrible place to put software. If these ideas are not familiar to you, it might be helpful to look at that article first.

So - with those ideas as background, I will also try to provide some alternative perspectives to a few of the specific points in your article about when no-code might be useful.:

1. Transitionary, ephemeral software

We agree that for things like brainstorming, prototyping, developing UX, etc - no-code is often great - but that the these solutions also typically include problems such as:

1) You typically can't manage it through common/essential tools like source control, ci, defect management, etc.

2) It's often not sufficiently scalable.

3) It's usually a black box (internally), for which you typically only have selective knobs/levers to adjust.

4) You're "stuck" as soon as the no-code product doesn't do exactly what is needed.

These are all true of most No-Code providers - but not so much for Low-code tools. Even if you use No-code tools just to simply sketch out/brainstorm an idea first, depending a little bit on precisely which tool(s) you're using, this work will almost inevitably at this point create a really well defined, machine readable description of those rules - i.e. a Single Source of Truth or SSoT - and Low-code tools can then turn those "specification" documents into production ready code (derivative code) in the language/tech stack of your choice.

With an SSoT providing a common foundation - we can now pick out the production stack, and specific tools that we want to use.

LAMP, JAM, MEAN, WIMP, Native Java/Swift/Typescript, Windows, Web, Mac, iOS, Android, etc.

Importantly though, these decisions can all come after the No-Code model has been already defined. Possibly even years after it's been defined - and after 100's or even 1000's of changes have been made to it over that time. All this is possible because the decisions are not "buried" in hand written "source code", possibly in some long forgotten language like Fortran or Cobol.

After close to 2 decades of research in this area, what I have found in practice is that most of the requirements of virtually any technology can be well defined in a database, even without knowing and ultimately completely decoupled from the specific languages or technical contexts that you're going to need in the production environment..

So we can apply Low-code tools to No-Code models - such that developers can actually start on day 1 with much/most of the code needed for the scaffolding/framework for the project already present, regardless of the tech stack involved. As a result, literally right out of the gates, the developers can begin addressing the needs of the end users, on the actual, final production versions of the code, most of which is just as flexible/responsive as the no-code model used to describe the rules in the first place.

2. High-churn code

Here again, if the code is a high-churn environment, I think that we both agree that No-Code might be a good option. And if the goal of No-Code is to be the final or finished solution, then I completely agree with this assessment..

But change resiliency over time is not the focus of no-code tools

By contrast however, Low-code tools are actually super resilient to change, and I'd even go so far as to say that they actually future proof your code in a way that is virtually impossible to replicate in a "traditional" development environment.

Because by putting most/all of the decisions about the solution needed into a No-Code model first (rather than hand written code in a specific language)- these decisions can be abstractly leveraged even years in the future, possibly against completely unknown languages today with different operating environments, technical contexts, etc - all because the decisions were not baked into hand coded "python" or "java", as the very 1st step - back in the day.

3. Avoiding the same mistakes

After all, the world is complex. And when we build software against the complexity of the world, that complexity needs to go somewhere. Software is complex, but only as much as the world it attempts to make sense of.

This is precisely the Crux of it imho. The world is complex. And that complexity needs to go somewhere. And most of it simply should not end up in "Source code" as it's first destination. That's basically helpful to an audience of 2. The compiler for the language in question and developers of that specific language. This is a really expensive & brittle place to record those decisions, with a narrow audience of people who can definitively answer the question: What does this "system" actually do?

Instead, that complexity should be captured in a specification database (i.e. a No-Code model) which can be abstractly queried and reported against now, and in the future. The human readable, English "specification" then simply becomes a report against that database. And any time we update the specification database, we simply re-run that report - making it the first target to "follow along" as changes occur.

Then, when we re-run, say, the "python report” - it updates the foundational python libraries and code to also reflect the new rules/changes. With most of the "plumbing" largely maintaining itself over time, the code that we do still have to write by hand ends up being extraordinarily efficient, because we're not constantly re-inventing the wheel and calling that "source code".

Does the distinction I'm trying to draw make any sense?

The grain of abstractions

but we as a technical industry have learned how to build and evolve software systems against changing requirements and constraints that span years and decades.

We agree that changes are inevitable, but - however good our "source code" is, it still inevitably mixes multiple things and treats them as one. Specifically, traditional "source code" mixes the description of WHAT needs to happen into the same place as the description of HOW to do the work in question in a very specific language, context or environment.

Instead, systems can be dramatically more resilient to change, if the definition for WHAT needs to happen is consciously isolated and defined separated from the description of HOW to actually do that in a particular language. The types of information which fit comfortably in the "specification database" that I've mentioned are rules that encode what needs to happen. Then tools can convert those details about what needs to happen into a specific language with as much specificity as is available between the Single Source of Truth and the tool which provides the final output code.