The Truth, The Whole Truth, and Nothing But The Truth

As a tester I have always believed in the idea of a single source of truth, as a consultant I have always espoused the idea of a single source of truth, until recently. I have found my self in a conflicted state, as I am currently in a situation where there are multiple sources of truth.

The concept of a single source of truth for your data is still the ideal solution, and I will continue to advocate for this within organisations where it is practical. However I recently on one of my engagements was involved in a situation where there was multiple sources of truth, and it was impractical to change this (apart from the size of the data set(s) there was external factors such as legislation that blocked us from doing it).

This all started with a simple conversation around ‘how do you DevOps Databases?’ The easiest answer to this was ‘Outsource it to the DBA team…. Or use a tool that does it for you. My counter argument to this was it is easy to do DevOps on a greenfield system, but once you start adding legacy systems it becomes harder. My argument then over the course of the discussion morphed into how do you handle multiple sources of Truth? (something I never thought I would have ever asked).  Not surprisingly the responses were you should only have 1 source of truth….

I responded with a real life example, from a client situation, and that situation is this. If you are working in a Government Dept (can be at any level) and you have the system you are developing/testing/validating within that department, it contains a certain set of data. Now under the Australian Privacy Principles (APP)  APP 3.1 ‘If an APP entity is an agency, the entity must not collect personal information (other than sensitive information) unless the information is reasonably necessary for, or directly related to, one or more of the entity's functions or activities.For example the Department of Health does not need to know about your income, or taxation to provide medical care to you, so they are unable to ask for or store that information.  The reverse is true when it comes tax time, the Australian Taxation Ofice (ATO) needs to know some health details about you (such as your private health insurance, in order to ensure that you are taxed or levied the correct amounts based on your income and coverage status.

There are other situations where this is the same for numerous departments. As such there is a single identity source of truth which contains the most basic information that every Department needs to identify you, such as Name, Last known address, phone, Date of Birth (this is source of truth 1 which I shall refer to as Base Identity). Each department is then responsible for the data that they manage in relation to you. When you first interact with any department they try to validate or create you against the Base Identity system, this would typically be at birth where your parents do the interaction on your behalf. Once they have your Base Identity, they are then able to gather the requisite information from their internal systems to consume the additional data they require, this is now Source 2 which we shall call Extended Identity. To complete the 'function' of the department they may data-match some of your extended identity against the department who is the primary manager of certain aspects this is now source 3 which we shall call Advanced Identity. I.e. the ATO is the primary manager of people’s Tax File Numbers and other Departments consume that data.

 You Can't Handle The Truth!

So we now have a single source of truth for a department (the Extended Identity) which enables them to perform the function to service the customer, the trick now comes when the data provided by you, and stored in the extended Identity is mismatched against the primary managers data. So we are now faced with 2 sources of truth, ultimately the primary manager should be considered the source of truth for the data they manage, however what do you do, if the data they provide and you consume, no longer matches the data you have previously collected, and more importantly validated?

 If we collect the data and consume the data from the Source Of Truth and they don’t match who is wrong? In this instance I am talking about a data feed from one of the providers was corrupted, and we had a situation where our Extended Identity did not match the data of the Advanced Identity, now there are valid situations where this could occur, if certain aspects of data are changed for whatever reason. The example I am talking about was a coding mistake in the API.


I would like to know if you have had had situations where there are 2 sources of truth, and how you have handled this? 

  • How do you validate the truth is the truth?
  • What do you do when source of truth is wrong?

I find this interesting and would like to hear from people that have had similar situations, and how you handle this especially with the prevalence of DevOps.


How do you do DevOps of Data?





References and Further Reading



Leave your comments

Post comment as a guest



  • No comments found