Building and maintaining sources of truth

Brian Levine's Profile Picture

Brian Levine

Co-Founder, CEO

Expected vs Actual

I've done a fair bit of work at this point in my life, in and around tech but also outside of it. I spent 12 years as an engineer for the Department of Defense before getting into the Silicon Valley tech world. Since then, I've spent another ten years in tech as a Support Engineer, Support Lead, and now as a founder of a support tools company. All that to say that I've worked with a lot of people in different industries and in different capacities and have run into this strange use of jargon over and over again.

When I first heard the term "source of truth" at a tech startup, I thought it was a specific term of art used in the industry that differed from what I had learned working for the government. Maybe people use it in some oddly particular way in this industry. With that in mind, I asked someone what they meant when they used the phrase "source of truth" in conversation.

Turns out, it isn't an industry specific term of art. It's that people were misusing the phrase "source of truth" pretty frequently and to everyone's detriment.

Since first running into this disconnect years ago, I've encountered it over and over. It's a commonly used phrase that, in my experience, doesn't mean what people think it means. Or, at the very least, shouldn't refer to the things it's often used to refer to.

But I'm getting ahead of myself.

Sources of truth matter

The point of having any sort of "source of truth" about anything is to keep everyone aligned in some way. I know that "keeping teams aligned" is kind of buzzword-y, but the point is to make sure everyone has the same information about the business.

We want everyone on the team, and everyone at the company, to have the same basic information at any given time. That's why we care about having a source for this information, for this "truth".

We also care because sometimes we're just told to deal with it.

Several times in my career as a support professional I've been told to maintain some source of truth for the company. A list of client accounts, monthly revenue, customer contact information. For whatever reason (and I have some guesses that I won't get into yet), Support is often tasked with maintaining some source of truth for the company.

So sources of truth are important to have and Support is asked to maintain some or all of them. Those are good reasons for us to care about what they are and what they are not.

Defining a "source of truth"

Here's my short definition:

A source of truth is where a data set is stored.

That's kind of it in a nutshell.

The data can be any information that people at the company need. The source of that data is where it's kept. It seems kind of simple and literal. Not a lot of room for negotiation or error, right? And yet we're here talking about it, so there must be more to it.

Let's put it in more philosophical terms and see if that helps:

a source of truth is a canonical depiction of reality inside your company

Less technical and literal here, but this is the part that's important.

A source of truth is not ambitious or hopeful. It isn't "leading" or "trailing" data. It's a picture of something real that everyone in your company who needs to can look at to understand something as it is in reality. Revenue, expenditures, customer churn, support ticket volume -- whatever bit of information people are looking for -- the source of truth will be the place to find the real and complete picture of it.

The term "source of truth" is often used in conversation as a singular thing. But there can be more than one source of truth within a company if and only if those sources of truth represent different pieces of information.

I once worked for a company where the CEO said that the product should be "the customer's source of truth." Unqualified. But the source of truth of what? There are a lot of things happening within a company. A lot of things happening within a team. It's difficult for any one tool to be the source of truth for everything the team or company needs.

The idea this CEO had is not unique. A lot of vendors promote their products as the "source of truth" for your company. They want you to store all your information in that tool so that you're using their tool as a reference for everything you need. The benefit seems positive: you have one place to find any information you could want.

The benefit to the vendor is greater, though, because once they're your primary reference point it becomes difficult to tear that out. It's a way to prevent customer attrition from their point of view more than anything.

You don't have to live like that. You don't need one single source of truth for everything. You can and should have a bunch of primary sources to work with.

You can have a source of truth for your list of active customers and a separate source of truth for which customers have contacted your support team in the last 30 days. The data is related, but not so tightly coupled that they need to exist in the same place. You can have a source of truth for your customer onboarding process for the professional services team and a separate source of truth for your customer offboarding process for the account management team.

These might seem obvious to you. So let's think of a counter-example.

You cannot have more than one source of truth for the same information. You can't have a spreadsheet of client accounts as the source of truth for one team and a Salesforce report of client accounts as the source of truth for another team. This is, in fact, a real situation that made me angry enough to write this. It's the reason I thought all those years ago that I was confused about how people were using the term. Because many people do keep multiple "sources of truth" for one data set.

If you store your list of active customers in more than one place, only one of them can be considered the source of truth. Additionally, if you have multiple sources of truth like that, it's likely that none of those could be considered a source of truth at all.

Why and how a source of truth is useful

I believe that a good source of truth has three qualities:

  1. Correctness
  2. Accessibility
  3. Uniqueness

Correctness

It can be considered "correct" if the source of truth changes as reality changes.

Your source of truth should accurately reflect the data in real time as much as possible. For example, if you process customer refunds, then your source of truth for which customers have been given refunds should be updated when a refund is processed.

This seems like an obvious example, maybe. But in many cases, people have a bunch of intermediate steps. Imagine a team that keeps a spreadsheet of customer refunds for each month, and they update the spreadsheet manually at the end of every week by going into their payment system, finding the refunds for the week, and copying/pasting those numbers into the spreadsheet.

Is that spreadsheet the most correct source of truth? In the middle of the week, there are refunds that aren't yet reflected in the spreadsheet. A more correct source of truth would be the payment system in which the refunds are being processed. It has the data in real time, since that's where the refunds are happening.

Accessibility

A source of truth is accessible if it is available to anybody who needs to know the information when they need to know it.

If the information is not directly accessible, or it's difficult to access in some way (going through layers of permission granting and authorization from other people, for example), then it isn't useful. And if a source of truth is not accessible to people when they need it, they will start to create their own source of truth, leading to the problem of multiple data sets. Rather than trying to have a group of documents that teams try to keep in sync (which inevitably fails), give people access to the information they need to do their jobs.

I know that sounds easier than it is in practice. This often involves lengthy discussions with legal and finance. But the discussion is worth having.

Uniqueness

Lastly, a source of truth for any individual data set can only be a source of truth if it is the only one the people go to for that information. This is a recurring theme for a reason.

If people look for the information in multiple places, you're keeping the data updated in multiple places. This increases the likelihood of those non-primary sources getting out of sync.

These three qualities are often closely related, especially in how they fail. If a source of truth is not accessible, it will be recreated by other people in ways that are more accessible to them. This creates a situation where there are non-unique "sources of truth". If sources of truth are not unique, there is a very high likelihood that they are not correct.

Okay, that's all well and good. But what can we do about it? We're supposed to be talking about building and maintaining sources of truth here, right? How do we do it?

What to do, what to do

First, do an audit.

"This isn't fun, but it's often enlightening. Check all of your information. Data about customers, about the business, process documentation, etc. Make sure that the things you refer to regularly are the same things that other teams are referring to. Talk to those other teams to see what they're doing and what information they're using. You might be surprised by what you discover. You might be angry or sad. That's okay; this is a chance to correct past mistakes."

You may disagree with another team on what a source of truth should be. If you're looking at customer account data in a backend admin tool and the sales team is getting their customer account data in Salesforce, how do you know which one is more correct? Dig into each source and see where they are getting their information. Try not to get too attached to one over another. In the end, it's more important to use the most correct information than it is to use your information.

Now that you've audited your sources and made them unique, let's make sure they're correct.

If you want to make your sources of truth a direct reflection of reality in real time, you should be automating them as much as possible. Every manual data entry in a "source of truth" takes it further from the source. If you're copying and pasting any information at any step, stop and figure out how and where to automate that data transfer. Stripe, Salesforce, Zendesk, Help Scout, RedShift… the list goes on. They all have APIs you can use to set up automatic updates. When in doubt, see if Zapier has an integration.

The caveat to this is that sources of truth that contain processes and internal docs are usually written by people. They still have a primary source, whether it's a Jira issue or a Notion doc or a Google doc. If those docs need to be used in other places, make sure that the secondary places are updated automatically when the primary source changes.

You should make each source of truth accessible to the people who need it. Again, this is not always in your power. Some will be a matter of automation, but the tools need permission to access that data for the automation to work. This is another chance to talk to other teams and see who needs what information, when they need it, and why they need it. Make it a company-wide project to get everyone the access they need so that everyone can do their best work.

All together now

In summary, sources of truth are a depiction of reality.

They're correct, accessible, and unique.

You may notice that I never talked about "building and maintaining" any of these sources of truth. The reason is that I think you generally shouldn't.

Some sources of truth are written by people, so building and maintaining them is a matter of keeping the primary source updated. But the way sources of truth get misused is when a primary data source is copied repeatedly until people are looking at a recreation of a recreation that no longer tells them the truth at all.

This is why I think that sources of truth are best when they are discovered and cleaned rather than built and maintained.

Treat your sources of truth less like craft projects and more like archeological surveys. We'll all be better for it.