Get a Good Start with Data Governance

Are you having trouble capturing metrics associated with your Data Governance program? Do you find yourself wondering where those metrics came from in the first place?

To manage a successful Data Governance program, you need to be able to assess the functions critical to your business, the data elements necessary to complete those functions, each system that processes each data element, and the volume of these data elements that each system holds. To manage an outrageously successful Data Governance program, you will also need to incorporate context and timeframes into those measurements.

In other words, a successful Data Governance program must have a deep understanding of:

  • Cardinality
  • Complexity
  • Volatility
  • Reuse

Cardinality is the easiest to calculate. This value is gleaned from the number of unique instances of that data element within your information environment. If the data element is only stored in one data store, congratulations! You need to celebrate, because most information systems contain multiple data stores housing the very same data element.

If you are dealing with a more ubiquitous information system, keep track of the name of each data store you examine, and on what date the numbers you’re using were relevant. Business is dynamic; expect your data elements to be dynamic, as well.

Scoring Complexity is critical to understand how much maintenance cost should be anticipated with that data element. Now for another car analogy (see Does Your Information Engine Run on Diesel? and Data Quality is a Conversation for more car analogies) – it’s the difference between the cost of upkeep for the family sedan versus this year’s Maserati. The trick is to get the performance without the cost, like the Lotus Elise. It’s a great performer (0-60 in 4.8 seconds, stops, goes and turns when you want it to), but reasonable maintenance costs. Oh, and it’s a 4-cylinder, so it gets CRAZY gas mileage. Yes, I’m biased about the Elise.

The higher the complexity, the harder you will need to work to maintain that data element. Similar to volatility, if your data elements are stored in multiple locations,  you can expect to compound your costs for maintaining that data element.

HINT: If there is a variation in the number of data elements in each identified data store, conduct further analysis.

Volatility is a measure of how frequently the data element changes in each of its data stores. As with cardinality, if only one data store houses the data element, calculating volatility is simpler. You can count the number of data elements in the data store at each interval during the defined timeframe. Use those numbers to calculate the volatility. If, like many long-lived organizations, the data elements critical to your business functions are stored in multiple locations, scoring volatility gets a lot more interesting.

HINT: Score the volatility for each data store during the timeframe. Any differences should spark further analysis.

Think of Reuse as the “green” factor of the data world. The 2nd law of data science states that the value of information increases with use. The more you use your information, the more information you have to use. Successful organizations reuse their internal information for a variety of purposes. Billing, Customer Service and Marketing all use the same core customer information to complete their functions.

Many of the most successful organizations have either completed or are working on data-centric efforts, like Master Data Management or Single View of the Customer. The underlying goal is to get the organization to share the same version of that data; to reuse the data element rather than creating multiple versions of it and watching IT support costs skyrocket.

Quantifying metrics will improve your chances of Data Governance success. I developed a tool that calculates the 4 variables discussed here – Cardinality, Complexity, Volatility, and Reuse – and helps make sense of the business functions, their supporting data elements, and the information systems on which they reside.

After entering values for each metric described above for the 11 in-scope data elements shared across (at most) 3 Critical Business Functions – Customer Service, Billing, and Marketing – this is what’s displayed as the summary:

DG Eval Summary

Taken as an average of the scores for each data element, this provides a summary view of the range of values to be expected for data elements associated with the defined business functions. This tool also provides a summary view of each metric.

Here’s the view of Volatility:

DG Eval Volatility

What questions do you have based on these two simple views? Is there an opportunity to increase Reuse, or decrease Volatility and/or Complexity? Are there unexpected variances in Cardinality that need to be understood before moving forward?

Having this visualization on hand when discussing your Data Governance program can help focus a wide audience quickly. Any questions that arise regarding the scores can be addressed with the hard numbers that back it up.  Most importantly, it enables the team to review trends over time, which is critical to the success of any Data Governance initiative.

If you’d like to learn more about this tool, including how you can use it to guide your Data Governance conversations, please contact me.

One thought on “Get a Good Start with Data Governance

Leave a Reply

Your email address will not be published. Required fields are marked *