Do you really know what data quality is?
We are all aware of the old adage about data quality: “garbage in, garbage out” (GIGO), which implies that most data issues result from incorrectly created or poorly maintained information. Often, however, data issues stem from the lack of clarity on the data rules that need to be complied with for successful business processing or the expectations around reporting needs. Indeed, issues may arise when data is incomplete or just used (or not used) in unexpected ways during application design.
Thus, poor data quality arises from multiple areas. While classic technical items such as consistency, conformity, completeness, de-duplicated, and audited are all applicable, the reality is that if businesses can’t use or rely on the data, there is a quality issue. Since many of the underlying business issues can be from the design, processing rules, or reporting areas, a more functional assessment of data quality must occur in conjunction with the expected, technical one. This reality, however, has a significant impact on how you define, measure, achieve, and maintain data quality. To put it bluntly: Don’t assume that fixing an issue involves a minor correction at the source and that things will automatically improve in the following new solution. Data quality often won’t improve unless it’s a key focus area and we apply some proactive effort.
Data quality: A working definition
Data quality is, to my mind, not just a mix of the technical and functional items (noted above). It also involves physically recording data quality measures to meet critical data rules, aiming to improve the quality over time. Unless you link these data quality measures to actual business impacts, specifically the cost impacts of poor data, it will always be hard to build business cases to justify improving data quality. This is because you won’t have the capability to show any linked financial costs improvements as data quality increases.
This blog aims to discuss why companies need to invest in data quality to avoid potentially significant impacts over time, or at the very least, why they need to invest in data quality tactically as larger system implementations, upgrades, changes, or developments occur.
It won’t fix itself!
The simple truth is that even with the best intentions and hard work, data quality may still be a stubborn item to address. The actual status of your business data usually only comes to life as large-scale changes occur. The triggering event can be as simple as the implementation of new business processes, or as complex as a large-scale digital transformation project that has been delayed, taken significantly more effort, or failed to realize expected benefits due to data quality issues. Thus, while many data issues may be accommodated in current designs, working practices, or accepted by the end-users, in case of a larger-scale change, these same issues become much more apparent and often block, delay, or even restrict the business’ ability to adopt change and drive innovation.
The bottom line is that unless data quality is directly addressed, it often won’t be resolved. Additionally, the design, processing, reporting, governance, and management of data all play a large part in data quality.
The following list provides some key points to start your thinking around data quality and help drive the need for broader quality management, which is required to support future business transformations or large-scale changes.
Understand the impacts of data quality: Link data quality measures to actual time and cost impact to the business and, when applicable, to the project or support teams. While there are risks around data quality, there are intangible benefits to improving it. Some key examples of this are security, auditability, and data compliance, all of which are fundamental items and can’t always be fully ‘costed’ but can have a significant impact if problems occur.
Detail data requirements and rules: The solution designs within and across systems can have many data rules. These may be known to specialists or found via data rule mining. Either way, the critical issue is identifying key data objects and attributes that can benefit the most from tighter control via data rules and reporting linked to business costs and impacts. Additionally, general data requirements for data quality and associated management, reporting, and control are also vital to ensuring that suitable tooling, processing, project delivery, and operating solutions are agreed upon and linked back to key requirements.
Assess the current status and areas for improvement: Understanding your own data quality and usage is a critical step to enable any assessment of your Data Management and operating model. Once an assessment has occurred then a plan can be more effectively created into how any improvements can be instigated not just to the data, but also to the wider aspects of the data solution. Assessing data quality, though, is not a one-off event; it’s an ongoing process. Similarly, potential improvements of data quality and delivery are moving targets, so while this may require a kick-off or an update to any current periodic process, it is a crucial stage when planning the way forward.
Identify your governance and delivery approach: While there can be products and frameworks for master data governance and toolchains for delivery, this may be a large step to complete in one pass. Therefore, coming to an agreement on your own governance and delivery approach is key while maximizing the use of available tools and applications through your skills and resources. There may be multiple steps in this approach; from initial focus areas to the use of procedural or focused solution areas, which can then provide a potential route to more in-depth solutions. This provides a concrete step or validation of approaches and allows wider and larger deployments of governance and operational data solutions.
The points above are just some areas to consider. The challenge is to proactively ensure that data does not become an obstacle to change, transformations, or innovations. Most companies have some level of central data governance, quality management, and delivery. However, you can re-invigorate your approach to data by using profiling tools and data mining solutions to understand your data and potential rules, quality, data governance, and operational opportunities. Adopting a more business-focused view of the data also helps ground the solution into real requirements, costs, and benefits. The business focus will typically be from within the company through an end-user perspective, however, do not forget to consider external factors such adopting process simplifications to adhere to standard software solutions or using cloud native delivery to move technical debt outside of applications.
So what should your next steps be?
In summary, my main recommendations would be to address data quality early in any change program and on an ongoing basis. Re-assess any current quality and governance linking to data requirements, physical cost/risk-based impacts around data, and utilize available tools, platforms, and services that can offer a more comprehensive input into this process. And remember, it will also have to include your own business and technical perspectives too. Next, I would suggest that key stakeholders get together and agree on a direction forward, depending on where you are with your data quality journey and requirements. This could involve some specific, tactical steps and/or a larger strategic initiative. Overall, data quality is simply not a topic you can ignore since you can be sure that it will eventually catch up with you!
Data quality management is a much wider topic that I will be covering in forthcoming blogs. HCLTech would be keen to talk to companies with large on-premise SAP-focused systems and that seek to understand how they can transform their delivery capability and maximize cloud utilization.