Given that it is World Data Privacy Day and having spent this week working in this field, I thought now would be a good time to publish a post about data quality and risks. While this article does not directly relate to users privacy, it does connect back to data issues and strategies for organisations that may need to be addressed.
Data is a key asset to businesses of any size, from large multinationals with vast data centers looking to gain a competitive advantage, regional organisations aiming to make important decisions around their data and smaller business looking to understand their customers more. It seems more and more important in today’s world that the impact of low quality data can create issues that run through a business.
This is not a new topic and organisations are well aware of the problems that can exist with data, however it’s important to review data governance procedures to ensure data is trustworthy and fit for purpose. High quality data can help an organisations perform to its maximum efficiency while bad quality data can lead to trust issues with partners or internal teams, legal problems with customers or consequences to businesses strategy and their reputation.
As business systems grow, integrations are made or operations are scaled up, data is consumed in various different ways. It may be aggregated or merged from various different channels such as social media, web forms, data entry or alternative data sources making the problem more complex and difficult to control. Therefore as the problem domain of data grows it becomes even more important to control the risk of data quality.
We must all agree, inaccurate, incorrect or incomplete data will not serve your organisation well and will most likely have a negative impact on your business decisions at many different levels. There are several ways that an organisation can improve data quality in its operations. Having a good data strategy and maintaining a method of continuous improvement is vital to achieve your data quality targets. This means that there is not a single fix to the problem, it requires a process to monitor, improve and review data over cycles that will manage and maintain your data quality and the effectiveness it has on your organisation.
1. Profile the data
- Discover the strengths and weaknesses of your data
- Identify structure, attributes and rules associated to your data
- Do you have bad data? What work needs to be done?
- Generate metadata to describe it
2. Metrics and Defining Targets
- Measure the quality of your data
- Set targets based on data dimensions (defined by DAMA) – Accuracy, Completeness, Consistency, Conformity, Uniqueness and Timeliness are some of those suggested
- Remember its not always essential to populate all null values in your dataset but this should be determined with your business stakeholders
3. Define and Implement Data Quality Rules
- Define validations for your data – Identify rules for specific types of data
- How can your data be reported in something is missing? An alert or warning from a report or digital dashboard or associated system
- Create data quality scorecards
- Fix issues already reported
- Implement mechanisms and validations to prevent data issues that have been signed off by business stakeholders
- Do you have ETL requirements? An ETL (Extract Transform Load) process is a method that transforms your data to a required format.
5. Test & Review
- Test the development fixes to data in a staging environment
- Get Data Quality Percentages to review the status of data quality
6. Continuous Monitoring
- Use tools to monitor your data input to your organisation and existing data
- Recycle your data profile process moving back to step one and continuously improving your data sets and control quality
It is important to note that data quality should run through your organisation and your stakeholders will need to have a say in the process about how the metadata is defined and what they expect in terms of the data quality. It may be necessary to implement a metadata repository to maintain, manage and govern your organisations data and provide visibility and traceability to all stakeholders and roles that may need to see how your data is defined, who owns the data and how ownership works.
This article has been a light introduction to the topic but hopefully you will need see the importance of setting up a good framework like this in your organisation will help to govern and maintain the quality of your data.
If you would like to know more about this process, ways to implement it and tools that may help to achieve this in your business please get in touch.