Data validation may not be one of the hot topics in data management, but it plays a critical role in ensuring that data sent from one place to another for operational and analytic purposes is still the same data. For many organizations, reconciliation and auditing of such data is a painful task of manual comparisons or, worse, a tedious sampling of data from two systems to look at differences to create exceptions; in many cases people transfer the data into spreadsheets where even more mistakes can be made. Many organizations have gone to the trouble of custom-coding SQL and applications to compare the results of data integration and synchronization processes and ensure the integrity of their data.
An alternative to this complexity is a product from DVO Software called DataValidator that is designed for data validation and confirmation. For many organizations it can reduce the cost of manual labor, the time needed to finish the tasks and the potential risk of having incorrect data spread through the organization. DVO Software provides a critical link in the automation of these types of tasks that are usually conducted in the second phase – testing and quality assurance – of just about any IT data or application development project, before the system goes into production. This can be a large issue when an organizations makes changes to its data-related infrastructure, such as upgrading a database or application, initiates a data migration project or brings a new data warehouse or business intelligence project into production.
It is critical to ensure that data has the same types of fields in format, length and even range of values. Our benchmark research on data governance found the largest impact of manual efforts that miss the data validation step to be data inconsistency between applications and services – a serious risk when business depends on having the right data at all points. This is the case in product information management, as we found in our benchmark research on that topic; almost two-thirds of organizations use custom coding and manual processes in their efforts to assure they have consistent, high-quality product data.
It is interesting that data validation capabilities are not readily available in some prominent data integration technologies; it requires customization and coding to perform. Addressing this situation somewhat, DVO Software has initially made its software easy to embed with Informatica PowerCenter so that it can be invoked readily from within that environment. It generates a set of rules to be used in data mapping along with generating an exception-based report for analysis. DVO Software claims that heavy manual testing of data validity can take weeks in most enterprises and that its product reduces the time spent on these types of manual efforts by one-third or more. This streamlining speeds source-to-target testing and regression testing of data-related systems and facilitates simpler upgrades, which happen in IT on a regular basis.
DataValidator is in its third version and is available directly and as an option from Informatica. While there are several products on the market for low-end data validation and in most cases deduping, matching and cleansing of lists, more is needed at the enterprise level where DVO Software competes and companies use Informatica PowerCenter. This application was designed for the purpose of data validation. If organizations look at the cost of resources and time spent in validation, they will easily establish a business case for automation and improvement. While you might think that you have done everything you can to improve and automate data-related tasks, this product might make you want to take a second look.
Mark also blogs at VentanaResearch.com/blog.