Checking Data Completeness is done to verify that the data in the target system is as per expectation after loading.
The common tests that can be performed for this are as follows −
- Checking Aggregate functions (sum, max, min, count),
- Checking and validating the counts and the actual data between the source and the target for columns without transformations or with simple transformations.
Count Validation
Compare the count of number of records in the source and the target tables. It can be done by writing the following queries −
SELECT count (1) FROM employee; SELECT count (1) FROM emp_dim;
Data Profile Validation
It involves checking the aggregate functions such as count, sum, and max in the source and target tables (fact or dimension).
Column Data Profile Validation
It involves comparing the distinct values and the count of rows for each distinct value.
SELECT city, count(*) FROM employee GROUP BY city; SELECT city_id, count(*) FROM emp_dim GROUP BY city_id;
Duplicate Data Validation
It involves validating the primary key and the unique key in a column or in combination of columns that should be unique as per the business requirements. You can use the following query to perform duplicate data validation −
SELECT first_name, last_name, date_of_joining, count (1) FROM employee GROUP BY first_name, last_name HAVING count(1)>1;
ETL Testing – Backup Recovery
Backup recovery for a system is planned to ensure that system is restored as soon as possible from a failure and operations are resumed as early as possible without losing any important data.
ETL Backup recovery testing is used to ensure that the Data Warehouse system recovers successfully from hardware, software, or from a network failure with losing any data.
A proper backup plan must be prepared to ensure maximum system availability. Backup systems should be able to restore with ease and should take over the failed system without any data loss.
ETL Testing Backup recovery involves exposing the application or the DW system to extreme conditions for any hardware component, software crash, etc. The next step is to ensure that recovery process is initiated, system verification is done, and data recovery is achieved.
I really like and appreciate your blog.Really thank you! Great.
I think this is a real great article post.Thanks Again. Really Cool.