ETL Testing / Data
Warehouse Testing
Why do organizations need Data
Warehouse?
Organizations with organized IT practices are looking forward to create a next level of technology transformation. They are now trying to make themselves much more operational with easy-to-interoperate data. Having said that data is most important part of any organization, it may be everyday data or historical data. Data is backbone of any report and reports are the baseline on which all the vital management decisions are taken.
Most of the companies are taking a
step forward for constructing their data warehouse to store and monitor real
time data as well as historical data. Crafting an efficient data warehouse is
not an easy job.
Many organizations have distributed departments with different
applications running on distributed technology. ETL tool is employed in order
to make a flawless integration between different data sources from different
departments.
ETL tool will work as an integrator, extracting data from
different sources; transforming it in preferred format based on the business
transformation rules and loading it in cohesive DB known are Data Warehouse.
Well planned, well defined and
effective testing scope guarantees smooth conversion of the project to the
production. A business gains the real buoyancy once the ETL processes are
verified and validated by independent group of experts to make sure that data
warehouse is concrete and robust.
ETL or Data warehouse testing is
categorized into four different engagements
irrespective of technology or ETL tools used:
- New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer requirements and different data sources and new data warehouse is build and verified with the help of ETL tools.
- Migration Testing – In this type of project customer will have an existing DW and ETL performing the job but they are looking to bag new tool in order to improve efficiency.
- Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.
- Report Testing – Report are the end result of any Data Warehouse and the basic propose for which DW is build. Report must be tested by validating layout, data in the report and calculation.
ETL
Testing Techniques:
1) Verify that data is transformed
correctly according to various business requirements and rules.
2) Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.
3) Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.
4) Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.
2) Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.
3) Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.
4) Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.
Apart from these 4 main ETL testing
methods other testing methods like integration testing and user acceptance
testing is also carried out to make sure everything is smooth and reliable.
ETL
Testing Process:
Similar to any other testing that
lies under Independent Verification and Validation, ETL also go through the
same phase.
- Business and requirement understanding
- Validating
- Test Estimation
- Test planning based on the inputs from test estimation and business requirement
- Designing test cases and test scenarios from all the available inputs
- Once all the test cases are ready and are approved, testing team proceed to perform pre-execution check and test data preparation for testing
- Lastly execution is performed till exit criteria are met
- Upon successful completion summary report is prepared and closure process is done.
It is necessary to define test
strategy which should be mutually accepted by stakeholders before starting
actual testing. A well defined test strategy will make sure that correct
approach has been followed meeting the testing aspiration. ETL testing might
require writing SQL statements extensively by testing team or may be tailoring
the SQL provided by development team. In any case testing team must be aware of
the results they are trying to get using those SQL statements.
Difference between Database and Data
Warehouse Testing
There is a popular misunderstanding that database testing and data warehouse is similar while the fact is that both hold different direction in testing.
- Database testing is done using smaller scale of data normally with OLTP (Online transaction processing) type of databases while data warehouse testing is done with large volume with data involving OLAP (online analytical processing) databases.
- In database testing normally data is consistently injected from uniform sources while in data warehouse testing most of the data comes from different kind of data sources which are sequentially inconsistent.
- We generally perform only CRUD (Create, read, update and delete) operation in database testing while in data warehouse testing we use read-only (Select) operation.
- Normalized databases are used in DB testing while demoralized DB is used in data warehouse testing.
There are number of universal
verifications that have to be carried out for any kind of data warehouse
testing. Below is the list of objects that are treated as essential for
validation in ETL testing:
- Verify that data transformation from source to destination works as expected
- Verify that expected data is added in target system
- Verify that all DB fields and field data is loaded without any truncation
- Verify data checksum for record count match
- Verify that for rejected data proper error logs are generated with all details
- Verify NULL value fields
- Verify that duplicate data is not loaded
- Verify data integrity
ETL
Testing Challenges:
ETL testing is quite different from
conventional testing. There are many challenges we faced while performing data
warehouse testing. Here is the list of few ETL testing challenges I experienced
on my project:
- Incompatible and duplicate data.
- Loss of data during ETL process.
- Unavailability of inclusive test bed.
- Testers have no privileges to execute ETL jobs by their own.
- Volume and complexity of data is very huge.
- Fault in business process and procedures.
- Trouble acquiring and building test data.
- Missing business flow information.
Data is important for businesses to
make the critical business decisions. ETL testing plays a significant role
validating and ensuring that the business information is exact, consistent and
reliable.
No comments:
Post a Comment