Wednesday, December 12, 2012

[BI Testing] What Is Business Intelligence Testing




     What is Business Intelligence Testing



What is a BI software application all about ?

 
Source data, target data, data warehouse, data mart, reports, ETL - Extract Transform Load, etl logic - incremental or full , integration services, reporting services, analysis services, database engine, cubes, measures, measure groups,dimensions, facts, reports testing - drill down, drill through, build deployment. I mean there are just so many of them .

How does the chain look like ? May be something like this :

Source  -> Staging -> Datawarehouse -> Cube -> Reports

 
Source :
  
Multiple data sources  such as MSSql Server, Oracle, MySql, SAP, Teradata, DB2, Sybase. It can be any combination of the listed as well as non listed ones for example as simple as flat text files. These multiple data sources can be an input data source to an application software that might help the end user take future strategic decisions.

 
Staging : 

An intermediary state of source data that act as an input to the datawarehouse where in data is loaded by the ETL logic using integration services from the source data. It is more or less the source data in its original form with those tables getting discarded that might not be needed for the report generation within the application.

 
Datawarehouse : 

It is the final entity developed within the database engine that gets created in the sequence mentioned above. It is here that the final objects get created and based on these objects the cubes are created for the analysis purpose. Here the object types that the complete data is organized into is called dimensions and facts. They as well as just like simple tables but the attributes within each dimensions and facts have a specific association within itself to resemble some realistic facts and associated information as per the business requirement.

 
Cube :  

This is the Analysis services objects within the complete  BI application development phase. As the name suggests it is not just the two dimension tables that hold its relevance in a typical RDBMS . It is indeed a three dimensional data modeling technique wherein we analyze a data set in more than just two dimensions. The facts that are associated with an application are analyzed as per the association they have with the various parameters which in BI terms are called as measures, or measure groups. Measure groups are actually a combination of more than one related measure. Thus we get greater insights into how a specific aspect of any business decision making gets impacted with a variation in various parameters.

 
Reports : 

Reports are nothing but the cubes data getting represented onto a user friendly interface with the option to parameterise the reports as per the business needs. In simple terms it is the end product that gets developed and the data we look for in the cubes are available for view purpose in them. We can drill down and drill through them based on the scenario we need.

For example we can by default see the report for any specific fiscal year as to how much sales have been materialized and then drill down onto the quarter basis , and then monthly basis, then the weekly basis and finally on the daily basis . Similarly drill through also gets applied on to the reports and the data can be visualized as per needs.

Authorization and authentication is another feature that has its role to be played in the reporting services but then the authorization of the cubes over ride the privileges granted on the reports.

I hope by now we have gain some insights on what these applications resemble like and how is it different from any other typical web application . Its more of an application that delves into data and data modeling and management techniques.


What is BI testing all about ?
 
 BI testing is pretty different from any other application in ways more than we can just list out. Domain expertise can only make the tester's life easy in an otherwise unknown nature that such application develops . There is no way one can wait till the reports get developed that we have any UI to start testing with !!! And thats what makes it even more interesting and challenging . 


Do we wait till the last phase of development cycle where in the reports get deployed onto a web application and then we start testing or is it that we have scopes of starting with our testing activities well within the initial development cycle or may be even before the full flow development activities start ?

There is a scope for automation as well, loads of scope for performance testing and of course the manual has to be the core of any testing activity.

 Things really start once the Integration services packages gets into a shaped state wherein the data from the raw format gets extracted ,transformed and loaded into the new environment.

That is what we call as ETL execution as well in short.Test cases authoring can be started once the ETL packages gets developed because that is where a tester's activity gets eased out. This is also the area of high automation ROI wherein the etl related test cases  can be automated using the various database automation techniques.

 Once integration services are in place we target the Analysis services section wherein the  cube testing starts at the base of which exists the dimensions and measures and measure groups. These cubes have the raw data organized in a very effectively designed and inter related manner that the real time data analysis is possible. These cubes form the basis of the reporting service that is the base of the UI testing .

 UI testing or the Reporting services testing is more on the filter and various combination of filter that can be provided with a set of inter - related measures and measure groups for the various facts as per the dimensions supplied. Within the same lies the validation of the drill down and drill through capabilities of the reports by just right click and left clicks in the reports on its graphs and the various axes.The three terms that makes sense during the data validation for the complete application is the dimensions, and fact tables . These are the actual tables that form the basis of the cube data.

 As per the data quality is concerned , the major target area is the Integration services packages that is the ETL packages and the logic with which the data gets extracted , transformed and  loaded into the target systems . It is here that we design the data loading logic that shall ensure that the junk data or data that may in no way be utilized for the reports getting generated do not load into the staging system. 

It is simply because the data mart is the next stage within the complete BI application development logic and for the data mart to have deal with only relevant data , staging environment must have only needed data with quality in tact. Then stored procedures come into play to arrange the data from the staging tables into the data mart tables and two different categories of data tables get created namely the dimensions and the facts.

 The data arrangement logic has to be validated in these tables with the major target areas being the stored procedure. These also get created during the   ETL execution phase that is the Integration services phase.Once the mart is finalized the cubes need to be tested wherein the testing involves no more the Sql queries execution but the MDX queries .

 Here in the measures and measure groups and their data for the various inter-related entities come up.It is done by browsing the cubes within the Analysis service engine. Finally these data becomes the base for showing the relevant reports in the UI as per the business user's requirements.






No comments:

Post a Comment