Thing which seemed very Thingish inside you is quite different when it gets out into the open and has other people looking at it

Thursday, August 21, 2008

Introduction to Data Warehousing

Data warehousing is a very useful process to support business and organizational decision making activities to store organizational historical data which can be later presented using a portal.

Data warehousing is mainly done in three basic steps

Data warehouse mainly depends on the Extraction Transformation Load process

Extraction – Extract the data from the source systems and converts them into a format for transformation process

Resources used for extraction task.

DW can use source system to get data to the system. Such as employee allocation information, employee human resource information, and financial information ect. These resources can come in many forms such as excel/csv/rdbms ect.

These data can be extracted to the warehouses using DTS packages or SSIS packages.

Transformation – Apply set of rules and conditions to the source data and transform filtered data to the data warehouse.

Set of rules which can be applied in the transformation tasks.

    • Aggregation

    • Filtering

    • Joining

    • Sorting

    • Pivoting -Turning multiple columns into multiple rows or vice versa

Loading – Load the filtered data in to the tables arrange them accordingly to the data warehousing schema with the usage of facts and dimensions. Facts store measures calculated, usually business process is represented by a fact table. Dimension tables contain component attributes, which represents descriptive information about a fact.

Data warehouse schema is put to data cubes and they are access by the internal portal

Useful Technologies for Data Warehousing

Data can be extracted through DTS Packages and SSIS packages.

Schedule tasked can be assigned to extract data daily using scheduled job using SQL server 2000/2005.

Data transformation can be done using views and stored procedures designed accordingly and executed those using DTS packages as well as SSIS packages

Loading can be done using data cubes architecture . Table data is loaded to the cubes using SQL 2005 Business Intelligent Studio which are arrange accordingly to the relevant DW schema. These cubes can be access in three ways

  1. Access through the reports which uses SSRS 2005 MDX coding and VB.Net custom codes.

  2. Access data through the portal which uses share point technology and .Net framework.

  3. Generating pivots using Microsoft Excel

No comments:

Post a Comment