Data warehousing is a very useful process to support business and organizational decision making activities to store organizational historical data which can be later presented using a portal.
Data warehousing is mainly done in three basic steps
Data warehouse mainly depends on the Extraction Transformation Load process
Extraction – Extract the data from the source systems and converts them into a format for transformation process
Resources used for extraction task.
DW can use source system to get data to the system. Such as employee allocation information, employee human resource information, and financial information ect. These resources can come in many forms such as excel/csv/rdbms ect.
These data can be extracted to the warehouses using DTS packages or SSIS packages.
Transformation – Apply set of rules and conditions to the source data and transform filtered data to the data warehouse.
Set of rules which can be applied in the transformation tasks.
Pivoting -Turning multiple columns into multiple rows or vice versa
Loading – Load the filtered data in to the tables arrange them accordingly to the data warehousing schema with the usage of facts and dimensions. Facts store measures calculated, usually business process is represented by a fact table. Dimension tables contain component attributes, which represents descriptive information about a fact.
Data warehouse schema is put to data cubes and they are access by the internal portal
Useful Technologies for Data Warehousing
Data can be extracted through DTS Packages and SSIS packages.
Schedule tasked can be assigned to extract data daily using scheduled job using SQL server 2000/2005.
Data transformation can be done using views and stored procedures designed accordingly and executed those using DTS packages as well as SSIS packages
Loading can be done using data cubes architecture . Table data is loaded to the cubes using SQL 2005 Business Intelligent Studio which are arrange accordingly to the relevant DW schema. These cubes can be access in three ways
Access through the reports which uses SSRS 2005 MDX coding and VB.Net custom codes.
Access data through the portal which uses share point technology and .Net framework.
Generating pivots using Microsoft Excel