Data warehouse - An Idea
Subject Oriented - A data warehouse is subject oriented because it provides information around a subject rather than the organization's ongoing operations.
These subjects can be product, customers, suppliers, sales, revenue, etc.
A data warehouse does not focus on the ongoing operations, rather it focuses on modelling and analysis of data for decision making.
Integrated - A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases, flat files, etc.
This integration enhances the effective analysis of data.
Time Variant - The data collected in a data warehouse is identified with a particular time period.
The data in a data warehouse provides information from the historical point of view.
Non-volatile - Non-volatile means the previous data is not erased when new data is added to it.
A data warehouse is kept separate from the operational database and therefore frequent changes in operational database is not reflected in the data warehouse.
Function of data warehouse tools and utilities
Data Extraction - Involves gathering data from multiple heterogeneous sources.
Data Cleaning - Involves finding and correcting the errors in data.
Data Transformation - Involves converting the data from legacy format to warehouse format.
Data Loading - Involves sorting, summarizing, consolidating, checking integrity, and building indices and partitions.
Refreshing - Involves updating from data sources to warehouse.