Defining A Data Warehouse
Data Warehouse
A data warehouse is the main storage server hosting an organizations
historical data, its corporate memory.
Defining A Data Warehouse
A data warehouse can be defined in the following terms:
- Subject-oriented - The data in the database
is organized so that all the data elements relating to the same
real-world event or object are linked together.
- Time-variant - The changes to the data in
the database are tracked and recorded so that reports can be produced
showing changes over time.
- Non-volatile - Data in the database is never
over-written or deleted - once committed, the data is static,
read-only, but retained for future reporting; and
- Integrated - The database contains data from
most or all of an organization's operational applications, and
that this data is made consistent.
While operational systems are optimized for simplicity and speed
of modification [Online Transactional Processing, or OLTP] through
heavy use of database normalization and an entity-relationship model,
the data warehouse is optimized for reporting and analysis [online
analytical processing, or OLAP].
Data in data warehouses are often heavily denormalised, summarised
or stored in a dimension-based model. This is not always required
to achieve acceptable query response times.
NEXT: Types of Data Warhouses
Types of Data Warehouses
With technology improvements supporting lower cost for more performance
and to support user requirements for faster data load cycle times
and more features, data warehouses have evolved through different
stages:
Off line Operational Databases - Data warehouses
developed by simply copying the database of an operational system
to an off-line server where the processing load of reporting does
not impact on the operational system's performance.
Off line Data Warehouse - Data warehouses are
updated on a regular time cycle (usually daily, weekly or monthly)
from the operational systems and the data is stored in an integrated
reporting-oriented data structure.
Real Time Data Warehouse - Data warehouses are
updated on a transaction or event basis, every time an operational
system performs a transaction [e.g. an order or a delivery or a
booking etc.]
Integrated Data Warehouse - Data warehouses are
used to generate activity or transactions that are passed back into
the operational systems for use in the daily activity of the organization.
These are now fully integrated to the various types of business
intelligence tools, such as MS Excel, Scorecards, Dashboards, Analytic
Services.
Advantages Of Data Warehouses
There are many advantages to using a data warehouse, including:
- Enhances end-user access to a wide variety of data.
- Decision support system users can obtain specified trend reports,
e.g. the item with the most sales in a particular area/country
within the last two years.
- A data warehouse can be a significant enabler of commercial
business applications, most notably customer relationship management
(CRM).
Data Warehouse Concerns
There are some general concerns regarding the use of data warehouses:
- Extracting, transforming and loading data consumes a lot of
time and computational resources.
- Data warehousing project scope must be actively managed to
deliver a release of defined content and value.
- Compatibility problems with systems already in place.
- Security could develop into a serious issue, especially if
the data warehouse is web accessible.
- Data Storage design controversy warrants careful consideration
and perhaps prototyping of the data warehouse solution for each
project's environments.
Back To Top
Data Index | Data Governance
| Data Quality | MDM
| ODS | Data
Warehouse
|