- Back to Home »
- Data Ware Housing
Posted by : Unknown
Sunday, June 30, 2013
Abstract
Organisations are today suffering from a malaise of
data overflow. The developments in the transaction processing technology has
given rise to a situation where the amount and rate of data capture is very
high, but the processing of this data into information that can be utilised for
decision making, is not developing at the same pace. Data warehousing and data mining
(both data & text) provide a technology that enables the decision-maker in
the corporate sector/govt. to process this huge amount of data in a reasonable
amount of time, to extract intelligence/knowledge in a near real time.
The data warehouse allows the storage
of data in a format that facilitates its access, but if the tools for deriving
information and/or knowledge and presenting them in a format that is useful for
decision making are not provided the whole rationale for the existence of the
warehouse disappears. Various technologies for extracting new insight from the
data warehouse have come up which we classify loosely as "Data Mining
Techniques".
Our paper focuses on the need for information repositories and
discovery of knowledge and thence the overview of, the so hyped, Data
Warehousing and Data Mining. Introduction
“Knowledge [no more Information] is not only power,
but also has significant competitive advantage”
Organizations
have lately realized that just processing transactions and/or information’s
faster and more efficiently, no longer provides them with a competitive
advantage vis-à-vis their competitors
for achieving business excellence. Information technology (IT) tools that are
oriented towards knowledge processing can provide the edge that organizations
need to survive and thrive in the current era of fierce competition. The
increasing competitive pressures and the desire to leverage information technology
techniques have led many organizations to explore the benefits of new emerging
technology – viz. "Data Warehousing and Data Mining". What is needed
today is not just the latest and updated to the nano-second information, but
the cross-functional information that can help decisions making activity as
"on-line" process.
Evolution of Information Technology Tools
The evolution of the information systems characterize the
evolution of systems from data maintenance systems, to systems that transform
the data into "information" for use in the decision making process.
These systems supported the information acquisition from the database of
transactional data. The managerial knowledge acquisition function is/was not
directly supported by these systems. The evolution of new patterns in the
changing scenario could not be provided by these systems directly, the planner
was supposed to do this from experience.
Warehouse
with a database
One thing
that remains constant, especially in corporate world, is “Change”
And, these days, change
is occurring at an ever-increasing rate. A key challenge is implementing an
information infrastructure that allows your company to rapidly respond to
change. One solution to this challenge is the data warehouse.
Data warehousing is an
information infrastructure based on detail data that supports the
decision-making process and provides businesses the ability to access and
analyze data to increase an organization's competitive advantage.
Data warehousing is a
process, not an off-the-shelf solution you buy, but hardware--database and
tools integrated into an evolving information infrastructure--that changes with
the dynamics of the business.
The data warehouse makes an attempt to figure out
"what we need", before we know we need it.
What it
actually is?
*
A data
warehouse stores current and historical data
*
This data is
taken from various, perhaps incompatible, sources and stored in a uniform
format
*
Several
tools transform this data into meaningful business information for the purpose
of comparisons, trends and forecasting
*
Data in a
warehouse is not updates or changed in any way, but is only loaded and accessed
later on
*
Data is
organized according to subject instead of application.
In general a database is not a data warehouse
unless it has the following two features:
·
It collects
information from a number of different disparate sources and is the place where
this disparity is reconciled, and
·
It allows several different applications to
make use of the same information.
Information Sources always include the
core operational systems which form the backbone of day-to-day activities. It
is these systems which have traditionally provided management information to
support decision making.
Decision Support Tools are used to analyze
the information stored in the warehouse, typically to identify trends and new
business opportunities..
The Data Warehouse itself is the bridge
between the operational systems and the decision support tools. It holds a copy
of much of the operational system data in a logical structure which is more
conducive to analysis. The Data Warehouse, which will be refreshed in scheduled
bursts from operational systems and from relevant external data sources,
provides a single, consistent view of corporate data, leaving operational
systems unaffected.
Data – Warehouse Functions
The main function behind a data warehouse is to get the
enterprise-wide data in a format that is most useful to end-users, regardless
of their locations. Data warehousing is used for:
* Increasing
the speed and flexibility of analysis.
* Providing
a foundation for enterprise-wide integration and access.
* Improving
or re-inventing business processes.
* Gaining a
clear understanding of customer behavior.
Data Warehouse Architecture
Each implementation of a data warehouse is
different in its detailed design (a schematic high-level of the architecture
and its components is given in the figure below), but all are characterized by
a handful of the following key components:
·
A data model
to define the warehouse contents.
·
A carefully
designed warehouse database, whether hierarchical, relational, or
multidimensional. While choosing a DBMS it must be kept in view that the
database management system should be powerful enough to handle huge amount of
data running up to terabytes.
·
A front end
for Decision Support System (DSS) for
reporting and for structured and unstructured analysis.
Compendium
A data warehouse takes
the organisations operational data, historical data and external data
a)
consolidates it into a separately designed database (which can either be
relational or multi-dimensional in nature)
b)
manages it into a format that is
optimised for end users to access and analyse.
When a data warehouse has
been constructed, it provides a complete picture of the enterprise. It provides
an unparalleled opportunity to the management to learn about their customers.
The data warehouse
technology together with online transaction processing and data mining,
allows the management to provide better customer service, create greater
customer loyalty and activity, focus customer acquisition and retention of the
most profitable customer, increase revenue, reduce operating cost; provides
tools that facilitate sounder decision making; improves worker/management
knowledge and productivity; spares the operational database from ad-hoc queries
with the resulting performance degradation and clears the legacy database
system, while moving the corporate system architecture forward.
With the incorporation of
new data delivery and presentation techniques, like Hypertext Mark-up Language
(HTML), Open Database Connectivity (ODBC) etc. the database mining (Data &
Text) operation has gained wide spread recognition as a viable tool for
business intelligence gathering. Advances in the document mining technology
(database mining of free form text/data, in contrast to the “classical”
approach to data mining of fixed length records) are making the data mining
technology more powerful.
Last but never the least,
the Internet has emerged as the largest data warehouse of unstructured and free form
data. The new technologies are geared towards mining this great data warehouse.