Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparison of Three-tier Architecture and ROLAP, MOLAP, HOLAP in Data Warehouses, Study notes of Data Warehousing

The importance of data warehouses in business and their role in providing competitive advantage, productivity enhancement, and customer relationship management. It outlines the steps for designing and constructing a data warehouse, including the top-down view, data source view, data warehouse view, and business query view. The document also covers the three-tier architecture of data warehouses, consisting of the bottom tier warehouse database server, middle tier OLAP servers, and top tier client tools. The document further compares and contrasts ROLAP (Relational Online Analytical Processing), MOLAP (Multidimensional Online Analytical Processing), and HOLAP (Hybrid Online Analytical Processing) OLAP servers.

What you will learn

  • What are the differences between ROLAP, MOLAP, and HOLAP OLAP servers?
  • What are the benefits of using a data warehouse in business?
  • What are the three views of a data warehouse and what do they represent?

Typology: Study notes

2020/2021

Uploaded on 05/10/2021

maithili-rathod
maithili-rathod 🇮🇳

6 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CHAPTER 4
Data Warehouse Architecture
4.1 Data Warehouse Architecture
4.2 Three-tier data warehouse architecture
4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP
4.4 Further development of Data Cube Technology
4.5 Complex Aggregation at Multiple Granularity: Multi-feature cubes
4.6 Review questions
4.7 References
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Comparison of Three-tier Architecture and ROLAP, MOLAP, HOLAP in Data Warehouses and more Study notes Data Warehousing in PDF only on Docsity!

CHAPTER 4

Data Warehouse Architecture

4.1 Data Warehouse Architecture

4.2 Three-tier data warehouse architecture

4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP

4.4 Further development of Data Cube Technology

4.5 Complex Aggregation at Multiple Granularity: Multi-feature cubes

4.6 Review questions

4.7 References

4. Data Warehouse Architecture

4.1 Steps for the design and construction of warehouse:

A data warehouse provides competitive advantage by presenting relevant information from which to measure performance and make critical judgments in order to win in the competitive market space. A data warehouse can enhance productivity since it can quickly gather information about the activities of the organization. Thirdly, a data warehouse provides enhanced customer relationship management since it provides consistent view of the customers and items along all lines of the business, all departments and all markets. Finally, it is possible that a data warehouse brings about cost reduction by tracking trends, patterns and exceptions over long periods of time in a reliable and a consistent manner.

The business analysis framework must be understood in order to create an efficient data warehouse. Four different views of the data warehouse must be consisted: the top-down view, the data source view, the data warehouse view and the business query view.

  • The top-down view allows the selection of relevant information necessary for the data warehouse. The information matches the current and the forthcoming needs of the business.
  • The data source view exposes the way in which the data is captured, stored and managed by a data warehouse system. For individual source tables to integrated source tables information may be detailed at various levels of accuracy. Normally CASE tools or entity relationship models hold the traditional data sources.
  • The data warehouse view includes the fact tables and dimension tables. It represents the information that is stored with in the data warehouse as well as information regarding the data, time of origin of the source data to provide the historical context.

Since data warehouse construction is a difficult and a long term task, its implementation scope should be clearly defined in the beginning. The goals of an initial data warehouse should be specific, achievable and measurable

4.2 Three-tier data warehouse architecture

Data warehouses normally adopt three-tier architecture:

1. The bottom tiers is a warehouse database server that is almost always a relational database stsyem.Data from operational databases and from external sources are extracted using application program interfaces known as gateways. A gateway is supported by the underlying DBMS and allows client programs to execute code.

  1. The middle tier is an OLAP server that is typically implemented using a relational OLAP (ROLAP) model.
  2. The top tier is a client , which contains query and and reporting tools, analysis tools and/or data mining tools.

From the architecture point of view there are three data warehouse models: the enterprise warehouse, the data mart, and the virtual warehouse.

  1. Enterprise Warehouse: An enterprise warehouse collects all details comprising of all information about subjects spanning the entire organization. It provides corporate wide data

integration, usually from one or more operational systems and from external information providers. It takes extensive business modeling and it takes many years to design and build.

  1. Data Mart: A data mart consists of a subset of corporate wide data that is of value to specific group of users. The scope is confined to specific selected subjects. The data contained in a data mart tend to be summarized.
  2. Virtual Warehouse: A virtual warehouse is a set of views over operational databases. For efficient query processing, only some of the possible summary views may be materialized. A virtual warehouse is easy to build and it requires excess capacity on the operational database servers.

4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP

Normally business users are presented with multidimensional data from data warehouses or data marts without them being aware of the way in which i. e. how or where the data are stored. However, the physical architecture and implementation of OLAP servers need to take into consideration the issues regarding data storage. Various implementations are possible:

  1. Relational OLAP(ROLAP)servers: These are intermediate servers that stand in between a relational back-end server and client front-end tools. They use a relational or an extended relational DBMS to store and manage warehouse data, OLAP middle ware to support missing pieces. 2. Multidimensional OLAP (MOLAP) servers: These servers support multidimensional view of data through array based multi dimensional storage engines. They map multidimensional views directly to data cube array structures.

Data can be summarized and stored in a variety of ways in a multidimensional cube of an OLAP system. A user or analyst can search for interesting patterns in a cube by specifying a number of OLAP operations, such as drill down, roll up, slice, and dice. While tools are there to help the discovery process is not automated. The user follows his or her own intuition or hypotheses, tries to recognize exceptions or anomalies in the data. Discovery driven exploration is an alternative approach in which pre-computed measures indicating data exceptions are used to guide the user in the data analysis process. Exception indicators indicate cell values that are significantly different from the anticipated structural model. Three measures are used as exception indicators to help identify data anomalies. These measures indicate the degree of surprise that the quantity in a cell holds, with respect to its expected value. The measures are computed and associated with every cell, for all levels of aggregation. They are

SelfExp: This indicates the degree of surprise of the cell value, relative to other cells t the same level of aggregation.

InExp: This indicates the degree of surprise somewhere beneath the cell, if we were to drill down from it.

PathExp: This indicates the degree of surprise for each drill-down path from the cell.

4.5 Complex Aggregation at Multiple Granularity: Multi-feature cubes

Data cubes facilitate the answering of data mining queries as they allow the computation of aggregate data at multiple levels of granularity. Multifeature cubes compute complex queries involving multiple dependent aggregates at multiple granularities. These cubes are very useful in practice. Many complex data mining queries can be answered be answered by multifeature cubes without any significant increase in computing cost, in comparison to cube computation for simple queries with standard data cubes.

From Data Warehousing to Data Mining

In this section we study the usage of data warehousing for information processing, analytical processing and data mining. We also introduce on-line analytical mining (OLAM), a powerful paradigm that integrates OLAP with data mining technology.

Data warehouses and data marts and data marts are used in a wide range of applications. Business executives in almost every industry use the data collected, integrated, preprocessed and stored in data warehouses. Data warehouses are used extensively in banking and financial services, consumer goods and retail distribution sectors and controlled manufacturing, such as demand based production.

The more a data warehouse has been in use the more it would have evolved. This evolution takes place through a number of stages. Initially data warehouses are used for generating reports and answering predefined queries. Progressively, it used to analyze summarized and detailed data. Later, the data warehouses are put to strategic use. Finally the data warehouse may be put to use for strategic decision-making and knowledge discovery using data mining tools.

Business users need to know what exists in the data warehouse (through metadata), how to access the contents of the data warehouse, how to examine the contents using analytical tools and how to present the results of such an analysis. There are three kinds of data warehouse applications:

  • Information Processing supports querying, basic statistical analysis, and reporting using cross tabs, tables charts and graphs current trend in information processing is to construct low-cost Web based accessing tools that are integrated with the web browsers.
  • Analytical Processing supports basic OLAP operations, including slice and dice, drill-down, roll up and pivoting. It generally operates on historical data in both summarized and detailed forms.