Home >Common Problem >What are the outstanding features of a data warehouse compared to an operational database?

What are the outstanding features of a data warehouse compared to an operational database?

青灯夜游
青灯夜游Original
2022-07-19 16:15:493570browse

The outstanding features are “massive data support” and “fast retrieval technology”. Data warehouse is a structured data environment for decision support systems and online analysis application data sources, and the database is the core of the entire data warehouse environment, where data is stored and provides support for data retrieval; compared with manipulative databases, it is outstanding It is characterized by support for massive data and fast retrieval technology.

What are the outstanding features of a data warehouse compared to an operational database?

The operating environment of this tutorial: Windows 7 system, Dell G3 computer.

Compared with operational databases, the outstanding features of data warehouses are “massive data support” and “fast retrieval technology”.

Data warehouse, the English name is Data Warehouse, which can be abbreviated as DW or DWH. A data warehouse is a strategic collection that provides support for all types of data for decision-making processes at all levels of an enterprise. It is a single data store created for analytical reporting and decision support purposes. Provides guidance on business process improvement, monitoring time, cost, quality and control for enterprises in need of business intelligence.

The data warehouse is a structured data environment for decision support systems (dss) and online analysis application data sources. Data warehousing studies and solves the problems of obtaining information from databases. Data warehouses are characterized by subject orientation, integration, stability and time variability.

Characteristics of data warehouse

Data warehouse is used for further mining of data resources and decision-making needs when a large number of databases already exist. And generated, it is not the so-called "large database". The purpose of building a data warehouse solution is to serve as the basis for front-end query and analysis. Due to the large redundancy, the required storage is also large. In order to better serve front-end applications, data warehouses often have the following characteristics:

1. The efficiency is high enough.

The analysis data of the data warehouse is generally divided into days, weeks, months, quarters, years, etc. It can be seen that the daily cycle data requires the highest efficiency, requiring 24 hours or even 12 hours. Customers can see yesterday's data analysis. Because some companies have a large amount of data every day, problems often occur with poorly designed data warehouses, and data can only be provided after a delay of 1-3 days, which is obviously not possible.

2. Data quality.

The various information provided by the data warehouse must be accurate. However, since the data warehouse process is usually divided into multiple steps, including data cleaning, loading, query, display, etc., it is complicated. The architecture will have more layers, so dirty data in the data source or imprecise code can lead to data distortion. When customers see wrong information, they may make wrong decisions through analysis, causing losses rather than benefits.

3. Extensibility.

The reason why some large-scale data warehouse system architectures are complex is because they take into account the scalability in the next 3-5 years. In this case, there is no need to spend money to rebuild the data warehouse system too quickly in the future. Can run very stably. Mainly reflected in the rationality of data modeling, there are some more middle layers in the data warehouse solution, so that the massive data flow has enough buffer, so that the amount of data will not be much larger and it will not be able to run.

As can be seen from the above introduction, data warehouse technology can awaken the data accumulated by enterprises for many years. It not only manages these massive data for enterprises, but also taps the potential value of data, thus becoming an operation and maintenance system for communication enterprises. One of the highlights.

Broadly speaking, a decision support system based on data warehouse consists of three components: data warehouse technology, online analytical processing technology and data mining technology. Data warehouse technology is the core of the system. What follows in this series In this article, we will focus on data warehouse technology, introduce the main technologies of modern data warehouses and the main steps of data processing, and discuss how to use these technologies to help operation and maintenance in communication operation and maintenance systems.

4. Topic-oriented

The data organization of the operational database is oriented towards transaction processing tasks. Each business system is separated from each other, and the data in the data warehouse is organized according to certain organized by subject area. The theme corresponds to the application-oriented nature of traditional databases. It is an abstract concept that synthesizes, classifies, analyzes and utilizes data in enterprise information systems at a higher level. Each topic corresponds to a macro analysis area. The data warehouse eliminates data that is not useful for decision-making and provides a concise view of a specific subject.

Composition of data warehouse

Data extraction tool

Extract data from various Take it out of the storage method, perform necessary transformation and organization, and then store it in the data warehouse. The ability to access various data storage methods is the key to data extraction tools. It should be able to generate COBOL programs, MVS job control language (JCL), UNIX scripts, and SQL statements to access different data. Data transformation includes deleting data segments that are not meaningful for decision-making applications; converting to unified data names and definitions; calculating statistics and derived data; assigning default values ​​to missing data; and unifying different data definition methods.

Database

is the core of the entire data warehouse environment, where data is stored and provides support for data retrieval. Compared with manipulative databases, its outstanding features are support for massive data and fast retrieval technology.

Metadata

Metadata is the data that describes the structure and creation method of the data in the data warehouse. It can be divided into two categories according to different uses, technical metadata and commercial metadata.

Technical metadata is the data used by data warehouse designers and managers to develop and daily manage the use of data warehouses. Including: data source information; description of data transformation; definition of objects and data structures in the data warehouse; rules for data cleaning and data updating; mapping of source data to destination data; user access rights, data backup history, and data import Historical records, information release history, etc.

Business metadata describes the data in the data warehouse from the perspective of commercial business. Including: description of business topics, included data, queries, and reports;

Metadata provides an information directory (information directory) for accessing the data warehouse. This directory comprehensively describes what data is in the data warehouse. How the data is obtained and how to access the data. It is the center of data warehouse operation and maintenance. The data warehouse server uses it to store and update data, and users use it to understand and access data.

Data Mart

A part of the data that is independent from the data warehouse for a specific application purpose or application scope, which can also be called department data or subject data. (subject area). In the implementation process of data warehouse, you can often start with the data mart of one department, and then use several data marts to form a complete data warehouse. What needs to be noted is that when implementing different data marts, field definitions with the same meaning must be compatible, so that it will not cause big trouble when implementing a data warehouse in the future.

In the well-known foreign Garnter report on data mart products, the agile business intelligence products in the first quadrant include QlikView, Tableau and SpotView, all of which are full-memory computing data mart products. They are very important in terms of big data. Traditional business intelligence product giants pose a challenge. Domestic BI products started late. Well-known agile business intelligence products include PowerBI, Yonghong Technology's Z-Suite, SmartBI, FineBI business intelligence software, etc. Among them, Yonghong Technology's Z-Data Mart is a hot memory computing data Market products. Domestic Deon Information is also a system integrator of data mart products.

Data warehouse management

Security and privilege management; track data updates; data quality checks; manage and update metadata; audit and report data warehouse usage and status ;Delete data;Copy, split and distribute data;Backup and recovery;Storage management.

Information publishing system

Sends data in the data warehouse or other related data to different locations or users. Web-based information publishing system is the most effective way to deal with multi-user access.

Access tool

Provides means for users to access the data warehouse. There are data query and reporting tools; application development tools; management information system (EIS) tools; online analysis (OLAP) tools; and data mining tools.

For more related knowledge, please visit the FAQ column!

The above is the detailed content of What are the outstanding features of a data warehouse compared to an operational database?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn