Home  >  Article  >  What technology is needed for data warehouse

What technology is needed for data warehouse

anonymity
anonymityOriginal
2019-05-07 10:24:5111755browse

Data warehouse technology (Data Warehousing) is a series of new application technologies developed based on database system technology based on the needs of information system business development and gradually becoming independent. There are two main technologies for data warehouse: OLTP and OLAP. Let’s analyze them below:

What technology is needed for data warehouse

## 1. OLTP and OLAP

The full name of OLTP is Online Transaction Processing. OLTP mainly uses traditional relational databases for transaction processing. The core requirement of OLTP is efficient and fast processing of single records. The most fundamental requirements such as indexing technology and sub-database and sub-table are to solve this problem.

The full name of OLAP is Online Analytical Processing. OLAP can process and count a large amount of data. Unlike OLTP database, which needs to consider the addition, deletion, modification and concurrency control of data, OLAP data generally only needs to process data query requests. Imports are imported in batches, so the response to requests can be greatly accelerated through technologies such as column storage, column compression, and bitmap indexing.

2. Simple comparison of OLTP and OLAP data

What technology is needed for data warehouse

##3. Data warehouse logical architecture design

Offline data warehouses are usually built based on dimensional modeling theory. Offline data warehouses are usually logically layered. Word segmentation is mainly based on the following considerations:

1. Isolation: users should use It is data carefully processed by the data team, rather than raw data from the business system. The first advantage of this is that users use carefully prepared, standardized, clean data from a business perspective. Very easy to understand and use. Second, if the upstream business system is changed or even reconstructed (such as table structure, fields, business meaning, etc.), the data team will be responsible for handling all these changes and minimizing the impact on downstream users.

2. Performance and maintainability: Professional people do professional things. Data layering makes the data processing basically all in the data team, so the same business logic does not need to be executed repeatedly, saving corresponding storage. and computational overhead. In addition, data layering also makes the maintenance of the data warehouse clear and convenient. Each layer is only responsible for its own tasks. If there is a problem with data processing on a certain layer, you only need to modify that layer.

3. Standardization: For a company and organization, the caliber of data is very important. When everyone talks about an indicator, it must be based on a clear and recognized caliber. The appearance, fields and indicators must To standardize.

4. ODS layer: The data tables of the data warehouse source system are usually stored intact. This is called the ODS (Operation Data Store) layer. The ODS layer is also often called the preparation area ( Staging area), they are the source of data processed by the subsequent data warehouse layer (i.e., the fact table and dimension table layer generated based on Kimball dimensional modeling, and the summary layer data processed based on these fact tables and detail tables). At the same time, the ODS layer also stores Historical incremental data or full data.

5. DWD and DWS layers: Data Warehouse Detail (DWD) and Data Warehouse Summary (DWS) are the main contents of the data warehouse. The data of the DWD and DWS layers are generated by the ODS layer after ETL cleaning, conversion, and loading, and they are usually built based on Kimball's dimensional modeling theory, and the dimensions of each subtopic are guaranteed through consistent dimensions and data buses. consistency.

6. Application layer (ADS): The application layer is mainly the data mart (DM) established by each business or department based on DWD and DWS. The data mart DM is relative to DWD and DWS. For Data Warehouse (DW). Generally speaking, the data of the application layer comes from the DW layer, but in principle, direct access to the ODS layer is not allowed. In addition, compared with the DW layer, the application layer only contains detailed and summary layer data that departments or parties themselves care about.

The above is the detailed content of What technology is needed for data warehouse. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn