The basic functions of the data warehouse include: 1. ETL design, including data extraction and synchronization, data cleaning, and data conversion; 2. Data layering, which is generally divided into ODS layer, CM layer, and ML layer; 3. , preliminary modeling of data.
The operating environment of this tutorial: Windows 7 system, Dell G3 computer.
Data warehouse, the English name is Data Warehouse, which can be abbreviated as DW or DWH. A data warehouse is a strategic collection that provides support for all types of data for decision-making processes at all levels of an enterprise. It is a single data store created for analytical reporting and decision support purposes. Provides guidance on business process improvement, monitoring time, cost, quality and control for enterprises in need of business intelligence.
Basic functions of data warehouse
ETL design: data extraction and synchronization, data cleaning, and data conversion. Involving relational databases (mysql, mariadb, oracle, etc.) and document databases (mongodb, elasticsearch, etc.).
Data layering: Generally divided into ODS layer, CM layer, and ML layer. The ODS layer represents unprocessed data. The CM layer represents the data of the cleaning and merging layer.
Preliminary data modeling: Corresponding to the data hierarchical ML layer, the relational model (snowflake model) or star model is generally used to form a wide table to provide external data support.
Involved technologies: HDFS, HIVE, HBASE, MR, SPARK, YARN, etc.
Data warehouse architecture
The following figure shows the data architecture planned by referring to the data architecture of many companies at work, for reference only.
For more related knowledge, please visit the FAQ column!
The above is the detailed content of What are the basic functions of a data warehouse?. For more information, please follow other related articles on the PHP Chinese website!