The difference between data warehouse and database: 1. The database stores raw data without any processing; while the data warehouse is designed to meet the needs of data analysis, and the ETL process and data extraction work are performed on the source data. It is divided into extraction, cleaning, conversion and loading; 2. The amount of data in the data warehouse is much larger than that of the database.
(Recommended tutorial: mysql video tutorial)
1. Data warehouse
- What is a data warehouse?
Data Warehouse, which can be abbreviated as DW or DWH, data warehouse is a strategic collection of all types of data types for the decision-making and planning process at all levels of the enterprise. It was created for analytical reporting and decision support purposes. For enterprises that need business intelligence, they need to guide business process improvement, monitor time, cost, quality and control, etc.; -
What can the data warehouse do? (Cite a few chestnuts)
- The establishment of annual sales targets requires decisions based on past historical reports and cannot be set casually.
- Optimize business process
For example: For a certain brand of mobile phone on a certain e-commerce platform, what is the age group of the main buyers in the past five years? In that season, the number of purchases was large, so that you can Based on this feature, set their main needs for the target group and dynamically allocate the generated production volume and warehouse inventory.
-
Characteristics of data warehouse
- The data warehouse is subject-oriented.
- Unlike traditional databases, data warehouses are subject-oriented. So what is a subject? The home page theme is a higher-order concept and is the object of data synthesis, classification and analysis in the enterprise information system at a higher level. In a logical sense, it is the analysis object involved in a certain macro analysis field in the enterprise. (In human terms: It is the key aspects that users are concerned about when using data warehouses to make decisions. A topic is usually related to multiple operational information systems, and the data organization of operational databases is oriented towards transaction processing tasks, and each task is isolated from each other. );
- The data warehouse is integrated.
- The data in the data warehouse is extracted from the original scattered database data (mysql and other relational databases). There is a big difference between operational databases and DSS (decision support system) analytical databases. First, the source data corresponding to each topic in the data warehouse has many repetitions and differences in all the scattered databases, and the data from different online systems are bundled with different application logic. ; Second, the comprehensive data in the data warehouse cannot be obtained directly from the original database system. Therefore, before the data enters the data warehouse, it must go through unification and synthesis. This step is the most critical and complex step in the construction of the data warehouse. The work to be done is:
- To count the sources All contradictions in the data, such as fields with the same name, different names and synonyms, inconsistent units, inconsistent word lengths, etc.
- Perform data synthesis and calculation. The data synthesis work in the data warehouse can be generated when the original database extracts data, but many of them are generated within the data warehouse, that is, after entering the data warehouse, they are synthesized and generated.
- The data in the data warehouse is extracted from the original scattered database data (mysql and other relational databases). There is a big difference between operational databases and DSS (decision support system) analytical databases. First, the source data corresponding to each topic in the data warehouse has many repetitions and differences in all the scattered databases, and the data from different online systems are bundled with different application logic. ; Second, the comprehensive data in the data warehouse cannot be obtained directly from the original database system. Therefore, before the data enters the data warehouse, it must go through unification and synthesis. This step is the most critical and complex step in the construction of the data warehouse. The work to be done is:
- #The data in the data warehouse changes with time.
- The data in the data warehouse cannot be updated for applications. That is to say, users of the data warehouse do not perform data update operations when performing analysis and processing. But it does not mean that all data warehouse data will never change during the entire life cycle from the beginning of data integration to the data warehouse until it is finally deleted.
- The data in the data warehouse changes with time, which is one of the characteristics of the data warehouse. This feature mainly has the following three manifestations:
- The data warehouse continues to add new data content as time changes. The data warehouse system must continuously capture the changing data in the OLTP database and append it to the data warehouse. That is to say, it must continuously generate snapshots of the OLTP database and add them to the data warehouse through unified integration; but for database snapshots that do not change, If new changed data is captured, only a new database snapshot will be generated and added without modifying the original database snapshot.
- The database continuously deletes old data content as time changes. The data in the data warehouse also has a storage period. Once this period expires, the expired data will be deleted. It's just that the data age limit in the database is much longer than the data age limit in the operational environment. In an operational environment, only 60 to 90 days of data are generally saved, while in a data warehouse, a longer period of data (for example, 5 to 10 years) needs to be saved to meet the requirements of DSS for trend analysis.
- The data warehouse contains a large amount of comprehensive data, and many of these comprehensive data are related to time. For example, the data is often synthesized according to time periods, or sampled at certain time slices, etc. These data need to be continuously re-synthesized as time changes. Therefore, the data characteristics of the data warehouse include time items to indicate the historical period of the data.
- #The data in the data warehouse cannot be modified.
- The data in the data warehouse is mainly used for enterprise decision-making analysis. The data operations involved are mainly data queries, and generally no modification operations are performed. The data in the data warehouse reflects the content of historical data over a long period of time. It is a collection of database snapshots at different points in time, as well as exported data based on statistics, synthesis and reorganization of these snapshots, rather than data processed online. The libraries for online processing in the database are integrated and input into the data warehouse. Once the data stored in the data warehouse has exceeded the data storage period of the data warehouse, the data will be deleted from the current data warehouse. Because the data warehouse only performs data query operations, the system in the data warehouse is much simpler than the system in the database. Many technical difficulties in database management systems, such as integrity protection, concurrency control, etc., can almost be eliminated in data warehouse management. However, since the amount of data queried in the data warehouse is often large, higher requirements are put forward for data query, which requires the use of various complex indexing technologies; at the same time, the data warehouse is oriented to the senior management of commercial enterprises, who will The interface friendliness and data representation of data query put forward higher requirements;
- The data warehouse is subject-oriented.
2. The difference between data warehouse and database
- Before we want to understand the difference, we need to understand three concepts. What are database software, databases and data warehouses?
- Database software: It is a kind of software (not a graphical client that links to the database). It is used to implement database logical processes and belongs to the physical layer.
- Database: It is a logical concept, a warehouse used to store data, implemented through database software. The database is composed of many tables. The tables are two-dimensional and there are many fields in one table. The fields are arranged in a row, and the data is written into the table row by row. Database tables are capable of expressing multidimensional relationships in two dimensions. Such as: oracle, DB2, MySQL, Sybase, MSSQL Server, etc.
- Data warehouse: It is an upgrade of the database concept. Logically speaking, there is no difference between a database and a data warehouse. They are both places where data is stored through database software. However, in terms of data volume, the data warehouse is much larger than the database. Data warehouse is mainly used for data mining and data analysis to assist leaders in making decisions;
- In the IT architecture system, a database must exist and there must be a place to store data. For example, current online shopping and other e-commerce. The inventory of items, the price of the items, the user’s account balance, etc. These data are stored in the background database. Or the simplest understanding is our current accounts and passwords such as WeChat, Weibo and QQ. The background database must be a user table with at least two fields, namely username and password, and then our data is stored in the table row by row. When we log in, we fill in the username and password, and the data will be sent back to the backend to match the data in the table. If the match is successful, we can log in. If the match is unsuccessful, an error will be reported. This is the database, and the database is used for work in the production environment. We use databases for all business-related applications.
- Data warehouse is one of the technologies under BI. Since the database is linked to business applications, it is impossible for one database to contain all the data of a company. The table design of the database is often designed for a certain application. For example, in the login function just now, there are only these two fields in the user table and no other fields. At that time, this table meets what it should be, and there is no problem, but this table does not meet the analysis. For example, I want to know during which time period there are the largest number of users? Which user purchases the most in a year? Indicators like this. Then we need to redesign the table structure of the database. For data analysis and data mining, we introduced the concept of data warehouse. The table structure of the data warehouse is designed according to the analysis requirements, analysis dimensions, and analysis indicators.
- The difference between database and data warehouse is actually the difference between OLTP and OLAP.
- Operational processing is called Online Transaction Processing (OLTP) (On-Line Transaction Processing), which can also be called a transaction-oriented processing system. It is a daily operation of a specific business online in the database, usually querying a small number of records. ,Revise. Users are more concerned about issues such as operation response time, data security, integrity, and the number of concurrent supported users. As the main means of data management, traditional database systems are mainly used for operational processing.
- Analytical processing, called online analytical processing OLAP (On-Line Analytical Processing), generally analyzes historical data on certain topics to support management decisions.
Analytical processing | |
---|---|
Comprehensive or refined | |
Star model Or snowflake model | |
Storing historical data, excluding recent data | |
Read only, append only | |
Operation one collection at a time | |
Loose performance requirements | |
Analysis-oriented | |
Support decision-making needs | |
Large amount of data | |
Customer revenue analysis, market segmentation |
2. Maybe you think this is nothing, but I will take it seriously and regard it as my notes and experiences, so that I can improve myself.
The above is the detailed content of What is the difference between a data warehouse and a database?. For more information, please follow other related articles on the PHP Chinese website!

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于索引优化器工作原理的相关内容,其中包括了MySQL Server的组成,MySQL优化器选择索引额原理以及SQL成本分析,最后通过 select 查询总结整个查询过程,下面一起来看一下,希望对大家有帮助。

数据库系统由4个部分构成:1、数据库,是指长期存储在计算机内的,有组织,可共享的数据的集合;2、硬件,是指构成计算机系统的各种物理设备,包括存储所需的外部设备;3、软件,包括操作系统、数据库管理系统及应用程序;4、人员,包括系统分析员和数据库设计人员、应用程序员(负责编写使用数据库的应用程序)、最终用户(利用接口或查询语言访问数据库)、数据库管理员(负责数据库的总体信息控制)。

在当今数字化时代,数据已被普遍认为是企业决策的基础与资本。但是,处理大量数据并将其转化为可靠的决策支持信息的过程并不容易。这时,数据加工和数据仓库开始发挥重要作用。本文将分享一个通过MySQL开发实现数据加工和数据仓库的项目经验。一、项目背景本项目是基于一个商业企业数据化建设的需要,旨在通过数据加工和数据仓库实现数据汇聚、一致性、清洗和可靠性。本次实施的数据

数据库的“完整性”是指数据的正确性和相容性。完整性是指数据库中数据在逻辑上的一致性、正确性、有效性和相容性。完整性对于数据库系统的重要性:1、数据库完整性约束能够防止合法用户使用数据库时向数据库中添加不合语义的数据;2、合理的数据库完整性设计,能够同时兼顾数据库的完整性和系统的效能;3、完善的数据库完整性有助于尽早发现应用软件的错误。

mysql查询为什么会慢,关于这个问题,在实际开发经常会遇到,而面试中,也是个高频题。遇到这种问题,我们一般也会想到是因为索引。那除开索引之外,还有哪些因素会导致数据库查询变慢呢?

go语言可以写数据库。Go语言和其他语言不同的地方是,Go官方没有提供数据库驱动,而是编写了开发数据库驱动的标准接口,开发者可以根据定义的接口来开发相应的数据库驱动;这样做的好处在于,只要是按照标准接口开发的代码,以后迁移数据库时,不需要做任何修改,极大方便了后期的架构调整。

结构层次是“数据库→数据表→记录→字段”;字段构成记录,记录构成数据表,数据表构成了数据库。数据库是一个完整的数据的记录的整体,一个数据库包含0到N个表,一个表包含0到N个字段,记录是表中的行。

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于mysql获取时间、格式转换各类操作方法的相关内容, 下面一起来看一下,希望对大家有帮助。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Linux new version
SublimeText3 Linux latest version

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

Atom editor mac version download
The most popular open source editor

SublimeText3 Mac version
God-level code editing software (SublimeText3)
