OLAP简介（结合个人工作）-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

OLAP简介（结合个人工作）

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:11 PM

personalWorkIntroductioncombine

OLTP和OLAP 传统的数据库系统都是OLTP，只能提供数据原始的操作。不支持分析工作。 OLTP系统:：执行联机事务和查询处理。一般超市进销存系统，功能：注册，记账，库存和销售记录等等， OLAP系统：数据分析与决策服务，组织不同式数据，满足不同用户需求。区

OLTP和OLAP

传统的数据库系统都是OLTP，只能提供数据原始的操作。不支持分析工作。

OLTP系统:：执行联机事务和查询处理。一般超市进销存系统，功能：注册，记账，库存和销售记录等等，

OLAP系统：数据分析与决策服务，组织不同格式数据，满足不同用户需求。

区别：

面向性。OLTP面向顾客，就是操作员，如超市收银员，银行柜台人员。OLAP面向市场，用于数据分析，分析人员包括数据分析员，做出决策的业务经理，或者策略制定部分。

数据内容：OLTP当前数据。OLAP历史数据的汇总与聚集。

数据库设计：OLTP用ER模型和面向应用数据库。OLAP 用星型或雪花模型，面向主题数据库设计。

还有访问模式：操作事务与只读的分析计算的区别。

等等

多维数据模型：

数据立方体cube：

给定维度的每个子集产生一个cuboid（称为方体）。这样可以在不同粒度上的汇总级别或分组（group by），来显示数据，整体上方体的格成为cube。

最低层汇总的方体称为基本方体（basecuboid）。出现某一个维度上的汇总后，则为非基本方体。

汇总到最高层的数据称为顶点方体（apexcuboid），如0-d方体，that’s to say，所有维度汇总到一起只剩一个cuboid，不能再汇总了。

顶点方体是最高泛化的方体。基本方体是最低特殊化的方体。

粗细粒度是不同程度上的汇总，涉及操作：

上卷（roll up），供应商称之为上钻drillup，沿着维度的概念分层向上

下钻（drill down）沿着维度的概念分层向下，需找更细粒度的数据。

切片：固定某一维度的取值，抽取这一维度下的子集。

切块：由多个维度上选择多个取值，抽取其所映射的子立方体。

旋转rotate: 也叫pivot数轴变换，简单说，二维表中的行列转置。到三维以上复杂，不同数轴之间的位置变换。说的高大上叫数据的视图角度转变

概念分层：低层概念（如城市）映射到更高的层次概念（如国家）。从低到高叫泛化（generalize），从高到低叫特殊化（specialize）。

模式分层（schema hierarchy）概念分层为数据库模式中属性的全序或偏序。

集合分组分层（set-grouping hierarchy）给定维度的属性值的离散化或分组。如年龄age属性离散化为young、mid、old三个子集，分组group by sex的男女子集。

数据立方体的实现：

使用数据仓库的模型是多维模型，目前经常的有:

星型模型：一个大而全，且无冗余的事实表（fact）；以及不同分析维度上的维度表（dimension）。维度表围绕事实表，通过每个维度自身的dimension key（所有可能范围内的取值）关联。

雪花模型：星型模型的进一步细化，即将其中包含多个值的维度表进行规范化的（就是将维度表包含的某个值提取出来，作为新的dimension表），以便减少冗余。

这样把数据进一步分解到附加表中，易于维护，省空间（防止维度灾难），但查询时需要更多关联操作，降低时效性。

事实星座模型（fact constellation）or 星系模式（galaxy schema）：多个fact tableshare all dimesioms（共享维度表）。

比如我的设计的data warehouse。Workbench

Cube定义

Dimension定义

一般的data warehouse 都是用fact constellation。

指标Index

度量measure

维度灾难（curse of dimensionality），当维度过多（特征空间非常复杂），那么维度之间的关联计算就变得非常多，而维度概念分层会加重灾难。反应在cube中，就是不同维度的计算就会产生巨大的数据，就是预计算cube中所有的方体（子cube），存储空间是爆炸似增长。N维会有2ⁿ个子cube，加上概念分层L_i，则方体总数

预计算：1不物化（no materialization）2全物化（full materialization）3部分物化（partial materialization）

OLTP和OLAP

OLTP系统:：执行联机事务和查询处理。一般超市进销存系统，功能：注册，记账，库存和销售记录等等，

OLAP系统：数据分析与决策服务，组织不同格式数据，满足不同用户需求。

区别：

数据内容：OLTP当前数据。OLAP历史数据的汇总与聚集。

数据库设计：OLTP用ER模型和面向应用数据库。OLAP 用星型或雪花模型，面向主题数据库设计。

还有访问模式：操作事务与只读的分析计算的区别。

等等

多维数据模型：

数据立方体cube：

给定维度的每个子集产生一个cuboid（称为方体）。这样可以在不同粒度上的汇总级别或分组（group by），来显示数据，整体上方体的格成为cube。

最低层汇总的方体称为基本方体（basecuboid）。出现某一个维度上的汇总后，则为非基本方体。

汇总到最高层的数据称为顶点方体（apexcuboid），如0-d方体，that’s to say，所有维度汇总到一起只剩一个cuboid，不能再汇总了。

顶点方体是最高泛化的方体。基本方体是最低特殊化的方体。

粗细粒度是不同程度上的汇总，涉及操作：

上卷（roll up），供应商称之为上钻drillup，沿着维度的概念分层向上

下钻（drill down）沿着维度的概念分层向下，需找更细粒度的数据。

切片：固定某一维度的取值，抽取这一维度下的子集。

切块：由多个维度上选择多个取值，抽取其所映射的子立方体。

旋转rotate: 也叫pivot数轴变换，简单说，二维表中的行列转置。到三维以上复杂，不同数轴之间的位置变换。说的高大上叫数据的视图角度转变

概念分层：低层概念（如城市）映射到更高的层次概念（如国家）。从低到高叫泛化（generalize），从高到低叫特殊化（specialize）。

模式分层（schema hierarchy）概念分层为数据库模式中属性的全序或偏序。

集合分组分层（set-grouping hierarchy）给定维度的属性值的离散化或分组。如年龄age属性离散化为young、mid、old三个子集，分组group by sex的男女子集。

数据立方体的实现：

使用数据仓库的模型是多维模型，目前经常的有:

这样把数据进一步分解到附加表中，易于维护，省空间（防止维度灾难），但查询时需要更多关联操作，降低时效性。

事实星座模型（fact constellation）or 星系模式（galaxy schema）：多个fact tableshare all dimesioms（共享维度表）。

比如我的设计的data warehouse。Workbench

Cube定义

Dimension定义

一般的data warehouse 都是用fact constellation。

指标Index

度量measure

预计算：1不物化（no materialization）2全物化（full materialization）3部分物化（partial materialization）

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Adding Users to MySQL: The Complete TutorialMay 12, 2025 am 12:14 AM

Mastering the method of adding MySQL users is crucial for database administrators and developers because it ensures the security and access control of the database. 1) Create a new user using the CREATEUSER command, 2) Assign permissions through the GRANT command, 3) Use FLUSHPRIVILEGES to ensure permissions take effect, 4) Regularly audit and clean user accounts to maintain performance and security.

Mastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMay 12, 2025 am 12:12 AM

ChooseCHARforfixed-lengthdata,VARCHARforvariable-lengthdata,andTEXTforlargetextfields.1)CHARisefficientforconsistent-lengthdatalikecodes.2)VARCHARsuitsvariable-lengthdatalikenames,balancingflexibilityandperformance.3)TEXTisidealforlargetextslikeartic

MySQL: String Data Types and Indexing: Best PracticesMay 12, 2025 am 12:11 AM

Best practices for handling string data types and indexes in MySQL include: 1) Selecting the appropriate string type, such as CHAR for fixed length, VARCHAR for variable length, and TEXT for large text; 2) Be cautious in indexing, avoid over-indexing, and create indexes for common queries; 3) Use prefix indexes and full-text indexes to optimize long string searches; 4) Regularly monitor and optimize indexes to keep indexes small and efficient. Through these methods, we can balance read and write performance and improve database efficiency.

MySQL: How to Add a User RemotelyMay 12, 2025 am 12:10 AM

ToaddauserremotelytoMySQL,followthesesteps:1)ConnecttoMySQLasroot,2)Createanewuserwithremoteaccess,3)Grantnecessaryprivileges,and4)Flushprivileges.BecautiousofsecurityrisksbylimitingprivilegesandaccesstospecificIPs,ensuringstrongpasswords,andmonitori

The Ultimate Guide to MySQL String Data Types: Efficient Data StorageMay 12, 2025 am 12:05 AM

TostorestringsefficientlyinMySQL,choosetherightdatatypebasedonyourneeds:1)UseCHARforfixed-lengthstringslikecountrycodes.2)UseVARCHARforvariable-lengthstringslikenames.3)UseTEXTforlong-formtextcontent.4)UseBLOBforbinarydatalikeimages.Considerstorageov

MySQL BLOB vs. TEXT: Choosing the Right Data Type for Large ObjectsMay 11, 2025 am 12:13 AM

When selecting MySQL's BLOB and TEXT data types, BLOB is suitable for storing binary data, and TEXT is suitable for storing text data. 1) BLOB is suitable for binary data such as pictures and audio, 2) TEXT is suitable for text data such as articles and comments. When choosing, data properties and performance optimization must be considered.

MySQL: Should I use root user for my product?May 11, 2025 am 12:11 AM

No,youshouldnotusetherootuserinMySQLforyourproduct.Instead,createspecificuserswithlimitedprivilegestoenhancesecurityandperformance:1)Createanewuserwithastrongpassword,2)Grantonlynecessarypermissionstothisuser,3)Regularlyreviewandupdateuserpermissions

MySQL String Data Types Explained: Choosing the Right Type for Your DataMay 11, 2025 am 12:10 AM

MySQLstringdatatypesshouldbechosenbasedondatacharacteristicsandusecases:1)UseCHARforfixed-lengthstringslikecountrycodes.2)UseVARCHARforvariable-lengthstringslikenames.3)UseBINARYorVARBINARYforbinarydatalikecryptographickeys.4)UseBLOBorTEXTforlargeuns

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Chinese version

Chinese version, very easy to use

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Hot Topics

1665

1424

1321

1269

1249