深入理解Oracle表(5)：三大表连接方式详解之Hash Join的定义，原-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

深入理解Oracle表(5)：三大表连接方式详解之Hash Join的定义，原

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 03:37 PM

hashoracleWaygo deepunderstandDetailed explanationconnect

Hash Join只能用于相等连接，且只能在CBO优化器模式下。相对于nested loop join，hash join更适合处理大型结果集
Hash Join的执行计划第1个是hash表(build table)，第2个探查表(probe table)，一般不叫内外表，nested loop才有内外表
Hash表也就是所谓的内表，探查表所谓的外表
两者的执行计划形如：
nested loop
outer table --驱动表
inner table

hash join
build table (inner table) --驱动表

probe table (outer table)

先看一张图片，大致了解Hash Join的过程：

深入理解Oracle表(5)：三大表连接方式详解之Hash Join的定义，原

下面详细了解一下Hash Join

㈠ Hash join概念

  Hash join算法的一个基本思想就是根据小的row sources(称作build input 也就是前文提到的build table，我们记较小的表为S，较大的表为B)
建立一个可以存在于hash area内存中的hash table
然后用大的row sources(称作probe input，也就是前文提到的probe table) 来探测前面所建的hash table
如果hash area内存不够大，hash table就无法完全存放在hash area内存中
针对这种情况，Oracle在连接键利用一个hash函数将build input和probe input分割成多个不相连的分区
分别记作Si和Bi，这个阶段叫做分区阶段；然后各自相应的分区，即Si和Bi再做Hash join，这个阶段叫做join阶段
如果HASH表太大，无法一次构造在内存中，则分成若干个partition，写入磁盘的temporary segment，则会多一个写的代价，会降低效率
至于小表的概念，对于 hash join 来说，能容纳在 pga 中的 hash table 都可以叫小表，通常比如：
pga_aggregate_target big integer 1073741824
hash area size 大体能使用到40多 M ，这样的话通常可能容纳几十万的记录
hash area size缺省是2*sort_area_size,我们可以直接修改SORT_AREA_SIZE 的大小，HASH_AREA_SIZE也会跟着改变的
如果你的workarea_size_policy=auto，那么我们只需设定pga_aggregate_target
但请记住，这是一个session级别的参数，有时，我们更倾向于把hash_area_size的大小设成驱动表的1.6倍左右
驱动表仅仅用于nested loop join 和 hash join，但Hash join不需要在驱动表上存在索引，而nested loop join则迫切需求
一两百万记录的表 join上千万记录的表，hash join的通常表现非常好
不过，多与少，大与小，很多时候很难量化，具体情况还得具体分析
如果在分区后，针对某个分区所建的hash table还是太大的话，oracle就采用nested loop hash join
所谓的nested-loops hash join就是对部分Si建立hash table，然后读取所有的Bi与所建的hash table做连接
然后再对剩余的Si建立hash table，再将所有的Bi与所建的hash table做连接，直至所有的Si都连接完了

   ㈡ Hash Join原理

  考虑以下两个数据集：
S={1,1,1,3,3,4,4,4,4,5,8,8,8,8,10}
B={0,0,1,1,1,1,2,2,2,2,2,2,3,8,9,9,9,10,10,11}
Hash Join的第一步就是判定小表（即build input）是否能完全存放在hash area内存中
如果能完全存放在内存中，则在内存中建立hash table，这是最简单的hash join
如果不能全部存放在内存中，则build input必须分区。分区的个数叫做fan-out
Fan-out是由hash_area_size和cluster size来决定的。其中cluster size等于db_block_size * _hash_multiblock_io_count
hash_multiblock_io_count是个隐藏参数,在9.0.1以后就不再使用了

[sql] view plaincopyprint?

sys@ORCL> ed
Wrote file afiedt.buf
1 select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
2 from x$ksppi a,x$ksppcv b
3 where a.indx = b.indx
4* and a.ksppinm like '%hash_multiblock_io_count%'
sys@ORCL> /
NAME VALUE DESCRIPTION
------------------------------ ----- ------------------------------------------------------------
_hash_multiblock_io_count 0 number of blocks hash join will read/write at once

Oracle采用内部一个hash函数作用于连接键上，将S和B分割成多个分区
在这里我们假设这个hash函数为求余函数，即Mod(join_column_value,10)
这样产生十个分区，如下表:

深入理解Oracle表(5)：三大表连接方式详解之Hash Join的定义，原

经过这样的分区之后，只需要相应的分区之间做join即可（也就是所谓的partition pairs）
如果有一个分区为NULL的话，则相应的分区join即可忽略
在将S表读入内存分区时，oracle即记录连接键的唯一值，构建成所谓的位图向量
它需要占hash area内存的5%左右。在这里即为{1,3,4,5,8,10}
当对B表进行分区时，将每一个连接键上的值与位图向量相比较，如果不在其中，则将其记录丢弃
在我们这个例子中，B表中以下数据将被丢弃{0,0,2,2,2,2,2,2,9,9,9,9,9}
这个过程就是位图向量过滤
当S1,B1做完连接后，接着对Si,Bi进行连接
这里oracle将比较两个分区，选取小的那个做build input，就是动态角色互换
这个动态角色互换发生在除第一对分区以外的分区上面

   ㈢ Hash Join算法

  第1步：判定小表是否能够全部存放在hash area内存中，如果可以，则做内存hash join。如果不行，转第二步
第2步：决定fan-out数
(Number of Partitions) * C
其中C为Cluster size，其值为DB_BLOCK_SIZE*HASH_MULTIBLOCK_IO_COUNT
Favm为hash area内存可以使用的百分比，一般为0.8左右
M为Hash_area_size的大小
第3步：读取部分小表S，采用内部hash函数(这里称为hash_fun_1)
将连接键值映射至某个分区，同时采用hash_fun_2函数对连接键值产生另外一个hash值
这个hash值用于创建hash table用，并且与连接键值存放在一起
第4步：对build input建立位图向量
第5步：如果内存中没有空间了，则将分区写至磁盘上
第6步：读取小表S的剩余部分，重复第三步，直至小表S全部读完
第7步：将分区按大小排序，选取几个分区建立hash table(这里选取分区的原则是使选取的数量最多)
第8步：根据前面用hash_fun_2函数计算好的hash值，建立hash table
第9步：读取表B，采用位图向量进行位图向量过滤
第10步：对通过过滤的数据采用hash_fun_1函数将数据映射到相应的分区中去，并计算hash_fun_2的hash值
第11步：如果所落的分区在内存中，则将前面通过hash_fun_2函数计算所得的hash值与内存中已存在的hash table做连接
将结果写致磁盘上。如果所落的分区不在内存中，则将相应的值与表S相应的分区放在一起
第12步：继续读取表B，重复第9步，直至表B读取完毕
第13步：读取相应的(Si,Bi)做hash连接。在这里会发生动态角色互换
第14步：如果分区过后，最小的分区也比内存大，则发生nested-loop hash join

㈣ Hash Join的成本

⑴ In-Memory Hash Join
Cost(HJ)=Read(S)+ build hash table in memory(CPU)+Read(B) + Perform In memory Join(CPU)
忽略cpu的时间，则：
Cost(HJ)=Read(S)+Read(B)

⑵ On-Disk Hash Join
根据上述的步骤描述，我们可以看出：
Cost(HJ)=Cost(HJ1)+Cost(HJ2)
其中Cost(HJ1)的成本就是扫描S,B表，并将无法放在内存上的部分写回磁盘，对应前面第2步至第12步
Cost(HJ2)即为做nested-loop hash join的成本，对应前面的第13步至第14步
其中Cost(HJ1)近似等于Read(S)+Read(B)+Write((S-M)+(B-B*M/S))
因为在做nested-loop hash join时，对每一chunk的build input，都需要读取整个probe input，因此
Cost(HJ2)近似等于Read((S-M)+n*(B-B*M/S))，其中n是nested-loop hash join需要循环的次数：n=(S/F)/M
一般情况下，如果n大于10的话，hash join的性能将大大下降
从n的计算公式可以看出，n与Fan-out成反比例，提高fan-out，可以降低n
当hash_area_size是固定时，可以降低cluster size来提高fan-out
从这里我们可以看出，提高hash_multiblock_io_count参数的值并不一定提高hash join的性能

㈤ Hash Join的过程

  一次完整的hash join如下：
  1 计算小表的分区（bucket）数--Hash分桶
决定hash join的一个重要因素是小表的分区（bucket）数
这个数字由hash_area_size、hash_multiblock_io_count和db_block_size参数共同决定
Oracle会保留hash area的20%来存储分区的头信息、hash位图信息和hash表
因此，这个数字的计算公式是：
Bucket数=0.8*hash_area_size/(hash_multiblock_io_count*db_block_size)

  2 Hash计算
读取小表数据（简称为R），并对每一条数据根据hash算法进行计算
Oracle采用两种hash算法进行计算，计算出能达到最快速度的hash值（第一hash值和第二hash值）
而关于这些分区的全部hash值（第一hash值）就成为hash表

  3 存放数据到hash内存中
将经过hash算法计算的数据，根据各个bucket的hash值（第一hash值）分别放入相应的bucket中
第二hash值就存放在各条记录中

  4 创建hash位图
与此同时，也创建了一个关于这两个hash值映射关系的hash位图

  5 超出内存大小部分被移到磁盘
如果hash area被占满，那最大一个分区就会被写到磁盘（临时表空间）上去
任何需要写入到磁盘分区上的记录都会导致磁盘分区被更新
这样的话，就会严重影响性能，因此一定要尽量避免这种情况
2-5一直持续到整个表的数据读取完毕

  6 对分区排序
为了能充分利用内存，尽量存储更多的分区，Oracle会按照各个分区的大小将他们在内存中排序

  7 读取大表数据，进行hash匹配
接下来就开始读取大表（简称S）中的数据
按顺序每读取一条记录，计算它的hash值，并检查是否与内存中的分区的hash值一致
如果是，返回join数据
如果内存中的分区没有符合的，就将S中的数据写入到一个新的分区中，这个分区也采用与计算R一样的算法计算出hash值
也就是说这些S中的数据产生的新的分区数应该和R的分区集的分区数一样。这些新的分区被存储在磁盘（临时表空间）上

  8 完全大表全部数据的读取
一直按照7进行，直到大表中的所有数据的读取完毕

  9 处理没有join的数据
这个时候就产生了一大堆join好的数据和从R和S中计算存储在磁盘上的分区

  10 二次hash计算
从R和S的分区集中抽取出最小的一个分区，使用第二种hash函数计算出并在内存中创建hash表
采用第二种hash函数的原因是为了使数据分布性更好

  11 二次hash匹配
在从另一个数据源（与hash在内存的那个分区所属数据源不同的）中读取分区数据，与内存中的新hash表进行匹配。返回join数据

  12 完成全部hash join
继续按照9-11处理剩余分区，直到全部处理完毕

   ㈥ Hash Join的模式
  Oracle中，Hash Join也有三种模式：optimal，one-pass，multi-pass
  ⑴ optimal

当驱动结果集生成的hash表全部可以放入PGA的hash area时，称为optimal，大致过程如下：
① 先根据驱动表，得到驱动结果集
② 在hash area生成hash bulket，并将若干bulket分成一组，成为一个partition，还会生成一个bitmap的列表，每个bulket在上面占一位
③ 对结果集的join键做hash运算，将数据分散到相应partition的bulket中
当运算完成后，如果键值唯一性较高的话，bulket里的数据会比较均匀，也有可能有的桶里面数据会是空的
这样bitmap上对应的标志位就是0，有数据的桶，标志位会是1
④ 开始扫描第二张表，对jion键做hash运算，确定应该到某个partition的某个bulket去探测
探测之前，会看这个bulket的bitmap是否会1，如果为0，表示没数据，这行就直接丢弃掉
⑤ 如果bitmap为1，则在桶内做精确匹配，判断OK后，返回数据
这个是最优的hash join，他的成本基本是两张表的full table scan，在加微量的hash运算
博客开篇的那幅图描述的也就是这种情况

⑵ one-pass
如果进程的pga很小，或者驱动表结果集很大，超过了hash area的大小，会怎么办？
当然会用到临时表空间，此时oracle的处理方式稍微复杂点需奥注意上面提到的有个partition的概念
可以这么理解，数据是经过两次hash运算的，先确定你的partition，再确定你的bulket
假设hash area小于整个hash table，但至少大于一个partition的size，这个时候走的就是one-pass
当我们生成好hash表后，状况是部分partition留在内存中，其他的partition留在磁盘临时表空间中
当然也有可能某个partition一半在内存，一半在磁盘，剩下的步骤大致如下：
① 扫描第二张表，对join键做hash运算，确定好对应的partition和bulket
② 查看bitmap，确定bulket是否有数据，没有则直接丢弃
③ 如果有数据，并且这个partition是在内存中的，就进入对应的桶去精确匹配，能匹配上，就返回这行数据，否则丢弃
④ 如果partition是在磁盘上的，则将这行数据放入磁盘中暂存起来，保存的形式也是partition，bulket的方式
⑤ 当第二张表被扫描完后，剩下的是驱动表和探测表生成的一大堆partition，保留在磁盘上
⑥ 由于两边的数据都按照相同的hash算法做了partition和bulket，现在只要成对的比较两边partition数据即可
并且在比较的时候，oracle也做了优化处理，没有严格的驱动与被驱动关系
他会在partition对中选较小的一个作为驱动来进行，直到磁盘上所有的partition对都join完
可以发现，相比optimal，他多出的成本是对于无法放入内存的partition，重新读取了一次，所以称为one-pass
只要你的内存保证能装下一个partition，oracle都会腾挪空间，每个磁盘partition做到one-pass

⑶ multi-pass
这是最复杂，最糟糕的hash join
此时hash area小到连一个partition也容纳不下，当扫描好驱动表后
可能只有半个partition留在hash area中，另半个加其他的partition全在磁盘上
剩下的步骤和one-pass比价类似，不同的是针对partition的处理
由于驱动表只有半个partition在内存中，探测表对应的partition数据做探测时
如果匹配不上，这行还不能直接丢弃，需要继续保留到磁盘，和驱动表剩下的半个partition再做join
这里举例的是内存可以装下半个partition，如果装的更少的话，反复join的次数将更多
当发生multi-pass时，partition物理读的次数会显著增加

㈦ Hash Join的位图
这个位图包含了每个hash分区是否有有值的信息。它记录了有数据的分区的hash值
这个位图的最大作用就是，如果probe input中的数据没有与内存中的hash表匹配上
先查看这个位图，以决定是否将没有匹配的数据写入磁盘
那些不可能匹配到的数据（即位图上对应的分区没有数据）就不再写入磁盘

㈧小结
① 确认小表是驱动表
② 确认涉及到的表和连接键分析过了
③ 如果在连接键上数据不均匀的话，建议做柱状图
④ 如果可以，调大hash_area_size的大小或pga_aggregate_target的值
⑤ Hash Join适合于小表与大表连接、返回大型结果集的连接.

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How to Grant Permissions to New MySQL UsersMay 09, 2025 am 12:16 AM

TograntpermissionstonewMySQLusers,followthesesteps:1)AccessMySQLasauserwithsufficientprivileges,2)CreateanewuserwiththeCREATEUSERcommand,3)UsetheGRANTcommandtospecifypermissionslikeSELECT,INSERT,UPDATE,orALLPRIVILEGESonspecificdatabasesortables,and4)

How to Add Users in MySQL: A Step-by-Step GuideMay 09, 2025 am 12:14 AM

ToaddusersinMySQLeffectivelyandsecurely,followthesesteps:1)UsetheCREATEUSERstatementtoaddanewuser,specifyingthehostandastrongpassword.2)GrantnecessaryprivilegesusingtheGRANTstatement,adheringtotheprincipleofleastprivilege.3)Implementsecuritymeasuresl

MySQL: Adding a new user with complex permissionsMay 09, 2025 am 12:09 AM

ToaddanewuserwithcomplexpermissionsinMySQL,followthesesteps:1)CreatetheuserwithCREATEUSER'newuser'@'localhost'IDENTIFIEDBY'password';.2)Grantreadaccesstoalltablesin'mydatabase'withGRANTSELECTONmydatabase.TO'newuser'@'localhost';.3)Grantwriteaccessto'

MySQL: String Data Types and CollationsMay 09, 2025 am 12:08 AM

The string data types in MySQL include CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT. The collations determine the comparison and sorting of strings. 1.CHAR is suitable for fixed-length strings, VARCHAR is suitable for variable-length strings. 2.BINARY and VARBINARY are used for binary data, and BLOB and TEXT are used for large object data. 3. Sorting rules such as utf8mb4_unicode_ci ignores upper and lower case and is suitable for user names; utf8mb4_bin is case sensitive and is suitable for fields that require precise comparison.

MySQL: What length should I use for VARCHARs?May 09, 2025 am 12:06 AM

The best MySQLVARCHAR column length selection should be based on data analysis, consider future growth, evaluate performance impacts, and character set requirements. 1) Analyze the data to determine typical lengths; 2) Reserve future expansion space; 3) Pay attention to the impact of large lengths on performance; 4) Consider the impact of character sets on storage. Through these steps, the efficiency and scalability of the database can be optimized.

MySQL BLOB : are there any limits?May 08, 2025 am 12:22 AM

MySQLBLOBshavelimits:TINYBLOB(255bytes),BLOB(65,535bytes),MEDIUMBLOB(16,777,215bytes),andLONGBLOB(4,294,967,295bytes).TouseBLOBseffectively:1)ConsiderperformanceimpactsandstorelargeBLOBsexternally;2)Managebackupsandreplicationcarefully;3)Usepathsinst

MySQL : What are the best tools to automate users creation?May 08, 2025 am 12:22 AM

The best tools and technologies for automating the creation of users in MySQL include: 1. MySQLWorkbench, suitable for small to medium-sized environments, easy to use but high resource consumption; 2. Ansible, suitable for multi-server environments, simple but steep learning curve; 3. Custom Python scripts, flexible but need to ensure script security; 4. Puppet and Chef, suitable for large-scale environments, complex but scalable. Scale, learning curve and integration needs should be considered when choosing.

MySQL: Can I search inside a blob?May 08, 2025 am 12:20 AM

Yes,youcansearchinsideaBLOBinMySQLusingspecifictechniques.1)ConverttheBLOBtoaUTF-8stringwithCONVERTfunctionandsearchusingLIKE.2)ForcompressedBLOBs,useUNCOMPRESSbeforeconversion.3)Considerperformanceimpactsanddataencoding.4)Forcomplexdata,externalproc

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Hot Tools

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Linux new version

SublimeText3 Linux latest version

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software