search
HomeDatabaseMysql TutorialOracle 11g统计信息收集--多列统计信息的收集

我们在写SQL语句的时候,有的时候会碰到where子句后面有多个条件的情况,也就是根据多列的条件筛选得到数据。默认情况下,oracle

我们在写SQL语句的时候,有的时候会碰到where子句后面有多个条件的情况,也就是根据多列的条件筛选得到数据。默认情况下,Oracle会把多列的选择性(selectivity)相乘从而得到where语句的选择性,这样有可能会让Oracle的选择性变的不够准确,从而导致优化器做出错误的判断。比如对于汽车厂商和汽车型号,实际上是有关联关系的,一旦你知道了汽车的型号,就能判断出是哪一个厂商的汽车。再比如说酒店星级和酒店价格等级也有类似的对应关系。为了能够让优化器做出准确的判断,从而生成准确的执行计划,oracle在11g数据库中引入了多列统计信息的概念。

选择性:在本例中是 1/唯一值

我们有一张表BOOKS,两个列hotel_id,rate_category,我们来看一下这两列的数据分布:
SQL>  select hotel_id,rate_category,count(1) from books
2  group by  hotel_id,rate_category
3  order by hotel_id;

HOTEL_ID RATE_CATEGORY  COUNT(1)
---------- ------------- ----------
10  11  19943
10  12  39385
10  13  20036
20  21  5106
20  22  10041
20  23  5039

6 rows selected.

仔细检查数据:hotel_id 10 的 rate_category 列仅包含 11、12 和 13,而 hotel_id 20 的该列仅包含 21、22 和 23(11、12 和 13 一个都不包含)。为什

么?原因可能与酒店的星级有关。酒店 20 是一家定价较高的酒店,而租金等级 11、12 和 13 是较低的等级,因此它们不适用于一家高收费的酒店。同样地,

21、22 和 23 是较高的租金等级,因此它们不适用于酒店 10 这样的经济型酒店。而且,酒店 10 的房间预定数量多于酒店 20。

在表books的两个列上创建索引,并收集表的统计信息。
SQL> create index book_idx1 on books(hotel_id);
Index created.

SQL> create index book_idx2 on books(rate_category);
Index created.

SQL> analyze table books compute statistics;
Table analyzed.

如果我们要找到表中满足条件20号酒店价格等级是21的记录,执行计划会是什么样子呢?
SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=20 and rate_category=21;

Execution Plan
----------------------------------------------------------
Plan hash value: 2688610195

---------------------------------------------------------------------------
| Id  | Operation  | Name  | Rows  | Bytes | Cost (%CPU)| Time  |
---------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |  |  8296 | 33184 |  47  (3)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOOKS |  8296 | 33184 |  47  (3)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("RATE_CATEGORY"=21 AND "HOTEL_ID"=20)

SQL> set autotrace off

SQL> select count(1) from books;

COUNT(1)
----------
99550

SQL> select 99550/8296 from dual;

99550/8296
----------
11.9997589

从上例中可以看到,oracle选择了走全表扫描,判定的记录条数是8296条,而我么表中真实的数据是5106条,对于整张表99550条记录来说,应当可以使用到索引的。但是oracle没有,因为oracle会把两个列分别考虑,而计算出来的选择性是hotel_id 1/2,rate_category 1/6,从而得到了语句的选择性是1/12,这也就

是我们在执行计划中看到8296(99550*1/12)条记录的原因。

为了能够让oracle得到准确的执行记录,我们可以采取两个方法
1.使用程序包 dbms_stats 中的新函数 create_extended_stats 创建一个虚拟列,然后对表收集统计信息。
大致如下:
dbms_stats.create_extended_stats('SCOTT', 'BOOKS','(HOTEL_ID, RATE_CATEGORY)')
下次再收集表的统计信息时,将会自动收集您的列组的多列统计信息。

2.直接在程序包 dbms_stats 指定method_opt,收集统计信息时,把列组合作为单独列使用

在这里我们使用第二种方法
SQL> begin
2  dbms_stats.gather_table_stats (
3  ownname  => 'SCOTT',
4  tabname  => 'BOOKS',
5  estimate_percent=> 100,
6  method_opt  => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS  (HOTEL_ID,RATE_CATEGORY)',
7  cascade  => TRUE
8  );
9  end;
10  /

PL/SQL procedure successfully completed.

收集完列组统计信息后,再来看一下语句的执行计划
SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=20 and rate_category=21;

Execution Plan
----------------------------------------------------------
Plan hash value: 1484887743

-----------------------------------------------------------------------------------------
| Id  | Operation  | Name  | Rows  | Bytes | Cost (%CPU)| Time  |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |  |  5106 | 30636 |  19  (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| BOOKS  |  5106 | 30636 |  19  (0)| 00:00:01 |
|*  2 |  INDEX RANGE SCAN  | BOOK_IDX2 |  5106 |  |  11  (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("HOTEL_ID"=20)
2 - access("RATE_CATEGORY"=21)

该输出清晰地显示索引 BOOK_IDX2 已使用。为什么现在使用了索引?注意“Rows”列下方的值 (5106)。优化程序正确地确定了值组合的行数的估计值,而非分开的各个值的行数的估计值。

当然了,对于其他的条件,oracle也可以做出准确的判断

SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=10 and rate_category=12;

Execution Plan
----------------------------------------------------------
Plan hash value: 2688610195

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
MySQL: BLOB and other no-sql storage, what are the differences?MySQL: BLOB and other no-sql storage, what are the differences?May 13, 2025 am 12:14 AM

MySQL'sBLOBissuitableforstoringbinarydatawithinarelationaldatabase,whileNoSQLoptionslikeMongoDB,Redis,andCassandraofferflexible,scalablesolutionsforunstructureddata.BLOBissimplerbutcanslowdownperformancewithlargedata;NoSQLprovidesbetterscalabilityand

MySQL Add User: Syntax, Options, and Security Best PracticesMySQL Add User: Syntax, Options, and Security Best PracticesMay 13, 2025 am 12:12 AM

ToaddauserinMySQL,use:CREATEUSER'username'@'host'IDENTIFIEDBY'password';Here'showtodoitsecurely:1)Choosethehostcarefullytocontrolaccess.2)SetresourcelimitswithoptionslikeMAX_QUERIES_PER_HOUR.3)Usestrong,uniquepasswords.4)EnforceSSL/TLSconnectionswith

MySQL: How to avoid String Data Types common mistakes?MySQL: How to avoid String Data Types common mistakes?May 13, 2025 am 12:09 AM

ToavoidcommonmistakeswithstringdatatypesinMySQL,understandstringtypenuances,choosetherighttype,andmanageencodingandcollationsettingseffectively.1)UseCHARforfixed-lengthstrings,VARCHARforvariable-length,andTEXT/BLOBforlargerdata.2)Setcorrectcharacters

MySQL: String Data Types and ENUMs?MySQL: String Data Types and ENUMs?May 13, 2025 am 12:05 AM

MySQloffersechar, Varchar, text, Anddenumforstringdata.usecharforfixed-Lengthstrings, VarcharerForvariable-Length, text forlarger text, AndenumforenforcingdataAntegritywithaetofvalues.

MySQL BLOB: how to optimize BLOBs requestsMySQL BLOB: how to optimize BLOBs requestsMay 13, 2025 am 12:03 AM

Optimizing MySQLBLOB requests can be done through the following strategies: 1. Reduce the frequency of BLOB query, use independent requests or delay loading; 2. Select the appropriate BLOB type (such as TINYBLOB); 3. Separate the BLOB data into separate tables; 4. Compress the BLOB data at the application layer; 5. Index the BLOB metadata. These methods can effectively improve performance by combining monitoring, caching and data sharding in actual applications.

Adding Users to MySQL: The Complete TutorialAdding Users to MySQL: The Complete TutorialMay 12, 2025 am 12:14 AM

Mastering the method of adding MySQL users is crucial for database administrators and developers because it ensures the security and access control of the database. 1) Create a new user using the CREATEUSER command, 2) Assign permissions through the GRANT command, 3) Use FLUSHPRIVILEGES to ensure permissions take effect, 4) Regularly audit and clean user accounts to maintain performance and security.

Mastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMay 12, 2025 am 12:12 AM

ChooseCHARforfixed-lengthdata,VARCHARforvariable-lengthdata,andTEXTforlargetextfields.1)CHARisefficientforconsistent-lengthdatalikecodes.2)VARCHARsuitsvariable-lengthdatalikenames,balancingflexibilityandperformance.3)TEXTisidealforlargetextslikeartic

MySQL: String Data Types and Indexing: Best PracticesMySQL: String Data Types and Indexing: Best PracticesMay 12, 2025 am 12:11 AM

Best practices for handling string data types and indexes in MySQL include: 1) Selecting the appropriate string type, such as CHAR for fixed length, VARCHAR for variable length, and TEXT for large text; 2) Be cautious in indexing, avoid over-indexing, and create indexes for common queries; 3) Use prefix indexes and full-text indexes to optimize long string searches; 4) Regularly monitor and optimize indexes to keep indexes small and efficient. Through these methods, we can balance read and write performance and improve database efficiency.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor