Research on optimizing SQL efficiency-LINUX-php.cn

Home

System Tutorial

LINUX

Research on optimizing SQL efficiency

王林

Jan 28, 2024 am 08:09 AM

linuxlinux tutorialRed Hatlinux systemlinux commandlinux certificationred hat linuxlinux video

This is a case shared by teacher Chen Hongyi (Old K) at the Shanghai MOORACLE Conference in August 2016. By rewriting a merge SQL into plsql, the execution efficiency was greatly improved. When Tiger Liu saw this case, he initially did not notice the actual number of records in each table displayed in the execution plan. He did not think that the way of rewriting plsql was more efficient than the way of writing analytic functions. He also had several email discussions with Teacher Chen. It wasn’t until later that I took a closer look at the execution plan.

The original SQL is as follows:

merge into t_customer c using

(

select a.cstno, a.amount from t_trade a,

(select cstno,max(trade_date) trade_date from t_trade

group by cstno) b

where a.cstno = b.cstno and a.trade_date=b.trade_date

) m

on(c.cstno = m.cstno)

when matched then

update set c.amount = m.amount;

This SQL is to update the latest consumption amount in the user transaction details table (t_trade) to the consumption amount field in the user information table (t_customer), using the merge operation.

Implementation plan:

Research on optimizing SQL efficiency

Tiger Liu Note:

Before mastering the writing method of analysis function, the red part of SQL is a common way of writing other field information after group by, which is also the fundamental reason for the poor execution efficiency of this SQL.

There is another hidden danger in the original SQL, that is, if the maximum trade_date corresponding to a certain cstno of t_trade is repeated, then this SQL will report an ORA-30926 error and cannot be executed.

If you don’t look carefully at the execution plan (real data volume information of the two tables), the usual optimization method for this kind of SQL is to use analytic functions to rewrite:

Rewriting method 1:

merge into t_customer c using

(

select a.cstno,a.amount from

(select trade_date,cstno,amount,

row_number()over(partition by cstno order by trade_date desc) RNO from t_trade)a

where RNO=1

) m

on(c.cstno = m.cstno)

when matched then

update set c.amount = m.amount;

This rewriting method will be much more efficient than the original SQL, and there will be no problem of repeated error reports for max trade_date corresponding to a certain cstno.

However, Teacher Chen did not use the rewriting method of analytic function. Instead, based on the large difference in data volume between the two tables, he rewritten the SQL into a more efficient plsql:

Rewriting method 2:

declare

vamount number;

begin

for v in (select * from t_customer )

loop

select amount into vamount from

(select amount from t_trade where cstno=v.cstno order by trade_date desc)

where rownum

update t_customer set amount = vamount where cstno=v.cstno;

end loop

commit;

end;

According to the original SQL execution plan, we know that the number of records in the t_customer table is relatively small, only more than 1,000, while the t_trade table has 10 million records, with a ratio of 1:10000 (I don’t know if this is real data or test data, only There are more than 1,000 users, and an average user has 10,000 consumption details, which does not look like real data).

In such a special case where the data between the two tables is quite different, the plsql writing method is indeed more efficient than the analytical function writing method. This rewriting is very clever.

Let’s analyze the advantages and disadvantages of these two rewritings:

1. The rewriting method of plsql is suitable when the t_customer table is relatively small, and the ratio of the number of records in the t_customer and t_trade tables is relatively large. The execution efficiency will be higher than the rewriting of the analytical function. In this example, if the number of records in the t_customer table is 100,000, then the way of writing the analytical function is dozens to hundreds of times faster than the way of writing plsql.

3. The prerequisite for this rewriting of plsql is that there must be a joint index of the two fields of the t_trade table cstno trade_date. The rewriting of analytic functions does not require any index support.

4. For tables with tens of millions of records like t_trade, writing analytical functions can speed up by turning on parallelism; if you want to improve efficiency when rewriting plsql, you need to first group the t_customer table by cstno and use multiple sessions. Concurrent execution.

Let’s see if Teacher Chen’s plsql can be implemented with a single sql. I made an attempt. The SQL code is as follows:

merge into t_customer c using

(

select tc.cstno,

(select amount

from t_trade td1

where td1.cstno=tc.cstno and td1.trade_date = (select max(trade_date) from t_trade td2 where tc.cstno = td2.cstno) and rownum=1 ) as amount

from t_customer tc

) m

on(c.cstno = m.cstno)

when matched then

update set c.amount = m.amount;

The execution plan is roughly as follows:

Research on optimizing SQL efficiency

This writing method also requires the cstno trade_date joint index (IDX_T_TRADE) to exist in the t_trade table, and the data volume of the T_customer table is much lower than that of T_trade.

According to the execution plan, the execution efficiency of this sql should be comparable to that of plsql writing.

Summarize:

SQL optimization, in addition to avoiding inefficient SQL writing, mainly depends on the data volume and data distribution of the table. The rewriting method of plsql will show higher efficiency in a few special cases. In some cases of data distribution, the efficiency may not be as good as the original SQL. However, the optimization ideas are worth learning from.

The way the analysis function is rewritten, no matter how the data is distributed, will be more efficient and more versatile than the original SQL.

There should still be many developers and DBAs using the SQL before this example was rewritten. After understanding how to use the analysis function, the inefficient way of writing the original SQL should be completely abandoned.

The last plsql is rewritten into a single SQL. The logic seems to be complicated and difficult to understand. Generally, such rewriting is not used. It would be nice for everyone to understand it.

Again, there is no definite formula for optimization. The optimizer is dead, but the human brain is alive. Only by mastering the principles can SQL execution efficiency become higher and higher.

The above is the detailed content of Research on optimizing SQL efficiency. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:Linux就该这么学. If there is any infringement, please contact admin@php.cn delete

How does logging and auditing differ between Linux and Windows?May 08, 2025 am 12:03 AM

Linuxoffersmoregranularcontroloverloggingandauditing,whileWindowsprovidesamorecentralizedsystem.1)Linuxusestoolslikesyslog,rsyslog,andjournaldforcustomizablelogging.2)WindowsusestheEventViewerforcentralizedlogmanagement.3)Linuxisidealforenvironmentsn

What is AI? A Beginner's Guide for Linux UsersMay 07, 2025 am 11:23 AM

Artificial Intelligence (AI) is a term that’s been buzzing around for a while now, from self-driving cars to voice assistants like Siri and Alexa, AI is becoming a part of our everyday lives. But what exactly is AI, and why should Linux users care ab

AerynOS 2025.03 Alpha Released with GNOME 48, Mesa 25, and Linux Kernel 6.13.8May 07, 2025 am 11:22 AM

50 Essential Linux Commands for Beginners and SysAdminsMay 07, 2025 am 11:12 AM

For someone new to Linux, using it can still feel challenging, even with user-friendly distributions like Ubuntu and Mint. While these distributions simplify many tasks, some manual configuration is often required, but fully harnessing the power of L

How to Set Up Your Linux System for AI DevelopmentMay 07, 2025 am 10:55 AM

In the previous article, we introduced the basics of AI and how it fits into the world of Linux. Now, it’s time to dive deeper and set up your Linux system to start building your first AI model. Whether you’re a complete beginner or have some exper

How to Install Kloxo Web Hosting Control Panel in LinuxMay 07, 2025 am 10:52 AM

If you’re looking to manage your server with ease, Kloxo is a great option, as it is free and open-source web hosting control panel that allows you to manage your server and websites with a simple, user-friendly interface. In this guide, we’ll walk

How to Move Files and Folders with Spaces in LinuxMay 07, 2025 am 10:17 AM

If you’ve ever found yourself in a situation where you’re trying to move a bunch of files and folders, only to be stumped by spaces in the folder names, you’re not alone. Spaces in filenames or folder names can quickly become a frustrat

Linus Torvalds Acknowledges Missed Release of Linux 6.14 Due to OversightMay 07, 2025 am 10:12 AM

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

2 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Hot Tools

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Linux new version

SublimeText3 Linux latest version

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Hot Topics

1662

1419

1313

1263

1236