search
HomeDatabaseSQLWhy do code specifications require SQL statements not to have too many joins?

Send sub-questions

Interviewer: Have you ever operated Linux?

Me: Yes

Interviewer: What command should I use to check the memory usage?

Me: free or top

Interviewer: Then tell me what information you can see using the free command

Me: Then, as shown in the figure below, you can see the usage of memory and cache.

  • total total memory

  • ##used used memory

  • free free memory

  • buff/cache used cache

  • avaiable memory

Why do code specifications require SQL statements not to have too many joins?

##Interviewer:

Then you know how Clear the used cache (buff/cache)

Me: em... I don’t know

Interviewer: sync; echo 3 > /proc/sys/vm/drop_caches You can clear the buff/cache. Can you tell me if I can execute this command online?

Why do code specifications require SQL statements not to have too many joins?

##Me: (Send points, Overjoyed) The benefits are huge. After clearing the cache, we will have more available memory space. Just like the little rocket of xx Guardian on the PC, a lot of memory will be released with one click.

Interviewer: em…., go back and wait for notification

Let’s talk about SQL Join

Interviewer: Change the topic and let’s talk Your understanding of join

Me: Okay (if you answer it wrong again, it’s over, seize the opportunity)

Review

join in SQL can combine specified tables according to certain conditions and return data to the client

Join methods include

inner join inner join

Why do code specifications require SQL statements not to have too many joins?

##left join left join

Why do code specifications require SQL statements not to have too many joins?

right join right join

Why do code specifications require SQL statements not to have too many joins?

full join Full join

Why do code specifications require SQL statements not to have too many joins?


#Picture source: https://www.cnblogs.com/reaptomorrow-flydream/p/8145610.html

Interviewer: If you need to use join statements during project development, how to optimize and improve performance?

Me: Divided into two In this case, the data size is small and the data size is large.

Interviewer: Then?

Me: For

1. The data size is small and all is put into the memory. Wow

2. The data scale is large

  • #You can optimize the execution speed of the join statement by adding indexes

  • You can use redundant information to reduce the number of joins

  • Reduce the number of table connections as much as possible, the number of table connections for one SQL statement No more than 5 times

Interviewer: It can be summarized that the join statement is relatively performance-consuming, right?

Me: Yes

Interviewer: Why?

Buffer

Me: There must be a comparison process when executing the join statement

Interviewer: Yes

Me: The statement comparing two tables one by one is relatively slow, so we can read the data in the two tables into a memory block in sequence, using MySQL Taking the InnoDB engine as an example, we can definitely find the relevant memory area by using the following statement show variables like '%buffer%'

Why do code specifications require SQL statements not to have too many joins?

As shown in the figure Indicates that the size of join_buffer_size will affect the execution performance of our join statement

Interviewer: What else?

A major premise

Me: Any project will eventually go online, it is inevitable to generate data, and the scale of the data cannot be too small

Interviewer: Yes Like this

Me:Most of the data in the database will eventually be saved to the hard disk and stored in the form of files.

Take MySQL's InnoDB engine as an example

  • InnoDB uses page as the basic IO unit, and the size of each page is 16KB

  • InnoDB will create an .ibd file for each table to store data

Why do code specifications require SQL statements not to have too many joins?

Verification

Why do code specifications require SQL statements not to have too many joins?

Me: This means that we need to read as many files as there are tables to connect, although it can be used Index, but it is still inevitable to move the hard disk head frequently

Interviewer:In other words, frequent movement of the head will affect the performance, right

Me:Yes, don’t the current open source frameworks like to say that they have greatly improved performance through sequential reading and writing, such as hbase and kafka

Interviewer: That’s right, then Do you think Linux has optimized this? Tip, you can execute the free command again to take a look

Me:Strange why the cache occupies more than 1.2G

Why do code specifications require SQL statements not to have too many joins?

Why do code specifications require SQL statements not to have too many joins?

##Image source: https://www.linuxatemyram.com/

Interviewer: Have you ever thought about

  • buff/cache is stored in What?

  • Why does buff/cache occupy so much memory, and the available memory is available and there is still 1.1G?

  • Why can you clear the memory occupied by buff/cache through two commands, but you can only release used by ending the process?

Taste it carefully

After thinking for a few minutes

Why do code specifications require SQL statements not to have too many joins?

Me: Releasing the memory occupied by buff/cache so casually means that it is not important, and clearing it will not affect the operation of the system

Interviewer: Not entirely true

Me: Is that so? I think of a sentence in "CSAPP" (In-depth Understanding of Computer Systems)

The essence of the memory hierarchy is that each layer of storage device is the cache of the lower layer device

Why do code specifications require SQL statements not to have too many joins?

In layman’s terms, it means that Linux will treat the memory as the cache of the hard disk

Related information: http://tldp.org /LDP/sag/html/buffer-cache.html

Interviewer: Now you know how to answer the scoring question

Me: I….

Why do code specifications require SQL statements not to have too many joins?

##Join Algorithm

Interviewer: Give it to you again Given an opportunity, what would you do if you were asked to implement the Join algorithm?

Me: If there is no index, the nested loop will be finished. If there is an index, you can use the index to improve performance.

Interviewer: Back to join_buffer, what do you think is stored in join_buffer?

Me: During the scanning process, the database will select a table and add it to The data that needs to be returned and compared with other tables is put into join_buffer

Interviewer: How to deal with it when there is an index?

Me: This is relatively simple. Just read the index trees of the two tables directly for comparison and that's it. Let me introduce the non-index processing method here

Nested Loop Join

Why do code specifications require SQL statements not to have too many joins?

##Nested loop only reads one row of data in the table at a time, that is to say If the outerTable has 100,000 rows of data and the innerTable has 100 rows of data, it needs to be read 10,000,000 times (assuming that the files of these two tables are not cached in memory by the operating system, we call them cold data tables)

Of course, no database engine currently uses this algorithm (too slow)

Block nested loop

Why do code specifications require SQL statements not to have too many joins?

Block block, that is to say, a piece of data will be fetched into the memory each time to reduce I/O overhead

MySQL InnoDB will use this algorithm when no index can be used

Consider the following two tables t_a and t_b

Why do code specifications require SQL statements not to have too many joins?

When it is not possible When using an index to perform a join operation, InnoDB will automatically use the Block nested loop algorithm

Why do code specifications require SQL statements not to have too many joins?

Summary

When I was in school, the database teacher most I like to study database paradigms, and it wasn’t until I got to work that I learned that everything should be based on performance. If redundancy is possible, use redundancy. If redundancy is not possible, join if join really affects performance. Try increasing your join_buffer_size, or change to a solid state drive.

Reference materials

"In-depth understanding of computer systems"-Chapter 6 Memory Hierarchy
Author of "Experiments and fun with the Linux disk cache" Use several examples to illustrate the impact of hard disk cache on program execution performance
《Linux ate my ram》Explanation of Free parameters
How to clear the buffer/pagecache (disk cache) under Linux The sub-question command is given at the beginning of the article Explain
How MySQL runs: Understand MySQL from the root
Block bested loop The official documentation from MariaDB explains the implementation of the Block-Nested-Loop algorithm

The above is the detailed content of Why do code specifications require SQL statements not to have too many joins?. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:Java学习指南. If there is any infringement, please contact admin@php.cn delete
SQL and Databases: A Perfect PartnershipSQL and Databases: A Perfect PartnershipApr 25, 2025 am 12:04 AM

The relationship between SQL and database is closely integrated, and SQL is a tool for managing and operating databases. 1.SQL is a declarative language used for data definition, operation, query and control. 2. The database engine parses SQL statements and executes query plans. 3. Basic usage includes creating tables, inserting and querying data. 4. Advanced usage involves complex queries and subqueries. 5. Common errors include syntax, logic and performance issues, which can be debugged through syntax checking and EXPLAIN commands. 6. Optimization techniques include using indexes, avoiding full table scanning and optimizing queries.

SQL vs. MySQL: Clarifying the Relationship Between the TwoSQL vs. MySQL: Clarifying the Relationship Between the TwoApr 24, 2025 am 12:02 AM

SQL is a standard language for managing relational databases, while MySQL is a database management system that uses SQL. SQL defines ways to interact with a database, including CRUD operations, while MySQL implements the SQL standard and provides additional features such as stored procedures and triggers.

The Importance of SQL: Data Management in the Digital AgeThe Importance of SQL: Data Management in the Digital AgeApr 23, 2025 am 12:01 AM

SQL's role in data management is to efficiently process and analyze data through query, insert, update and delete operations. 1.SQL is a declarative language that allows users to talk to databases in a structured way. 2. Usage examples include basic SELECT queries and advanced JOIN operations. 3. Common errors such as forgetting the WHERE clause or misusing JOIN, you can debug through the EXPLAIN command. 4. Performance optimization involves the use of indexes and following best practices such as code readability and maintainability.

Getting Started with SQL: Essential Concepts and SkillsGetting Started with SQL: Essential Concepts and SkillsApr 22, 2025 am 12:01 AM

SQL is a language used to manage and operate relational databases. 1. Create a table: Use CREATETABLE statements, such as CREATETABLEusers(idINTPRIMARYKEY, nameVARCHAR(100), emailVARCHAR(100)); 2. Insert, update, and delete data: Use INSERTINTO, UPDATE, DELETE statements, such as INSERTINTOusers(id, name, email)VALUES(1,'JohnDoe','john@example.com'); 3. Query data: Use SELECT statements, such as SELEC

SQL: The Language, MySQL: The Database Management SystemSQL: The Language, MySQL: The Database Management SystemApr 21, 2025 am 12:05 AM

The relationship between SQL and MySQL is: SQL is a language used to manage and operate databases, while MySQL is a database management system that supports SQL. 1.SQL allows CRUD operations and advanced queries of data. 2.MySQL provides indexing, transactions and locking mechanisms to improve performance and security. 3. Optimizing MySQL performance requires attention to query optimization, database design and monitoring and maintenance.

What SQL Does: Managing and Manipulating DataWhat SQL Does: Managing and Manipulating DataApr 20, 2025 am 12:02 AM

SQL is used for database management and data operations, and its core functions include CRUD operations, complex queries and optimization strategies. 1) CRUD operation: Use INSERTINTO to create data, SELECT reads data, UPDATE updates data, and DELETE deletes data. 2) Complex query: Process complex data through GROUPBY and HAVING clauses. 3) Optimization strategy: Use indexes, avoid full table scanning, optimize JOIN operations and paging queries to improve performance.

SQL: A Beginner-Friendly Approach to Data Management?SQL: A Beginner-Friendly Approach to Data Management?Apr 19, 2025 am 12:12 AM

SQL is suitable for beginners because it is simple in syntax, powerful in function, and widely used in database systems. 1.SQL is used to manage relational databases and organize data through tables. 2. Basic operations include creating, inserting, querying, updating and deleting data. 3. Advanced usage such as JOIN, subquery and window functions enhance data analysis capabilities. 4. Common errors include syntax, logic and performance issues, which can be solved through inspection and optimization. 5. Performance optimization suggestions include using indexes, avoiding SELECT*, using EXPLAIN to analyze queries, normalizing databases, and improving code readability.

SQL in Action: Real-World Examples and Use CasesSQL in Action: Real-World Examples and Use CasesApr 18, 2025 am 12:13 AM

In practical applications, SQL is mainly used for data query and analysis, data integration and reporting, data cleaning and preprocessing, advanced usage and optimization, as well as handling complex queries and avoiding common errors. 1) Data query and analysis can be used to find the most sales product; 2) Data integration and reporting generate customer purchase reports through JOIN operations; 3) Data cleaning and preprocessing can delete abnormal age records; 4) Advanced usage and optimization include using window functions and creating indexes; 5) CTE and JOIN can be used to handle complex queries to avoid common errors such as SQL injection.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)