search
HomeDatabaseMysql TutorialMySQL single table data should not exceed 5 million rows: is it an empirical value or a golden rule?

MySQL single table data should not exceed 5 million rows: is it an empirical value or a golden rule?

Today, let’s discuss an interesting topic: How much data does a single MySQL table need to consider before it needs to be divided into databases and tables? Some say 20 million rows, others say 5 million rows. So, what do you think this value is appropriate?

There was once a widely circulated saying in China's Internet technology circle: MySQL's performance will drop significantly if the data volume of a single table exceeds 20 million rows. In fact, this rumor is said to have originated from Baidu. The specific situation is probably like this. When the DBA tested the performance of MySQL, he found that when the size of a single table reached 20 million rows, the performance of SQL operations dropped sharply. Therefore, the conclusion comes from this. Then it was said that Baidu engineers moved to other companies in the industry and brought this information with them, so this saying spread in the industry.

Later, Alibaba's "Java Development Manual" proposed that database and table sharding is only recommended when the number of rows in a single table exceeds 5 million or the capacity of a single table exceeds 2GB. This is supported by Alibaba's golden iron rule. Therefore, when many people design big data storage, they will use this as a standard to perform table operations.

So, what do you think is the appropriate value? Why not 3 million rows, or 8 million rows, but 5 million rows? Maybe you would say that this may be Ali's best actual combat value? So, the question comes again, how is this value evaluated? Wait a moment, please think about it for a moment.

In fact, this value has nothing to do with the actual number of records, but is related to the configuration of MySQL and the hardware of the machine. Because, in order to improve performance, MySQL will load the index of the table into memory. When the InnoDB buffer size is sufficient, it can be fully loaded into memory and there will be no problem with querying. However, when a single-table database reaches an upper limit of a certain magnitude, the memory cannot store its index, causing subsequent SQL queries to generate disk IO, resulting in performance degradation. Of course, this is also related to the design of the specific table structure, and the ultimate problem is memory limitation. Here, increasing the hardware configuration may bring immediate performance improvements.

So, my point of view on sub-database and sub-table is that it needs to be combined with actual needs and should not be over-designed. The sub-database and sub-table design should not be used at the beginning of the project. Instead, as the business grows, it will be unavailable. If optimization continues, consider sharding databases and tables to improve system performance. In this regard, Alibaba's "Java Development Manual" adds: If the data volume is not expected to reach this level in three years, please do not divide the database into tables when creating the table. So, back to the original question, what do you think is an appropriate value? My suggestion is to make a comprehensive evaluation based on the situation of your own machine. If you have no standard in mind, then temporarily use 5 million lines as a unified standard, which is relatively a compromise value.

For more MySQL related technical articles, please visit the MySQL Tutorial column to learn!

The above is the detailed content of MySQL single table data should not exceed 5 million rows: is it an empirical value or a golden rule?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
图文详解mysql架构原理图文详解mysql架构原理May 17, 2022 pm 05:54 PM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于架构原理的相关内容,MySQL Server架构自顶向下大致可以分网络连接层、服务层、存储引擎层和系统文件层,下面一起来看一下,希望对大家有帮助。

mysql的msi与zip版本有什么区别mysql的msi与zip版本有什么区别May 16, 2022 pm 04:33 PM

mysql的msi与zip版本的区别:1、zip包含的安装程序是一种主动安装,而msi包含的是被installer所用的安装文件以提交请求的方式安装;2、zip是一种数据压缩和文档存储的文件格式,msi是微软格式的安装包。

mysql怎么去掉第一个字符mysql怎么去掉第一个字符May 19, 2022 am 10:21 AM

方法:1、利用right函数,语法为“update 表名 set 指定字段 = right(指定字段, length(指定字段)-1)...”;2、利用substring函数,语法为“select substring(指定字段,2)..”。

mysql怎么替换换行符mysql怎么替换换行符Apr 18, 2022 pm 03:14 PM

在mysql中,可以利用char()和REPLACE()函数来替换换行符;REPLACE()函数可以用新字符串替换列中的换行符,而换行符可使用“char(13)”来表示,语法为“replace(字段名,char(13),'新字符串') ”。

mysql怎么将varchar转换为int类型mysql怎么将varchar转换为int类型May 12, 2022 pm 04:51 PM

转换方法:1、利用cast函数,语法“select * from 表名 order by cast(字段名 as SIGNED)”;2、利用“select * from 表名 order by CONVERT(字段名,SIGNED)”语句。

MySQL复制技术之异步复制和半同步复制MySQL复制技术之异步复制和半同步复制Apr 25, 2022 pm 07:21 PM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于MySQL复制技术的相关问题,包括了异步复制、半同步复制等等内容,下面一起来看一下,希望对大家有帮助。

mysql怎么判断是否是数字类型mysql怎么判断是否是数字类型May 16, 2022 am 10:09 AM

在mysql中,可以利用REGEXP运算符判断数据是否是数字类型,语法为“String REGEXP '[^0-9.]'”;该运算符是正则表达式的缩写,若数据字符中含有数字时,返回的结果是true,反之返回的结果是false。

带你把MySQL索引吃透了带你把MySQL索引吃透了Apr 22, 2022 am 11:48 AM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了mysql高级篇的一些问题,包括了索引是什么、索引底层实现等等问题,下面一起来看一下,希望对大家有帮助。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment