什么是数据库分表技术_MySQL-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

什么是数据库分表技术_MySQL

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 01, 2016 pm 01:31 PM

technologydatabase

bitsCN.com

什么是数据库分表技术

一、概述

分表是个目前算是比较炒的比较流行的概念，特别是在大负载的情况下，分表是一个良好分散数据库压力的好方法。

首先要了解为什么要分表，分表的好处是什么。我们先来大概了解以下一个数据库执行SQL的过程：

接收到SQL --> 放入SQL执行队列 --> 使用分析器分解SQL --> 按照分析结果进行数据的提取或者修改 --> 返回处理结果

当然，这个流程图不一定正确，这只是我自己主观意识上这么我认为。那么这个处理过程当中，最容易出现问题的是什么？就是说，如果前一个SQL没有执行完毕的话，后面的SQL是不会执行的，因为为了保证数据的完整性，必须对数据表文件进行锁定，包括共享锁和独享锁两种锁定。共享锁是在锁定的期间，其它线程也可以访问这个数据文件，但是不允许修改操作，相应的，独享锁就是整个文件就是归一个线程所有，其它线程无法访问这个数据文件。一般MySQL中最快的存储引擎MyISAM，它是基于表锁定的，就是说如果一锁定的话，那么整个数据文件外部都无法访问，必须等前一个操作完成后，才能接收下一个操作，那么在这个前一个操作没有执行完成，后一个操作等待在队列里无法执行的情况叫做阻塞，一般我们通俗意义上叫做“锁表”。

锁表直接导致的后果是什么？就是大量的SQL无法立即执行，必须等队列前面的SQL全部执行完毕才能继续执行。这个无法执行的SQL就会导致没有结果，或者延迟严重，影响用户体验。

特别是对于一些使用比较频繁的表，比如SNS系统中的用户信息表、论坛系统中的帖子表等等，都是访问量大很大的表，为了保证数据的快速提取返回给用户，必须使用一些处理方式来解决这个问题，这个就是我今天要聊到的分表技术。

分表技术顾名思义，就是把若干个存储相同类型数据的表分成几个表分表存储，在提取数据的时候，不同的用户访问不同的表，互不冲突，减少锁表的几率。比如，目前保存用户分表有两个表，一个是user_1表，还有一个是 user_2 表，两个表保存了不同的用户信息，user_1 保存了前10万的用户信息，user_2保存了后10万名用户的信息，现在如果同时查询用户 heiyeluren1 和 heiyeluren2 这个两个用户，那么就是分表从不同的表提取出来，减少锁表的可能。

我下面要讲述的两种分表方法我自己都没有实验过，不保证准确能用，只是提供一个设计思路。下面关于分表的例子我假设是在一个贴吧系统的基础上来进行处理和构建的。（如果没有用过贴吧的用户赶紧Google一下）

二、基于基础表的分表处理

这个基于基础表的分表处理方式大致的思想就是：一个主要表，保存了所有的基本信息，如果某个项目需要找到它所存储的表，那么必须从这个基础表中查找出对应的表名等项目，好直接访问这个表。如果觉得这个基础表速度不够快，可以完全把整个基础表保存在缓存或者内存中，方便有效的查询。

我们基于贴吧的情况，构建假设如下的3张表：

1. 贴吧版块表: 保存贴吧中版块的信息

2. 贴吧主题表：保存贴吧中版块中的主题信息，用于浏览

3. 贴吧回复表：保存主题的原始内容和回复内容

“贴吧版块表”包含如下字段：

版块ID      board_id          int(10)版块名称   board_name      char(50)子表ID      table_id            smallint(5)产生时间   created             datetime&ldquo;贴吧主题表&rdquo;包含如下字段：主题ID         topic_id        int(10)主题名称       topic_name     char(255)版块ID         board_id          int(10)创建时间      created           datetime&ldquo;贴吧回复表&rdquo;的字段如下：回复ID       reply_id           int(10)回复内容     reply_text        text主题 ID       topic_id           int(10)版块ID       board_id         int(10)创建时间     created            datetime

那么上面保存了我们整个贴吧中的表结构信息，三个表对应的关系是：

版块 --> 多个主题

主题 --> 多个回复

那么就是说，表文件大小的关系是：

版块表文件

所以基本可以确定需要对主题表和回复表进行分表，已增加我们数据检索查询更改时候的速度和性能。

看了上面的表结构，会明显发现，在“版块表”中保存了一个"table_id"字段，这个字段就是用于保存一个版块对应的主题和回复都是分表保存在什么表里的。

比如我们有一个叫做“PHP”的贴吧，board_id是1，子表ID也是1，那么这条记录就是：

board_id | board_name | table_id | created

1 | PHP | 1 | 2007-01-19 00:30:12

相应的，如果我需要提取“PHP”吧里的所有主题，那么就必须按照表里保存的table_id来组合一个存储了主题的表名称，比如我们主题表的前缀是 “topic_”，那么组合出来“PHP”吧对应的主题表应该是：“topic_1”，那么我们执行：

基于Hash算法的分表处理

我们知道Hash表就是通过某个特殊的Hash算法计算出的一个值，这个值必须是惟一的，并且能够使用这个计算出来的值查找到需要的值，这个叫做哈希表。

我们在分表里的hash算法跟这个思想类似：通过一个原始目标的ID或者名称通过一定的hash算法计算出数据存储表的表名，然后访问相应的表。

继续拿上面的贴吧来说，每个贴吧有版块名称和版块ID，那么这两项值是固定的，并且是惟一的，那么我们就可以考虑通过对这两项值中的一项进行一些运算得出一个目标表的名称。

现在假如我们针对我们这个贴吧系统，假设系统最大允许1亿条数据，考虑每个表保存100万条记录，那么整个系统就不超过100个表就能够容纳。按照这个标准，我们假设在贴吧的版块ID上进行hash，获得一个key值，这个值就是我们的表名，然后访问相应的表。

我们构造一个简单的hash算法：

function get_hash($id){   $str = bin2hex($id);   $hash = substr($str, 0, 4);   if (strlen($hash)<4){       $hash = str_pad($hash, 4, "0");   }   return $hash;}

算法大致就是传入一个版块ID值，然后函数返回一个4位的字符串，如果字符串长度不够，使用0进行补全。

比如：get_hash(1)，输出的结果是“3100”，输入：get_hash(23819)，得到的结果是：3233，那么我们经过简单的跟表前缀组合，就能够访问这个表了。那么我们需要访问ID为1的内容时候哦，组合的表将是：topic_3100、reply_3100，那么就可以直接对目标表进行访问了。

当然，使用hash算法后，有部分数据是可能在同一个表的，这一点跟hash表不同，hash表是尽量解决冲突，我们这里不需要，当然同样需要预测和分析表数据可能保存的表名。

如果需要存储的数据更多，同样的，可以对版块的名字进行hash操作，比如也是上面的二进制转换成十六进制，因为汉字比数字和字母要多很多，那么重复几率更小，但是可能组合成的表就更多了，相应就必须考虑一些其它的问题。

归根结底，使用hash 方式的话必须选择一个好的hash算法，才能生成更多的表，然数据查询的更迅速。

【优点hash算法直接得出目标表名称，效率很高】通过

【劣势】扩展性比较差，选择了一个hash算法，定义了多少数据量，以后只能在这个数据量上跑，不能超过过这个数据量，可扩展性稍差

四、其它问题

1. 搜索问题

现在我们已经进行分表了，那么就无法直接对表进行搜索，因为你无法对可能系统中已经存在的几十或者几百个表进行检索，所以搜索必须借助第三方的组件来进行，比如Lucene作为站内搜索引擎是个不错的选择。

2. 表文件问题

我们知道MySQL的MyISAM引擎每个表都会生成三个文件，*.frm、*.MYD、*.MYI 三个文件，分表用来保存表结构、表数据和表索引。Linux下面每个目录下的文件数量最好不要超过1000个，不然检索数据将更慢，那么每个表都会生成三个文件，相应的如果分表超过300个表，那么将检索非常慢，所以这时候就必须再进行分，比如在进行数据库的分离。

使用基础表，我们可以新增加一个字段，用来保存这个表保存在什么数据。使用Hash的方式，我们必须截取hash值中第几位来作为数据库的名字。这样，完好的解决这个问题。

bitsCN.com

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

MySQL's Place: Databases and ProgrammingApr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

MySQL: From Small Businesses to Large EnterprisesApr 13, 2025 am 12:17 AM

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

What are phantom reads and how does InnoDB prevent them (Next-Key Locking)?Apr 13, 2025 am 12:16 AM

InnoDB effectively prevents phantom reading through Next-KeyLocking mechanism. 1) Next-KeyLocking combines row lock and gap lock to lock records and their gaps to prevent new records from being inserted. 2) In practical applications, by optimizing query and adjusting isolation levels, lock competition can be reduced and concurrency performance can be improved.

MySQL: Not a Programming Language, But...Apr 13, 2025 am 12:03 AM

MySQL is not a programming language, but its query language SQL has the characteristics of a programming language: 1. SQL supports conditional judgment, loops and variable operations; 2. Through stored procedures, triggers and functions, users can perform complex logical operations in the database.

MySQL: An Introduction to the World's Most Popular DatabaseApr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

The Importance of MySQL: Data Storage and ManagementApr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system suitable for data storage, management, query and security. 1. It supports a variety of operating systems and is widely used in Web applications and other fields. 2. Through the client-server architecture and different storage engines, MySQL processes data efficiently. 3. Basic usage includes creating databases and tables, inserting, querying and updating data. 4. Advanced usage involves complex queries and stored procedures. 5. Common errors can be debugged through the EXPLAIN statement. 6. Performance optimization includes the rational use of indexes and optimized query statements.

Why Use MySQL? Benefits and AdvantagesApr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

Describe InnoDB locking mechanisms (shared locks, exclusive locks, intention locks, record locks, gap locks, next-key locks).Apr 12, 2025 am 12:16 AM

InnoDB's lock mechanisms include shared locks, exclusive locks, intention locks, record locks, gap locks and next key locks. 1. Shared lock allows transactions to read data without preventing other transactions from reading. 2. Exclusive lock prevents other transactions from reading and modifying data. 3. Intention lock optimizes lock efficiency. 4. Record lock lock index record. 5. Gap lock locks index recording gap. 6. The next key lock is a combination of record lock and gap lock to ensure data consistency.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks agoByDDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Zend Studio 13.0.1

Powerful PHP integrated development environment

SublimeText3 Linux new version

SublimeText3 Linux latest version

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Notepad++7.3.1

Easy-to-use and free code editor

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Hot Topics

Where is the login entrance for gmail email?

7478

CakePHP Tutorial

1377

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers