bitsCN.com
怎样解决MySQL数据库主从复制延迟的问题
像Facebook、开心001、人人网、优酷、豆瓣、淘宝等高流量、高并发的网站,单点数据库很难支撑得住,WEB2.0类型的网站中使用MySQL的居多,要么用MySQL自带的MySQL NDB Cluster(MySQL5.0及以上版本支持MySQL NDB Cluster功能),或者用MySQL自带的分区功能(MySQL5.1及以上版本支持分区功能),我所知道的使用这两种方案的很少,一般使用主从复制,再加上MySQL Proxy实现负载均衡、读写分离等功能,在使用主从复制的基础上,再使用垂直切分及水平切分;或者不使用主从复制,完全使用垂直切分加上水平切分再加上类似Memcached的系统也可以解决问题。
1.优酷的经验
数据库采用水平扩展,主从复制,随着从数据库的增多,复制延迟越来越厉害,最终无法忍受。
最终还是采用数据库的sharding,把一组用户相关的表和数据放到一组数据库上。
使用SSD来优化mysql的I/O,性能提升明显,每块16G,6块SSD做RAID。
数据库的类型选用MYISAM
数据库的拆分策略,先纵向按照业务或者模块拆分。对于一些特别大的表,再采用垂直拆分
根据用户进行分片,尽可能不要跨篇查询。如果确实要跨片查询,可以考虑搜索的方案,先索引再搜索。
分布式的数据库方案太复杂,否掉。
优酷使用的是数据库分片技术,而抛弃了由于数据量的越来越多导致复制延迟的问题。按照user_id进行分片,这样必须有一个全局的表来管理用户与shard的关系,根据user_id可以得到share_id,然后根据share_id去指定的分片查询指定的数据。
假如此表的表名为sharding_manager,如果网站的用户数太多,比如千万级的或甚至更大比如亿级的用户,此时此表也许也会成为一个瓶颈,因为查询会非常频繁,所有的动态请求都要读此表,这时可以用其它的解决方案,比如用Memcached、Tokyo Cabinet、Berkeley DB或其它的性能更高的方案来解决。
具体怎么定位到哪台db服务器,定位到哪个数据库,定位到哪个shard(就是userN,msgN,videoN),优酷网的架构文档中说得不是很仔细,这里只能猜测一下了。
根据优酷的架构图,一共有2台db服务器,每台db服务器有2个数据库,每个数据库有3个shard,这样一共是2 * 2 * 3 = 12个shard。
user_id一般是自增型字段,用户注册的时候可以自动生成,然后看有几台db服务器,假如有m台db服务器,则用 user_id % m便可以分配一台db服务器(例如0对应100,1对应101,以此类推,字段mysql_server_ip的值确定),假设每台服务器有n个数据库,则用user_id % n可以定位到哪个数据库(字段database_name的值确定),假设每个数据库有i个shard,则用user_id % i可以定位到哪个shard(字段shard_id的值确定),这样就可以进行具体的数据库操作了。
user_id share_id mysql_server_ip database_name
101 2 192.168.1.100 shard_db1
105 0 192.168.1.100 shard_db2
108 0 192.168.1.101 shard_db3(或shard_db1)
110 1 192.168.1.101 shard_db4(或shard_db2)
如上述user_id为101的用户,连接数据库服务器192.168.1.100,使用其中的数据库为shard_db1,使用其中的表系列为user2,msg2,video2
如果上述的m,n,i发生变化,比如网站的用户不断增长,需要增加db服务器,此时则需要进行数据库迁移。
因为表位于不同的数据库中,所以不同的数据库中表名可以相同
server1(192.168.1.100)
shard_db1
user0
msg0
video0
user1
msg1
video1
...
userN
msgN
videoN
shard_db2
user0
msg0
video0
user1
msg1
video1
...
userN
msgN
videoN
因为表位于不同的数据库服务器中,所以不同的数据库服务器中的数据库名可以相同
server2(192.168.1.101)
shard_db3(这里也可以用shard_db1)
user0
msg0
video0
user1
msg1
video1
...
userN
msgN
videoN
shard_db4(这里也可以用shard_db2)
user0
msg0
video0
user1
msg1
video1
...
userN
msgN
videoN
2.豆瓣的经验
由于从主库到辅库的复制需要时间
更新主库后,下一个请求往往就是要读数据(更新数据后刷新页面)
从辅库读会导致cache里存放的是旧数据(不知道这个cache具体指的是什么,如果是Memcached的话,如果更新的数据的量很大,难道把所有更新过的数据都保存在Memcached里面吗?)
解决方法:更新数据库后,在预期可能会马上用到的情况下,主动刷新缓存
不完美,but it works
豆瓣后来改为双MySQL Master+Slave说是能解决Replication Delay的问题,不知道是怎么解决的,具体不太清楚。
3.Facebook的经验
下面一段内容引用自www.dbanotes.net
大量的 MySQL + Memcached 服务器,布署简示:
California (主 Write/Read)............. Virginia (Read Only)
主数据中心在 California ,远程中心在 Virginia 。这两个中心网络延迟就有 70ms,MySQL 数据复制延迟有的时候会达到 20ms. 如果要让只读的信息从 Virginia 端发起,Memcached 的 Cache 数据一致性就是个问题。
1 用户发起更新操作,更名 "Jason" 到 "Monkey" ;
2 主数据库写入 "Monkey",删除主端 Memcached 中的名字值,但Virginia 端 Memcached 不删;(这地方在 SQL 解析上作了一点手脚,把更新的操作"示意"给远程);
3 在 Virginia 有人查看该用户 Profile ;
4 在 Memcached 中找到键值,返回值 "Jason";
5 复制追上更新 Slave 数据库用户名字为 "Monkey",删除 Virginia Memcached 中的键值;
6 在 Virginia 有人查看该用户 Profile ;
7 Memcache 中没找到键值,所以从 Slave 中读取,然后得到正确的 "Monkey" 。
Via
从上面3可以看出,也仍然存在数据延迟的问题。同时master中数据库更新的时候不更新slave中的memcached,只是给slave发个通知,说数据已经改变了。
那是不是可以这样,当主服务器有数据更新时,立即更新从服务器中的Memcached中的数据,这样即使有延迟,但延迟的时间应该更短了,基本上可以忽略不计了。
4.Netlog的经验
对于比较重要且必须实时的数据,比如用户刚换密码(密码写入 Master),然后用新密码登录(从 Slaves 读取密码),会造成密码不一致,导致用户短时间内登录出错。所以在这种需要读取实时数据的时候最好从 Master 直接读取,避免 Slaves 数据滞后现象发生。还好,需要读取实时数据的时候不多,比如用户更改了邮件地址,就没必要马上读取,所以这种 Master-Slaves 架构在多数情况下还是有效的。
bitsCN.com
InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

MySQL is worth learning because it is a powerful open source database management system suitable for data storage, management and analysis. 1) MySQL is a relational database that uses SQL to operate data and is suitable for structured data management. 2) The SQL language is the key to interacting with MySQL and supports CRUD operations. 3) The working principle of MySQL includes client/server architecture, storage engine and query optimizer. 4) Basic usage includes creating databases and tables, and advanced usage involves joining tables using JOIN. 5) Common errors include syntax errors and permission issues, and debugging skills include checking syntax and using EXPLAIN commands. 6) Performance optimization involves the use of indexes, optimization of SQL statements and regular maintenance of databases.

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

Dreamweaver CS6
Visual web development tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.