Home >Database >Mysql Tutorial >How to query after mysql branch database

How to query after mysql branch database

(*-*)浩
(*-*)浩Original
2019-05-28 17:38:077368browse

The strategy of splitting databases and tables depends on the project requirements. The conventional approach is adopted here: according to the method of taking the module, assume that we have 2 horizontal splitting databases, and each database has 2 horizontal splitting tables so that the total number is 4 tables are not sorted according to other conditions by default when querying. Suppose we want to query the data on page 41 and display 10 pieces of data on each page

How to query after mysql branch database

The first one:

is also the simplest one: by adding an additional association table, there must be an id attribute in the attributes. As for whether there are library id attributes and table id attributes (that is, which number database and which table) are optional, because this can be obtained by taking the modulus based on the ID. Note that the data stored in this table is all the data, but it has fewer attribute columns and only provides a few attribute columns for indexing. In this case, we only need to select * from brand_temp where ... limit 400,10 (Insert the data on page 41, each page displays 5 pieces of data), and then after we obtain the id, we can query it in the corresponding table

The second type:

The most performance-consuming one. If we want to query the records on the first page, the SQL of a single database and a single table is: select * from db limit 0, 10; When we divide the database into shards, the statement is still the same, but at this time we need to parse the records returned by the 4 tables in memory, and then perform ascending order by ID to obtain the first 10 data returns... The amount of data is small, It's OK when the page number is small, but if we want to query the data on page 2, in the case of SQL monolithic architecture: select * from db limit 10,10; But this is not possible in a distributed database, the data is very small Obviously it will be lost. The way to make up for it is to query all. The sql statement is select * from db_x limit 0,10 10 //It means that what needs to be queried is the number of records to be queried on the single architecture plus the previous records, and then The records returned by all tables are combined in memory and then parsed, and finally the records starting from the 10th are taken... It can be seen that once the number of pages in this solution reaches n pages, and the number of records displayed on each page is m, each table The number of records that need to be queried is: (n-1)*m m=nm records, and the number of records that need to be parsed in the memory is t * n * m records. If the CPU does not explode, I lose

The third type:

adopts a business-based model: forcing users to be unable to perform page jump queries. What does it mean? Users can only browse by clicking on the next page or the previous page. Specifically The method is to query the number of records and record the maximum value of the current unique id value, and then add the where condition when querying again... Let us start from the beginning: the first query pageNum=1, pageSize=10, maxId=0- >sql:select * from db_x where id>0 limit 10; and then distribute it to the table of the corresponding library, merge the obtained 4*10 pieces of data, and then parse and sort in the memory, take the first 10 pieces of data, and at the same time Take out the id=maxId of the 10th piece of data separately, render it and save it on the front-end page, so that when you click on the next page, the maxId=10 is also submitted, and the sql becomes select * from db_x where id>10 limit 10 , then continue to parse and continue to save...The data returned in this way are stable and the data is coherent (sorting)

Fourth:

Legendary The best way is to support page jump query. The core of this method lies in 2 sql queries. How to do it specifically:

Prerequisite assumption: Query the data on page 1001 and display 10 records on each page

1):我们先记录下要查询的记录数的范围:(1001-1)*10=10000 开始,10010结束->10000-10010
单体的sql为:select * from db limit 10000,10;
我们总共有4个表,意味着:每个表的start应该为10000/4=2500,从而sql变成了:
select * from db_x limit 2500,10;	//假设是平均分配的,因而我们可以均分,不均分也没关系,后续操作会补齐
我们会得到4个表中的记录:(因为我demo还没写,所以先手写了)
T1:(1,"a"),.......
T2:(2,"b"),.......
T3:(3,"c"),.......
T4:(4,"d"),.......
真实数据第1001页不可能是1开头的,将就着看吧,过几天会一起讲rabbitMQ分布式一致性和这个demo一起发布的
ok,第一阶段的sql查询结束

2):对4个表中返回的记录进行id匹配(id如果非整型,自行用hashCode匹配),因为是升序查询,所以我们只需要比较下每个表的首条记录
的id值即可,获得了最小的minId=1,和各个表最大的那个值maxId;ok,转换sql思路,这里我们采用条件查询了(弥补操作第一步):
select * from db_x where id between minId and maxId 这样我们就获取到了遗漏的数据(当然有多余的数据)
这样我们4个表中就返回了可能记录数各不相同的记录,第二步结束

3):
之后记录minId出现的位置,如T1出现的位置为2500,T2出现的位置为2500-2=2048 ,T3出现的位置为2500-3=2047 ,T4出现的位置
为2500-3=2047 则最终出现的记录数为:2500+2048+2047+2047=10000-2-3-3=9992,因此我们需要的查询的记录数需要从9992 依次往后取
8个开始,然后再取10个就是所求的数据,这种方式能做到数据精确查询,但是唯一的缺点就是每次查询都需要进行二次sql查询

The above is the detailed content of How to query after mysql branch database. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Related articles

See more