Home  >  Article  >  Backend Development  >  The amount of data is too large and paging query becomes slow. Is there any way to optimize the query?

The amount of data is too large and paging query becomes slow. Is there any way to optimize the query?

WBOY
WBOYOriginal
2016-08-04 09:20:002968browse

Because there are too many user records (3 million), the paging query becomes very slow. How to optimize the statement or index? Do you have any optimization plans?

Supplementary content:
Thank you everyone for your plan. I am very inspired by it, but it is special to my situation and not applicable. The specific situation is as follows:
1 When the user opens the participation record page, the records of the 10 most recent participants should be displayed. (The records of these 10 people are not the latest ten pieces of data in the user participation record table, but the ten pieces of data filtered out by adding WHERE conditions.)
2 The latest data of the 10 participating people are obtained from two tables ( User information table, user participation record table).
3 Make sure that users can turn to the next page of data and the previous page of data.
Now the data query is to connect two tables, WHERE the conditions of the two tables, and then limit the data. This is too slow, so I would like to ask if there are any optimization solutions?

Reply content:

Because there are too many user records (3 million), the paging query becomes very slow. How to optimize the statement or index? Do you have any optimization plans?

Supplementary content:
Thank you everyone for your plan. I am very inspired by it, but it is special to my situation and not applicable. The specific situation is as follows:
1 When the user opens the participation record page, the records of the 10 most recent participants should be displayed. (The records of these 10 people are not the latest ten pieces of data in the user participation record table, but the ten pieces of data filtered out by adding WHERE conditions.)
2 The latest data of the 10 participating people are obtained from two tables ( User information table, user participation record table).
3 Make sure that users can turn to the next page of data and the previous page of data.
Now the data query is to connect two tables, WHERE the conditions of the two tables, and then limit the data. This is too slow, so I would like to ask if there are any optimization solutions?

1. Don’t use limit without adding conditions. Limit 1000, 20 will scan from 0~10002
2. Remember the last user primary key on the previous page, and use select * from user where uid>lastUid limit pagesize for pagination
3. Data consistency requirements are not high, consider adding a layer of cache in the middle of the database

<code class="sql">EXPLAIN SELECT * FROM lagou where id <=100</code>

The amount of data is too large and paging query becomes slow. Is there any way to optimize the query?

<code class="sql">EXPLAIN SELECT * FROM lagou LIMIT 100</code>

The amount of data is too large and paging query becomes slow. Is there any way to optimize the query?

Notetype One is range and the other is all

Summary: What a painful realization~

MySQL LIMIT paging optimization (where id is the auto-incremented primary key of the posts table):
SELECT * FROM posts ORDER BY id DESC LIMIT 20 OFFSET 10000
Scan 10020 rows, skip the first 10000 rows, and return the last 20 rows, speed Slow.
A better design is to replace the specific number of pages with the "Next Page" button.
Assuming that each page displays 20 records, then every time we query, LIMIT returns 21 records and only displays 20 Article.
If Article 21 exists, then we will display the "Next Page" or "AJAX Load More" button.

Use "previous page" and "next page" for paging:
Previous page (id: 60~41)
Current page (id: 40~21)
Next page (id: 20~01)

Previous page (20 new articles):
//page.php?id=40 A positive number indicates a new article on the previous page, and the 40 here indicates the ID of the top article on the current page.
SELECT * FROM posts WHERE id > 40 ORDER BY id ASC LIMIT 20;
What you get here is the ascending ID sequence. PHP uses array_reverse to reverse the array to achieve descending output.

Next page (20 old articles):
//page.php?id=-21 A negative number indicates the old article on the next page, and 21 here indicates the ID of the bottom article on the current page.
SELECT * FROM posts WHERE id < 21 ORDER BY id DESC LIMIT 20;
If you think negative numbers don’t look good, you can add an additional parameter, such as page.php?id=21&next

I personally think this is a paging query optimization, which is just pagination after joining two tables

Typical pagination optimization is:

Normal writing:

<code>select * from buyer where sellerid=100 limit 100000,20</code>

Normally limit M, N page turning is written, the slower the page turning process is, the reason is that mysql will read the first M+N data in the table. The larger M, the worse the performance

Optimized writing:

<code>select t1.* from buyer t1, (select id from buyer where sellerid=100 limit 100000,20 ) t2 where t1.id=t2.id</code>

You need to create an index in the sellerid field in the t table, id is the primary key of the table

如果id主键是自增且连续的(未出现删除记录)那么可以改用where between

假设user_id是自增主键

<code class="sql">SELECT * FROM user
WHERE user_id >= (
SELECT user_id FROM user
LIMIT pageIndex, 1
)
LIMIT pageSize;</code></p>

参考:https://mp.weixin.qq.com/s?__biz=MjM5NzMyMjAwMA==&mid=2651477279&idx=1&sn=4a799a4aa2f8563d482973e262e906b8&scene=1&srcid=0629Qe4RKqmpBhwOUzFJeK1B&key=77421cf58af4a65302d0d5987b2f5d1610469f10ba9f2ac129747c2578d1a68463e33f8c7f388537de80a7bc38d00c05&ascene=0&uin=ODYxODM1MDE4&devicetype=iMac+MacBookAir6%2C2+OSX+OSX+10.9.5+build(13F34)&version=11020012&pass_ticket=aZDru5jCHRXbpBvmRG0m%2BkUwb4sJZ2hIfzTt6LDC1xnW2ZMj0mgDjwuyZgisQJxA

请根据自己的查询条件定义where子句,同时不要反复count记录数量。
1- 记忆当前页的最后一条记录,下次查询的时候根据order by添加对应的where子句代替跳过N行
2- 数据量大的时候总共多少页在实际使用的时候已经失去意义,但是count的消耗却很大,确实需要的话首次查询时获得一次就好了。

同意楼上的意见

1.完善搜索及索引建立
2.设计数据表默认排序作为排序规则,降低查询排序耗时
3.在id为有序数字情况下,利用最后一次id作为下一次检索条件,极大降低返回数据集
4.减少count的次数和精度,超大数据可以缓存count值
5.更大型数据可根据分页规则、类型等对数据表进行拆分,降低单次查询数据量

1 你减少sql语句的函数使用
2 减少order by 和group by
3 对查询条件的字段添加索引
4 300w的记录应该操作记录,不是实时需要展示的数据,那么可以做数据缓存。
5 js使用$(obj).load("url #id") 进行ajax局部刷新翻页,这样可以避免你的资源重新加载
我知道的就这么多了

按我的理解你的情况是限制条件在用户参与记录表中,而用户信息表实际上只要按前一个表的id取数据就可以了。我估计你索引已经做了,剩下能有大的改观的就是数据分区和预排序,参与记录表按主要查询条件,用户信息表按id分区。然后参与记录表一定要预排序,如果到查询时再排序,再怎么样也快不起来。

增加where条件 主键>0 然后只取主键,这个是分页用,取内容用 where IN 直接取主键,比你查的快多了

把某个用户的信息缓存,分页查询就查用户参与记录表

两张表关联,而且显示最新参与的10个人的记录。这样关联的时候,查询的sql会现的复杂。所以建议通过两条sql+php的遍历实现。查询需要的个人记录一个sql,查询会员一个sql,然后遍历数组生成需要的数组。总之就是将复杂的sql拆分成简单的sql

建立合适的索引
查询先分页主键,然后通过主键结果查内容 曾经做过7kw的日志数据查询,分页很快的。

对数据实效性要求不高的话 可以用sphinx做查询

limit 基数比较大时使用 between , 取比较后面的数据时用 desc 反向查找 .

可以写个定时脚本,10分钟跑一次这样的,将取到的数据保存起来,用户打开页面时就从这个表里取。只是展示最新参与的话,来个几分钟的延迟应该是可以接受的

First carefully review whether your SQL has room for optimization; then consider database tuning, caching, and separation of reading and writing

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn