集算器如何优化SQL计算（3）序运算

Home

Database

Mysql Tutorial

集算器如何优化SQL计算（3）序运算_MySQL

PHP中文网

May 27, 2016 pm 07:11 PM

mysqlOptimize sql

跨行引用

早期SQL不直接支持跨行引用，要生成序号后再JOIN，极其繁琐困难。引入窗口函数后的SQL能够较方便地引用其它行数据，但写法仍不简洁，有多个跨行引用项时代码会很长。而且如前所述，窗口函数在其它运算结果集基础上再实施，对窗口函数计算值的再引用就要写成子查询的形式，仍然繁琐。

MySQL不支持窗口函数，但支持在SQL中使用变量，可以引用到前面的行，但无法引用到后面的行。

集算器提供了方便自然的跨行引用语法。

各产品月销售表结构为：产品、月份、销量；现要找出销量比上月多10%的记录。

	A
1	=db.query("select * from 销售表 order by 产品,月份")
2	=A1.select(if(产品==产品[-1],销量/销量[-1])>1.1)

排序后可以简单用[-1]就可以引用前一月的数据，且可以直接基于跨行计算值过滤。使用SQL窗口函数则要用子查询，MySQL则要定义两个临时变量。

再计算上表中各月前后一个月的销量移动平均值：

	A
1	=db.query("select * from 销售表 order by 产品,月份")
2	=A1.derive(if(产品==产品[-1]&&产品==产品[1],销量{-1:1}.avg()):移动平均)

计算移动平均涉及到向后引用和集合引用，用[1]可引用下一行数据，{-1:1}可引用从上一行到下一行的字段值集合。类似地，SQL窗口函数也需要子查询先把相应行计算出来再做移动平均；而MySQL的变量不能后向引用，就很难直接计算了。

再看一例，简化的事件表结构为：序号，时刻，…；时刻应当和序号同步递增，但可能有错误，需要找出时刻没有和序号同步递增的记录。

	A
1	=db.query("select * from 事件表 order by 序号")
2	=A1.select(时刻!=max(时刻{:0})\|\|时刻!=min(时刻{0:}))	和前后所有记录对比

取集合时还可以从头取后或取到尾。SQL窗口函数也支持类似的写法，但两次比较要做两个不同方向的排序，当然了必须要用子查询。

有序分组

SQL只提供与次序无关的等值分组，但有时分组的键值并不能在每条记录中找到，而是和记录的次序有关，这种情况，用SQL又需要使用窗口函数（或其它更麻烦的手段）制造出序号才能实现。

集算器提供了与次序相关的分组机制，方便用于与连续区间相关的计算。

收支表结构为：月份、收入、支出；找出连续亏损达三月或以上的那些月份的记录。

	A
1	=db.query("select * from 收支表 order by 月份")
2	=A1.group@o(收入>支出).select(~.收入

group@o表示在分组时只比较相邻记录，如果相邻值发生变化则会分出一个新组。这样就可以根据收入支出的比较把收支记录分成赢利、亏损、赢利、…这样的组，然后取出其中亏损且成员不少于3的组再合并起来。

还是这个表，希望计算收入最长连续增长了几个月。可以设计这样的分组机制：收入增长时和上月分作一个组，收入下降时则分出一个新组，最后统计组成员的最大值。

	A
1	=db.query("select * from 收支表 order by 月份")
2	=A1.group@i(收入<收入[-1]).max(~.len())

group@i将在条件变化时分出一个新组，即收入降低时。

在窗口函数的支持下，SQL也能实现本例和上例的思路，但写法非常难懂。

区间合并也是常见的有序分组运算。设有事件发生区间表T有字段：S（开始时刻）、E（结束时刻）；现在要将这些区间中重叠部分去除后再计算该事件实际发生的总时长。

	A
1	$select S,E from T order by S
2	=A1.select(E>max(E{:-1}))	去除被包含的条目
3	=A2.run(max(S,E[-1]):S)	去除重叠时间段
4	=A2.sum(interval@s(max(S,E[-1]),E))	计算总时长
5	=A2.run(if(S	合并有重叠的时间段

这里给了多种目标的处理方法，充分利用了跨行运算和有序分组的特点。SQL要实现这种运算简单用窗口函数已经做不到了，需要用到很难理解的递归查询。

位置访问

对于有序的集合，有时我们需要直接用序号访问成员。SQL延用了数学上的无序集合概念，要生成序号再用条件过滤才能访问指定位置的成员，这对许多运算造成很大的麻烦。

集算器采用了有序集合机制，允许直接用序号访问成员，这类运算要方便得多。

比如经济统计中常用到的在众多价格中找出中位数：

	A
1	=db.query@i("select 价格 from T order by 价格")
2	=A1([(A1.len()+1)\2,A1.len()\2+1]).avg()

位置还可以用于分组。事件表结构为：序号、时刻、动作，动作有开始、结束两种，现在要统计事件持续的总时长，即每一对开始和结束之间的时间之和。

	A
1	=db.query@i("select 时刻 from 事件表 order by 时刻")
2	=A1.group((#-1)\2).sum(interval@s(~(1),~(2))

#表示记录序号，group((#-1)\2)即将数据每两个分成一组，然后针对每组计算时长再合计即可。

根据位置还能进行相邻跨行引用。设有股价表结构为：交易日、收盘价；现列出计算出股价超过100元的交易日及当日涨幅。

	A
1	=db.query("select * from 股价表 order by 交易日")
2	=A1.pselect@a(收盘价>100).select(~>1)
3	=A2.new(A1(~).交易日:交易日,A1(~).收盘价-A1(~-1).收盘价:涨幅)

pselect函数将返回满足条件的成员位置，使用这些位置就可以方便地计算涨幅，而不必象使用窗口函数时事先计算出所有涨幅再过滤。

---恢复内容结束---

以上就是集算器如何优化SQL计算（3）序运算_MySQL的内容，更多相关内容请关注PHP中文网（www.php.cn）！

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Explain the role of InnoDB redo logs and undo logs.Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

What are the key metrics to look for in an EXPLAIN output (type, key, rows, Extra)?Apr 15, 2025 am 12:15 AM

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

What is the Using temporary status in EXPLAIN and how to avoid it?Apr 15, 2025 am 12:14 AM

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB

Describe the different SQL transaction isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) and their implications in MySQL/InnoDB.Apr 15, 2025 am 12:11 AM

MySQL/InnoDB supports four transaction isolation levels: ReadUncommitted, ReadCommitted, RepeatableRead and Serializable. 1.ReadUncommitted allows reading of uncommitted data, which may cause dirty reading. 2. ReadCommitted avoids dirty reading, but non-repeatable reading may occur. 3.RepeatableRead is the default level, avoiding dirty reading and non-repeatable reading, but phantom reading may occur. 4. Serializable avoids all concurrency problems but reduces concurrency. Choosing the appropriate isolation level requires balancing data consistency and performance requirements.

MySQL vs. Other Databases: Comparing the OptionsApr 15, 2025 am 12:08 AM

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

How does MySQL index cardinality affect query performance?Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

MySQL: Resources and Tutorials for New UsersApr 14, 2025 am 12:16 AM

The MySQL learning path includes basic knowledge, core concepts, usage examples, and optimization techniques. 1) Understand basic concepts such as tables, rows, columns, and SQL queries. 2) Learn the definition, working principles and advantages of MySQL. 3) Master basic CRUD operations and advanced usage, such as indexes and stored procedures. 4) Familiar with common error debugging and performance optimization suggestions, such as rational use of indexes and optimization queries. Through these steps, you will have a full grasp of the use and optimization of MySQL.

Real-World MySQL: Examples and Use CasesApr 14, 2025 am 12:15 AM

MySQL's real-world applications include basic database design and complex query optimization. 1) Basic usage: used to store and manage user data, such as inserting, querying, updating and deleting user information. 2) Advanced usage: Handle complex business logic, such as order and inventory management of e-commerce platforms. 3) Performance optimization: Improve performance by rationally using indexes, partition tables and query caches.

See all articles