search
HomeDatabaseMysql TutorialMongoDB:聚集管道

在MongoDB2.2新出现的。 聚集管道式基于数据处理管道概念建模的数据聚集框架。文档进入一个多阶段能将该文档转化为聚集结果的管道。 聚集管道提供了map-reduce方法了替代物,并在很多聚集任务中是首选的方案,因为map-reduce的复杂性可能是你不希望看到的。

在MongoDB2.2新出现的。

聚集管道式基于数据处理管道概念建模的数据聚集框架。文档进入一个多阶段能将该文档转化为聚集结果的管道。

聚集管道提供了map-reduce方法了替代物,并在很多聚集任务中是首选的方案,因为map-reduce的复杂性可能是你不希望看到的。

\

上图是一个带注释的聚集管道的操作,有两个阶段:$match和$group

聚集管道在值的类型和结果大小上有很多限制。下面简单介绍,

聚集操作在使用aggregate命令时有的限制:

类型限制

聚集管道不是在下列类型的值上进行操作:Symbol,Minkey,MaxKey,DBRef,Code和CodeWSrope

(在MongoDB2.4版本解除了对Binary类型的限制。在MongoDB2.2,管道不能对Binary类型数据操作)

结果大小限制

如果aggregate命令返回的单个文档保护完整的结果集,则该命令在结果集超过BSON Document Size限制时会产生一个错误,目前的大小是16M。为了管理超过这一限制的结果集,aggregate命令当命令返回一个游标(cursor)或把结果保存在一个collection里时,能够返回任意大小的结果集。

(在MongoDB2.6,aggregate命令返回一个游标或把结果存在一个collection时,能不受这个大小限制。db.collection.aggregate()返回一个游标,能返回任意大小的结果集。)

内存限制

在MongoDB2.6有了变化。

管道阶段在RAM有100M的限制。如果超过这一限制,MongoDB会出错。为了允许操作大型数据,可以使用allowDiskUse选项来时聚集管道阶段能往临时文件写数据。

管道

管道,顾名思义就是来自集合的文档通过一个聚集管道的旅行,当通过其中时该管道能转化这些对象。对熟悉Unix shells命令的(如 bash),这个概念和管道(pipe)很类似。

MongoDB的聚集管道以一个集合的文档开始,流动文档从一个管道操作(pipeline operator)到下一个来处理文档。在管道的每一个操作符在文档经过管道时都会转化文档。管道操作符不需要为每一个输入文档产生一个输出文档。操作符可以产生新文档也能过滤文档。管道操作能在一个管道里面重复。

管道表达式

每一个管道操作符接受一个管道表达式作为操作数。管道表达式指出应用在输入文档的转化过程。表达式有一个文档(document)结构,并包含字段,值和操作符。

管道表达式只能操作在管道中的当前文档,不能引用在其他文档的数据:表达式提供了内存(in-memory)文档转化。

一般的,表达式是无状态的,只在聚集过程时有一个例外:Accumulation expressions。累加表达式,使用$group管道,维持他们的状态(如,totals,maximums,mininums和相关数据)作为通过管道的文档过程。

聚集管道行为

在MongoDB,aggregate命令操作单个集合,并在逻辑上传递整个文档给聚集管道。为了优化这个操作,在可能的情况下,应该使用下面策略来避免扫描整个集合。

管道操作符和索引

$match和$sort管道操作符能够利用索引的优势,如果他们在管道的开始位置出现。

(在Mongo2.4新出现的:$geoNear管道操作符能利用地理索引的优势。当使用$geoNear,$geoNear必须在聚集管道的第一阶段出现。)

即使管道使用了索引,聚合操作依然要访问实际的文档。比如,索引不能完全覆盖聚集管道。

(在Mongo2.6之前的版本,对规模非常小的选择情况,索引能够覆盖管道)

提前过滤

如果你的聚集应用仅需要一个集合的一个数据子集,使用 $match,$limit,$skip阶段在文档进入管道时去限制文档.当被放置在管道的开始,$match操作符使用合适的索引去扫描集合匹配的文档.

在管道的开始紧跟着$sort阶段放置$match在逻辑上等价于一个使用排序的单一查询,并可以使用索引.如果可能的话,把$match放置在管道的开头.

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How does MySQL handle concurrency compared to other RDBMS?How does MySQL handle concurrency compared to other RDBMS?Apr 29, 2025 am 12:44 AM

MySQLhandlesconcurrencyusingamixofrow-levelandtable-levellocking,primarilythroughInnoDB'srow-levellocking.ComparedtootherRDBMS,MySQL'sapproachisefficientformanyusecasesbutmayfacechallengeswithdeadlocksandlacksadvancedfeatureslikePostgreSQL'sSerializa

How does MySQL handle transactions compared to other relational databases?How does MySQL handle transactions compared to other relational databases?Apr 29, 2025 am 12:37 AM

MySQLhandlestransactionseffectivelyusingtheInnoDBengine,supportingACIDpropertiessimilartoPostgreSQLandOracle.1)MySQLusesREPEATABLEREADasthedefaultisolationlevel,whichcanbeadjustedtoREADCOMMITTEDforhigh-trafficscenarios.2)Itoptimizesperformancewithabu

How does MySQL differ from PostgreSQL?How does MySQL differ from PostgreSQL?Apr 29, 2025 am 12:23 AM

MySQLisbetterforspeedandsimplicity,suitableforwebapplications;PostgreSQLexcelsincomplexdatascenarioswithrobustfeatures.MySQLisidealforquickprojectsandread-heavytasks,whilePostgreSQLispreferredforapplicationsrequiringstrictdataintegrityandadvancedSQLf

How does MySQL handle data replication?How does MySQL handle data replication?Apr 28, 2025 am 12:25 AM

MySQL processes data replication through three modes: asynchronous, semi-synchronous and group replication. 1) Asynchronous replication performance is high but data may be lost. 2) Semi-synchronous replication improves data security but increases latency. 3) Group replication supports multi-master replication and failover, suitable for high availability requirements.

How can you use the EXPLAIN statement to analyze query performance?How can you use the EXPLAIN statement to analyze query performance?Apr 28, 2025 am 12:24 AM

The EXPLAIN statement can be used to analyze and improve SQL query performance. 1. Execute the EXPLAIN statement to view the query plan. 2. Analyze the output results, pay attention to access type, index usage and JOIN order. 3. Create or adjust indexes based on the analysis results, optimize JOIN operations, and avoid full table scanning to improve query efficiency.

How do you back up and restore a MySQL database?How do you back up and restore a MySQL database?Apr 28, 2025 am 12:23 AM

Using mysqldump for logical backup and MySQLEnterpriseBackup for hot backup are effective ways to back up MySQL databases. 1. Use mysqldump to back up the database: mysqldump-uroot-pmydatabase>mydatabase_backup.sql. 2. Use MySQLEnterpriseBackup for hot backup: mysqlbackup--user=root-password=password--backup-dir=/path/to/backupbackup. When recovering, use the corresponding life

What are some common causes of slow queries in MySQL?What are some common causes of slow queries in MySQL?Apr 28, 2025 am 12:18 AM

The main reasons for slow MySQL query include missing or improper use of indexes, query complexity, excessive data volume and insufficient hardware resources. Optimization suggestions include: 1. Create appropriate indexes; 2. Optimize query statements; 3. Use table partitioning technology; 4. Appropriately upgrade hardware.

What are views in MySQL?What are views in MySQL?Apr 28, 2025 am 12:04 AM

MySQL view is a virtual table based on SQL query results and does not store data. 1) Views simplify complex queries, 2) Enhance data security, and 3) Maintain data consistency. Views are stored queries in databases that can be used like tables, but data is generated dynamically.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.