MongoDB Map Reduce function


  Translation results:

map

英[mæp] 美[mæp]

n. Map, celestial map; something similar to a map; face , face; genetic map (arrangement of genes on chromosomes)

vt. draw (a region, etc.) map; survey; detailed planning; [genetics] comparison

reduce

UK[rɪˈdju:s] US[rɪˈdu:s]

vt. Reduce; reduce; reduce; make weak

vi. Reduce; diet ;Evaporate;(liquid)concentrate and thicken

MongoDB Map Reduce functionsyntax

Function:Map-Reduce is a computing model. Simply put, it decomposes a large batch of work (data) (MAP) for execution, and then merges the results into the final result (REDUCE). The Map-Reduce provided by MongoDB is very flexible and quite practical for large-scale data analysis.

Syntax: >db.collection.mapReduce(function() {emit(key,value);}, //map function
function(key,values) {return reduceFunction}, //reduce function {out: collection, query: document, sort: document, limit: number }) Use MapReduce to implement two functions, the Map function and the Reduce function. The Map function calls emit(key, value) to traverse the collection. For all the records in the record, pass the key and value to the Reduce function for processing. The Map function must call emit(key, value) to return the key-value pair.

Parameters: map: mapping function (generates a sequence of key-value pairs as reduce function parameters). reduce statistical function, the task of the reduce function is to turn key-values ​​into key-value, that is, to turn the values ​​array into a single value value. . out The statistical results are stored in a collection (if not specified, a temporary collection will be used, which will be automatically deleted after the client is disconnected). query is a filtering condition. Only documents that meet the condition will call the map function. (query.limit, sort can be combined at will) The sort sorting parameters combined with sort and limit (which also sort the documents before sending them to the map function) can optimize the grouping mechanism limit and limit the upper limit of the number of documents sent to the map function (if there is no limit, Using sort alone is of little use)

MongoDB Map Reduce functionexample

>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "mark",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "php",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "php",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "php中文网,最全的技术文档。",
   "user_name": "php",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
现在,我们将在 posts 集合中使用 mapReduce 函数来选取已发布的文章(status:"active"),并通过user_name分组,计算每个用户的文章数:

>db.posts.mapReduce( 
   function() { emit(this.user_name,1); }, 
   function(key, values) {return Array.sum(values)}, 
      {  
         query:{status:"active"},  
         out:"post_total" 
      }
)
以上 mapReduce 输出结果为:

{
        "result" : "post_total",
        "timeMillis" : 23,
        "counts" : {
                "input" : 5,
                "emit" : 5,
                "reduce" : 1,
                "output" : 2
        },
        "ok" : 1
}
结果表明,共有4个符合查询条件(status:"active")的文档, 在map函数中生成了4个键值对文档,最后使用reduce函数将相同的键值分为两组。



具体参数说明:

result:储存结果的collection的名字,这是个临时集合,MapReduce的连接关闭后自动就被删除了。

timeMillis:执行花费的时间,毫秒为单位

input:满足条件被发送到map函数的文档个数

emit:在map函数中emit被调用的次数,也就是所有集合中的数据总量

ouput:结果集合中的文档个数(count对调试非常有帮助)

ok:是否成功,成功为1

err:如果失败,这里可以有失败原因,不过从经验上来看,原因比较模糊,作用不大

使用 find 操作符来查看 mapReduce 的查询结果:

>db.posts.mapReduce( 
   function() { emit(this.user_name,1); }, 
   function(key, values) {return Array.sum(values)}, 
      {  
         query:{status:"active"},  
         out:"post_total" 
      }
).find()
以上查询显示如下结果,两个用户 tom 和 mark 有两个发布的文章:

{ "_id" : "mark", "value" : 4 }
{ "_id" : "php", "value" : 1 }
用类似的方式,MapReduce可以被用来构建大型复杂的聚合查询。

Map函数和Reduce函数可以使用 JavaScript 来实现,使得MapReduce的使用非常灵活和强大。

Home

Videos

Q&A