首页 >后端开发 >Golang >沙发GO! — 使用 Go 编写的查询服务器增强 CouchDB

沙发GO! — 使用 Go 编写的查询服务器增强 CouchDB

PHPz
PHPz原创
2024-07-19 12:38:41593浏览

CouchGO! — Enhancing CouchDB with Query Server Written in Go

在过去的一个月里,我一直在积极致力于与 CouchDB 相关的概念验证项目,探索其功能并为未来的任务做准备。在此期间,我多次阅读了 CouchDB 文档,以确保我了解一切是如何工作的。在阅读文档时,我发现了这样的说法:尽管 CouchDB 附带了用 JavaScript 编写的默认查询服务器,但创建自定义实现相对简单,并且自定义解决方案已经存在。

我做了一些快速研究,发现了用 Python、Ruby 或 Clojure 编写的实现。由于整个实现看起来并不太长,因此我决定通过尝试编写自己的自定义查询服务器来尝试 CouchDB。为此,我选择 Go 作为语言。我之前对这种语言没有太多的经验,除了在 Helm 图表中使用 Go 模板之外,但我想尝试一些新的东西,并且认为这个项目将是一个很好的机会。

了解查询服务器

开始工作之前,我再次回顾了 CouchDB 文档,以了解查询服务器的实际工作原理。根据文档,查询服务器的高级概述非常简单:

查询服务器是一个外部进程,它通过 stdio 接口通过 JSON 协议与 CouchDB 进行通信,并处理所有设计函数调用 [...]。

CouchDB 发送到查询服务器的命令结构可以表示为 [, ] 或 ["ddoc", , [, ] 或 ["ddoc", , [, ] funcname>], [, , …]] 设计文档。

所以基本上,我要做的就是编写一个能够从 STDIO 解析此类 JSON、执行预期操作并返回文档中指定的响应的应用程序。 Go 代码中涉及大量类型转换来处理各种命令。有关每个命令的具体详细信息可以在文档的查询服务器协议部分找到。

我在这里遇到的一个问题是查询服务器应该能够解释和执行设计文档中提供的任意代码。知道 Go 是一种编译语言,我预计会在这一点上陷入困​​境。值得庆幸的是,我很快就找到了 Yeagi 包,它能够轻松解释 Go 代码。它允许创建沙箱并控制对可以在解释代码中导入的包的访问。就我而言,我决定仅公开我的名为 couchgo 的包,但也可以轻松添加其他标准包。

介绍 CouchGO!

作为我工作的成果,开发了一个名为 CouchGO! 的应用程序!出现了。虽然它遵循查询服务器协议,但它不是 JavaScript 版本的一对一重新实现,因为它有自己的方法来处理设计文档功能。

例如,在CouchGO!中,没有像emit这样的辅助函数。要发出值,您只需从映射函数返回它们即可。此外,设计文档中的每个函数都遵循相同的模式:它只有一个参数,该参数是一个包含特定于函数的属性的对象,并且应该只返回一个值作为结果。该值不必是原始值;根据函数的不同,它可能是一个对象、一个映射,甚至是一个错误。

要开始使用 CouchGO!,您只需从我的 GitHub 存储库下载可执行二进制文件,将其放置在 CouchDB 实例中的某个位置,然后添加一个允许 CouchDB 启动 CouchGO! 的环境变量!过程。

例如,如果您将 couchgo 可执行文件放入 /opt/couchdb/bin 目录中,则需要添加以下环境变量以使其能够工作。

export COUCHDB_QUERY_SERVER_GO="/opt/couchdb/bin/couchgo"

使用 CouchGO 编写函数!

为了快速了解如何使用 CouchGO! 编写函数,让我们探索以下函数接口:

func Func(args couchgo.FuncInput) couchgo.FuncOutput { ... }

CouchGO 中的每个功能!将遵循此模式,其中 Func 被替换为适当的函数名称。目前,CouchGO!支持以下函数类型:

  • 地图
  • 减少
  • 过滤器
  • 更新
  • 验证 (validate_doc_update)

让我们检查一个示例设计文档,该文档指定带有 map 和 reduce 函数的视图,以及 validate_doc_update 函数。此外,我们需要指定我们使用 Go 作为语言。

{
  "_id": "_design/ddoc-go",
  "views": {
    "view": {
      "map": "func Map(args couchgo.MapInput) couchgo.MapOutput {\n\tout := couchgo.MapOutput{}\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 1})\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 2})\n\tout = append(out, [2]interface{}{args.Doc[\"_id\"], 3})\n\t\n\treturn out\n}",
      "reduce": "func Reduce(args couchgo.ReduceInput) couchgo.ReduceOutput {\n\tout := 0.0\n\n\tfor _, value := range args.Values {\n\t\tout += value.(float64)\n\t}\n\n\treturn out\n}"
    }
  },
  "validate_doc_update": "func Validate(args couchgo.ValidateInput) couchgo.ValidateOutput {\n\tif args.NewDoc[\"type\"] == \"post\" {\n\t\tif args.NewDoc[\"title\"] == nil || args.NewDoc[\"content\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Title and content are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\tif args.NewDoc[\"type\"] == \"comment\" {\n\t\tif args.NewDoc[\"post\"] == nil || args.NewDoc[\"author\"] == nil || args.NewDoc[\"content\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Post, author, and content are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\tif args.NewDoc[\"type\"] == \"user\" {\n\t\tif args.NewDoc[\"username\"] == nil || args.NewDoc[\"email\"] == nil {\n\t\t\treturn couchgo.ForbiddenError{Message: \"Username and email are required\"}\n\t\t}\n\n\t\treturn nil\n\t}\n\n\treturn couchgo.ForbiddenError{Message: \"Invalid document type\"}\n}",
  "language": "go"
}

现在,让我们从地图函数开始分解每个函数:

func Map(args couchgo.MapInput) couchgo.MapOutput {
  out := couchgo.MapOutput{}
  out = append(out, [2]interface{}{args.Doc["_id"], 1})
  out = append(out, [2]interface{}{args.Doc["_id"], 2})
  out = append(out, [2]interface{}{args.Doc["_id"], 3})

  return out
}

In CouchGO!, there is no emit function; instead, you return a slice of key-value tuples where both key and value can be of any type. The document object isn't directly passed to the function as in JavaScript; rather, it's wrapped in an object. The document itself is simply a hashmap of various values.

Next, let’s examine the reduce function:

func Reduce(args couchgo.ReduceInput) couchgo.ReduceOutput {
  out := 0.0
  for _, value := range args.Values {
    out += value.(float64)
  }
  return out
}

Similar to JavaScript, the reduce function in CouchGO! takes keys, values, and a rereduce parameter, all wrapped into a single object. This function should return a single value of any type that represents the result of the reduction operation.

Finally, let’s look at the Validate function, which corresponds to the validate_doc_update property:

func Validate(args couchgo.ValidateInput) couchgo.ValidateOutput {
  if args.NewDoc["type"] == "post" {
    if args.NewDoc["title"] == nil || args.NewDoc["content"] == nil {
      return couchgo.ForbiddenError{Message: "Title and content are required"}
    }

    return nil
  }

  if args.NewDoc["type"] == "comment" {
    if args.NewDoc["post"] == nil || args.NewDoc["author"] == nil || args.NewDoc["content"] == nil {
      return couchgo.ForbiddenError{Message: "Post, author, and content are required"}
    }

    return nil
  }

  return nil
}

In this function, we receive parameters such as the new document, old document, user context, and security object, all wrapped into one object passed as a function argument. Here, we’re expected to validate if the document can be updated and return an error if not. Similar to the JavaScript version, we can return two types of errors: ForbiddenError or UnauthorizedError. If the document can be updated, we should return nil.

For more detailed examples, they can be found in my GitHub repository. One important thing to note is that the function names are not arbitrary; they should always match the type of function they represent, such as Map, Reduce, Filter, etc.

CouchGO! Performance

Even though writing my own Query Server was a really fun experience, it wouldn’t make much sense if I didn’t compare it with existing solutions. So, I prepared a few simple tests in a Docker container to check how much faster CouchGO! can:

  • Index 100k documents (indexing in CouchDB means executing map functions from views)
  • Execute reduce function for 100k documents
  • Filter change feed for 100k documents
  • Perform update function for 1k requests

I seeded the database with the expected number of documents and measured response times or differentiated timestamp logs from the Docker container using dedicated shell scripts. The details of the implementation can be found in my GitHub repository. The results are presented in the table below.

Test CouchGO! CouchJS Boost
Indexing 141.713s 421.529s 2.97x
Reducing 7672ms 15642ms 2.04x
Filtering 28.928s 80.594s 2.79x
Updating 7.742s 9.661s 1.25x

As you can see, the boost over the JavaScript implementation is significant: almost three times faster in the case of indexing, more than twice as fast for reduce and filter functions. The boost is relatively small for update functions, but still faster than JavaScript.

Conclusion

As the author of the documentation promised, writing a custom Query Server wasn’t that hard when following the Query Server Protocol. Even though CouchGO! lacks a few deprecated functions in general, it provides a significant boost over the JavaScript version even at this early stage of development. I believe there is still plenty of room for improvements.

If you need all the code from this article in one place, you can find it in my GitHub repository.

Thank you for reading this article. I would love to hear your thoughts about this solution. Would you use it with your CouchDB instance, or maybe you already use some custom-made Query Server? I would appreciate hearing about it in the comments.

Don’t forget to check out my other articles for more tips, insights, and other parts of this series as they are created. Happy hacking!

以上是沙发GO! — 使用 Go 编写的查询服务器增强 CouchDB的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn