用于安全编写原始 SQL 的库-Python教程-PHP中文网

首页

后端开发

Python教程

用于安全编写原始 SQL 的库

DDD

Sep 14, 2024 am 06:21 AM

我在 PropelAuth 的职责之一是用各种语言/框架编写示例应用程序/指南。这确实是我工作中最有趣的部分之一。我开始尝试不同的堆栈，无论是新的还是旧的，并找出支持客户的最佳方法。

因此，我最终从头开始创建了很多项目。每当我开始一个新项目时，我都必须做出一些重要的选择，而我倾向于花费大量时间的一个决定是：

我应该使用什么数据库库？

对我来说，我喜欢的库是那些我正在编写的代码和 SQL 查询本身之间具有很少抽象层的库。

这样做的一个原因是实用性——因为我经常切换语言，所以我没有太多时间来熟练掌握任何特定的 ORM。我过去也从事过包含大量数据科学成分的工作，所以 SQL 是我非常熟悉的东西。

但我也是一个不喜欢“魔法”的开发人员 - 所以我避免使用那些我无法轻易判断生成的 SQL 是什么样子的库，或者我觉得我花了所有时间的库谷歌搜索“如何加入 X”，然后搜索“如何在两个条件下加入 X”。

在这篇文章中，我想强调一些我经常使用的库，以及我很高兴尝试的库，它们都试图最小化我编写的代码和 SQL 之间的差异被处决。

我个人最喜欢的：SQLx

我最喜欢的 Rust 箱子之一是 SQLx。

用他们自己的话说：

SQLx 支持 编译时检查查询。然而，它并没有通过提供 Rust API 或 DSL（特定于域的语言）来构建查询来实现这一点。相反，它提供了将常规 SQL 作为输入的宏，并确保它对您的数据库有效。其工作方式是 SQLx 在编译时连接到您的开发数据库，让数据库本身验证您的 SQL 查询（并返回一些信息）。

换句话来说，SQLx 让您可以编写如下查询：

let row = sqlx::query!(r#"
    SELECT enail
    FROM user
    WHERE user_id = ?
"#, user_id)
  .fetch_one(&pool)
  .await?;

这似乎是标准的，但是当你编译代码时，你会得到这样的错误：

error returned from database: column "enail" of relation "user" does not exist

同样的编译时检查也适用于复杂查询：

SELECT job_id, job_data 
FROM job_queue
WHERE job_status = 'Queued' AND run_at >= NOW()
ORDER BY run_at ASC
FOR UPDATE SKIP LOCKE -- oops
LIMIT 1

error returned from database: syntax error at or near "LOCKE"

由于查询是根据数据库进行检查的，因此这也适用于您安装的任何扩展。

为什么这么酷？

令人难以置信的是我们实际上只是在编写 SQL。但与像 postgres 这样的包不同，它还允许您编写原始 SQL，SQLx 可以防止我们犯愚蠢的错误。

这确实需要付出很小的代价——我们现在对数据库有编译时依赖，但 SQLx 通过“离线模式”解决了这个问题。当您的数据库可用时，您可以生成一个包含所有经过验证的查询的文件，然后在您的构建中，SQLx 将检查该文件而不是数据库。

在我寻求最小化我编写的代码和执行的 SQL 之间的差异时，使用 SQLx 既没有差异，而且我不必牺牲安全性来获得它。

使用 PgTyped 为您的 SQL 生成 TS 接口

正如 JavaScript/TypeScript 生态系统中经常出现的那样，这里有很多选项。

像 Kysely 这样的选项可以从数据库生成 TS 类型，然后提供查询生成器和编写原始 SQL 的方法。 Drizzle 是一个查询生成器，但它的既定目标是减少您编写的 TS 代码与生成的 SQL 之间的差异。甚至还有一个 SQLx 端口我还没有机会尝试。

但是最符合我在这里寻找的库是 PgTyped。使用 PgTyped，您可以在单独的文件中定义查询，如下所示：

/* @name FindEmailById */
SELECT email FROM user WHERE user_id = :userId;

然后运行命令 npx pgtyped -c config.json，它会根据您的架构生成具有正确类型的函数：

export interface IFindEmailByIdParams {
    userId?: string | null;
}

export interface IFindEmailByIdResult {
    email: string
}export const findEmailById = new PreparedQuery



<p>您可以调用该函数从数据库获取结果。重要的是，如果您的查询错误（假设它引用了不存在的列），您会收到如下错误：<br>
</p>

<pre class="brush:php;toolbar:false">Error in query. Details: {
  errorCode: 'errorMissingColumn',
  hint: 'Perhaps you meant to reference the column "user.email".',
  message: 'column "enail" does not exist',
  position: '7'
}

这意味着您不仅可以安全地编写原始 SQL — 您的应用程序代码还可以获得一个很好的 TS 抽象来调用（或在测试中模拟）。

PgTyped 的最大缺点是 Github 问题——类型的可为空性不受尊重，这可能非常令人沮丧，因为这意味着您可能会合理地为必填字段传入 null。另一个缺点是它特定于 Postgres……稍后将在“可移植性”部分详细介绍。

Prisma recently released TypedSQL — a “a new way to write raw SQL queries in a type-safe way.” They mention that part of the inspiration was both SQLx and PgTyped, so I am excited to try it out!

Something for the Python world: PugSQL

A library I enjoy when I switch to Python is PugSQL (Python). Similar to PgTyped, you create separate SQL files for your queries like this:

-- :name find_email :one
select email from users where user_id = :user_id

which will create a function that you can call:

email = queries.find_email(user_id=42)

The downside (relative to the previous libraries) is these queries aren’t automatically checked for issues. That being said, some tests can surface most (all?) the issues with the query itself — you just need to write them.

If you are feeling fancy, it’s possible to add your own automation which will check the queries. There are ways to verify a query against a DB without running the query — it’s just some additional work. Each query being in its own file makes it a bit easier to automate since you don’t need to go parse out the queries in the first place.

Mark up your interfaces with JDBI

Whenever I talk about how much I liked Dropwizard, I usually get met with blank stares. It’s a bit of a deeper cut in the Java world relative to Spring (either that or normal people don’t discuss Dropwizard at parties).

One of the reasons I liked Dropwizard so much was just because it came with JDBI. That library allowed you to annotate the functions on an interface with SQL queries and it would generate the implementation for you.

public interface UserDAO {
  @SqlQuery("select email from user where user_id = :user_id")
  String fetchEmail(@Bind("user_id") String userId);
}

final UserDAO userDao = database.onDemand(UserDAO.class);

Again though, this would require additional testing to find issues in the queries.

I should also mention that Spring Data JPA does also have the same concept with it’s @Query annotation. It’s been a very long time, but back when I was comparing JDBI and Spring Data JPA - I always felt like Spring was trying to get me to use it’s more magical “function name to sql query” methods. Upon re-reading the docs recently though, I was wrong, and it does mention that you can fallback to @Query pretty frequently.

Other considerations

“Use it sparingly”

If you followed some of the links in this post, you’ll find that some of these libraries don’t advocate for this approach as the primary way to query the database.

TypedSQL describes it as an escape hatch for when querying via their ORM isn’t sufficient. Same for Spring Data JPA which describes it as “fine for a small number of queries”.

This isn’t an unfounded claim — if you go down the path of writing raw SQL for every query, it can be pretty verbose. There are absolutely times where I am making a simple, boring table that’s basically just a key-value store, and the exercise in writing INSERT INTO boring_table VALUES (...) and SELECT * FROM boring_table WHERE ... etc is just a typing exercise.

A library that provides the best of both worlds seems great! The devil is really in the details, as it depends on what you consider to be complex enough to warrant writing raw SQL and how frequently those queries come up.

Portability

One issue with the raw SQL approach is it’s less portable. If you are using an ORM, that ORM often will be compatible with more than just the database you are currently working with.

This can mean small things like running sqlite locally and a different DB in production — or big things like making it easier to migrate your database to something else.

Again, your mileage may vary here — it’s really dependent on how much you care about this.

Use a query builder instead

Going back to the java ecosystem, a popular library is jOOQ. With jOOQ, you aren’t writing raw SQL, but it’s very close:

Libraries for writing raw SQL safely

To me, this is great! My stated goal was just keeping the delta between my code and the generated SQL as little as possible, so query builders like jOOQ or Drizzle do a good job of keeping that delta small.

Not all query builders are made equal here, as I tend to dislike ones like Knex which have a larger delta.

Summary

Raw SQL libraries like SQLx, PgTyped, and JDBI allow writing SQL directly while providing safety and type checking.
These libraries aim to minimize the gap between code and executed SQL, with some offering benefits like compile-time checking and generated type interfaces.
替代方案包括查询构建器，例如 jOOQ 和 Drizzle，您可以直接编写 SQL，但差距仍然很小。
选择数据库库时的考虑因素包括可移植性、冗长性以及复杂查询与简单 CRUD 操作的需求。

以上是用于安全编写原始 SQL 的库的详细内容。更多信息请关注PHP中文网其他相关文章！

声明

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系admin@php.cn

Python中的合并列表：选择正确的方法May 14, 2025 am 12:11 AM

Tomergelistsinpython，YouCanusethe操作员，estextMethod，ListComprehension，Oritertools

如何在Python 3中加入两个列表？May 14, 2025 am 12:09 AM

在Python3中，可以通过多种方法连接两个列表：1)使用运算符，适用于小列表，但对大列表效率低；2)使用extend方法，适用于大列表，内存效率高，但会修改原列表；3)使用*运算符，适用于合并多个列表，不修改原列表；4)使用itertools.chain，适用于大数据集，内存效率高。

Python串联列表字符串May 14, 2025 am 12:08 AM

使用join()方法是Python中从列表连接字符串最有效的方法。1)使用join()方法高效且易读。2)循环使用运算符对大列表效率低。3)列表推导式与join()结合适用于需要转换的场景。4)reduce()方法适用于其他类型归约，但对字符串连接效率低。完整句子结束。

Python执行，那是什么？May 14, 2025 am 12:06 AM

pythonexecutionistheprocessoftransformingpypythoncodeintoExecutablestructions.1）InternterPreterReadSthecode，ConvertingTingitIntObyTecode，whepythonvirtualmachine（pvm）theglobalinterpreterpreterpreterpreterlock（gil）the thepythonvirtualmachine（pvm）

Python：关键功能是什么May 14, 2025 am 12:02 AM

Python的关键特性包括：1.语法简洁易懂，适合初学者；2.动态类型系统，提高开发速度；3.丰富的标准库，支持多种任务；4.强大的社区和生态系统，提供广泛支持；5.解释性，适合脚本和快速原型开发；6.多范式支持，适用于各种编程风格。

Python：编译器还是解释器？May 13, 2025 am 12:10 AM

Python是解释型语言，但也包含编译过程。1）Python代码先编译成字节码。2）字节码由Python虚拟机解释执行。3）这种混合机制使Python既灵活又高效，但执行速度不如完全编译型语言。

python用于循环与循环时：何时使用哪个？May 13, 2025 am 12:07 AM

useeAforloopWheniteratingOveraseQuenceOrforAspecificnumberoftimes; useAwhiLeLoopWhenconTinuingUntilAcIntiment.ForloopSareIdeAlforkNownsences，而WhileLeleLeleLeleLoopSituationSituationSituationsItuationSuationSituationswithUndEtermentersitations。