在你的职业生涯中,你会遇到薛定谔的猫问题,这种情况有时有效,有时无效。竞争条件是这些挑战之一(是的,只是其中之一!)。
在这篇博文中,我将展示一个真实的示例,演示如何重现问题并讨论使用 PostgreSQL 的可序列化事务隔离和咨询锁来处理竞争条件的策略。
受到“设计数据密集型应用程序”第 7 章 - 事务“弱隔离级别”的启发
Github 存储库和实际示例
该应用程序管理医院医生的值班轮班。为了关注竞争条件问题,让我们简化我们的场景。我们的应用程序围绕这个表进行解析:
CREATE TABLE shifts ( id SERIAL PRIMARY KEY, doctor_name TEXT NOT NULL, shift_id INTEGER NOT NULL, on_call BOOLEAN NOT NULL DEFAULT FALSE );
我们有一条重要的业务规则:
正如您可能已经猜到的,实现简单的 API 可能会导致竞争条件场景。考虑这个假设的情况:
杰克和约翰在同一班次期间都在医院待命。几乎同时,他们决定请假。一种成功了,但另一种依赖于有关有多少医生轮班的过时信息。结果,两人最终都下班了,违反了业务规则,并在没有值班医生的情况下离开了特定的班次:
John --BEGIN------doctors on call: 2-------leave on call-----COMMIT--------> (t) \ \ \ \ \ \ \ \ Database ------------------------------------------------------------------> (t) / / / / / / / / Jack ------BEGIN------doctors on call: 2-----leave on call----COMMIT-------> (t)
该应用程序是一个用 Golang 实现的简单 API。查看 GitHub 存储库,了解有关如何运行和执行脚本以重现此竞争条件场景的说明。总之,您需要:
测试尝试同时叫停两名医生,使用不同的方法到达端点:shiftId=1使用咨询锁,shiftId=2使用可序列化事务隔离,并且shiftId=3是一个简单的实现,没有并发控制。
k6结果将输出自定义指标来指示哪个shiftId违反了业务规则:
✓ at least one doctor on call for shiftId=1 ✓ at least one doctor on call for shiftId=2 ✗ at least one doctor on call for shiftId=3 ↳ 36% — ✓ 123 / ✗ 217
您将需要 Yarn、Go、K6 和 Docker 等工具,或者您可以使用 DevBox 来更轻松地设置存储库依赖项。
当我们的应用程序根据陈旧数据做出决策时,就会出现问题。如果两项交易几乎同时运行并且两者都试图叫停医生轮班,则可能会发生这种情况。一笔交易按预期成功,但另一笔交易由于依赖过时的信息,也错误地成功。我们怎样才能防止这种不良行为呢?有几种方法可以实现这一点,我将探索 PostgreSQL 支持的两个选项,尽管在其他数据库管理系统中也可以找到类似的解决方案。
可序列化快照隔离会自动检测并防止异常情况,例如我们的应用程序表现出的写入偏差。
我不会深入探讨事务隔离背后的理论,但它是许多流行数据库管理系统中的常见主题。您可以通过搜索快照隔离来找到很好的材料,例如 PostgreSQL 官方文档中关于事务隔离的这篇文章。此外,这是多年前提出此解决方案的论文。空谈是廉价的,所以让我们看一下代码:
首先,启动事务并将隔离级别设置为Serialized:
// Init transaction with serializable isolation level tx, err := db.BeginTxx(c.Request().Context(), &sql.TxOptions{ Isolation: sql.LevelSerializable, })
然后,继续执行操作。在我们的例子中它执行这个函数:
CREATE OR REPLACE FUNCTION update_on_call_status_with_serializable_isolation(shift_id_to_update INT, doctor_name_to_update TEXT, on_call_to_update BOOLEAN) RETURNS VOID AS $$ DECLARE on_call_count INT; BEGIN -- Check the current number of doctors on call for this shift SELECT COUNT(*) INTO on_call_count FROM shifts s WHERE s.shift_id = shift_id_to_update AND s.on_call = TRUE; IF on_call_to_update = FALSE AND on_call_count = 1 THEN RAISE EXCEPTION '[SerializableIsolation] Cannot set on_call to FALSE. At least one doctor must be on call for this shiftId: %', shift_id_to_update; ELSE UPDATE shifts s SET on_call = on_call_to_update WHERE s.shift_id = shift_id_to_update AND s.doctor_name = doctor_name_to_update; END IF; END; $$ LANGUAGE plpgsql;
每当由于并发执行而出现不一致的情况时,可序列化隔离级别将允许一个事务成功,并自动回滚其他事务并显示此消息,因此您可以安全地重试:
ERROR: could not serialize access due to read/write dependencies among transactions
确保执行业务规则的另一种方法是显式锁定特定班次的资源。我们可以在交易级别使用咨询锁来实现这一点。这种类型的锁完全由应用程序控制。您可以在这里找到更多相关信息。
需要注意的是,锁可以在会话级别和事务级别应用。您可以探索此处提供的各种功能。在我们的例子中,我们将使用 pg_try_advisory_xact_lock(key bigint) → boolean,它在提交或回滚后自动释放锁:
BEGIN; -- Attempt to acquire advisory lock and handle failure with EXCEPTION IF NOT pg_try_advisory_xact_lock(shift_id_to_update) THEN RAISE EXCEPTION '[AdvisoryLock] Could not acquire advisory lock for shift_id: %', shift_id_to_update; END IF; -- Perform necessary operations -- Commit will automatically release the lock COMMIT;
这是我们应用程序中使用的完整函数:
-- Function to Manage On Call Status with Advisory Locks, automatic release when the trx commits CREATE OR REPLACE FUNCTION update_on_call_status_with_advisory_lock(shift_id_to_update INT, doctor_name_to_update TEXT, on_call_to_update BOOLEAN) RETURNS VOID AS $$ DECLARE on_call_count INT; BEGIN -- Attempt to acquire advisory lock and handle failure with NOTICE IF NOT pg_try_advisory_xact_lock(shift_id_to_update) THEN RAISE EXCEPTION '[AdvisoryLock] Could not acquire advisory lock for shift_id: %', shift_id_to_update; END IF; -- Check the current number of doctors on call for this shift SELECT COUNT(*) INTO on_call_count FROM shifts s WHERE s.shift_id = shift_id_to_update AND s.on_call = TRUE; IF on_call_to_update = FALSE AND on_call_count = 1 THEN RAISE EXCEPTION '[AdvisoryLock] Cannot set on_call to FALSE. At least one doctor must be on call for this shiftId: %', shift_id_to_update; ELSE UPDATE shifts s SET on_call = on_call_to_update WHERE s.shift_id = shift_id_to_update AND s.doctor_name = doctor_name_to_update; END IF; END; $$ LANGUAGE plpgsql;
Dealing with race conditions, like the write skew scenario we talked about, can be pretty tricky. There's a ton of research and different ways to solve these problems, so definitely check out some papers and articles if you're curious.
These issues can pop up in real-life situations, like when multiple people try to book the same seat at an event or buy the same spot in a theater. They tend to appear randomly and can be hard to figure out, especially if it's your first time dealing with them.
When you run into race conditions, it's important to look into what solution works best for your specific situation. I might do a benchmark in the future to compare different approaches and give you more insights.
I hope this post has been helpful. Remember, there are tools out there to help with these problems, and you're not alone in facing them!
以上是处理竞争条件:一个实际示例的详细内容。更多信息请关注PHP中文网其他相关文章!