search
HomeWeb Front-endCSS TutorialConsistent Backends and UX: What Can Go Wrong?

Consistent Backends and UX: What Can Go Wrong?

Article series

  1. Why should you care?
  2. What problems may occur?
  3. What are the barriers to adopting new technologies?
  4. How does the new algorithm help?

The previous article explains strong consistency (as opposed to final consistency). This article is the second part of the series, and we will explain how a lack of strong consistency makes it more difficult to provide a good end-user experience, how it can lead to severe engineering overhead, and how it can make you vulnerable to attacks. This section is longer because we will explain different database exceptions and use several example scenarios and briefly explain which type of database is susceptible to each exception.

User experience is the driver of any application success, and relying on inconsistent backends increases the challenge of delivering a good experience. More importantly, building application logic on top of inconsistent data can lead to vulnerabilities. One paper calls this type of attack "ACIDrain." They investigated 12 of the most popular self-hosted e-commerce applications and found at least 22 possible serious attacks. One website is the Bitcoin wallet service, which had to be shut down due to these attacks. There is trouble when you choose a distributed database that is not 100% ACID. As our previous example explains, it is difficult for engineers to determine the assurances provided by a particular database due to misunderstandings, unclearly defined terms, and radical marketing.

What trouble? Your application may experience problems such as account balance errors, user rewards not received, transaction execution twice, messages out of order, or violation of response application rules. For a quick introduction to why distributed databases are needed and why they are difficult, see our first post or this wonderful video explanation. In short, a distributed database is a database that holds copies of data in multiple locations, for scalability, latency, and availability.

We will cover these four potential issues (and more) and illustrate them with examples from game development. Game development is complex and these developers face many problems that are very similar to the serious problems in real life. The game has trading systems, message systems, rewards that need to meet the conditions, etc. Remember how angry (or happy?) the gamers will be if things go wrong or look wrong. In games, user experience is crucial, so game developers often face great pressure to ensure their systems are fault-tolerant.

Ready? Let's dive into the first potential problem!

1. Old reading

A stale read refers to a read that returns old data, in other words, the returned value has not been updated based on the latest write. Many distributed databases, including traditional databases that scale up with replicas (read part 1 to understand how these databases work), experience stale read issues.

Impact on end users

First, stale reads affect the end user. And this is not a single impact.

Frustrating experience and unfair advantages

Imagine that two users in the game encounter a treasure chest with gold coins. The first user receives data from one database server, and the second user connects to the second database server. The sequence of events is as follows:

  1. User 1 (through database server 1) sees and opens the treasure chest and takes away the gold coins.
  2. User 2 (via the database server 2) sees a full treasure chest, opens it, and fails.
  3. User 2 still sees a full treasure chest and doesn't understand why it fails.

While this may seem like a minor issue, the result is the second player experience is frustrating. Not only is he at a disadvantage, but he often sees situations in the game that seem to exist but does not. Next, let's look at an example of a player taking action against stale readings!

Durable reads lead to repeated writes

Imagine a character in the game trying to buy shields and swords in the store. If multiple locations contain data and no intelligent system provides consistency, one node contains data older than the other. In this case, the user may purchase an item (contact the first node) and then check his inventory (contact the second node) and find that the item does not exist. Users may be confused and may think that the transaction is not successful. What would most people do in this case? Well, they will try to buy the item again. Once the second node catches up, the user has purchased the copy , and once the copy catches up, he suddenly finds that he has run out of money and has two pieces of each item. He thinks our game is broken.

In this case, the user spends resources he does not want to spend. If we write an email client on top of such a database, the user might try to send an email and then refresh the browser, but cannot retrieve the email he just sent, so it will be sent again. It is very difficult to provide a good user experience and implement secure transactions (such as bank transactions) on top of such systems.

Impact on developers

When encoding, you always have to expect something (not yet) to exist and encode accordingly. Writing fault-proof code becomes very challenging when reading is ultimately consistent, and users are likely to have problems in your application. When the reads are finally consistent, these questions disappear before you can investigate them. Basically, you end up chasing ghosts. Developers still often choose ultimately consistent database or distribution methods, as it usually takes time to notice problems. Then, once there is a problem in their application, they try to get creative and build solutions (1, 2) on top of their traditional databases to fix stale reads. There are many such guides and the fact that databases like Cassandra have implemented some consistency features that show that these problems are real and cause problems in production systems more often than you think. Custom solutions built on top of systems not built for consistency are complex and fragile. Why would anyone experience such trouble if there is a database that provides strong consistency out of the box?

Databases that exhibit this exception

Traditional databases (PostgreSQL, MySQL, SQL Server, etc.) that use the primary read replication usually experience stale read problems. Many newer distributed databases are also ultimately consistent initially, or in other words, without protection against stale reads. This is due to a strong belief in the developer community that this is necessary for expansion. This is how the most famous database started, but it recognizes how its users struggle to deal with this exception and has since provided additional measures to avoid it. Older databases or databases that are not designed to provide efficient and consistent (such as Cassandra, CouchDB, and DynamoDB) are ultimately consistent by default. Other approaches such as Riak are also ultimately consistent, but different paths are taken by implementing a conflict resolution system to reduce the chance of outdated values. However, this does not guarantee your data security, as conflict resolution is not foolproof.

2. Lost write

In the field of distributed databases, an important choice needs to be made when writing occurs simultaneously. One option (a safe option) is to ensure that all database nodes agree on the order of these writes. This is far from trivial, as it either requires a synchronous clock (for which specific hardware is required) or a clock-independent intelligent algorithm like Calvin. The second less secure option is to allow each node to write locally and then decide later on how to handle the conflict. Choosing a second option to the database may lose your writes.

Impact on end users

Considering two deals in the game, we start with 11 gold coins and buy two items. First, we buy a sword for 5 gold coins and then a shield for 5 gold coins, and both transactions are directed to different nodes in our distributed database. Each node reads the value, and in this case it is still 11 for both nodes. Both nodes will decide to write 6 as the result (11-5) because they do not know of any replication. Since the second transaction has not seen the value written by the first, the player ends up buying the sword and shield for a total of 5 coins instead of 10 coins. Good for users, but bad for the system! To remedy this behavior, distributed databases have several strategies—some are better than others.

Resolution strategies include "Last Write Win" (LWW) or "Longest Version History" (LVH) Win. LWW has long been Cassandra's policy, and it is still the default behavior if you don't configure it differently.

If we apply LWW conflict resolution to our previous example, the player still has 6 gold coins left, but only one item will be purchased. It's a bad user experience because the app confirmed his purchase of the second item, even if the database didn't identify it as being present in his inventory.

Unpredictable security

As you might imagine, it is not safe to write security rules on top of such systems. Many applications rely on complex security rules from the backend (or as directly on the database as possible) to determine whether a user can access resources. When these rules are based on unreliably updated stale data, how can we ensure that there will never be any violation? Imagine a user of a PaaS application calls his administrator and asks: "Can you make this public group private so we can reuse it for internal data?" The administrator applies the action and tells him that it is done. However, since administrators and users may be on different nodes, users may start adding sensitive data to groups that are technically still public.

Impact on developers

Debugging user problems will be a nightmare when writes are lost. Imagine a user reporting that he lost data in your app and then it took a day to get time to reply. How will you try to find out if the problem is caused by your database or by faulty application logic? In databases that allow tracking data history, such as FaunaDB or Datomic, you can trace back to the past to see how the data was manipulated. However, none of these databases are affected by missing writes, and databases that are indeed susceptible to such exceptions usually do not have time-rewind capabilities.

A database that is easily lost to writes

All databases that use conflict resolution instead of conflict avoidance will lose writes. Cassandra and DynamoDB use the Last Write Win (LWW) as the default; MongoDB used to use LWW, but has since abandoned it. Master-master distribution methods in traditional databases such as MySQL provide different conflict resolution strategies. Many distributed databases not built for consistency lose their writes. Riak’s easiest conflict resolution is powered by LWW, but they also enable smarter systems. But even with smart systems, there are sometimes no obvious ways to resolve conflicts. Riak and CouchDB hand over the responsibility for choosing the correct write to the client or application, allowing them to manually choose which version to keep.

Because distribution is complex, most databases use imperfect algorithms, so when nodes crash or network partitions appear, missing writes often occur in many databases. Even with MongoDB (which does not distribute writes (write goes to a node)), write conflicts can occur in rare cases where nodes crash immediately after writes.

3. Write deviation

Write bias is what can happen in the guarantee type that database vendors call snapshot consistency. In snapshot consistency, transactions are read from snapshots taken at the beginning of the transaction. Snapshot consistency prevents many exceptions. In fact, many people think it is completely safe until a paper (PDF) appears to prove that it is not the case. Therefore, it is not surprising that developers have trouble understanding why some guarantees are not good enough.

Before we discuss what does not work in snapshot consistency, let's first discuss which works. Suppose we have a battle between a knight and a mage, whose respective vitality is composed of four hearts.

When any character is attacked, a transaction is a function that calculates how many hearts have been removed:

 <code>damageCharacter(character, damage) { character.hearts = character.hearts - damage character.dead = isCharacterDead(character) }</code>

And, after each attack, another isCharacterDead function will also run to see if the character has any hearts:

 <code>isCharacterDead(character) { if (character.hearts </code>

In a simple case, the Knight's attack removes three hearts from the Mage, and then the Mage's spell removes four hearts from the Knight, regains his own health to four. If one transaction runs after another, then the two transactions will behave correctly in most databases.

But what if we add a third transaction, the Knight's attack, it runs simultaneously with the mage's spell?

Is the knight dead, is the mage still alive?

To deal with this confusion, snapshot consistency systems often implement a rule called "first committer wins." The transaction can only end if another transaction has not been written to the same line, otherwise it will roll back. In this example, since both transactions try to write to the same line (the mage's health value), only the life-sucking spell will work, and the Knight's second attack will be rolled back. The end result will then be the same as the previous example: a dead knight and a mage with full health.

However, some databases such as MySQL and InnoDB do not consider "the first committer win" as part of snapshot isolation. In this case we will have a lost write : the mage is now dead, although he should get health points from the life suck before the Knight's attack takes effect. (We did mention unclear terms and loose explanations, right?)

Snapshot consistency that includes the "first committer wins" rule does handle some things well, which is not surprising, as it has long been considered a good solution. This is still the method for PostgreSQL, Oracle, and SQL Server, but they all have different names. PostgreSQL calls this guarantee "repeatable", Oracle calls it "serializable" (incorrect by our definition), and SQL Server calls it "snapshot isolation." No wonder people get lost in this term forest. Let's see an example where it doesn't work as expected!

Impact on end users

The next battle will take place between two armies, and if all the characters of the army are dead, one army is considered dead:

 isArmyDead(army){
  if (all characters are dead) { return true }
  else { return false }
}

After each attack, the following function determines whether the character dies, and then runs the above function to see if the army dies:

 damageArmyCharacter(army, character, damage){
  character.hearts = character.hearts - damage
  character.dead = isCharacterDead(character)
  armyDead = isArmyDead(army)
  if (army.dead != armyDead){
    army.dead = armyDead
  }
}

First, the character's heart will be reduced by the damage it takes. We then verify that the army is dead by checking if each character has no health. Then, if the state of the army changes, we will update the army's "death" boolean.

Three mages attacked once each, resulting in three "life-absorbing" matters. The snapshot is taken at the beginning of the transaction, and since all transactions start at the same time, the snapshot is the same. Each transaction has a copy of data where all knights still have full health.

Let's see how the first "life-absorbing" transaction is resolved. In this transaction, mage1 attacks knight1, the knight loses 4 health points, and the attacker restores all health points. The matter decided that the Knights army was not dead because it could only see a snapshot where two knights still had full health and one knight died. The other two transactions act on the other mage and knight, but are carried out in a similar way. Each of these transactions initially has three living knights in its copy of the data, and only one knight is seen dead. Therefore, every matter determines that the Knights' army is still alive.

When all transactions are completed, none of the knights are alive, but the boolean value that we indicate whether the army dies is still set to false. Why? Because when taking the snapshot, no Knight died. Therefore, each transaction sees its own knights die, but does not know the other knights in the army. Although this is an exception in our system (called write bias), the writes have passed because they each write to different roles and the write to the army has never changed. Great, we have a ghost army now!

Impact on developers

Data quality

What if we want to make sure that the user has a unique name? The transaction we create the user will check if the name exists; if it does not exist, we will use that name to write to the new user. However, if two users try to register with the same name, the snapshot will not notice anything because the user is written to different rows and therefore there will be no conflict. We now have two users of the same name on the system.

Many other exception examples may appear due to write bias. If you are interested, Martin Kleppman's book Designing Data-intensive Applications describes this in more detail.

Writing code in different ways to avoid rollbacks

Now, let's consider a different approach where the attack is not targeting a specific role in the military. In this case, the database is responsible for choosing which knight to attack first.

 damageArmy(army, damage){
  character = getFirstHealthyCharacter(knight)
  character.hearts = character.hearts - damage
  character.dead = isCharacterDead(character)
  // ...
}

If we perform multiple attacks in parallel like the previous example, getFirstHealthyCharacter will always target the same knight, which will cause multiple transactions to be written to the same line. This will be blocked by the "first submitter wins" rule, which will roll back the other two attacks. While it prevents exceptions, developers need to understand these issues and write code creatively around them. But wouldn't it be easier if the database does this for you out of the box?

Databases that are prone to write deviations

Any database that provides snapshot isolation rather than serialization can experience write bias. For an overview of the database and its isolation levels, see this article.

4. Out of order writing

To avoid missing writes and stale reads, the goal of distributed databases is so-called " strong consistency ". We mentioned that databases can choose to agree on global order (a safe choice) or decide to resolve conflicts (a choice that results in lost writes). If we decide on a global order, it means that although the sword and shield are purchased in parallel, the end result should behave as if we bought the sword first and then the shield. This is also commonly known as "linearization" because you can linearize database operations. Linearization is the gold standard for ensuring data security.

Different vendors offer different isolation levels, which you can compare here. One term that often appears is serialization, which is a slightly less stringent version of strong consistency (or linearization). Serialization is already quite strong and covers most exceptions, but still leaves room for a very slight exception that is caused by the reordering of the writes. In this case, the database can freely switch the order even if the transaction has been committed. Simply put, linearization is the order of serialization plus guaranteed. When the database lacks this guaranteed order, your application is susceptible to out-of-order writes.

Impact on end users

Dialogue reordering

If someone sends a second message due to an error, the conversation can be sorted in a confusing way.

User operations reorder

If our player has 11 gold coins and just buys items in order of importance without proactively checking the amount of gold coins he owns, the database can reorder these purchase orders. If he didn't have enough money, he could have bought the lowest-important item first.

In this case, there is a database check to verify that we have enough gold. Imagine that we don’t have enough money to have an account below zero will cost us money, just like the bank charges you an overdraft fee when you are below zero. You may quickly sell one item to make sure you have enough money to buy all three items. However, sales designed to increase your balance may be reordered to the end of the transaction list, which will effectively push your balance below zero. If this is a bank, you will most likely incur expenses that you should never bear.

Unpredictable security

After configuring security settings, users expect these settings to be applied to all upcoming actions, but problems may arise when users talk to each other through different channels. Remember the example we discussed, an administrator is talking to a user who wants to make a group private and then adding sensitive data to it. Although this may occur in a database that provides serializability, the time window for this situation becomes smaller, this may still occur because the administrator's operations may not be completed until after the user's operations. Things go wrong when users communicate through different channels and expect the database to be sorted in real time.

This exception also occurs if the user is redirected to a different node due to load balancing. In this case, two consecutive operations end up on different nodes and may be reordered. If a girl adds her parents to a Facebook group with limited viewing permissions and posts her spring break photos, then these images may still appear in her parents’ feed.

In another example, an automated trading bot might have settings such as the highest purchase price, expenditure limit, and a list of stocks to focus on. If the user changes the list of stocks the bot should buy and then changes the expenditure limit, he will not be happy if the transactions are reordered and the bot has used the newly allocated budget for the old stock.

Impact on developers

Vulnerability

Some vulnerabilities depend on potential reversals of transactions. Imagine that a gamer will receive a trophy once he has 1,000 gold coins, and he really wants this trophy. The game calculates how many gold coins the player has by adding together multiple containers of gold coins (such as his storage and his carry (his inventory). If the player quickly exchanges money between storage and inventory, he can actually cheat the system.

In the following figure, the second player acts as a criminal partner to ensure that the transfer of funds between storage and inventory occurs in different transactions, thereby increasing the chances that these transactions are routed to different nodes. A more serious example in the real world is a bank that uses a third-party account to transfer money; a bank may miscalculate whether someone is eligible for a loan because various transactions have been sent to different nodes and do not have enough time to sort.

Databases that are prone to out-of-order writing

Any database that does not provide linearization may experience write bias. See this article for an overview of which databases provide linearization. Spoiler: Not many.

All exceptions may return when consistency is limited

One relaxation of strong consistency to be discussed in the end is to ensure that it is only within a certain range. Typical scopes are data center areas, partitions, nodes, collections, or rows. If you are programming on top of a database that imposes these types of boundaries on strong consistency, you need to remember these boundaries to avoid accidentally opening Pandora's box again.

Below is an example of consistency, but guaranteed to be within one set. The following example contains three sets: one for the player, one for the blacksmith (i.e. the blacksmith who repairs the player's items), and the other for the items. Each player and each blacksmith have an id list pointing to items in the item collection.

If you want to trade a shield between two players (for example, from Brecht to Robert), then everything will be fine because you are still in a collection, so your transactions are still within the scope of guaranteed consistency. But what if Robert's sword was repaired at the blacksmith and he wanted to retrieve it? The transaction then spans two sets, namely the set of the blacksmith and the set of the players, and is guaranteed to be cancelled. This limitation is usually found in document databases such as MongoDB. Then you need to change the programming method to find creative solutions around limitations. For example, you can encode the location of the item to the item itself.

Of course, real games are complicated. You may want to be able to drop items on the floor or place them on the market so that items can be owned by the player, but not necessarily in the player's inventory. When things get more complicated, these workarounds significantly increase the depth of the technology and change how you code to stay within the guaranteed scope of your database.

in conclusion

We've seen examples of different problems that can arise when your database behavior doesn't match expectations. While some situations may seem trivial at first glance, they all have a significant impact on developer productivity, especially when systems are scaled. More importantly, they leave you vulnerable to unpredictable security vulnerabilities – which can cause irreparable damage to the reputation of your application.

We discussed several levels of consistency, but now that we have seen these examples, let's put them together:

Also remember that each of these correctness guarantees may have boundaries:

Finally, realize that we only mentioned a few exceptions and consistency guarantees, and there are actually more. For interested readers, I highly recommend Martin Kleppman's Design Data-intensive Applications.

We live in an era where we no longer need to care, as long as we choose a strong consistency database without restrictions. Thanks to the emergence of new approaches such as Calvin (FaunaDB) and Spanner (Google Spanner, FoundationDB), we now have multi-region distributed databases that can provide extremely low latency and operate as expected in each case. So why do you still risk shooting yourself in the foot and choose a database that doesn’t provide these guarantees?

In the next article in this series, we will cover the impact on the developer experience. Why is it difficult to convince developers that consistency is important? Spoiler: Most people need to experience it in person to see the necessity. But think about it: "If there is an error, is it your application error, or a data error? How do you know?" Once the limitations of the database are manifested as errors or a bad user experience, you need to address the limitations of the database, which can lead to inefficient glued code that cannot be extended. Of course, by then you had invested a lot of money and realized it was too late.

Article series

  1. Why should you care?
  2. What problems may occur?
  3. What are the barriers to adopting new technologies?
  4. How does the new algorithm help?

The above is the detailed content of Consistent Backends and UX: What Can Go Wrong?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Simulating Mouse MovementSimulating Mouse MovementApr 22, 2025 am 11:45 AM

If you've ever had to display an interactive animation during a live talk or a class, then you may know that it's not always easy to interact with your slides

Powering Search With Astro Actions and Fuse.jsPowering Search With Astro Actions and Fuse.jsApr 22, 2025 am 11:41 AM

With Astro, we can generate most of our site during our build, but have a small bit of server-side code that can handle search functionality using something like Fuse.js. In this demo, we’ll use Fuse to search through a set of personal “bookmarks” th

Undefined: The Third Boolean ValueUndefined: The Third Boolean ValueApr 22, 2025 am 11:38 AM

I wanted to implement a notification message in one of my projects, similar to what you’d see in Google Docs while a document is saving. In other words, a

In Defense of the Ternary StatementIn Defense of the Ternary StatementApr 22, 2025 am 11:25 AM

Some months ago I was on Hacker News (as one does) and I ran across a (now deleted) article about not using if statements. If you’re new to this idea (like I

Using the Web Speech API for Multilingual TranslationsUsing the Web Speech API for Multilingual TranslationsApr 22, 2025 am 11:23 AM

Since the early days of science fiction, we have fantasized about machines that talk to us. Today it is commonplace. Even so, the technology for making

Jetpack Gutenberg BlocksJetpack Gutenberg BlocksApr 22, 2025 am 11:20 AM

I remember when Gutenberg was released into core, because I was at WordCamp US that day. A number of months have gone by now, so I imagine more and more of us

Creating a Reusable Pagination Component in VueCreating a Reusable Pagination Component in VueApr 22, 2025 am 11:17 AM

The idea behind most of web applications is to fetch data from the database and present it to the user in the best possible way. When we deal with data there

Using 'box shadows' and clip-path togetherUsing 'box shadows' and clip-path togetherApr 22, 2025 am 11:13 AM

Let's do a little step-by-step of a situation where you can't quite do what seems to make sense, but you can still get it done with CSS trickery. In this

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software