Home >web3.0 >After the Cancun upgrade, what is the performance bottleneck of Rollups?

After the Cancun upgrade, what is the performance bottleneck of Rollups?

王林
王林forward
2024-03-28 14:51:22644browse

Compiled by: Azuma, Odaily Planet Daily

According to noon on March 26, Beijing time, Monad co-founder Keone Hon published an in-depth article about the performance status of Rollup. In the article, Keone detailed how the theoretical TPS limit of Rollup should be calculated after the block display upgrade, and explained that after the upgrade, the single transaction fee of some Layer 2 (Base) is still as high as several dollars. In addition, Keone also outlined some of the bottlenecks faced by Rollup and potential improvements.

The following is the original content of Keone, compiled by Odaily Planet Daily. For the convenience of readers, the translator has made certain additions to the original text.

There have been some discussions in the market recently about Rollup execution bottlenecks and Gas limitations, which not only involve Layer1, but also Layer2. I discuss these bottlenecks below.

Data Availability (DA)

According to the EIP-4844 standard, the Blob data structure was introduced in the blockchain upgrade. Ethereum’s Data Availability (DA) has been greatly improved, Layer2 Data synchronization transactions no longer need to bid in the same fee market as ordinary Layer 1 transactions.

Currently, the overall capacity status of Blobs is to produce 3 125kb Blobs per block (12 seconds), that is, every 31.25kb. Given that the size of a transaction is approximately 100 bytes, this means The shared TPS of all Rollups is about 300.

Of course, there is some information here that requires special remarks.

  • First, if Rollup adopts better transaction data compression technology to reduce the size of a single transaction, TPS can grow.
  • The second is that in theory, in addition to using Blob to synchronize data, Rollup can also continue to use calldata to synchronize data (that is, the old solution before the Cancun upgrade), although doing so will bring additional complexity.
  • Third, there are differences in the way different ZK-rollups publish status (especially zkSync Era and Starknet), so for these Rollups, the calculation methods and results will also be different.

Gas limit of Rollup

Recently, Base has attracted great attention due to the surge in gas fees. The fees for some ordinary transactions on the network have risen to several dollars. .

Why did the Base network only decrease for a period of time after the Cancun upgrade, and now it has returned to or even exceeded the level before the upgrade? This is because there is a total gas limit on blocks on Base, which is enforced through a parameter in its code.

The gas parameters currently used by Base are the same as Optimism, that is, there is a total limit of 5 million gas per Layer2 block (2 seconds). When the demand (total number of transactions) on the network exceeds the supply (area) block space), price settlement will adopt an on-demand execution mechanism, resulting in a surge in gas on the network.

Why doesn’t Base increase the total gas limit? Or in other words, why does Rollup need to set a total gas limit?

In addition to the TPS upper limit on data availability mentioned above, there are actually two other major reasons, namely the bottleneck of execution throughput and the hidden danger of state growth.

Question 1: Bottleneck of execution throughput

Generally speaking, EVM Rollup runs an EVM forked from Geth, which means that they have similar performance characteristics to the Geth client. .

Geth's client is single-threaded (that is, it can only handle one task at a time), it uses LevelDB/PebbleDB encoding, and stores its state in a merkle patricia trie (MPT). This is a general-purpose database that uses another tree structure (LSM tree) as the underlying layer to store data on a solid-state drive (SSD).

For Rollup, "state access" (reading values ​​from the merkle trie) and "state update" (updating the merkle trie at the end of each block) are the most expensive links in the execution process. This is so because the cost of a single read from the SSD is 40-100 microseconds, and because the merkle trie data structure is embedded in another data structure (LSM tree), it requires a lot of unnecessary extra Find.

This link can be imagined as the process of finding a specific file in a complex file system. You need to go from the root directory (trie root node) all the way to the target file (leaf node). When searching for each file, you need to find a specific key in the database LevelDB, and inside LevelDB you must perform the actual data storage operation through another data structure called the LSM tree. This process causes many additional searches. step. These extra steps make the entire data reading and updating quite slow and inefficient.

In the design of Monad, we solved this problem through MonadDb. MonadDb is a custom database that supports storing merkle trie directly on disk, avoiding the overhead of LevelDb; supports asynchronous IO, allowing multiple reads to be processed in parallel; bypassing the file system.

In addition, the "optimistic parallel execution" mechanism adopted by Monad allows multiple transactions to be processed in parallel and their status can be extracted from MonadDb in parallel.

However, Rollup does not have these optimizations and therefore has a bottleneck in execution throughput.

It should be noted that the Erigon/Reth client has certain optimizations for database efficiency, and some Rollup clients are also built based on these clients (such as OP-Reth). Erigon/Reth uses a flat data structure, which reduces the query cost when reading to a certain extent; however, they do not support asynchronous reading or multi-threaded processing. Additionally, the merkle root needs to be recalculated after each block, which is also a rather slow process.

Question 2: Hidden dangers of state growth

Like other blockchains, Rollup will also limit their throughput to prevent its active state from growing too fast.

A common argument in the market is that the reason why the state growth rate is worrying is because if the state data grows significantly, the demand for solid state drives (SSD) equipment will also have to be adjusted upward. However, I think this is a bit inaccurate, SSDs are relatively cheap (a high-quality 2TB SSD is about $200), and Ethereum's full state has "only" been about 200 GB in its nearly 10-year history. From a pure storage perspective, there is still a lot of room for growth.

The bigger hidden danger is that as the status continues to grow, the time to query the specified status fragment will become longer. This is because the current merkle patricia trie will use the "shortcut" when the condition of "node has only one child node" is met, which can reduce the effective depth of the trie and thereby speed up the query process. However, if the status of the merkle trie becomes more and more full, There will be fewer and fewer "shortcuts" available.

In summary, the hidden danger of state growth is ultimately a matter of state access efficiency. Therefore, accelerating state access is the key to making state growth more sustainable.

Why just optimizing the hardware doesn’t work?

Layer2 is currently still in a relatively centralized state, that is, the network still relies on a single sequencer to maintain state and produce blocks. One might ask, then why not have the sorter run on hardware with very high RAM (random access memory) so that all the state can be stored in memory?

There are two reasons for this.

First of all, this will not solve the data availability bottleneck problem of the Ethereum main network. Although based on the current situation of Base, the surge in the network gas is not caused by insufficient data availability capabilities of the main network. , but in the long run this will eventually become a major bottleneck limiting Rollup.

The second issue is decentralization. Although the sequencer is still highly centralized, other roles involved in network operation are also important. They also need to be able to run nodes independently and replay the same transactions. history and maintain the same state.

The raw transaction data and state commits on Layer1 are not enough to unlock the complete state. Any actor with a need for access to the complete state (such as a merchant, exchange or automated trader) should run a full Layer2 node to process transactions and have an up-to-date copy of the state.

Rollups are still blockchains, and blockchains are interesting because of their ability to achieve global coordination through shared global state. For all blockchains, powerful software is necessary, and optimizing hardware alone is not enough to solve the problem.

Community Interaction

After Keone posted this article, key personnel of multiple head Layer2 projects interacted below the update.

After the Cancun upgrade, what is the performance bottleneck of Rollups?

zkSync co-founder Alex Gluchowski asked in response to the article "The merkle root needs to be recalculated after each block" that Monad is different in this regard?

Keone's reply was that there would be an optimized algorithm for calculating the merkle root after each block.

After the Cancun upgrade, what is the performance bottleneck of Rollups?

Jesse Pollak, the person in charge of Base, also used this to explain why the gas cost of Base increased instead of falling after the upgrade in Cancun. He said that EIP-4844 has significantly reduced the Layer1 level. The DA cost and gas fees should have been reduced, but as network transaction demand has increased by more than five times, and there is a 250 gas/s limit on blocks on the Base network, demand exceeds supply, causing gas fees to rise.

The above is the detailed content of After the Cancun upgrade, what is the performance bottleneck of Rollups?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:panewslab.com. If there is any infringement, please contact admin@php.cn delete