Possible futures of the Ethereum protocol, part 4: The Verge

2024 Oct 23 See all posts


Possible futures of the Ethereum protocol, part 4: The Verge

Special thanks to Justin Drake, Hsiao-wei Wang, Guillaume Ballet, Ignacio, Josh Rudolf, Lev Soukhanov, Ryan Sean Adams and Uma Roy for feedback and review.

One of the most powerful things about a blockchain is the fact that anyone can run a node on their computer and verify that the chain is correct. Even if 95% of the nodes running the chain consensus (PoW, PoS...) all immediately agreed to change the rules, and started producing blocks according to the new rules, everyone running a fully-verifying node would refuse to accept the chain. The stakers who are not part of such a cabal would automatically converge on, and continue building, a chain that continues to follow the old rules, and fully-verifying users would follow that chain.

This is a key difference between blockchains and centralized systems. However, for this property to hold, running a fully-verifying node needs to be actually feasible for a critical mass of people. This applies both to stakers (as if stakers are not verifying the chain, they are not actually contributing to enforcing the protocol rules), and to regular users. Today, running a node is possible on a consumer laptop (including the one being used to write this post), but doing so is difficult. The Verge is about changing this, and making fully-verifying the chain so computationally affordable that every mobile wallet, browser wallet, and even smart watch is doing it by default.


The Verge, 2023 roadmap.


Originally, the "Verge" referred to the idea of moving Ethereum state storage to Verkle trees - a tree structure that allows for much more compact proofs, enabling stateless validation of Ethereum blocks. A node could verify an Ethereum block without having any of the Ethereum state (account balances, contract code, storage...) on its hard drive, at a cost of spending a few hundred kilobytes of proof data and a few hundred extra milliseconds verifying a proof. Today, the Verge represents a much larger vision focused on enabling maximally resource-efficient verification of the Ethereum chain, which includes not just stateless validation technology, but also verifying all Ethereum execution with SNARKs.

In addition to the added long-term focus on SNARK-verifying the whole chain, another new question has to do with whether or not Verkle trees are even the best technology in the first place. Verkle trees are vulnerable to quantum computers, and so if we replace the current KECCAK Merkle Patricia tree with Verkle trees, we will later have to replace the trees again. The natural alternative to Merkle trees is skipping straight to using a STARK of Merkle branches in a binary tree. Historically, this has been considered non-viable due to overhead and technical complexity. More recently, however, we have seen Polygon prove 1.7 million Poseidon hashes per second on a laptop with circle STARKs, and proving times for more "conventional" hashes are also rapidly improving thanks to techniques like GKR.

As a result, over the past year the Verge has become much more open-ended, and there are several possibilities.


The Verge: key goals

In this chapter

Stateless verification: Verkle or STARKs

What problem are we trying to solve?

Today, an Ethereum client needs to store hundreds of gigabytes of state data in order to verify blocks, and this amount continues to increase with each passing year. The raw state data increases by ~30 GB per year, and individual clients have to store some extra data on top to be able to update the trie efficiently.



This reduces the number of users who can run fully-verifying Ethereum nodes: even though hard drives large enough to store all Ethereum state and even history for many years are readily available, the computers that people buy by default tend to only have a few hundred gigabytes of storage. The state size also introduces a great friction into the process of setting up a node for the first time: the node needs to download the entire state, which can take hours or days. This has all kinds of knockon effects. For example, it makes it significantly harder for a staker to upgrade their staking setup. Technically, it's possible to do this with no downtime - start a new client, wait for it to sync, then shut down the old client and transfer the key - but in practice it's technically complicated.

What is it and how does it work?

Stateless verification is a technology that allows nodes to verify blocks without having the entire state. Instead, each block comes with a witness, which includes (i) the values (eg. code, balances, storage) at the specific locations in the state that the block will access, and (ii) a cryptographic proof that those values are correct.

Actually implementing stateless verification requires changing the Ethereum state tree structure. This is because the current Merkle Patricia tree is extremely unfriendly to implementing any cryptographic proof scheme, especially in the worst case. This is true for both "raw" Merkle branches, and the possibility of "wrapping" the Merkle branches in a STARK. The key difficulties stem from two weaknesses of the MPT:

  1. It's a hexary tree (ie. each node has 16 children). This means that on average, a proof in a size-N tree has 32 * (16 - 1) * log16(N) = 120 * log2(N) bytes, or about 3840 bytes in a 232-item tree. With a binary tree, you only need 32 * (2 - 1) * log2(N) = 32 * log2(N) bytes, or about 1024 bytes.
  2. The code is not Merkelized. This means that proving any access of account code requires providing the entire code, which is a maximum of 24000 bytes.



We can compute a worst-case scenario as follows:

30,000,000 gas / 2,400 ("cold" account read cost) * (5 * 480 + 24,000) = 330,000,000 bytes

The branch cost is slightly decreased ( 5 * 480 instead of 8 * 480 ) because the top parts of branches are repeated when there are many of them. But even still, this works out to a completely unrealistic amount of data to download within one slot. If we try to wrap it in a STARK, we get two problems: (i) KECCAK is relatively STARK-unfriendly, and (ii) 330 MB of data means we have to prove 5 million calls to the KECCAK round function, which is way too much to prove on all but the most powerful consumer hardware, even if we could make STARK-proving KECCAK much more efficient.

If we just replace the hexary tree with a binary tree, and we additionally Merkelize code, then the worst case becomes roughly 30,000,000 / 2,400 * 32 * (32 - 14 + 8) = 10,400,000 bytes (the 14 is a subtraction for redundant bits of ~214 branches, and the 8 is the length of a proof going into a leaf in a chunk). Note that this requires a change in gas costs, to charge for accessing each individual chunk of code; EIP-4762 does this. 10.4 MB is much better, but it's still too much data for many nodes to download within one slot. And so we need to introduce some more powerful technology. For this, there are two leading solutions: Verkle trees, and STARKed binary hash trees.

Verkle trees

Verkle trees use elliptic curve-based vector commitments to make much shorter proofs. The key unlock is that the piece of the proof corresponding to each parent-child relationship is only 32 bytes, regardless of the width of the tree. The only limit to the width of the tree is that if the tree gets too wide, proofs become computationally inefficient. The implementation proposed for Ethereum has a width of 256.



The size of a single branch in a proof thus becomes 32 * log256(N) = 4 * log2(N) bytes. The theoretical max proof size thus becomes roughly 30,000,000 / 2,400 * 32 * (32 - 14 + 8) / 8 = 1,300,000 bytes (the math works out slightly differently in practice because of uneven distribution of state chunks, but this is fine as a first approximation).

As an additional caveat, note that in all of the above examples, this "worst case" is not quite a worst case: an even worse case is when an attacker intentionally "mines" two addresses to have a long common prefix in the tree and reads from one of them, which can extend the worst-case branch length by another ~2x. But even with this caveat, Verkle trees get us to ~2.6 MB worst-case proofs, which roughly matches today's worst-case calldata.

We also take advantage of this caveat to do another thing: we make it very cheap to access "adjacent" storage: either many chunks of code of the same contract, or adjacent storage slots. EIP-4762 provides a definition of adjacency, and charges only 200 gas for adjacent access. With adjacent accesses, the worst-case proof size becomes 30,000,000 / 200 * 32 = 4,800,800 bytes, which is still roughly within tolerances. If we want this value decreased for safety, we can increase the adjacent access costs slightly.

STARKed binary hash trees

The technology here is very self-explanatory: you do a binary tree, take the max-10.4-MB proof that you need to prove the values in a block, and replace the proof with a STARK of the proof. This gets us to the point where the proof itself consists of only the data being proven, plus ~100-300 kB fixed overhead from the actual STARK.

The main challenge here is prover time. We can make basically the same calculations as above, except instead of counting bytes, we count hashes. A 10.4 MB block means 330,000 hashes. If we add in the possibility of an attacker "mining" addresses with a long common prefix in the tree, the real worst case becomes around 660,000 hashes. So if we can prove ~200,000 hashes per second, we're fine.

These numbers have already been reached on a consumer laptop with the Poseidon hash function, which has been explicitly designed for STARK-friendliness. However, Poseidon is relatively immature, and so many do not yet trust it for security. There are thus two realistic paths forward:

  1. Do lots and lots of security analysis on Poseidon quickly, and get comfortable enough with it to deploy it at L1
  2. Use a more "conservative" hash function, such as SHA256 or BLAKE

Starkware's circle STARK provers at the time of this writing can only prove ~10-30k hashes per second on a consumer laptop if they are proving conservative hash functions. However, STARK technology is improving quickly. Even today, GKR-based techniques show promise in potentially increasing this to the ~100-200k range.

Use cases of witnesses other than verifying blocks

In addition to verifying blocks, there are three other key use cases for more efficient stateless validation:

One thing that all of these use cases have in common is that they require a fairly large number of proofs, but each proof is small. For this reason, STARK proofs do not actually make sense for them; instead, it's most realistic to just use Merkle branches directly. Another advantage of Merkle branches is that they are updateable: given a proof of a state object X, rooted in block B, if you receive a child block B2 with its witness, you can update the proof to make it rooted in block B2. Verkle proofs are also natively updateable.

What is left to do, and what are the tradeoffs?

The main remaining work to do is:

  1. More analysis on the consequences of EIP-4762 (statelessness gas cost changes)
  2. More work finalizing and testing the transition procedure, which is a large part of the complexity of any statelessness EIP
  3. More security analysis of Poseidon, Ajtai and other "STARK-friendly" hash functions
  4. More development of ultra-efficient STARK protocols for "conservative" (or "traditional") hash functions, eg. based on ideas from Binius or GKR.

We also will soon have a decision point of which of three options to take: (i) Verkle trees, (ii) STARK-friendly hash functions, and (iii) conservative hash functions. Their properties can be approximately summarized in this table:

Algorithm Proof size Security assumptions Worst-case prover time (today)
Verkle Data plus ~100-2,000 kB Elliptic curve (not quantum-resistant) < 1s
STARK over conservative hash functions (eg. SHA256, BLAKE) Data plus ~100-300 kB Conservative hash functions > 10 s
STARK over aggressive hash functions (Poseidon, Ajtai) Data plus ~100-300 kB Relatively new and less-tested hash functions 1-2s

In addition to these "headline numbers", there are a few other important considerations:

If we want the Verkle witness updateability properties in a way that's quantum-safe, and reasonably efficient, one other possible path is lattice-based Merkle trees.

If proof systems end up being not efficient enough in the worst case, one other "rabbit out of a hat" that we could use to compensate for such an inadequacy is multidimensional gas: have separate gas limits for (i) calldata, (ii) computation, (iii) state accesses, and possibly other distinct resources. Multidimensional gas adds complexity, but in exchange it much more tightly bounds the ratio between average case and worst case. With multidimensional gas, the theoretical max number of branches to prove could plausibly decrease from 30,000,000 / 2400 = 12,500 to eg. 3000. This would make BLAKE3 (just barely) sufficient even today, with no further prover improvements.


Multidimensional gas allows the resource limits of a block to much more closely replicate the resource limits of the underlying hardware.


Another "rabbit out of a hat" is this proposal to delay state root computation until the slot after a block. This would give us a full 12 seconds to compute the state root, meaning that only ~60,000 hashes/sec proving time is sufficient even in the most extreme cases, again putting us in the range of BLAKE3 being just barely sufficient.

This approach has the downside that it would increase light client latency by a slot, though there are more clever versions of the technique that reduce this delay to just the proof generation latency. For example, the proof could be broadcasted across the network as soon as any node generates it, instead of waiting for the next block.

How does it interact with other parts of the roadmap?

Solving statelessness greatly increases the ease of solo staking. This becomes much more valuable if technologies that reduce the minimum balance of solo staking, such as Orbit SSF or application-layer strategies like squad staking, become available.

Multidimensional gas becomes easier if EOF is also introduced. This is because a key complexity of multidimensional gas for execution is handling sub-calls which don't pass along the parent call's full gas, and EOF makes this problem trivial by simply making such sub-calls illegal (and native account abstraction would provide an in-protocol alternative for the current primary use case of partial-gas sub-calls).

One other important synergy is between stateless validation and history expiry. Today, clients have to store nearly a terabyte of history data; this data is several times larger than the state. Even if clients are stateless, the dream of nearly-storage-free clients will not be realized unless we can relieve clients of the responsibility to store history as well. The first step in this regard is EIP-4444, which also implies storing historical data in torrents or the Portal network.

Validity proofs of EVM execution

What problem are we trying to solve?

The long-term goal for Ethereum block verification is clear: you should be able to verify an Ethereum block by (i) downloading the block, or perhaps even only small parts of the block with data availability sampling, and (ii) verifying a small proof that the block is valid. This would be an extremely low-resource operation, and could be done on a mobile client, inside a browser wallet, or even (without the data availability part) in another chain.

Getting to this point requires having SNARK or STARK proofs of (i) the consensus layer (ie. the proof of stake), and (ii) the execution layer (ie. the EVM). The former is itself a challenge, and it should be addressed during the process of making further ongoing improvements to the consensus layer (eg. for single slot finality). The latter requires proofs of EVM execution.

What is it and how does it work?

Formally, in the Ethereum specifications, the EVM is defined as a state transition function: you have some pre-state S, a block B, and you are computing a post-state S' = STF(S, B). If a user is using a light client, they do not have S and S' or even B in their entirety; instead, they have a pre-state root R, and a post-state root R' and a block hash H. The full statement that needs to be proven is approximately:

If this exists, you can have a light client that fully verifies the Ethereum EVM execution. This allows clients to be quite low-resource already. To get to true fully-verifying Ethereum clients, you also need to do the same for the consensus side.

Implementations of validity proofs for EVM computation already exist, and are being used heavily by layer 2s. However, there is still a lot to do to make EVM validity proofs viable for L1.

What is left to do, and what are the tradeoffs?

Today, validity proofs for the EVM are inadequate in two dimensions: security and prover time.

A secure validity proof involves having an assurance that the SNARK actually verifies the EVM computation, and does not have a bug in it. The two leading techniques for increasing security are multi-provers and formal verification. Multi-provers means having multiple independently-written validity proof implementations, much like there are multiple clients, and having clients accept a block if it's proven by a sufficiently large subset of these implementations. Formal verification involves using tools often used to prove mathematical theorems, such as Lean4, to prove that a validity proof only accepts inputs that are correct executions of the underlying EVM specification (eg. the EVM K Semantics or the Ethereum execution layer specifications (EELS) written in python).

Sufficiently fast prover time means that any Ethereum block can be proven in less than ~4 seconds. Today, we are still far from this, though we are much closer than was imagined possible even two years ago. To get to this goal, we need to advance in three directions:

In addition to this, the two "rabbits out of a hat" mentioned in the previous section (multidimensional gas, and delayed state root) can also help here. However, it's worth noting that unlike stateless validation, where using either rabbit out of a hat means that we have sufficient technology to do what we need today, even with these techniques full ZK-EVM validation will take more work - it will just take less work.

One thing that was not mentioned above is prover hardware: using GPUs, FPGAs and ASICs to generate proofs faster. Fabric Cryptography, Cysic and Accseal are three companies pushing ahead on this. This will be extremely valuable for layer 2s, but it's unlikely to become a decisive consideration for layer 1, because there is a strong desire to keep layer 1 highly decentralized, which implies that proof generation must be within the capabilities of a reasonably large subset of Ethereum users, and should not be bottlenecked by a single company's hardware. Layer 2s can make more aggressive tradeoffs.

More work remains to be done in each of these areas:

Likely tradeoffs here include:

How does it interact with other parts of the roadmap?

The core tech needed to make EVM validity proofs at layer 1 happen is heavily shared with two other areas:

A successful implementation of validity proofs at layer 1 allows for the ultimate in easy solo staking: even the weakest computer (including phone or smart watch) would be able to stake. This further increases the value of addressing the other limitations to solo staking (eg. the 32 ETH minimum).

Additionally, EVM validity proofs at L1 can enable considerable L1 gas limit increases.

Validity proofs of consensus

What problem are we trying to solve?

If we want it to be possible to fully verify an Ethereum block with a SNARK, then the EVM execution is not the only part we need to prove. We also need to prove the consensus: the part of the system that handles deposits, withdrawals, signatures, validator balance updates, and other elements of the proof-of-stake part of Ethereum.

The consensus is considerably simpler than the EVM, but it has the challenge that we don't have layer 2 EVM rollups as a reason why most of the work is going to be done anyway. Hence, any implementation of proving Ethereum consensus would need to be done "from scratch", although the proof systems themselves, are shared work that can be built on top of.

What is it and how does it work?

The beacon chain is defined as a state transition function, just like the EVM. The state transition function is dominated by three things:

In each block, we need to prove 1-16 BLS12-381 ECADDs per validator (potentially more than one, because signatures can get included in multiple aggregates). This can be compensated for by subset precomputation techniques, so altogether we can say that it's one BLS12-381 ECADD per validator. Today, there are ~30,000 validators signing in each slot. In the future, with single slot finality, this could change in either direction (see the exposition here): if we take the "brute force" route, this could increase to 1 million validators per slot. Meanwhile, with Orbit SSF, it would stay at 32,768, or even decrease to 8,192.


How BLS aggregation works. Verifying the aggregate signature only requires an ECADD per participant, instead of an ECMUL. But 30,000 ECADDs is still a lot to prove.


For pairings, there is a current maximum of 128 attestations per slot, implying the need to verify 128 pairings. With EIP-7549 and further changes, this can plausibly decrease to 16 per slot. Pairings are few in number, but they are extremely expensive: each one takes thousands of times longer to run (or prove) than an ECADD.

A major challenge with proving BLS12-381 operations is that there is no convenient curve whose curve order equals the BLS12-381 field size, which adds considerable overhead to any proving system. The Verkle trees proposed for Ethereum, on the other hand, were built with the Bandersnatch curve, which makes BLS12-381 itself the natural curve to use in a SNARK system to prove a Verkle branch. A fairly naive implementation can prove ~100 G1 additions per second; clever techniques like GKR would almost certainly be required to make proving fast enough.

For SHA256 hashes, today the worst case is the epoch transition block, where the entire validator short-balance tree, and a significant number of validator balances, gets updated. The validator short-balance tree has one byte per validator, so ~1 MB of data gets re-hashed. This corresponds to 32,768 SHA256 calls. If a thousand validators' balances fall above or below a threshold that requires the effective balance, in the validator record, to be updated, that corresponds to a thousand Merkle branches, so perhaps another ten thousand hashes. The shuffling mechanism requires 90 bits per validator (so, 11 MB of data), but this can be computed at any time over the course of an epoch. With single-slot finality, these numbers may again increase or decrease depending on the details. Shuffling becomes unnecessary, though Orbit may bring back the need for some degree of it.

Another challenge is the need to read all validator state, including public keys, in order to verify a block. Reading the public keys alone takes 48 million bytes for 1 million validators, together with Merkle branches. This requires millions of hashes per epoch. If we had to prove the proof-of-stake validation today, a realistic approach would be some form of incrementally verifiable computation: store a separate data structure inside the proof system that is optimized for efficient lookups, and prove updates to this structure.

To summarize, there are a lot of challenges.

Fixing these challenges most efficiently may well require a deep redesign of the beacon chain, which could happen at the same time as a switch to single-slot finality. Features of this redesign could include:

What is left to do, and what are the tradeoffs?

Realistically, it will take years before we have a validity proof of the Ethereum consensus. This is roughly the same timeline that we need to implement single slot finality, Orbit, changes to the signature algorithm, and potentially security analysis needed to become sufficiently confident to use "aggressive" hash functions like Poseidon. Hence, it makes the most sense to work on these other problems, and while doing that work keep STARK-friendliness in mind.

The main tradeoff may well be in order of operations, between a more incremental approach to reforming the Ethereum consensus layer and a more radical "many changes at once" approach. For the EVM, an incremental approach makes sense, because it minimizes disruption to backwards compatibility. For the consensus layer, backwards compatibility concerns are smaller, and there are benefits to a more "holistic" re-think in various details of how the beacon chain is constructed, to best optimize for SNARK-friendliness.

How does it interact with other parts of the roadmap?

STARK-friendliness needs to be a primary concern in long-term redesigns of the Ethereum proof of stake consensus, most notably single-slot finality, Orbit, changes to the signature scheme, and signature aggregation.