Original Title: 《Possible futures of the Ethereum protocol, part 1: The Merge》
Written by: Vitalik Buterin
Compiled by: Tia, Techub News
Initially, “the Merge” referred to the transition from proof of work to proof of stake. Today, Ethereum has been running as a proof of stake system for nearly two years, and proof of stake has performed excellently in terms of stability, performance, and avoiding centralization risks. However, there are still some areas where proof of stake needs improvement.
The roadmap I outlined in 2023 includes the following parts: improving technical features, such as increasing stability, performance, and accessibility for small validators, as well as economic reforms to address centralization risks. The former is part of “the Merge,” while the latter is part of “the Scourge.”
This article will focus on the “the Merge” part: What improvements can be made to the technical design of proof of stake, and what are the ways to achieve these improvements?
Please note that this is a list of ideas, not an exhaustive checklist of things that need to be accomplished for proof of stake.
Single Slot Finality and Staking Democratization
What problem are we trying to solve?
Currently, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a validator. This is due to the need to balance three objectives:
-
Maximizing the number of validators participating in staking (which directly means minimizing the minimum ETH required for staking)
-
Minimizing finality time
-
Minimizing the overhead of running nodes
These three objectives are in conflict: to achieve economic finality (i.e., the attacker needs to destroy a large amount of ETH to revert a finalized block), each validator needs to sign two messages every time finality is achieved. Therefore, if you have many validators, either it takes a long time to process all the signatures, which requires very powerful node performance, or you can only handle a limited number of signatures at once.
Ultimately, all of this serves the goal: attackers need to incur significant costs to succeed. This is the meaning of the term “economic finality.” If we disregard this goal, we could simply randomly select a committee (as Algorand does) to finalize each slot. However, the problem with this approach is that if an attacker can control 51% of the validators, they can attack at a very low cost (reverting finalized blocks, censorship, or delaying finality): only the nodes in their committee can be detected as participating in the attack and punished, either through slashing or minor soft forks. This means attackers can repeatedly attack the chain. Therefore, if we want to achieve economic finality, a simple committee-based system will not work.
At first glance, we do need all validators to participate.
But in an ideal scenario, we can still achieve economic finality while improving the following two aspects:
-
Finalizing blocks within a single slot (ideally maintaining or even reducing the current 12-second length), instead of 15 minutes.
-
Allowing validators to stake 1 ETH (originally 32 ETH).
These goals can be seen as “bringing Ethereum’s performance closer to (more centralized) performance-focused L1s.”
However, it will still use a finality mechanism with higher security guarantees to ensure the safety of all Ethereum users. Currently, most users cannot access this level of security because they are unwilling to wait 15 minutes; with a single slot finality mechanism, users can achieve transaction finality almost immediately after confirming a transaction. Furthermore, if users and applications do not have to worry about chain rollbacks, it can simplify the protocol and surrounding infrastructure, reducing the factors that the protocol and infrastructure need to consider.
The second goal is aimed at supporting solo stakers (users staking independently rather than relying on institutions). The main factor preventing more people from solo staking is the 32 ETH minimum requirement. Lowering the minimum requirement to 1 ETH would address this issue, making other factors the main limitations for solo staking.
But there is a challenge: faster finality and more democratized staking conflict with minimizing overhead. This is why we did not adopt single slot finality at the beginning. However, recent research has proposed some possible solutions to this problem.
What is it and how does it work?
Single-slot finality refers to a consensus algorithm that finalizes blocks within a single slot. This is not an inherently difficult goal to achieve: many algorithms (such as Tendermint consensus) have already achieved this with optimal properties. However, Ethereum’s unique “++inactivity leaks++” property is not present in Tendermint. This property allows Ethereum to continue operating and eventually recover even if more than 1/3 of the validators are offline. Fortunately, there are now solutions available to achieve this property: there are proposals to modify Tendermint-style consensus to accommodate inactivity leaks.
Single Slot Finality Proposal
The hardest part is figuring out how to make single slot finality work effectively with a very high number of validators without incurring extremely high overhead for node operators. Currently, there are several solutions:
- Option 1: Brute Force —— Strive to achieve better signature aggregation protocols, possibly using ZK-SNARKs, which would allow us to process millions of validator signatures in each slot.
Horn, one of the optimized aggregation protocol designs
- **Option 2: ** Orbit Committee , a new mechanism that randomly selects a medium-sized committee to determine the finality of the chain while ensuring high economic attack costs.
One way to think about Orbit SSF is that it opens up a compromise option that does not lack economic finality like Algorand-style committees, but can still achieve a certain degree of high economic attack costs, allowing Ethereum to maintain sufficient economic finality to ensure extreme security while also improving single slot efficiency.
Orbit leverages the pre-existing heterogeneity in validator deposit sizes to achieve as much economic finality as possible while still giving small validators a corresponding role in participation. Additionally, Orbit uses a slow committee rotation mechanism to ensure high overlap between adjacent quorums, thereby ensuring that its economic finality remains applicable even during committee rotations.
-
Option 3: Dual-layer staking, a mechanism that divides stakers into two categories, one with higher deposit requirements and the other with lower deposit requirements. Only stakers in the higher deposit tier will directly achieve economic finality. There have been some proposals (see Rainbow Staking Post) to specify the rights and responsibilities of lower-tier deposit stakers. Common ideas include:
-
Delegating rights to higher-tier stakers
-
Randomly selecting lower-tier stakers to validate and finalize each block
-
Generating inclusion lists rights
What are the connections to existing research?
What else needs to be done? What are the trade-offs?
There are four paths to choose from (we can also take a hybrid path):
-
Maintain the status quo
-
Orbit SSF
-
Brute force SSF
-
SSF with a dual-layer staking mechanism
(1) means doing nothing and keeping things as they are, but this would worsen Ethereum’s security experience and staking centralization properties beyond what they should be.
(2) avoids “high-tech” solutions by cleverly rethinking protocol assumptions: we relax the requirements for “economic finality,” so we require that attacks are expensive, but the attack cost could be 10 times lower than it is now (e.g., an attack cost of $2.5 billion instead of $25 billion). It is widely believed that Ethereum’s current economic finality far exceeds what is necessary, and its main security risks lie elsewhere, so this could be considered an acceptable sacrifice.
The main work is to verify whether the Orbit mechanism is secure and possesses the desired properties, then fully formalize and implement it. Additionally, EIP-7251 (Increase Maximum Effective Balance) allows voluntary validator balances the Merge, which would immediately reduce the verification overhead of the chain and serve as an effective initial phase for the launch of Orbit.
(3) forcibly solves the problem with high-tech solutions. Achieving this requires collecting a large number of signatures (over 1 million) in a very short time (5-10 seconds).
(4) creates a dual-layer staking system without overthinking the mechanism or using high-tech solutions, but it still carries centralization risks. The risks largely depend on the specific rights granted to the lower staking tier. For example:
-
If lower-tier stakers need to delegate their proving rights to higher-tier stakers, then the delegation could become centralized, and we might end up with two highly centralized staking tiers.
-
If random sampling is required from the lower tier to approve each block, then an attacker could spend a minimal amount of ETH to prevent finality.
-
If lower-tier stakers can only create inclusion lists, then the proving layer could remain centralized, at which point a 51% attack on the proving layer could censor the inclusion list itself.
Multiple strategies can be combined, such as:
(1 + 2): Add Orbit but do not implement single slot finality.
(1 + 3): Use brute force techniques to reduce the minimum deposit requirement without implementing single slot finality. The required aggregation amount is 64 times less than in pure (3), making the problem easier.
(2 + 3): Execute Orbit SSF with conservative parameters (e.g., a 128k validator committee instead of 8k or 32k) and use brute force techniques to make it super efficient.
(1 + 4): Add rainbow staking but do not implement single slot finality.
How does it interact with other parts of the roadmap?
In addition to other benefits, single slot finality also reduces the risk of certain types of multi-block MEV attacks. Furthermore, in a single slot finality world, the proposer-validator separation design and other block production mechanisms within the protocol need to be designed differently.
The weakness of achieving goals through brute force is that reducing slot time becomes more challenging.
Single Secret Leader Election
What problem are we trying to solve?
Today, which validator will propose the next block can be known in advance. This creates a security vulnerability: attackers can monitor the network, determine which validators correspond to which IP addresses, and launch DoS attacks against them when they are about to propose a block.
What is it and how does it work?
The best way to solve the DoS problem is to hide the information about which validator will generate the next block (at least until the block is actually generated). If we disregard the “single” requirement (only one party generates the next block), one solution is to allow anyone to create the next block, but this requires randao reveal to be less than 2 (256) / N. Generally, only one validator can meet this requirement (but sometimes there may be two or more, or sometimes none). Therefore, combining the “confidential” requirement with the “single” requirement has always been a challenge.
The single secret leader election protocol creates a “blind” validator ID for each validator using some cryptographic techniques, then allows many proposers to have the opportunity to reshuffle and re-blind the blind ID pool (similar to how mix networks work), thereby solving this problem. In each slot, a random blind ID is selected. Only the owner of that blind ID can generate a valid proof to propose a block, but no one knows which validator corresponds to that blind ID.
Whisk SSLE Protocol
What are the connections to existing research?
What remains to be done? What are the trade-offs?
In fact, what remains is to find and implement a sufficiently simple protocol so that we can easily deploy it on the mainnet. We place a high value on simplicity in Ethereum, and we do not want complexity to increase further. The SSLE implementations we have seen have added hundreds of lines of specification code and introduced new assumptions in complex cryptography. Finding a sufficiently efficient quantum-resistant SSLE implementation is also an outstanding issue.
Ultimately, it may be the case that the “marginal additional complexity” of SSLE will only decrease to a sufficiently low level when we boldly attempt to introduce mechanisms for executing general zero-knowledge proofs into the Ethereum protocol at L1 for other reasons (such as state trees, ZK-EVM).
Another option is to completely ignore SSLE and use off-protocol mitigations (such as at the p2p layer) to address the DoS issue.
How does it interact with other parts of the roadmap?
If we add a proposer-validator separation (APS) mechanism, such as execution tickets, then execution blocks (i.e., blocks containing Ethereum transactions) will not require SSLE, as we can rely on dedicated block builders. However, for consensus blocks (i.e., blocks containing protocol messages such as proofs, possibly containing lists, etc.), we will still benefit from SSLE.
Faster Transaction Confirmation
What problem are we trying to solve?
Shortening Ethereum’s transaction confirmation time from 12 seconds to 4 seconds is valuable. Doing so will significantly improve the user experience for L1 and rollup-based applications while making DeFi protocols more efficient. It will also make it easier for L2 to decentralize, as it will allow a large number of L2 applications based on rollups to run, reducing the need for L2 to build their own decentralized ordering based on committees.
What is it and how does it work?
There are roughly two technologies here:
-
Reducing slot time, for example, to 8 seconds or 4 seconds. This does not necessarily mean 4 seconds of finality: finality itself requires three rounds of communication, so we can treat each round of communication as a separate block, which will at least achieve preliminary confirmation after 4 seconds.
-
Allowing proposers to publish pre-confirmations during the slot. In extreme cases, proposers can incorporate transactions they see into their blocks in real-time and immediately publish pre-confirmation messages for each transaction (“My first transaction is 0×1234…”, “My second transaction is 0×5678…”). Conflicts where proposers publish two conflicting confirmations can be handled in two ways: (i) slashing the proposer, or (ii) using witnesses to vote on which one came first.
What are the connections to existing research?
What remains to be done? What are the trade-offs?
The feasibility of shortening slot time is currently unclear. Even today, many validators in various parts of the world struggle to obtain proofs quickly enough. Attempting a 4-second slot time carries the risk of centralizing validators, and due to latency, it is impractical to become a validator outside of a few privileged regions.
The weakness of the proposer pre-confirmation method is that it can significantly improve the average case inclusion time but cannot improve the worst case: if the current proposer is running well, your transaction will be pre-confirmed in 0.5 seconds instead of (on average) 6 seconds to be included, but if the current proposer is offline or running poorly, you still have to wait a full 12 seconds to start the next slot and provide a new proposer.
Additionally, there is an outstanding question of how to incentivize pre-confirmations. Proposers have the incentive to maximize their optionality for as long as possible. If witnesses sign the timeliness of pre-confirmations, then transaction senders can condition part of the fees on immediate pre-confirmation, but this places an additional burden on witnesses and may make it harder for witnesses to continue acting as neutral “dumb pipes.”
On the other hand, if we do not attempt to do this and keep the finality time at 12 seconds (or longer), the ecosystem will place more emphasis on L2’s pre-confirmation mechanisms, and interactions across L2s will take longer.
How does it interact with other parts of the roadmap?
Proposer-based pre-confirmation actually relies on the proposer-validator separation (APS) mechanism, such as execution tickets. Otherwise, the pressure to provide real-time pre-confirmations may be too centralized for regular validators.
Other Research Areas
51% Attack Recovery
It is generally believed that if a 51% attack occurs (including attacks that cannot be proven through cryptography, such as censorship), the community will work together to implement minority soft forks to ensure that good actors win and bad actors are penalized or slashed due to inactivity. However, this level of over-reliance on the social layer can be considered unhealthy. We can try to reduce reliance on the social layer by making the recovery processas automated as possible.
Complete automation is impossible because if it were fully automated, it would be equivalent to a consensus algorithm with a fault tolerance of >50%, and we already know the very strict mathematical limitations of such algorithms. But we can achieve partial automation: for example, if a client has seen a transaction for a long time, it can automatically refuse to accept a chain as finalized, or even refuse to accept it as the head of a fork choice. A key goal is to ensure that attackers at least cannot quickly achieve a complete victory.
Raising the Quorum Threshold
Currently, a block will be finalized as long as 67% of stakers support it. Some believe this practice is too aggressive. Throughout Ethereum’s history, there has only been one (very brief) failure of finality. If this ratio is raised to 80%, the number of additional non-finality periods will be relatively low, but Ethereum will gain security: specifically, many more controversial situations will lead to temporary halts in finality. This seems to be a healthier situation than allowing the “wrong side” to win immediately, whether the wrong side is an attacker or a client with a bug.
This also answers the question, “What is the point of solo stakers?” Today, most stakers stake through staking pools, making it seem unlikely that solo stakers can reach 51% of ETH. However, if we make an effort, it seems possible for solo stakers to reach quorum thresholds to prevent minority groups, especially if the quorum is 80% (thus requiring only 21% to prevent minority groups). As long as solo stakers do not participate in a 51% attack (whether it is a finality reversal or censorship), such an attack will not achieve a “clean victory,” and solo stakers will have the incentive to help organize a minority soft fork.
Quantum Attack Resistance
Metaculus currently believes that, although the error margin is large, quantum computers may begin to break cryptography at some point in the 2030s:
Quantum computing experts, such as Scott Aaronson, have recently begun to take the possibility of quantum computers working in the medium term more seriously. This will impact the entire Ethereum roadmap: it means that every part of the Ethereum protocol currently relying on elliptic curves will need some alternative based on hashes or other quantum resistance. This particularly means we cannot assume we will be able to rely on the excellent performance of BLS aggregation forever to handle signatures from large validator sets. This underscores the reasonableness of being conservative in performance assumptions in proof of stake design and the need to more actively develop quantum-resistant alternatives.
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click “Report”, and we will handle it promptly.