Published

Ethereum consensus P2P networking

Authors

The Footprint While Walking Through the Labyrinth (Ethereum Consensus Networking)

It seems that people’s attention is more focused on smart contract audits, while less attention is given to Ethereum itself. Perhaps one reason is the complexity and vast codebase of Ethereum. Another reason might be the low probability of finding potential issues during audits. However, from my understanding, Ethereum itself has evolved from Bitcoin by applying a quasi-Turing complete machine, which laid the foundation for smart contracts. It serves as a benchmark for blockchain architecture evolution and remains a dominant design in blockchain architecture. So, regardless of whether or not issues are found, I believe that delving into Ethereum itself is a worthwhile investment.

The Ethereum ecosystem consists of four main components: execution specifications and related clients, consensus specifications and related clients, the dominant languages—Solidity and Vyper—and compilers, and staked contracts that allow nodes to become validators by depositing 32 ethers.

After scanning the contents on a surface level, what motivated and intrigued me was the peer-to-peer (P2P) networking component. What I dug into was the consensus P2P networking.

P2P networking reminds me of old software like eMule and Qvod, which focused on content downloads. One amazing thing about blockchain networks comparing this historical software is that they can do so much more, such as providing financial Lego-like services. From the perspective of human history, this is the first time an open network can do such things. Some may think the Internet is also a similar innovation.

Internet is the fundamental infrastructure; everyone can connect to it, just pay the bandwidth fee. However, blockchain networks like Ethereum have defined goals or rules for node operators joining the network by putting capital for reward so everyone can interact with the on-chain smart contracts based on their private key. The unchanged logic related to the code supplies trust for the participants, which brings huge potential opportunities. Comparing the Internet, the Blockchain network's connections are tighter.

The amazing point of the network is its openness; everyone, if satisfy the requirements, can become a validator(or miner). for the pos consensus is staking 32 ethers. Think of current exsited network systems, The Giant technical compnay's own serving platform is the closed network, I can't come up with this type network in human history never before. Another amazing point is the openness feature, so everyone can join or exist, no matter when and where. how do guarantee it works well during so many different situations, such as the bad guys becoming the validator, and so many unexpectations for the normal validators such as the net disconnections, the long-time offline, or the 51% attack? the questions can go on and on. then my brain becomes overwhelmed. But regardless of these complexities, everything seems to be working well. As of now, Ethereum’s TVL (Total Value Locked) has reached approximately $60 billion.

Ethereum_TVL
https://defillama.com/chain/Ethereum

After digging into the Ethereum p2p network, I realized there are so many aspects and considerations around it. What I do now is like marking the footprint while walking through the labyrinth. maybe the next time, I can quickly become aware of the necessary information while encountering the same situation or give me the necessary hints while trying to solve related problems or help people like me. or I can understand the vulnerabilities discovered by others.

Of course, the most effective way to grasp this is to read Ethereum specifications and check the related client implementations, such as the Prysm client; the more verification by executing the test tools, the deeper the understanding of them.

Considering the complexity of the codebase and architecture, even some tiny components have dense information and rationable design, such as the introduction of SSZ and how to apply it in different places. There, it absolutely lacks many other aspects and does not go into detail. I just comment on some points I think should be given more attention when checking the P2P network specification and prism codebase, along with some related security considerations.

Maybe I am like a miner digging for the gold and didn't know the distance between this action and the ultimate gold, but the ongoing actions have increased the success possibilities.

miner

Outline

1. Preface

2. Node

3. Subnets

4. Communication

5. The used libraries or protocols

6. API

7. Verification

8. Design or mechanism

9. Test tools and methods

10. Reference

1. Preface

Although this article focuses on networking, other components such as beacon chain data structure, and fork choice should also be aware of. For example. Except for the data structure BeaconBlock, the validator's responsibility is to vote one block, like AttestationData. For fork choice, such as when received a SignedBeaconBlock, what's the related mechanism with this? Of course, this includes the often-used time units. such as epoch and slot. which is the fundamental component for defining the block as finalized.

From the consensus algorithm perspective, most of the nodes (including validators) are designed to serve the consensus goal and should execute their responsibility to vote one block during each epoch; otherwise, they will face the defined penalties. Under the surface, which is involved with networking, such as with gossip protocol by the defined topic, for example, topic beacon_block for propagating new signed beacon blocks to all nodes on the networks.

2. Node

The first concept definitely is the node. A whole node includes the execution client, such as geth, the consensus client, such as the prysm client, and the validator, which is always bonded with the consensus client. This article focuses on the consensus network, which involves the interactions among the consensus client along with the interaction between the consensus client and the validator.

2.1 node identification

How do we identify the node?

For consensus clients, nodeId represents the client. As this

001c5c174a5c58b971dcee142ffac3c88e9e1b5ab47cafcfa539a5c9636f402e51****, for the validator, each has their public key, Like the regular user interacting with smart contract, validators executing responsibility by signing the attestData using their private key.

How to get the node information? ENR(Ethereum Node Records) includes basic information such as IP address and port and other necessary information such as attnets and eth2. which is related to subnets and fork versions. If you would like more details about this design, you can visit https://eips.ethereum.org/EIPS/eip-778.

// Parse an ENR
devp2p enrdump enr:-Mq4QD7nuyX-7M5YKc9lnRgOx-NLAdg2OFevkR5aX7nW0qzIZfe9Hum-DH3Spd2E1vqI1LuCgbSnPs4RDD-3JcJ9jpOGAZR4p8sNh2F0dG5ldHOIAAAAAAAAAACEZXRoMpAy1u8YYAAAOADh9QUAAAAAgmlkgnY0gmlwhKwQAB2EcXVpY4IyyIlzZWNwMjU2azGhAthB2r4qRDzu_tTMDJmVY1Rqon03BbXT2IIDZxrw-fSPiHN5bmNuZXRzAIN0Y3CCMsiDdWRwgi7g

//Result
Node ID: bbdbf50b64ce4609dead8bce313f5fda5041c0e661e2eee709dc493fa18448e3
URLv4:   enode://d841dabe2a443ceefed4cc0c999563546aa27d3705b5d3d88203671af0f9f48f9c1d18220827dd75aa1d7caa5eb5f47ca847cf13ba11f3639022e389649ce9f6@172.16.0.29:13000?discport=12000
Record has sequence number 1737191049997 and 9 key/value pairs.
  "attnets"   880000000000000000
  "eth2"      9032d6ef186000003800e1f50500000000
  "id"        "v4"
  "ip"        172.16.0.29
  "quic"      8232c8
  "secp256k1" a102d841dabe2a443ceefed4cc0c999563546aa27d3705b5d3d88203671af0f9f48f
  "syncnets"  00
  "tcp"       13000
  "udp"       12000

if curious about all node metrics, reference this: Tracking nodes in the network

2.2 Node discovery protocol

After defining the node identification, then the question is how to find the node for the consensus network, which applied discv5, consensus layer implementation applies libp2p instead of devp2p.

There are so many considerations under the hood. As a critical component, this is worth investigating further, such as whether the use of discv5 is right.

Now, for the node. The security considerations:

  1. Whether or not a node is honest, another node should get the necessary info to make the judgment.
  2. Privacy considerations, leaking node info, whether or not, can lead to a vulnerability.
  3. Does all the node information play an appropriate role in many places, such as when finding a node, scoring a node, or performing node behaviors?

3. Subnets

When a validator submits attestation data, it does not mean that this data will be broadcast to the entire network immediately. Instead, it is submitted to a specific subnet. Currently, Ethereum has 64 attestation subnets, and each validator must submit its attestation data to its corresponding subnet.

Security considerations:

How is randomness guaranteed when selecting a validator as an aggregator? This randomness prevents anyone from predicting the aggregator and attempting to crash the node.

4. Communication

From a dynamic perspective, many messages were spread among all nodes per second, which involves different types of messages and services for different goals. It's the most complex part of networking. Under the hood, many researchers have made many contributions, and architecture is also evolving.

Some designs or tricks not only make differences in the networking design, such as how to guarantee node A can talk to all networks instead of specific groups, as below

network_cimmunicaiton

https://eth2book.info/capella/part2/consensus/preliminaries/#you-cant-have-both

4.1 the supported transports

To ensure interoperability and compatibility for future considerations, define which transport protocols clients should minimally support, such as TCP, QUIC, or WebSockets.

For example. Prism's previous issue is that it doesn't support IPv6 https://github.com/prysmaticlabs/prysm/pull/6363

4.2 Two main domains

Two main domains, gossip and req/rep

Gossip deals with the information spread all over the network. the information involved with beacon blocks, proofs, attestations, exits, and slashings.

https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#topics-and-messages.

Also can find Prysm on how to define these topics in this file

https://github.com/prysmaticlabs/prysm/blob/develop/beacon-chain/p2p/topics.go.

The implementation applies the gossipsub v1 libp2p Protocol, including the gossipsub v1.1 extension.

Related to gossip, beacon consensus applies the default config

default_config

Also, for the gossip, under the hood, which is also a challenging topic, how to make sure the data are transferred among the network in the expected way. Don't know the whole mechanism, but this plays a critical role.

Also, I think there exists a deep relationship with topology, maybe achieving the goal by using the topology knowledge. Expect the consensus network. Network topology itself, such as the Mesh type, seems a complex and sophisticated computer science or math direction.

Security considerations.

  1. Whether or not the applicaiton of the gossip is appropriate, Are the configs correct?
  2. For each topic, Is the validation with each topic appropriate? the clients' implementation is right?
  3. When adding a new topic, Whether or not that achieves the expected results? Are there other negative effects?

Req/rep

Which request the specific inforamtion from other nodes, such as get beacon blocks from one node.

https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#messages

Also can find how Prysm clients define this topic

https://github.com/prysmaticlabs/prysm/blob/develop/beacon-chain/p2p/rpc_topic_mappings.go

Security considerations

  1. For the response side, it should prevent malicious requests that will cost all the node's resources, such as the the introduction of the rate-limiting mechanism and the peer score.
  2. Guarantee the data correction. for example, Is the return data actually according to the correct fork version?
  3. Before request, should check the request, don't break the limitations.

For this prysm technical walkthrough, some tips about the p2p vulnerabilities as blew

4.3 Communicaiton data format

The beacon network applied a serialization method called Simple Serialize instead of RLP (Recursive Length Prefix), which the execution network used.

As a new Serialisation for consensus, the discussion for it lasted a long time. It aims to fit consensus and communication. For more details and usage, you can check these.

https://eth2book.info/capella/part2/building_blocks/ssz/#ssz-simple-serialize

https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md

https://eth2book.info/capella/part2/building_blocks/ssz/

Security consideration.

It's obvious the usage of ssz will be seen in many places, so we should check whether the usage of ssz is right or that each implementation for ssz is right when applying the corresponding language.

https://github.com/ethereum/consensus-specs/issues/2138

Besides the serialization method, it also involves the compressed method before transmitting the message. https://github.com/google/snappy

Snappy has two formats: "block" and "frames" (streaming)

Gossip: block compression

req/res: frames compression

One example
   /eth2/446a7232/beacon_aggregate_and_proof/ssz_snappy
   446a7232: fork version
   beacon_aggregate_and_proof: topic
   ssz_snappy: encoding and compressing methods

5. The used libraries or protocols

As shown above, beacon clients have applied many libraries or protocols, such as

lipp2p

discv5

gossipsub v1.0

gossipsub v1.1

SimpleSerialize (SSZ)

SSZ: Simple Serialize

BLS signatures: the reason why chose this is because it verifies cryptographic signatures efficiently

Some of the libraries were applied not only to Ethereum but also to other famous distributed systems, such as IPFS. Polkadot also applied lipp2p.

To some degree, some security audit not only for the Ethereum consensus but also for these libraries also make sense for all their used projects. For example It seems the trailofbits once published one audit report related with libp2p, for now, don't find this report.

6. API

Some platforms like Alchemy or Infura supply the blockchain API services, we can get the execution clients' info by the API service. There also exists the beacon API for the beacon clients. But for now, I don't find the related beacon client API services in these platforms.

we can check the official specifications of the beacon APIs.

https://ethereum.github.io/beacon-APIs/

From the security perspective, when these APIs are exposed to the outside, they should keep in mind the Denial-of-Service attacks or disclose the node info.

Take the Prysm client as an example; there are APIs between the validator and the beacon client, which services the validators to implement their duties and get some necessary information. Prysm implements its API by using the popular gRPC. Also, these APIs do not need to expose the outside world.

7. Verification

There are huge verifications that involve different types. such as data type check, signature verification, data limit check, and node info verification. some of them play a critical role in defending the network, such as if the data length is beyond the max length, refuse the request.

One type is when a node receives incoming messages and should thoroughly check them before the following actions. There are many considerations behind these, such as whether or not the node obeys the consensus rules.

For this, I just emphasize the P2P network; some examples are below.

  1. Request length check.
  2. Rate limit mechanism
  3. The usage of third libraries. Lipp2p, gossip
  4. validation is related to the different Topics.

Specification for beacon_block validation

https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#beacon_block

Related Prysm client code. the different validations for different topics https://github.com/prysmaticlabs/prysm/blob/31044206b8447cf290e128cb3531569721d31712/beacon-chain/sync/subscriber.go#L81

prysm_client_validation_code
Prysm client verification code

From the security perspective

  1. Is the client's implementation appropriate?

  2. Defensive: Prevent flood attacks, limit request rates, implement a scoring mechanism, and prevent cheating.

  3. Version management: Handle different forks and maintain compatibility across them

8. Design or mechanism

When starting a beacon client, two types of sync are involved: initial sync and regular sync. How can we ensure that all historical data is synced and that the client continues syncing the latest data after completing the initial sync? The design

One component of this design is called Blocks Fetcher, which deals with handling p2p communication.

Some related Prysm code https://github.com/prysmaticlabs/prysm/blob/ce397ce797c33dbcf77fa7670c356844ef6aad43/beacon-chain/sync/initial-sync/blocks_fetcher.go#L68

9. Test tools and methods

One challenge for the p2p network is how to test it, as there are so many different nodes. For example, during each slot, there are many communications among the nodes and between the execution client and the consensus client. Then, there is the question of how to guarantee the clients' expected behaviors.

There are many tools and methods involved with this topic, which help us test and verify whether or not our understanding is right. To some degree, it's a necessary component to achieve a true understanding by using the related tools.

9.1 Execuaiton specification, consensus specification

The unit test for specification, consensus specificaiton. https://github.com/ethereum/consensus-specs/tree/dev/tests

the execution specification:

https://github.com/ethereum/consensus-specs/tree/dev/tests

9.2 Simulating the network

https://github.com/kurtosis-tech/kurtosis

https://github.com/ethpandaops/ethereum-package

When checking the config(https://github.com/ethpandaops/ethereum-package/blob/main/network_params.yaml), we can confirm the network based on our needs. for example, below is one test config. We can add the different clients, and add more services like blocksocut which can check the ongoing transactions, can add the specific test by using assertool.

ethereum_package_config

One video teaches how to use the tool step by step. https://www.youtube.com/watch?v=yB94ukeWKiw&t=1s

Also traitofbits has implemented another way of simulating the network: chaos testing

https://blog.trailofbits.com/2024/03/18/releasing-the-attacknet-a-new-tool-for-finding-bugs-in-blockchain-nodes-using-chaos-testing/

https://www.youtube.com/watch?v=G_yTFuhYRFo

https://github.com/crytic/attacknet

9.3 For each client type(Prysm), also have their corresponding's test

Prysm client, each source go file has a related test file.

To prepare for the network configurations and simulate the network, check the different scenarios, Prysm applies Bazel to test the different scenarios.

https://docs.prylabs.network/docs/devtools/end-to-end#running-e2e-tests

prysm_client_test_scenarios

9.4 Running a node

A more verifiable way to check understanding is to run a node. By running the node, more concepts can be better understood, such as the sync mechanism. The validator's status how to change? from deposited to exited.

9.5 Automate test

Also can check the GitHub actions and scan the related automated test.

https://github.com/ethpandaops/ethereum-package/actions

9.6 Monitoring

As the dynamic changing related with the network, monitoring the network can also play a necessary role in guarantee sustainability of the network, such as prysm applied Prometheus and Grafana showing the metrics data Configure dashboarding and alerts with Prometheus and Grafana. This tool can aslo be add in the https://github.com/ethpandaops/ethereum-package.

10. Other references

10.1 configs

consensus config params

https://github.com/prysmaticlabs/prysm/blob/develop/config/params/config.go

Topics

https://github.com/prysmaticlabs/prysm/blob/develop/beacon-chain/p2p/topics.go

https://github.com/prysmaticlabs/prysm/blob/develop/beacon-chain/p2p/rpc_topic_mappings.go

10.2 Core materials

https://immunefi.slite.page/p/OCk4QLgZz4Euuo/Resources

https://immunefi.com/academy/ethereum-protocol-attackathon

10.3 Ethereum bug-bounty

https://ethereum.org/en/bug-bounty/

https://github.com/ethereum/public-disclosures