Cluster Configuration
Documenting a Distributed Validator Cluster in a standardised file format
These cluster definition and cluster lock files are a work in progress. The intention is for the files to be standardised for operating distributed validators via the EIP process when appropriate.
This document describes the configuration options for running a Charon client or cluster.
A Charon cluster is configured in two steps:
cluster-definition.json
which defines the intended cluster configuration before keys have been created in a distributed key generation ceremony.cluster-lock.json
which includes and extendscluster-definition.json
with distributed validator BLS public key shares.
In the case of a solo operator running a cluster, the charon create cluster
command combines both steps into one and just outputs the final cluster-lock.json
without a DKG step.
Cluster Definition File
The cluster-definition.json
is provided as input to the DKG which generates keys and the cluster-lock.json
file.
Using the CLI
The charon create dkg
command is used to create the cluster-definition.json
file which is used as input to charon dkg
.
The schema of the cluster-definition.json
is defined as:
Using the DV Launchpad
A
leader/creator
, that wishes to coordinate the creation of a new Distributed Validator Cluster navigates to the launchpad and selects "Create new Cluster".The
leader/creator
uses the user interface to configure all of the important details about the cluster including:The
Withdrawal Address
for the created validators;The
Fee Recipient Address
for block proposals if it differs from the withdrawal address;The number of distributed validators to create;
The list of participants in the cluster specified by Ethereum address(/ENS);
The threshold of fault tolerance required.
These key pieces of information form the basis of the cluster configuration. These fields (and some technical fields like DKG algorithm to use) are serialized and merklized to produce the definition's
cluster_definition_hash
. This merkle root will be used to confirm that there is no ambiguity or deviation between definitions when they are provided to Charon nodes.Once the
leader/creator
is satisfied with the configuration they publish it to the launchpad's data availability layer for the other participants to access. (For early development the launchpad will use a centralized backend db to store the cluster configuration. Near production, solutions like IPFS or arweave may be more suitable for the long term decentralization of the launchpad.)
Cluster Lock File
The cluster-lock.json
has the following schema:
Cluster Size and Resilience
The cluster size (the number of nodes/operators in the cluster) determines the resilience of the cluster; its ability remain operational under diverse failure scenarios. Larger clusters can tolerate more faulty nodes. However, increased cluster size implies higher operational costs and potential network latency, which may negatively affect performance.
Optimal cluster size is therefore trade-off between resilience (larger is better) vs cost-efficiency and performance (smaller is better).
Cluster resilience can be broadly classified into two categories:
Byzantine Fault Tolerance (BFT) - the ability to tolerate nodes that are actively trying to disrupt the cluster.
Crash Fault Tolerance (CFT) - the ability to tolerate nodes that have crashed or are otherwise unavailable.
Different cluster sizes tolerate different counts of byzantine vs crash nodes. In practice, hardware and software crash relatively frequently, while byzantine behaviour is relatively uncommon. However, Byzantine Fault Tolerance is crucial for trust minimised systems like distributed validators. Thus, cluster size can be chosen to optimise for either BFT or CFT.
The table below lists different cluster sizes and their characteristics:
Cluster Size
- the number of nodes in the cluster.Threshold
- the minimum number of nodes that must collaborate to reach consensus quorum and to create signatures.BFT #
- the maximum number of byzantine nodes that can be tolerated.CFT #
- the maximum number of crashed nodes that can be tolerated.
1
1
0
0
❌ Invalid: Not CFT nor BFT!
2
2
0
0
❌ Invalid: Not CFT nor BFT!
3
2
0
1
⚠️ Warning: CFT but not BFT!
4
3
1
1
✅ CFT and BFT optimal for 1 faulty
5
4
1
1
6
4
1
2
✅ CFT optimal for 2 crashed
7
5
2
2
✅ BFT optimal for 2 byzantine
8
6
2
2
9
6
2
3
✅ CFT optimal for 3 crashed
10
7
3
3
✅ BFT optimal for 3 byzantine
11
8
3
3
12
8
3
4
✅ CFT optimal for 4 crashed
13
9
4
4
✅ BFT optimal for 4 byzantine
14
10
4
4
15
10
4
5
✅ CFT optimal for 5 crashed
16
11
5
5
✅ BFT optimal for 5 byzantine
17
12
5
5
18
12
5
6
✅ CFT optimal for 6 crashed
19
13
6
6
✅ BFT optimal for 6 byzantine
20
14
6
6
21
14
6
7
✅ CFT optimal for 7 crashed
22
15
7
7
✅ BFT optimal for 7 byzantine
The table above is determined by the QBFT consensus algorithm with the following formulas from this paper:
Last updated